linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/7] bpf trampoline for arm64
@ 2022-04-24 15:40 Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 1/7] arm64: ftrace: Add ftrace direct call support Xu Kuohai
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

Add bpf trampoline support for arm64. Most of the logic is the same as
x86.

Tested on qemu, result:
 #18  bpf_tcp_ca:OK
 #51  dummy_st_ops:OK
 #55  fentry_fexit:OK
 #56  fentry_test:OK
 #57  fexit_bpf2bpf:OK
 #58  fexit_sleep:OK
 #59  fexit_stress:OK
 #60  fexit_test:OK
 #67  get_func_args_test:OK
 #68  get_func_ip_test:OK
 #101 modify_return:OK
 #233 xdp_bpf2bpf:OK

Also tested bpftrace kfunc/kretfunc and it worked fine. 

v3:
- Append test results for bpf_tcp_ca, dummy_st_ops, fexit_bpf2bpf,
  xdp_bpf2bpf
- Support to poke bpf progs
- Fix return value of arch_prepare_bpf_trampoline() to the total number
  of bytes instead of number of instructions 
- Do not check whether CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled in
  arch_prepare_bpf_trampoline, since the trampoline may be hooked to a bpf
  prog
- Restrict bpf_arch_text_poke() to poke bpf text only, as kernel functions
  are poked by ftrace
- Rewrite trace_direct_tramp() in inline assembly in trace_selftest.c
  to avoid messing entry-ftrace.S
- isolate arch_ftrace_set_direct_caller() with macro
  CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS to avoid compile error
  when this macro is disabled
- Some trivial code sytle fixes

v2: https://lore.kernel.org/bpf/20220414162220.1985095-1-xukuohai@huawei.com/
- Add Song's ACK
- Change the multi-line comment in is_valid_bpf_tramp_flags() into net
  style (patch 3)
- Fix a deadloop issue in ftrace selftest (patch 2)
- Replace pt_regs->x0 with pt_regs->orig_x0 in patch 1 commit message 
- Replace "bpf trampoline" with "custom trampoline" in patch 1, as
  ftrace direct call is not only used by bpf trampoline.

v1: https://lore.kernel.org/bpf/20220413054959.1053668-1-xukuohai@huawei.com/

Xu Kuohai (7):
  arm64: ftrace: Add ftrace direct call support
  ftrace: Fix deadloop caused by direct call in ftrace selftest
  bpf: Move is_valid_bpf_tramp_flags() to the public trampoline code
  bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  bpf, arm64: Support to poke bpf prog
  bpf, arm64: bpf trampoline for arm64
  selftests/bpf: Fix trivial typo in fentry_fexit.c

 arch/arm64/Kconfig                            |   2 +
 arch/arm64/include/asm/ftrace.h               |  12 +
 arch/arm64/kernel/asm-offsets.c               |   1 +
 arch/arm64/kernel/entry-ftrace.S              |  18 +-
 arch/arm64/net/bpf_jit.h                      |   8 +
 arch/arm64/net/bpf_jit_comp.c                 | 446 +++++++++++++++++-
 arch/x86/net/bpf_jit_comp.c                   |  20 -
 include/linux/bpf.h                           |   5 +
 kernel/bpf/bpf_struct_ops.c                   |   4 +-
 kernel/bpf/trampoline.c                       |  34 +-
 kernel/trace/trace_selftest.c                 |  16 +
 .../selftests/bpf/prog_tests/fentry_fexit.c   |   4 +-
 12 files changed, 531 insertions(+), 39 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 1/7] arm64: ftrace: Add ftrace direct call support
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest Xu Kuohai
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

Add ftrace direct support for arm64.

1. When there is custom trampoline only, replace the fentry nop to a
   jump instruction that jumps directly to the custom trampoline.

2. When ftrace trampoline and custom trampoline coexist, jump from
   fentry to ftrace trampoline first, then jump to custom trampoline
   when ftrace trampoline exits. The current unused register
   pt_regs->orig_x0 is used as an intermediary for jumping from ftrace
   trampoline to custom trampoline.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 arch/arm64/Kconfig               |  2 ++
 arch/arm64/include/asm/ftrace.h  | 12 ++++++++++++
 arch/arm64/kernel/asm-offsets.c  |  1 +
 arch/arm64/kernel/entry-ftrace.S | 18 +++++++++++++++---
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 57c4c995965f..81cc330daafc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -177,6 +177,8 @@ config ARM64
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS \
 		if $(cc-option,-fpatchable-function-entry=2)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+		if DYNAMIC_FTRACE_WITH_REGS
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
 		if DYNAMIC_FTRACE_WITH_REGS
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1494cfa8639b..14a35a5df0a1 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -78,6 +78,18 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 	return addr;
 }
 
+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs,
+						 unsigned long addr)
+{
+	/*
+	 * Place custom trampoline address in regs->orig_x0 to let ftrace
+	 * trampoline jump to it.
+	 */
+	regs->orig_x0 = addr;
+}
+#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 struct dyn_ftrace;
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1197e7679882..b1ed0bf01c59 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -80,6 +80,7 @@ int main(void)
   DEFINE(S_SDEI_TTBR1,		offsetof(struct pt_regs, sdei_ttbr1));
   DEFINE(S_PMR_SAVE,		offsetof(struct pt_regs, pmr_save));
   DEFINE(S_STACKFRAME,		offsetof(struct pt_regs, stackframe));
+  DEFINE(S_ORIG_X0,		offsetof(struct pt_regs, orig_x0));
   DEFINE(PT_REGS_SIZE,		sizeof(struct pt_regs));
   BLANK();
 #ifdef CONFIG_COMPAT
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index e535480a4069..dfe62c55e3a2 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -60,6 +60,9 @@
 	str	x29, [sp, #S_FP]
 	.endif
 
+	/* Set orig_x0 to zero  */
+	str     xzr, [sp, #S_ORIG_X0]
+
 	/* Save the callsite's SP and LR */
 	add	x10, sp, #(PT_REGS_SIZE + 16)
 	stp	x9, x10, [sp, #S_LR]
@@ -119,12 +122,21 @@ ftrace_common_return:
 	/* Restore the callsite's FP, LR, PC */
 	ldr	x29, [sp, #S_FP]
 	ldr	x30, [sp, #S_LR]
-	ldr	x9, [sp, #S_PC]
-
+	ldr	x10, [sp, #S_PC]
+
+	ldr	x11, [sp, #S_ORIG_X0]
+	cbz	x11, 1f
+	/* Set x9 to parent ip before jump to custom trampoline */
+	mov	x9,  x30
+	/* Set lr to self ip */
+	ldr	x30, [sp, #S_PC]
+	/* Set x10 (used for return address) to custom trampoline */
+	mov	x10, x11
+1:
 	/* Restore the callsite's SP */
 	add	sp, sp, #PT_REGS_SIZE + 16
 
-	ret	x9
+	ret	x10
 SYM_CODE_END(ftrace_common)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 1/7] arm64: ftrace: Add ftrace direct call support Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-04-25 15:05   ` Steven Rostedt
  2022-04-24 15:40 ` [PATCH bpf-next v3 3/7] bpf: Move is_valid_bpf_tramp_flags() to the public trampoline code Xu Kuohai
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

After direct call is enabled for arm64, ftrace selftest enters a
dead loop:

<trace_selftest_dynamic_test_func>:
00  bti     c
01  mov     x9, x30                            <trace_direct_tramp>:
02  bl      <trace_direct_tramp>    ---------->     ret
                                                     |
                                         lr/x30 is 03, return to 03
                                                     |
03  mov     w0, #0x0   <-----------------------------|
     |                                               |
     |                   dead loop!                  |
     |                                               |
04  ret   ---- lr/x30 is still 03, go back to 03 ----|

The reason is that when the direct caller trace_direct_tramp() returns
to the patched function trace_selftest_dynamic_test_func(), lr is still
the address after the instrumented instruction in the patched function,
so when the patched function exits, it returns to itself!

To fix this issue, we need to restore lr before trace_direct_tramp()
exits, so rewrite a dedicated trace_direct_tramp() for arm64.

Reported-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 kernel/trace/trace_selftest.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index abcadbe933bb..d2eff2b1d743 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -785,8 +785,24 @@ static struct fgraph_ops fgraph_ops __initdata  = {
 };
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+#ifdef CONFIG_ARM64
+extern void trace_direct_tramp(void);
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		trace_direct_tramp, %function\n"
+"	.global		trace_direct_tramp\n"
+"trace_direct_tramp:"
+"	mov	x10, x30\n"
+"	mov	x30, x9\n"
+"	ret	x10\n"
+"	.size		trace_direct_tramp, .-trace_direct_tramp\n"
+"	.popsection\n"
+);
+#else
 noinline __noclone static void trace_direct_tramp(void) { }
 #endif
+#endif
 
 /*
  * Pretty much the same than for the function tracer from which the selftest
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 3/7] bpf: Move is_valid_bpf_tramp_flags() to the public trampoline code
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 1/7] arm64: ftrace: Add ftrace direct call support Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

is_valid_bpf_tramp_flags() is not relevant to architecture, so move it
to the public trampoline code.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 arch/x86/net/bpf_jit_comp.c | 20 --------------------
 include/linux/bpf.h         |  5 +++++
 kernel/bpf/bpf_struct_ops.c |  4 ++--
 kernel/bpf/trampoline.c     | 34 +++++++++++++++++++++++++++++++---
 4 files changed, 38 insertions(+), 25 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8fe35ed11fd6..774f05f92737 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1900,23 +1900,6 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
 	return 0;
 }
 
-static bool is_valid_bpf_tramp_flags(unsigned int flags)
-{
-	if ((flags & BPF_TRAMP_F_RESTORE_REGS) &&
-	    (flags & BPF_TRAMP_F_SKIP_FRAME))
-		return false;
-
-	/*
-	 * BPF_TRAMP_F_RET_FENTRY_RET is only used by bpf_struct_ops,
-	 * and it must be used alone.
-	 */
-	if ((flags & BPF_TRAMP_F_RET_FENTRY_RET) &&
-	    (flags & ~BPF_TRAMP_F_RET_FENTRY_RET))
-		return false;
-
-	return true;
-}
-
 /* Example:
  * __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev);
  * its 'struct btf_func_model' will be nr_args=2
@@ -1995,9 +1978,6 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	if (nr_args > 6)
 		return -ENOTSUPP;
 
-	if (!is_valid_bpf_tramp_flags(flags))
-		return -EINVAL;
-
 	/* Generated trampoline stack layout:
 	 *
 	 * RBP + 8         [ return address  ]
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7bf441563ffc..90f878de2842 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -706,6 +706,11 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *i
 				const struct btf_func_model *m, u32 flags,
 				struct bpf_tramp_progs *tprogs,
 				void *orig_call);
+int bpf_prepare_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
+			   const struct btf_func_model *m, u32 flags,
+			   struct bpf_tramp_progs *tprogs,
+			   void *orig_call);
+
 /* these two functions are called from generated trampoline */
 u64 notrace __bpf_prog_enter(struct bpf_prog *prog);
 void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index de01d37c2d3b..3248dd86783a 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -325,8 +325,8 @@ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
 	tprogs[BPF_TRAMP_FENTRY].progs[0] = prog;
 	tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
 	flags = model->ret_size > 0 ? BPF_TRAMP_F_RET_FENTRY_RET : 0;
-	return arch_prepare_bpf_trampoline(NULL, image, image_end,
-					   model, flags, tprogs, NULL);
+	return bpf_prepare_trampoline(NULL, image, image_end, model, flags,
+				      tprogs, NULL);
 }
 
 static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index ada97751ae1b..00e6ad80fed2 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -360,9 +360,9 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
 	if (ip_arg)
 		flags |= BPF_TRAMP_F_IP_ARG;
 
-	err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
-					  &tr->func.model, flags, tprogs,
-					  tr->func.addr);
+	err = bpf_prepare_trampoline(im, im->image, im->image + PAGE_SIZE,
+				     &tr->func.model, flags, tprogs,
+				     tr->func.addr);
 	if (err < 0)
 		goto out;
 
@@ -641,6 +641,34 @@ arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image
 	return -ENOTSUPP;
 }
 
+static bool is_valid_bpf_tramp_flags(unsigned int flags)
+{
+	if ((flags & BPF_TRAMP_F_RESTORE_REGS) &&
+	    (flags & BPF_TRAMP_F_SKIP_FRAME))
+		return false;
+
+	/* BPF_TRAMP_F_RET_FENTRY_RET is only used by bpf_struct_ops,
+	 * and it must be used alone.
+	 */
+	if ((flags & BPF_TRAMP_F_RET_FENTRY_RET) &&
+	    (flags & ~BPF_TRAMP_F_RET_FENTRY_RET))
+		return false;
+
+	return true;
+}
+
+int bpf_prepare_trampoline(struct bpf_tramp_image *tr, void *image,
+			   void *image_end, const struct btf_func_model *m,
+			   u32 flags, struct bpf_tramp_progs *tprogs,
+			   void *orig_call)
+{
+	if (!is_valid_bpf_tramp_flags(flags))
+		return -EINVAL;
+
+	return arch_prepare_bpf_trampoline(tr, image, image_end, m, flags,
+					   tprogs, orig_call);
+}
+
 static int __init init_trampolines(void)
 {
 	int i;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
                   ` (2 preceding siblings ...)
  2022-04-24 15:40 ` [PATCH bpf-next v3 3/7] bpf: Move is_valid_bpf_tramp_flags() to the public trampoline code Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-05-10 11:45   ` Jakub Sitnicki
  2022-05-13 14:59   ` Mark Rutland
  2022-04-24 15:40 ` [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog Xu Kuohai
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
it to replace nop with jump, or replace jump with nop.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 8ab4035dea27..3f9bdfec54c4 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -9,6 +9,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/bpf.h>
+#include <linux/memory.h>
 #include <linux/filter.h>
 #include <linux/printk.h>
 #include <linux/slab.h>
@@ -18,6 +19,7 @@
 #include <asm/cacheflush.h>
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
+#include <asm/patching.h>
 #include <asm/set_memory.h>
 
 #include "bpf_jit.h"
@@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
 {
 	return vfree(addr);
 }
+
+static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
+			     void *addr, u32 *insn)
+{
+	if (!addr)
+		*insn = aarch64_insn_gen_nop();
+	else
+		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
+						    (unsigned long)addr,
+						    type);
+
+	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
+}
+
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
+		       void *old_addr, void *new_addr)
+{
+	int ret;
+	u32 old_insn;
+	u32 new_insn;
+	u32 replaced;
+	enum aarch64_insn_branch_type branch_type;
+
+	if (!is_bpf_text_address((long)ip))
+		/* Only poking bpf text is supported. Since kernel function
+		 * entry is set up by ftrace, we reply on ftrace to poke kernel
+		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
+		 * called after a failed poke with ftrace. In this case, there
+		 * is probably something wrong with fentry, so there is nothing
+		 * we can do here. See register_fentry, unregister_fentry and
+		 * modify_fentry for details.
+		 */
+		return -EINVAL;
+
+	if (poke_type == BPF_MOD_CALL)
+		branch_type = AARCH64_INSN_BRANCH_LINK;
+	else
+		branch_type = AARCH64_INSN_BRANCH_NOLINK;
+
+	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
+		return -EFAULT;
+
+	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
+		return -EFAULT;
+
+	mutex_lock(&text_mutex);
+	if (aarch64_insn_read(ip, &replaced)) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	if (replaced != old_insn) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
+out:
+	mutex_unlock(&text_mutex);
+	return ret;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
                   ` (3 preceding siblings ...)
  2022-04-24 15:40 ` [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-05-10  9:36   ` Jakub Sitnicki
  2022-04-24 15:40 ` [PATCH bpf-next v3 6/7] bpf, arm64: bpf trampoline for arm64 Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 7/7] selftests/bpf: Fix trivial typo in fentry_fexit.c Xu Kuohai
  6 siblings, 1 reply; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

1. Set up the bpf prog entry in the same way as fentry to support
   trampoline. Now bpf prog entry looks like this:

   bti c        // if BTI enabled
   mov x9, x30  // save lr
   nop          // to be replaced with jump instruction
   paciasp      // if PAC enabled

2. Update bpf_arch_text_poke() to poke bpf prog. If the instruction
   to be poked is bpf prog's first instruction, skip to the nop
   instruction in the prog entry.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/arm64/net/bpf_jit.h      |  1 +
 arch/arm64/net/bpf_jit_comp.c | 41 +++++++++++++++++++++++++++--------
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
index 194c95ccc1cf..1c4b0075a3e2 100644
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -270,6 +270,7 @@
 #define A64_BTI_C  A64_HINT(AARCH64_INSN_HINT_BTIC)
 #define A64_BTI_J  A64_HINT(AARCH64_INSN_HINT_BTIJ)
 #define A64_BTI_JC A64_HINT(AARCH64_INSN_HINT_BTIJC)
+#define A64_NOP    A64_HINT(AARCH64_INSN_HINT_NOP)
 
 /* DMB */
 #define A64_DMB_ISH aarch64_insn_gen_dmb(AARCH64_INSN_MB_ISH)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 3f9bdfec54c4..293bdefc5d0c 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -237,14 +237,23 @@ static bool is_lsi_offset(int offset, int scale)
 	return true;
 }
 
-/* Tail call offset to jump into */
-#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) || \
-	IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
-#define PROLOGUE_OFFSET 9
+#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)
+#define BTI_INSNS	1
+#else
+#define BTI_INSNS	0
+#endif
+
+#if IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
+#define PAC_INSNS	1
 #else
-#define PROLOGUE_OFFSET 8
+#define PAC_INSNS	0
 #endif
 
+/* Tail call offset to jump into */
+#define PROLOGUE_OFFSET	(BTI_INSNS + 2 + PAC_INSNS + 8)
+/* Offset of nop instruction in bpf prog entry to be poked */
+#define POKE_OFFSET	(BTI_INSNS + 1)
+
 static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
 {
 	const struct bpf_prog *prog = ctx->prog;
@@ -281,12 +290,15 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
 	 *
 	 */
 
+	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
+		emit(A64_BTI_C, ctx);
+
+	emit(A64_MOV(1, A64_R(9), A64_LR), ctx);
+	emit(A64_NOP, ctx);
+
 	/* Sign lr */
 	if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL))
 		emit(A64_PACIASP, ctx);
-	/* BTI landing pad */
-	else if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
-		emit(A64_BTI_C, ctx);
 
 	/* Save FP and LR registers to stay align with ARM64 AAPCS */
 	emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
@@ -1552,9 +1564,11 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 	u32 old_insn;
 	u32 new_insn;
 	u32 replaced;
+	unsigned long offset = ~0UL;
 	enum aarch64_insn_branch_type branch_type;
+	char namebuf[KSYM_NAME_LEN];
 
-	if (!is_bpf_text_address((long)ip))
+	if (!__bpf_address_lookup((unsigned long)ip, NULL, &offset, namebuf))
 		/* Only poking bpf text is supported. Since kernel function
 		 * entry is set up by ftrace, we reply on ftrace to poke kernel
 		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
@@ -1565,6 +1579,15 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
 		 */
 		return -EINVAL;
 
+	/* bpf entry */
+	if (offset == 0UL)
+		/* skip to the nop instruction in bpf prog entry:
+		 * bti c	// if BTI enabled
+		 * mov x9, x30
+		 * nop
+		 */
+		ip = (u32 *)ip + POKE_OFFSET;
+
 	if (poke_type == BPF_MOD_CALL)
 		branch_type = AARCH64_INSN_BRANCH_LINK;
 	else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 6/7] bpf, arm64: bpf trampoline for arm64
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
                   ` (4 preceding siblings ...)
  2022-04-24 15:40 ` [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  2022-04-24 15:40 ` [PATCH bpf-next v3 7/7] selftests/bpf: Fix trivial typo in fentry_fexit.c Xu Kuohai
  6 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

Add bpf trampoline support for arm64. Most of the logic is the same as
x86.

fentry before bpf trampoline hooked:
 mov x9, x30
 nop

fentry after bpf trampoline hooked:
 mov x9, x30
 bl  <bpf_trampoline>

Tested on qemu, result:
 #18  bpf_tcp_ca:OK
 #51  dummy_st_ops:OK
 #55  fentry_fexit:OK
 #56  fentry_test:OK
 #57  fexit_bpf2bpf:OK
 #58  fexit_sleep:OK
 #59  fexit_stress:OK
 #60  fexit_test:OK
 #67  get_func_args_test:OK
 #68  get_func_ip_test:OK
 #101 modify_return:OK
 #233 xdp_bpf2bpf:OK

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 arch/arm64/net/bpf_jit.h      |   7 +
 arch/arm64/net/bpf_jit_comp.c | 344 +++++++++++++++++++++++++++++++++-
 2 files changed, 350 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
index 1c4b0075a3e2..82a624b0d16e 100644
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -90,6 +90,13 @@
 /* Rt = Rn[0]; Rt2 = Rn[8]; Rn += 16; */
 #define A64_POP(Rt, Rt2, Rn)  A64_LS_PAIR(Rt, Rt2, Rn, 16, LOAD, POST_INDEX)
 
+/* Rn[imm] = Xt1; Rn[imm + 8] = Xt2 */
+#define A64_STP(Xt1, Xt2, Xn, imm) \
+	A64_LS_PAIR(Xt1, Xt2, Xn, imm, STORE, SIGNED_OFFSET)
+/* Xt1 = Rn[imm]; Xt2 = Rn[imm + 8] */
+#define A64_LDP(Xt1, Xt2, Xn, imm) \
+	A64_LS_PAIR(Xt1, Xt2, Xn, imm, LOAD, SIGNED_OFFSET)
+
 /* Load/store exclusive */
 #define A64_SIZE(sf) \
 	((sf) ? AARCH64_INSN_SIZE_64 : AARCH64_INSN_SIZE_32)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 293bdefc5d0c..cf8ca957c747 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1349,6 +1349,13 @@ static int validate_code(struct jit_ctx *ctx)
 		if (a64_insn == AARCH64_BREAK_FAULT)
 			return -1;
 	}
+	return 0;
+}
+
+static int validate_ctx(struct jit_ctx *ctx)
+{
+	if (validate_code(ctx))
+		return -1;
 
 	if (WARN_ON_ONCE(ctx->exentry_idx != ctx->prog->aux->num_exentries))
 		return -1;
@@ -1473,7 +1480,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	build_epilogue(&ctx);
 
 	/* 3. Extra pass to validate JITed code. */
-	if (validate_code(&ctx)) {
+	if (validate_ctx(&ctx)) {
 		bpf_jit_binary_free(header);
 		prog = orig_prog;
 		goto out_off;
@@ -1544,6 +1551,341 @@ void bpf_jit_free_exec(void *addr)
 	return vfree(addr);
 }
 
+static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_prog *p,
+			    int args_off, int retval_off, bool save_ret)
+{
+	u32 *branch;
+	u64 enter_prog;
+	u64 exit_prog;
+	u8 tmp = bpf2a64[TMP_REG_1];
+	u8 r0 = bpf2a64[BPF_REG_0];
+
+	if (p->aux->sleepable) {
+		enter_prog = (u64)__bpf_prog_enter_sleepable;
+		exit_prog = (u64)__bpf_prog_exit_sleepable;
+	} else {
+		enter_prog = (u64)__bpf_prog_enter;
+		exit_prog = (u64)__bpf_prog_exit;
+	}
+
+	/* arg1: prog */
+	emit_addr_mov_i64(A64_R(0), (const u64)p, ctx);
+	/* bl __bpf_prog_enter */
+	emit_addr_mov_i64(tmp, enter_prog, ctx);
+	emit(A64_BLR(tmp), ctx);
+
+	/* if (__bpf_prog_enter(prog) == 0)
+	 *         goto skip_exec_of_prog;
+	 */
+	branch = ctx->image + ctx->idx;
+	emit(A64_NOP, ctx);
+
+	/* move return value to x19 */
+	emit(A64_MOV(1, A64_R(19), r0), ctx);
+
+	/* bl bpf_prog */
+	emit(A64_ADD_I(1, A64_R(0), A64_SP, args_off), ctx);
+	if (!p->jited)
+		emit_addr_mov_i64(A64_R(1), (const u64)p->insnsi, ctx);
+	emit_addr_mov_i64(tmp, (const u64)p->bpf_func, ctx);
+	emit(A64_BLR(tmp), ctx);
+
+	/* store return value */
+	if (save_ret)
+		emit(A64_STR64I(r0, A64_SP, retval_off), ctx);
+
+	if (ctx->image) {
+		int offset = &ctx->image[ctx->idx] - branch;
+		*branch = A64_CBZ(1, A64_R(0), offset);
+	}
+
+	/* arg1: prog */
+	emit_addr_mov_i64(A64_R(0), (const u64)p, ctx);
+	/* arg2: start time */
+	emit(A64_MOV(1, A64_R(1), A64_R(19)), ctx);
+	/* bl __bpf_prog_exit */
+	emit_addr_mov_i64(tmp, exit_prog, ctx);
+	emit(A64_BLR(tmp), ctx);
+}
+
+static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_progs *tp,
+			       int args_off, int retval_off, u32 **branches)
+{
+	int i;
+
+	/* The first fmod_ret program will receive a garbage return value.
+	 * Set this to 0 to avoid confusing the program.
+	 */
+	emit(A64_STR64I(A64_ZR, A64_SP, retval_off), ctx);
+	for (i = 0; i < tp->nr_progs; i++) {
+		invoke_bpf_prog(ctx, tp->progs[i], args_off, retval_off, true);
+		/* if (*(u64 *)(sp + retval_off) !=  0)
+		 *	goto do_fexit;
+		 */
+		emit(A64_LDR64I(A64_R(10), A64_SP, retval_off), ctx);
+		/* Save the location of branch, and generate a nop.
+		 * This nop will be replaced with a cbnz later.
+		 */
+		branches[i] = ctx->image + ctx->idx;
+		emit(A64_NOP, ctx);
+	}
+}
+
+static void save_args(struct jit_ctx *ctx, int args_off, int nargs)
+{
+	int i;
+
+	for (i = 0; i < nargs; i++) {
+		emit(A64_STR64I(i, A64_SP, args_off), ctx);
+		args_off += 8;
+	}
+}
+
+static void restore_args(struct jit_ctx *ctx, int args_off, int nargs)
+{
+	int i;
+
+	for (i = 0; i < nargs; i++) {
+		emit(A64_LDR64I(i, A64_SP, args_off), ctx);
+		args_off += 8;
+	}
+}
+
+/*
+ * Based on the x86's implementation of arch_prepare_bpf_trampoline().
+ *
+ * We rely on DYNAMIC_FTRACE_WITH_REGS to set return address and nop
+ * for fentry.
+ *
+ * fentry before bpf trampoline hooked:
+ *   mov x9, x30
+ *   nop
+ *
+ * fentry after bpf trampoline hooked:
+ *   mov x9, x30
+ *   bl  <bpf_trampoline>
+ *
+ */
+static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
+			      struct bpf_tramp_progs *tprogs, void *orig_call,
+			      int nargs, u32 flags)
+{
+	int i;
+	int stack_size;
+	int retaddr_off;
+	int regs_off;
+	int retval_off;
+	int args_off;
+	int nargs_off;
+	int ip_off;
+	struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
+	struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
+	struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
+	bool save_ret;
+	u32 **branches = NULL;
+
+	/*
+	 * trampoline stack layout:
+	 *                  [ parent ip       ]
+	 *                  [ FP              ]
+	 * SP + retaddr_off [ self ip         ]
+	 * FP               [ FP              ]
+	 *
+	 * sp + regs_off    [ x19             ] callee-saved regs, currently
+	 *                                      only x19 is used
+	 *
+	 * SP + retval_off  [ return value    ] BPF_TRAMP_F_CALL_ORIG or
+	 *                                      BPF_TRAMP_F_RET_FENTRY_RET flags
+	 *
+	 *                  [ argN            ]
+	 *                  [ ...             ]
+	 * sp + args_off    [ arg1            ]
+	 *
+	 * SP + nargs_off   [ args count      ]
+	 *
+	 * SP + ip_off      [ traced function ] BPF_TRAMP_F_IP_ARG flag
+	 */
+
+	stack_size = 0;
+	ip_off = stack_size;
+
+	/* room for IP address argument */
+	if (flags & BPF_TRAMP_F_IP_ARG)
+		stack_size += 8;
+
+	nargs_off = stack_size;
+	/* room for args count */
+	stack_size += 8;
+
+	args_off = stack_size;
+	/* room for args */
+	stack_size += nargs * 8;
+
+	/* room for return value */
+	retval_off = stack_size;
+	save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
+	if (save_ret)
+		stack_size += 8;
+
+	/* room for callee-saved registers, currently only x19 is used */
+	regs_off = stack_size;
+	stack_size += 8;
+
+	retaddr_off = stack_size + 8;
+
+	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
+		emit(A64_BTI_C, ctx);
+
+	/* frame for parent function */
+	emit(A64_PUSH(A64_FP, A64_R(9), A64_SP), ctx);
+	emit(A64_MOV(1, A64_FP, A64_SP), ctx);
+
+	/* frame for patched function */
+	emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
+	emit(A64_MOV(1, A64_FP, A64_SP), ctx);
+
+	/* allocate stack space */
+	emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
+
+	if (flags & BPF_TRAMP_F_IP_ARG) {
+		/* save ip address of the traced function */
+		emit_addr_mov_i64(A64_R(10), (const u64)orig_call, ctx);
+		emit(A64_STR64I(A64_R(10), A64_SP, ip_off), ctx);
+	}
+
+	/* save args count*/
+	emit(A64_MOVZ(1, A64_R(10), nargs, 0), ctx);
+	emit(A64_STR64I(A64_R(10), A64_SP, nargs_off), ctx);
+
+	/* save args */
+	save_args(ctx, args_off, nargs);
+
+	/* save callee saved registers */
+	emit(A64_STR64I(A64_R(19), A64_SP, regs_off), ctx);
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		emit_addr_mov_i64(A64_R(0), (const u64)im, ctx);
+		emit_addr_mov_i64(A64_R(10), (const u64)__bpf_tramp_enter, ctx);
+		emit(A64_BLR(A64_R(10)), ctx);
+	}
+
+	for (i = 0; i < fentry->nr_progs; i++)
+		invoke_bpf_prog(ctx, fentry->progs[i], args_off, retval_off,
+				flags & BPF_TRAMP_F_RET_FENTRY_RET);
+
+	if (fmod_ret->nr_progs) {
+		branches = kcalloc(fmod_ret->nr_progs, sizeof(u32 *),
+				   GFP_KERNEL);
+		if (!branches)
+			return -ENOMEM;
+
+		invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
+				   branches);
+	}
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		restore_args(ctx, args_off, nargs);
+		emit(A64_LDR64I(A64_R(10), A64_SP, retaddr_off), ctx);
+		/* call original func */
+		emit(A64_BLR(A64_R(10)), ctx);
+		/* store return value */
+		emit(A64_STR64I(A64_R(0), A64_SP, retval_off), ctx);
+		/* reserve a nop */
+		im->ip_after_call = ctx->image + ctx->idx;
+		emit(A64_NOP, ctx);
+	}
+
+	/* update the branches saved in invoke_bpf_mod_ret with cbnz */
+	for (i = 0; i < fmod_ret->nr_progs && ctx->image != NULL; i++) {
+		int offset = &ctx->image[ctx->idx] - branches[i];
+		*branches[i] = A64_CBNZ(1, A64_R(10), offset);
+	}
+
+	for (i = 0; i < fexit->nr_progs; i++)
+		invoke_bpf_prog(ctx, fexit->progs[i], args_off, retval_off,
+				false);
+
+	if (flags & BPF_TRAMP_F_RESTORE_REGS)
+		restore_args(ctx, args_off, nargs);
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		im->ip_epilogue = ctx->image + ctx->idx;
+		emit_addr_mov_i64(A64_R(0), (const u64)im, ctx);
+		emit_addr_mov_i64(A64_R(10), (const u64)__bpf_tramp_exit, ctx);
+		emit(A64_BLR(A64_R(10)), ctx);
+	}
+
+	/* restore x19 */
+	emit(A64_LDR64I(A64_R(19), A64_SP, regs_off), ctx);
+
+	if (save_ret)
+		emit(A64_LDR64I(A64_R(0), A64_SP, retval_off), ctx);
+
+	/* reset SP  */
+	emit(A64_MOV(1, A64_SP, A64_FP), ctx);
+
+	/* pop frames  */
+	emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
+	emit(A64_POP(A64_FP, A64_R(9), A64_SP), ctx);
+
+	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
+		/* skip patched function, return to parent */
+		emit(A64_MOV(1, A64_LR, A64_R(9)), ctx);
+		emit(A64_RET(A64_R(9)), ctx);
+	} else {
+		/* return to patched function */
+		emit(A64_MOV(1, A64_R(10), A64_LR), ctx);
+		emit(A64_MOV(1, A64_LR, A64_R(9)), ctx);
+		emit(A64_RET(A64_R(10)), ctx);
+	}
+
+	if (ctx->image)
+		bpf_flush_icache(ctx->image, ctx->image + ctx->idx);
+
+	kfree(branches);
+
+	return ctx->idx;
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
+				void *image_end, const struct btf_func_model *m,
+				u32 flags, struct bpf_tramp_progs *tprogs,
+				void *orig_call)
+{
+	int ret;
+	int nargs = m->nr_args;
+	int max_insns = ((long)image_end - (long)image) / AARCH64_INSN_SIZE;
+	struct jit_ctx ctx = {
+		.image = NULL,
+		.idx = 0
+	};
+
+	/* the first 8 arguments are passed by registers */
+	if (nargs > 8)
+		return -ENOTSUPP;
+
+	ret = prepare_trampoline(&ctx, im, tprogs, orig_call, nargs, flags);
+	if (ret < 0)
+		return ret;
+
+	if (ret > max_insns)
+		return -EFBIG;
+
+	ctx.image = image;
+	ctx.idx = 0;
+
+	jit_fill_hole(image, (unsigned int)(image_end - image));
+	ret = prepare_trampoline(&ctx, im, tprogs, orig_call, nargs, flags);
+
+	if (ret > 0 && validate_code(&ctx) < 0)
+		ret = -EINVAL;
+
+	if (ret > 0)
+		ret *= AARCH64_INSN_SIZE;
+
+	return ret;
+}
+
 static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
 			     void *addr, u32 *insn)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v3 7/7] selftests/bpf: Fix trivial typo in fentry_fexit.c
  2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
                   ` (5 preceding siblings ...)
  2022-04-24 15:40 ` [PATCH bpf-next v3 6/7] bpf, arm64: bpf trampoline for arm64 Xu Kuohai
@ 2022-04-24 15:40 ` Xu Kuohai
  6 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-24 15:40 UTC (permalink / raw)
  To: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest
  Cc: Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

The "ipv6" word in assertion message should be "fentry_fexit".

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
---
 tools/testing/selftests/bpf/prog_tests/fentry_fexit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
index 130f5b82d2e6..e3c139bde46e 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
@@ -28,8 +28,8 @@ void test_fentry_fexit(void)
 
 	prog_fd = fexit_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
-	ASSERT_OK(err, "ipv6 test_run");
-	ASSERT_OK(topts.retval, "ipv6 test retval");
+	ASSERT_OK(err, "fentry_fexit test_run");
+	ASSERT_OK(topts.retval, "fentry_fexit test retval");
 
 	fentry_res = (__u64 *)fentry_skel->bss;
 	fexit_res = (__u64 *)fexit_skel->bss;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest
  2022-04-24 15:40 ` [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest Xu Kuohai
@ 2022-04-25 15:05   ` Steven Rostedt
  2022-04-26  7:36     ` Xu Kuohai
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2022-04-25 15:05 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Ingo Molnar, Daniel Borkmann,
	Alexei Starovoitov, Zi Shen Lim, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, David S . Miller, Hideaki YOSHIFUJI, David Ahern,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86, hpa,
	Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer, Mark Rutland,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On Sun, 24 Apr 2022 11:40:23 -0400
Xu Kuohai <xukuohai@huawei.com> wrote:

> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
> index abcadbe933bb..d2eff2b1d743 100644
> --- a/kernel/trace/trace_selftest.c
> +++ b/kernel/trace/trace_selftest.c
> @@ -785,8 +785,24 @@ static struct fgraph_ops fgraph_ops __initdata  = {
>  };
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> +#ifdef CONFIG_ARM64

Please find a way to add this in arm specific code. Do not add architecture
defines in generic code.

You could add:

#ifndef ARCH_HAVE_FTRACE_DIRECT_TEST_FUNC
noinline __noclone static void trace_direct_tramp(void) { }
#endif

here, and in arch/arm64/include/ftrace.h

#define ARCH_HAVE_FTRACE_DIRECT_TEST_FUNC

and define your test function in the arm64 specific code.

-- Steve




> +extern void trace_direct_tramp(void);
> +
> +asm (
> +"	.pushsection	.text, \"ax\", @progbits\n"
> +"	.type		trace_direct_tramp, %function\n"
> +"	.global		trace_direct_tramp\n"
> +"trace_direct_tramp:"
> +"	mov	x10, x30\n"
> +"	mov	x30, x9\n"
> +"	ret	x10\n"
> +"	.size		trace_direct_tramp, .-trace_direct_tramp\n"
> +"	.popsection\n"
> +);
> +#else
>  noinline __noclone static void trace_direct_tramp(void) { }
>  #endif
> +#endif
>  
>  /*
>   * Pretty much the same than for the function tracer from which the selftest


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest
  2022-04-25 15:05   ` Steven Rostedt
@ 2022-04-26  7:36     ` Xu Kuohai
  0 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-04-26  7:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Ingo Molnar, Daniel Borkmann,
	Alexei Starovoitov, Zi Shen Lim, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, David S . Miller, Hideaki YOSHIFUJI, David Ahern,
	Thomas Gleixner, Borislav Petkov, Dave Hansen, x86, hpa,
	Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer, Mark Rutland,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On 4/25/2022 11:05 PM, Steven Rostedt wrote:
> On Sun, 24 Apr 2022 11:40:23 -0400
> Xu Kuohai <xukuohai@huawei.com> wrote:
> 
>> diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
>> index abcadbe933bb..d2eff2b1d743 100644
>> --- a/kernel/trace/trace_selftest.c
>> +++ b/kernel/trace/trace_selftest.c
>> @@ -785,8 +785,24 @@ static struct fgraph_ops fgraph_ops __initdata  = {
>>  };
>>  
>>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +#ifdef CONFIG_ARM64
> 
> Please find a way to add this in arm specific code. Do not add architecture
> defines in generic code.
> 
> You could add:
> 
> #ifndef ARCH_HAVE_FTRACE_DIRECT_TEST_FUNC
> noinline __noclone static void trace_direct_tramp(void) { }
> #endif
> 
> here, and in arch/arm64/include/ftrace.h
> 
> #define ARCH_HAVE_FTRACE_DIRECT_TEST_FUNC
> 
> and define your test function in the arm64 specific code.
> 
> -- Steve
> 
> 

will move this to arch/arm64/ in v4, thanks.

> 
> 
>> +extern void trace_direct_tramp(void);
>> +
>> +asm (
>> +"	.pushsection	.text, \"ax\", @progbits\n"
>> +"	.type		trace_direct_tramp, %function\n"
>> +"	.global		trace_direct_tramp\n"
>> +"trace_direct_tramp:"
>> +"	mov	x10, x30\n"
>> +"	mov	x30, x9\n"
>> +"	ret	x10\n"
>> +"	.size		trace_direct_tramp, .-trace_direct_tramp\n"
>> +"	.popsection\n"
>> +);
>> +#else
>>  noinline __noclone static void trace_direct_tramp(void) { }
>>  #endif
>> +#endif
>>  
>>  /*
>>   * Pretty much the same than for the function tracer from which the selftest
> 
> .


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog
  2022-04-24 15:40 ` [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog Xu Kuohai
@ 2022-05-10  9:36   ` Jakub Sitnicki
  2022-05-11  3:12     ` Xu Kuohai
  0 siblings, 1 reply; 19+ messages in thread
From: Jakub Sitnicki @ 2022-05-10  9:36 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

Thanks for incorporating the attach to BPF progs bits into the series.

I have a couple minor comments. Please see below.

On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:
> 1. Set up the bpf prog entry in the same way as fentry to support
>    trampoline. Now bpf prog entry looks like this:
>
>    bti c        // if BTI enabled
>    mov x9, x30  // save lr
>    nop          // to be replaced with jump instruction
>    paciasp      // if PAC enabled
>
> 2. Update bpf_arch_text_poke() to poke bpf prog. If the instruction
>    to be poked is bpf prog's first instruction, skip to the nop
>    instruction in the prog entry.
>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  arch/arm64/net/bpf_jit.h      |  1 +
>  arch/arm64/net/bpf_jit_comp.c | 41 +++++++++++++++++++++++++++--------
>  2 files changed, 33 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
> index 194c95ccc1cf..1c4b0075a3e2 100644
> --- a/arch/arm64/net/bpf_jit.h
> +++ b/arch/arm64/net/bpf_jit.h
> @@ -270,6 +270,7 @@
>  #define A64_BTI_C  A64_HINT(AARCH64_INSN_HINT_BTIC)
>  #define A64_BTI_J  A64_HINT(AARCH64_INSN_HINT_BTIJ)
>  #define A64_BTI_JC A64_HINT(AARCH64_INSN_HINT_BTIJC)
> +#define A64_NOP    A64_HINT(AARCH64_INSN_HINT_NOP)
>  
>  /* DMB */
>  #define A64_DMB_ISH aarch64_insn_gen_dmb(AARCH64_INSN_MB_ISH)
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 3f9bdfec54c4..293bdefc5d0c 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -237,14 +237,23 @@ static bool is_lsi_offset(int offset, int scale)
>  	return true;
>  }
>  
> -/* Tail call offset to jump into */
> -#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) || \
> -	IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
> -#define PROLOGUE_OFFSET 9
> +#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)
> +#define BTI_INSNS	1
> +#else
> +#define BTI_INSNS	0
> +#endif
> +
> +#if IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
> +#define PAC_INSNS	1
>  #else
> -#define PROLOGUE_OFFSET 8
> +#define PAC_INSNS	0
>  #endif

Above can be folded into:

#define BTI_INSNS (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) ? 1 : 0)
#define PAC_INSNS (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) ? 1 : 0)

>  
> +/* Tail call offset to jump into */
> +#define PROLOGUE_OFFSET	(BTI_INSNS + 2 + PAC_INSNS + 8)
> +/* Offset of nop instruction in bpf prog entry to be poked */
> +#define POKE_OFFSET	(BTI_INSNS + 1)
> +
>  static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>  {
>  	const struct bpf_prog *prog = ctx->prog;
> @@ -281,12 +290,15 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>  	 *
>  	 */
>  
> +	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> +		emit(A64_BTI_C, ctx);

I'm no arm64 expert, but this looks like a fix for BTI.

Currently we never emit BTI because ARM64_BTI_KERNEL depends on
ARM64_PTR_AUTH_KERNEL, while BTI must be the first instruction for the
jump target [1]. Am I following correctly?

[1] https://lwn.net/Articles/804982/

> +
> +	emit(A64_MOV(1, A64_R(9), A64_LR), ctx);
> +	emit(A64_NOP, ctx);
> +
>  	/* Sign lr */
>  	if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL))
>  		emit(A64_PACIASP, ctx);
> -	/* BTI landing pad */
> -	else if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> -		emit(A64_BTI_C, ctx);
>  
>  	/* Save FP and LR registers to stay align with ARM64 AAPCS */
>  	emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
> @@ -1552,9 +1564,11 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>  	u32 old_insn;
>  	u32 new_insn;
>  	u32 replaced;
> +	unsigned long offset = ~0UL;
>  	enum aarch64_insn_branch_type branch_type;
> +	char namebuf[KSYM_NAME_LEN];
>  
> -	if (!is_bpf_text_address((long)ip))
> +	if (!__bpf_address_lookup((unsigned long)ip, NULL, &offset, namebuf))
>  		/* Only poking bpf text is supported. Since kernel function
>  		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>  		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
> @@ -1565,6 +1579,15 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>  		 */
>  		return -EINVAL;
>  
> +	/* bpf entry */
> +	if (offset == 0UL)
> +		/* skip to the nop instruction in bpf prog entry:
> +		 * bti c	// if BTI enabled
> +		 * mov x9, x30
> +		 * nop
> +		 */
> +		ip = (u32 *)ip + POKE_OFFSET;

This is very much personal preference, however, I find the use pointer
arithmetic too clever here. Would go for a more verbose:

        offset = POKE_OFFSET * AARCH64_INSN_SIZE;          
        ip = (void *)((unsigned long)ip + offset);

> +
>  	if (poke_type == BPF_MOD_CALL)
>  		branch_type = AARCH64_INSN_BRANCH_LINK;
>  	else

I think it'd make more sense to merge this patch with patch 4 (the
preceding one).

Initial implementation of of bpf_arch_text_poke() from patch 4 is not
fully functional, as it will always fail for bpf_arch_text_poke(ip,
BPF_MOD_CALL, ...) calls. At least, I find it a bit confusing.

Otherwise than that:

Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-04-24 15:40 ` [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
@ 2022-05-10 11:45   ` Jakub Sitnicki
  2022-05-11  3:18     ` Xu Kuohai
  2022-05-13 14:59   ` Mark Rutland
  1 sibling, 1 reply; 19+ messages in thread
From: Jakub Sitnicki @ 2022-05-10 11:45 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:
> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
> it to replace nop with jump, or replace jump with nop.
>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> Acked-by: Song Liu <songliubraving@fb.com>
> ---
>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
>
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 8ab4035dea27..3f9bdfec54c4 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -9,6 +9,7 @@
>  
>  #include <linux/bitfield.h>
>  #include <linux/bpf.h>
> +#include <linux/memory.h>
>  #include <linux/filter.h>
>  #include <linux/printk.h>
>  #include <linux/slab.h>
> @@ -18,6 +19,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/debug-monitors.h>
>  #include <asm/insn.h>
> +#include <asm/patching.h>
>  #include <asm/set_memory.h>
>  
>  #include "bpf_jit.h"
> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>  {
>  	return vfree(addr);
>  }
> +
> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
> +			     void *addr, u32 *insn)
> +{
> +	if (!addr)
> +		*insn = aarch64_insn_gen_nop();
> +	else
> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
> +						    (unsigned long)addr,
> +						    type);
> +
> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> +		       void *old_addr, void *new_addr)
> +{
> +	int ret;
> +	u32 old_insn;
> +	u32 new_insn;
> +	u32 replaced;
> +	enum aarch64_insn_branch_type branch_type;
> +
> +	if (!is_bpf_text_address((long)ip))
> +		/* Only poking bpf text is supported. Since kernel function
> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only

Nit: s/funcitons/functions/

> +		 * called after a failed poke with ftrace. In this case, there
> +		 * is probably something wrong with fentry, so there is nothing
> +		 * we can do here. See register_fentry, unregister_fentry and
> +		 * modify_fentry for details.
> +		 */
> +		return -EINVAL;
> +
> +	if (poke_type == BPF_MOD_CALL)
> +		branch_type = AARCH64_INSN_BRANCH_LINK;
> +	else
> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
> +
> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
> +		return -EFAULT;
> +
> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
> +		return -EFAULT;
> +
> +	mutex_lock(&text_mutex);
> +	if (aarch64_insn_read(ip, &replaced)) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	if (replaced != old_insn) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);

Nit: No need for the explicit cast to void *. Type already matches.

> +out:
> +	mutex_unlock(&text_mutex);
> +	return ret;
> +}


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog
  2022-05-10  9:36   ` Jakub Sitnicki
@ 2022-05-11  3:12     ` Xu Kuohai
  2022-05-12 10:54       ` Jakub Sitnicki
  0 siblings, 1 reply; 19+ messages in thread
From: Xu Kuohai @ 2022-05-11  3:12 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

On 5/10/2022 5:36 PM, Jakub Sitnicki wrote:
> Thanks for incorporating the attach to BPF progs bits into the series.
> 
> I have a couple minor comments. Please see below.
> 
> On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:
>> 1. Set up the bpf prog entry in the same way as fentry to support
>>    trampoline. Now bpf prog entry looks like this:
>>
>>    bti c        // if BTI enabled
>>    mov x9, x30  // save lr
>>    nop          // to be replaced with jump instruction
>>    paciasp      // if PAC enabled
>>
>> 2. Update bpf_arch_text_poke() to poke bpf prog. If the instruction
>>    to be poked is bpf prog's first instruction, skip to the nop
>>    instruction in the prog entry.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
>>  arch/arm64/net/bpf_jit.h      |  1 +
>>  arch/arm64/net/bpf_jit_comp.c | 41 +++++++++++++++++++++++++++--------
>>  2 files changed, 33 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
>> index 194c95ccc1cf..1c4b0075a3e2 100644
>> --- a/arch/arm64/net/bpf_jit.h
>> +++ b/arch/arm64/net/bpf_jit.h
>> @@ -270,6 +270,7 @@
>>  #define A64_BTI_C  A64_HINT(AARCH64_INSN_HINT_BTIC)
>>  #define A64_BTI_J  A64_HINT(AARCH64_INSN_HINT_BTIJ)
>>  #define A64_BTI_JC A64_HINT(AARCH64_INSN_HINT_BTIJC)
>> +#define A64_NOP    A64_HINT(AARCH64_INSN_HINT_NOP)
>>  
>>  /* DMB */
>>  #define A64_DMB_ISH aarch64_insn_gen_dmb(AARCH64_INSN_MB_ISH)
>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>> index 3f9bdfec54c4..293bdefc5d0c 100644
>> --- a/arch/arm64/net/bpf_jit_comp.c
>> +++ b/arch/arm64/net/bpf_jit_comp.c
>> @@ -237,14 +237,23 @@ static bool is_lsi_offset(int offset, int scale)
>>  	return true;
>>  }
>>  
>> -/* Tail call offset to jump into */
>> -#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) || \
>> -	IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
>> -#define PROLOGUE_OFFSET 9
>> +#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)
>> +#define BTI_INSNS	1
>> +#else
>> +#define BTI_INSNS	0
>> +#endif
>> +
>> +#if IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
>> +#define PAC_INSNS	1
>>  #else
>> -#define PROLOGUE_OFFSET 8
>> +#define PAC_INSNS	0
>>  #endif
> 
> Above can be folded into:
> 
> #define BTI_INSNS (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) ? 1 : 0)
> #define PAC_INSNS (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) ? 1 : 0)
> 

will fix in v4

>>  
>> +/* Tail call offset to jump into */
>> +#define PROLOGUE_OFFSET	(BTI_INSNS + 2 + PAC_INSNS + 8)
>> +/* Offset of nop instruction in bpf prog entry to be poked */
>> +#define POKE_OFFSET	(BTI_INSNS + 1)
>> +
>>  static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>>  {
>>  	const struct bpf_prog *prog = ctx->prog;
>> @@ -281,12 +290,15 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>>  	 *
>>  	 */
>>  
>> +	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
>> +		emit(A64_BTI_C, ctx);
> 
> I'm no arm64 expert, but this looks like a fix for BTI.
> 
> Currently we never emit BTI because ARM64_BTI_KERNEL depends on
> ARM64_PTR_AUTH_KERNEL, while BTI must be the first instruction for the
> jump target [1]. Am I following correctly?
> 
> [1] https://lwn.net/Articles/804982/
> 

Not quite correct. When the jump target is a PACIASP instruction, no
Branch Target Exception is generated, so there is no need to insert a
BTI before PACIASP [2].

In order to attach trampoline to bpf prog, a MOV and NOP are inserted
before the PACIASP, so BTI instruction is required to avoid Branch
Target Exception.

The reason for inserting NOP before PACIASP instead of after PACIASP is
that no call frame is built before entering trampoline, so there is no
return address on the stack and nothing to be protected by PACIASP.

[2]
https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/BTI--Branch-Target-Identification-?lang=en

>> +
>> +	emit(A64_MOV(1, A64_R(9), A64_LR), ctx);
>> +	emit(A64_NOP, ctx);
>> +
>>  	/* Sign lr */
>>  	if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL))
>>  		emit(A64_PACIASP, ctx);
>> -	/* BTI landing pad */
>> -	else if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
>> -		emit(A64_BTI_C, ctx);
>>  
>>  	/* Save FP and LR registers to stay align with ARM64 AAPCS */
>>  	emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
>> @@ -1552,9 +1564,11 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>>  	u32 old_insn;
>>  	u32 new_insn;
>>  	u32 replaced;
>> +	unsigned long offset = ~0UL;
>>  	enum aarch64_insn_branch_type branch_type;
>> +	char namebuf[KSYM_NAME_LEN];
>>  
>> -	if (!is_bpf_text_address((long)ip))
>> +	if (!__bpf_address_lookup((unsigned long)ip, NULL, &offset, namebuf))
>>  		/* Only poking bpf text is supported. Since kernel function
>>  		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>>  		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
>> @@ -1565,6 +1579,15 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>>  		 */
>>  		return -EINVAL;
>>  
>> +	/* bpf entry */
>> +	if (offset == 0UL)
>> +		/* skip to the nop instruction in bpf prog entry:
>> +		 * bti c	// if BTI enabled
>> +		 * mov x9, x30
>> +		 * nop
>> +		 */
>> +		ip = (u32 *)ip + POKE_OFFSET;
> 
> This is very much personal preference, however, I find the use pointer
> arithmetic too clever here. Would go for a more verbose:
> 
>         offset = POKE_OFFSET * AARCH64_INSN_SIZE;          
>         ip = (void *)((unsigned long)ip + offset);
> 

will change in v4.

>> +
>>  	if (poke_type == BPF_MOD_CALL)
>>  		branch_type = AARCH64_INSN_BRANCH_LINK;
>>  	else
> 
> I think it'd make more sense to merge this patch with patch 4 (the
> preceding one).
> 
> Initial implementation of of bpf_arch_text_poke() from patch 4 is not
> fully functional, as it will always fail for bpf_arch_text_poke(ip,
> BPF_MOD_CALL, ...) calls. At least, I find it a bit confusing.

will merge in v4

> 
> Otherwise than that:
> 
> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> 
> .
Thanks for the review!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-05-10 11:45   ` Jakub Sitnicki
@ 2022-05-11  3:18     ` Xu Kuohai
  0 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-05-11  3:18 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

On 5/10/2022 7:45 PM, Jakub Sitnicki wrote:
> On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:
>> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
>> it to replace nop with jump, or replace jump with nop.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> Acked-by: Song Liu <songliubraving@fb.com>
>> ---
>>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>> index 8ab4035dea27..3f9bdfec54c4 100644
>> --- a/arch/arm64/net/bpf_jit_comp.c
>> +++ b/arch/arm64/net/bpf_jit_comp.c
>> @@ -9,6 +9,7 @@
>>  
>>  #include <linux/bitfield.h>
>>  #include <linux/bpf.h>
>> +#include <linux/memory.h>
>>  #include <linux/filter.h>
>>  #include <linux/printk.h>
>>  #include <linux/slab.h>
>> @@ -18,6 +19,7 @@
>>  #include <asm/cacheflush.h>
>>  #include <asm/debug-monitors.h>
>>  #include <asm/insn.h>
>> +#include <asm/patching.h>
>>  #include <asm/set_memory.h>
>>  
>>  #include "bpf_jit.h"
>> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>>  {
>>  	return vfree(addr);
>>  }
>> +
>> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
>> +			     void *addr, u32 *insn)
>> +{
>> +	if (!addr)
>> +		*insn = aarch64_insn_gen_nop();
>> +	else
>> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
>> +						    (unsigned long)addr,
>> +						    type);
>> +
>> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
>> +}
>> +
>> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>> +		       void *old_addr, void *new_addr)
>> +{
>> +	int ret;
>> +	u32 old_insn;
>> +	u32 new_insn;
>> +	u32 replaced;
>> +	enum aarch64_insn_branch_type branch_type;
>> +
>> +	if (!is_bpf_text_address((long)ip))
>> +		/* Only poking bpf text is supported. Since kernel function
>> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
> 
> Nit: s/funcitons/functions/
> 
>> +		 * called after a failed poke with ftrace. In this case, there
>> +		 * is probably something wrong with fentry, so there is nothing
>> +		 * we can do here. See register_fentry, unregister_fentry and
>> +		 * modify_fentry for details.
>> +		 */
>> +		return -EINVAL;
>> +
>> +	if (poke_type == BPF_MOD_CALL)
>> +		branch_type = AARCH64_INSN_BRANCH_LINK;
>> +	else
>> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
>> +
>> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
>> +		return -EFAULT;
>> +
>> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
>> +		return -EFAULT;
>> +
>> +	mutex_lock(&text_mutex);
>> +	if (aarch64_insn_read(ip, &replaced)) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	if (replaced != old_insn) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
> 
> Nit: No need for the explicit cast to void *. Type already matches.
> 
>> +out:
>> +	mutex_unlock(&text_mutex);
>> +	return ret;
>> +}
> 
> .
will fix in v4, thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog
  2022-05-11  3:12     ` Xu Kuohai
@ 2022-05-12 10:54       ` Jakub Sitnicki
  0 siblings, 0 replies; 19+ messages in thread
From: Jakub Sitnicki @ 2022-05-12 10:54 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Mark Rutland, Pasha Tatashin, Ard Biesheuvel, Daniel Kiss,
	Steven Price, Sudeep Holla, Marc Zyngier, Peter Collingbourne,
	Mark Brown, Delyan Kratunov, Kumar Kartikeya Dwivedi

On Wed, May 11, 2022 at 11:12 AM +08, Xu Kuohai wrote:
> On 5/10/2022 5:36 PM, Jakub Sitnicki wrote:
>> On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:

[...]

>>> @@ -281,12 +290,15 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>>>  	 *
>>>  	 */
>>>  
>>> +	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
>>> +		emit(A64_BTI_C, ctx);
>> 
>> I'm no arm64 expert, but this looks like a fix for BTI.
>> 
>> Currently we never emit BTI because ARM64_BTI_KERNEL depends on
>> ARM64_PTR_AUTH_KERNEL, while BTI must be the first instruction for the
>> jump target [1]. Am I following correctly?
>> 
>> [1] https://lwn.net/Articles/804982/
>> 
>
> Not quite correct. When the jump target is a PACIASP instruction, no
> Branch Target Exception is generated, so there is no need to insert a
> BTI before PACIASP [2].
>
> In order to attach trampoline to bpf prog, a MOV and NOP are inserted
> before the PACIASP, so BTI instruction is required to avoid Branch
> Target Exception.
>
> The reason for inserting NOP before PACIASP instead of after PACIASP is
> that no call frame is built before entering trampoline, so there is no
> return address on the stack and nothing to be protected by PACIASP.
>
> [2]
> https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/BTI--Branch-Target-Identification-?lang=en

That makes sense. Thanks for the explanation!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-04-24 15:40 ` [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
  2022-05-10 11:45   ` Jakub Sitnicki
@ 2022-05-13 14:59   ` Mark Rutland
  2022-05-16  6:55     ` Xu Kuohai
  1 sibling, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2022-05-13 14:59 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote:
> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
> it to replace nop with jump, or replace jump with nop.
> 
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> Acked-by: Song Liu <songliubraving@fb.com>
> ---
>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 8ab4035dea27..3f9bdfec54c4 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -9,6 +9,7 @@
>  
>  #include <linux/bitfield.h>
>  #include <linux/bpf.h>
> +#include <linux/memory.h>
>  #include <linux/filter.h>
>  #include <linux/printk.h>
>  #include <linux/slab.h>
> @@ -18,6 +19,7 @@
>  #include <asm/cacheflush.h>
>  #include <asm/debug-monitors.h>
>  #include <asm/insn.h>
> +#include <asm/patching.h>
>  #include <asm/set_memory.h>
>  
>  #include "bpf_jit.h"
> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>  {
>  	return vfree(addr);
>  }
> +
> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
> +			     void *addr, u32 *insn)
> +{
> +	if (!addr)
> +		*insn = aarch64_insn_gen_nop();
> +	else
> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
> +						    (unsigned long)addr,
> +						    type);
> +
> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> +		       void *old_addr, void *new_addr)
> +{
> +	int ret;
> +	u32 old_insn;
> +	u32 new_insn;
> +	u32 replaced;
> +	enum aarch64_insn_branch_type branch_type;
> +
> +	if (!is_bpf_text_address((long)ip))
> +		/* Only poking bpf text is supported. Since kernel function
> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
> +		 * called after a failed poke with ftrace. In this case, there
> +		 * is probably something wrong with fentry, so there is nothing
> +		 * we can do here. See register_fentry, unregister_fentry and
> +		 * modify_fentry for details.
> +		 */
> +		return -EINVAL;

If you rely on ftrace to poke functions, why do you need to patch text
at all? Why does the rest of this function exist?

I really don't like having another piece of code outside of ftrace
patching the ftrace patch-site; this needs a much better explanation.

> +
> +	if (poke_type == BPF_MOD_CALL)
> +		branch_type = AARCH64_INSN_BRANCH_LINK;
> +	else
> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
> +
> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
> +		return -EFAULT;
> +
> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
> +		return -EFAULT;
> +
> +	mutex_lock(&text_mutex);
> +	if (aarch64_insn_read(ip, &replaced)) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	if (replaced != old_insn) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);

... and where does the actual synchronization come from in this case?

Thanks,
Mark.

> +out:
> +	mutex_unlock(&text_mutex);
> +	return ret;
> +}
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-05-13 14:59   ` Mark Rutland
@ 2022-05-16  6:55     ` Xu Kuohai
  2022-05-16  7:18       ` Mark Rutland
  0 siblings, 1 reply; 19+ messages in thread
From: Xu Kuohai @ 2022-05-16  6:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On 5/13/2022 10:59 PM, Mark Rutland wrote:
> On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote:
>> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
>> it to replace nop with jump, or replace jump with nop.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> Acked-by: Song Liu <songliubraving@fb.com>
>> ---
>>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>> index 8ab4035dea27..3f9bdfec54c4 100644
>> --- a/arch/arm64/net/bpf_jit_comp.c
>> +++ b/arch/arm64/net/bpf_jit_comp.c
>> @@ -9,6 +9,7 @@
>>  
>>  #include <linux/bitfield.h>
>>  #include <linux/bpf.h>
>> +#include <linux/memory.h>
>>  #include <linux/filter.h>
>>  #include <linux/printk.h>
>>  #include <linux/slab.h>
>> @@ -18,6 +19,7 @@
>>  #include <asm/cacheflush.h>
>>  #include <asm/debug-monitors.h>
>>  #include <asm/insn.h>
>> +#include <asm/patching.h>
>>  #include <asm/set_memory.h>
>>  
>>  #include "bpf_jit.h"
>> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>>  {
>>  	return vfree(addr);
>>  }
>> +
>> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
>> +			     void *addr, u32 *insn)
>> +{
>> +	if (!addr)
>> +		*insn = aarch64_insn_gen_nop();
>> +	else
>> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
>> +						    (unsigned long)addr,
>> +						    type);
>> +
>> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
>> +}
>> +
>> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>> +		       void *old_addr, void *new_addr)
>> +{
>> +	int ret;
>> +	u32 old_insn;
>> +	u32 new_insn;
>> +	u32 replaced;
>> +	enum aarch64_insn_branch_type branch_type;
>> +
>> +	if (!is_bpf_text_address((long)ip))
>> +		/* Only poking bpf text is supported. Since kernel function
>> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
>> +		 * called after a failed poke with ftrace. In this case, there
>> +		 * is probably something wrong with fentry, so there is nothing
>> +		 * we can do here. See register_fentry, unregister_fentry and
>> +		 * modify_fentry for details.
>> +		 */
>> +		return -EINVAL;
> 
> If you rely on ftrace to poke functions, why do you need to patch text
> at all? Why does the rest of this function exist?
> 
> I really don't like having another piece of code outside of ftrace
> patching the ftrace patch-site; this needs a much better explanation.
> 

Sorry for the incorrect explaination in the comment. I don't think it's
reasonable to patch ftrace patch-site without ftrace code either.

The patching logic in register_fentry, unregister_fentry and
modify_fentry is as follows:

if (tr->func.ftrace_managed)
        ret = register_ftrace_direct((long)ip, (long)new_addr);
else
        ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr,
                                 true);

ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is
only used to patch bpf prog and bpf trampoline, which are not managed by
ftrace.

>> +
>> +	if (poke_type == BPF_MOD_CALL)
>> +		branch_type = AARCH64_INSN_BRANCH_LINK;
>> +	else
>> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
>> +
>> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
>> +		return -EFAULT;
>> +
>> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
>> +		return -EFAULT;
>> +
>> +	mutex_lock(&text_mutex);
>> +	if (aarch64_insn_read(ip, &replaced)) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	if (replaced != old_insn) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
> 
> ... and where does the actual synchronization come from in this case?
> 

aarch64_insn_patch_text_nosync() replaces an instruction atomically, so
no other CPUs will fetch a half-new and half-old instruction.

The scenario here is that there is a chance that another CPU fetches the
old instruction after bpf_arch_text_poke() finishes, that is, different
CPUs may execute different versions of instructions at the same time.

1. When a new trampoline is attached, it doesn't seem to be an issue for
different CPUs to jump to different trampolines temporarily.

2. When an old trampoline is freed, we should wait for all other CPUs to
exit the trampoline and make sure the trampoline is no longer reachable,
IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu
tasks to do this.

> Thanks,
> Mark.
> 
>> +out:
>> +	mutex_unlock(&text_mutex);
>> +	return ret;
>> +}
>> -- 
>> 2.30.2
>>
> .


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-05-16  6:55     ` Xu Kuohai
@ 2022-05-16  7:18       ` Mark Rutland
  2022-05-16  7:58         ` Xu Kuohai
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Rutland @ 2022-05-16  7:18 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On Mon, May 16, 2022 at 02:55:46PM +0800, Xu Kuohai wrote:
> On 5/13/2022 10:59 PM, Mark Rutland wrote:
> > On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote:
> >> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
> >> it to replace nop with jump, or replace jump with nop.
> >>
> >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> >> Acked-by: Song Liu <songliubraving@fb.com>
> >> ---
> >>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
> >>  1 file changed, 63 insertions(+)
> >>
> >> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> >> index 8ab4035dea27..3f9bdfec54c4 100644
> >> --- a/arch/arm64/net/bpf_jit_comp.c
> >> +++ b/arch/arm64/net/bpf_jit_comp.c
> >> @@ -9,6 +9,7 @@
> >>  
> >>  #include <linux/bitfield.h>
> >>  #include <linux/bpf.h>
> >> +#include <linux/memory.h>
> >>  #include <linux/filter.h>
> >>  #include <linux/printk.h>
> >>  #include <linux/slab.h>
> >> @@ -18,6 +19,7 @@
> >>  #include <asm/cacheflush.h>
> >>  #include <asm/debug-monitors.h>
> >>  #include <asm/insn.h>
> >> +#include <asm/patching.h>
> >>  #include <asm/set_memory.h>
> >>  
> >>  #include "bpf_jit.h"
> >> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
> >>  {
> >>  	return vfree(addr);
> >>  }
> >> +
> >> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
> >> +			     void *addr, u32 *insn)
> >> +{
> >> +	if (!addr)
> >> +		*insn = aarch64_insn_gen_nop();
> >> +	else
> >> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
> >> +						    (unsigned long)addr,
> >> +						    type);
> >> +
> >> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
> >> +}
> >> +
> >> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> >> +		       void *old_addr, void *new_addr)
> >> +{
> >> +	int ret;
> >> +	u32 old_insn;
> >> +	u32 new_insn;
> >> +	u32 replaced;
> >> +	enum aarch64_insn_branch_type branch_type;
> >> +
> >> +	if (!is_bpf_text_address((long)ip))
> >> +		/* Only poking bpf text is supported. Since kernel function
> >> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
> >> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
> >> +		 * called after a failed poke with ftrace. In this case, there
> >> +		 * is probably something wrong with fentry, so there is nothing
> >> +		 * we can do here. See register_fentry, unregister_fentry and
> >> +		 * modify_fentry for details.
> >> +		 */
> >> +		return -EINVAL;
> > 
> > If you rely on ftrace to poke functions, why do you need to patch text
> > at all? Why does the rest of this function exist?
> > 
> > I really don't like having another piece of code outside of ftrace
> > patching the ftrace patch-site; this needs a much better explanation.
> > 
> 
> Sorry for the incorrect explaination in the comment. I don't think it's
> reasonable to patch ftrace patch-site without ftrace code either.
> 
> The patching logic in register_fentry, unregister_fentry and
> modify_fentry is as follows:
> 
> if (tr->func.ftrace_managed)
>         ret = register_ftrace_direct((long)ip, (long)new_addr);
> else
>         ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr,
>                                  true);
> 
> ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is
> only used to patch bpf prog and bpf trampoline, which are not managed by
> ftrace.

Sorry, I had misunderstood. Thanks for the correction!

I'll have another look with that in mind.

> >> +
> >> +	if (poke_type == BPF_MOD_CALL)
> >> +		branch_type = AARCH64_INSN_BRANCH_LINK;
> >> +	else
> >> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
> >> +
> >> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
> >> +		return -EFAULT;
> >> +
> >> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
> >> +		return -EFAULT;
> >> +
> >> +	mutex_lock(&text_mutex);
> >> +	if (aarch64_insn_read(ip, &replaced)) {
> >> +		ret = -EFAULT;
> >> +		goto out;
> >> +	}
> >> +
> >> +	if (replaced != old_insn) {
> >> +		ret = -EFAULT;
> >> +		goto out;
> >> +	}
> >> +
> >> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
> > 
> > ... and where does the actual synchronization come from in this case?
> 
> aarch64_insn_patch_text_nosync() replaces an instruction atomically, so
> no other CPUs will fetch a half-new and half-old instruction.
> 
> The scenario here is that there is a chance that another CPU fetches the
> old instruction after bpf_arch_text_poke() finishes, that is, different
> CPUs may execute different versions of instructions at the same time.
> 
> 1. When a new trampoline is attached, it doesn't seem to be an issue for
> different CPUs to jump to different trampolines temporarily.
>
> 2. When an old trampoline is freed, we should wait for all other CPUs to
> exit the trampoline and make sure the trampoline is no longer reachable,
> IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu
> tasks to do this.

It would be good to have a comment for these points.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64
  2022-05-16  7:18       ` Mark Rutland
@ 2022-05-16  7:58         ` Xu Kuohai
  0 siblings, 0 replies; 19+ messages in thread
From: Xu Kuohai @ 2022-05-16  7:58 UTC (permalink / raw)
  To: Mark Rutland
  Cc: bpf, linux-arm-kernel, linux-kernel, netdev, linux-kselftest,
	Catalin Marinas, Will Deacon, Steven Rostedt, Ingo Molnar,
	Daniel Borkmann, Alexei Starovoitov, Zi Shen Lim,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, David S . Miller, Hideaki YOSHIFUJI,
	David Ahern, Thomas Gleixner, Borislav Petkov, Dave Hansen, x86,
	hpa, Shuah Khan, Jakub Kicinski, Jesper Dangaard Brouer,
	Pasha Tatashin, Ard Biesheuvel, Daniel Kiss, Steven Price,
	Sudeep Holla, Marc Zyngier, Peter Collingbourne, Mark Brown,
	Delyan Kratunov, Kumar Kartikeya Dwivedi

On 5/16/2022 3:18 PM, Mark Rutland wrote:
> On Mon, May 16, 2022 at 02:55:46PM +0800, Xu Kuohai wrote:
>> On 5/13/2022 10:59 PM, Mark Rutland wrote:
>>> On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote:
>>>> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
>>>> it to replace nop with jump, or replace jump with nop.
>>>>
>>>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>>>> Acked-by: Song Liu <songliubraving@fb.com>
>>>> ---
>>>>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 63 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>>>> index 8ab4035dea27..3f9bdfec54c4 100644
>>>> --- a/arch/arm64/net/bpf_jit_comp.c
>>>> +++ b/arch/arm64/net/bpf_jit_comp.c
>>>> @@ -9,6 +9,7 @@
>>>>  
>>>>  #include <linux/bitfield.h>
>>>>  #include <linux/bpf.h>
>>>> +#include <linux/memory.h>
>>>>  #include <linux/filter.h>
>>>>  #include <linux/printk.h>
>>>>  #include <linux/slab.h>
>>>> @@ -18,6 +19,7 @@
>>>>  #include <asm/cacheflush.h>
>>>>  #include <asm/debug-monitors.h>
>>>>  #include <asm/insn.h>
>>>> +#include <asm/patching.h>
>>>>  #include <asm/set_memory.h>
>>>>  
>>>>  #include "bpf_jit.h"
>>>> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>>>>  {
>>>>  	return vfree(addr);
>>>>  }
>>>> +
>>>> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
>>>> +			     void *addr, u32 *insn)
>>>> +{
>>>> +	if (!addr)
>>>> +		*insn = aarch64_insn_gen_nop();
>>>> +	else
>>>> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
>>>> +						    (unsigned long)addr,
>>>> +						    type);
>>>> +
>>>> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
>>>> +}
>>>> +
>>>> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>>>> +		       void *old_addr, void *new_addr)
>>>> +{
>>>> +	int ret;
>>>> +	u32 old_insn;
>>>> +	u32 new_insn;
>>>> +	u32 replaced;
>>>> +	enum aarch64_insn_branch_type branch_type;
>>>> +
>>>> +	if (!is_bpf_text_address((long)ip))
>>>> +		/* Only poking bpf text is supported. Since kernel function
>>>> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>>>> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
>>>> +		 * called after a failed poke with ftrace. In this case, there
>>>> +		 * is probably something wrong with fentry, so there is nothing
>>>> +		 * we can do here. See register_fentry, unregister_fentry and
>>>> +		 * modify_fentry for details.
>>>> +		 */
>>>> +		return -EINVAL;
>>>
>>> If you rely on ftrace to poke functions, why do you need to patch text
>>> at all? Why does the rest of this function exist?
>>>
>>> I really don't like having another piece of code outside of ftrace
>>> patching the ftrace patch-site; this needs a much better explanation.
>>>
>>
>> Sorry for the incorrect explaination in the comment. I don't think it's
>> reasonable to patch ftrace patch-site without ftrace code either.
>>
>> The patching logic in register_fentry, unregister_fentry and
>> modify_fentry is as follows:
>>
>> if (tr->func.ftrace_managed)
>>         ret = register_ftrace_direct((long)ip, (long)new_addr);
>> else
>>         ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr,
>>                                  true);
>>
>> ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is
>> only used to patch bpf prog and bpf trampoline, which are not managed by
>> ftrace.
> 
> Sorry, I had misunderstood. Thanks for the correction!
> 
> I'll have another look with that in mind.
>>>>> +
>>>> +	if (poke_type == BPF_MOD_CALL)
>>>> +		branch_type = AARCH64_INSN_BRANCH_LINK;
>>>> +	else
>>>> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
>>>> +
>>>> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
>>>> +		return -EFAULT;
>>>> +
>>>> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
>>>> +		return -EFAULT;
>>>> +
>>>> +	mutex_lock(&text_mutex);
>>>> +	if (aarch64_insn_read(ip, &replaced)) {
>>>> +		ret = -EFAULT;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	if (replaced != old_insn) {
>>>> +		ret = -EFAULT;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
>>>
>>> ... and where does the actual synchronization come from in this case?
>>
>> aarch64_insn_patch_text_nosync() replaces an instruction atomically, so
>> no other CPUs will fetch a half-new and half-old instruction.
>>
>> The scenario here is that there is a chance that another CPU fetches the
>> old instruction after bpf_arch_text_poke() finishes, that is, different
>> CPUs may execute different versions of instructions at the same time.
>>
>> 1. When a new trampoline is attached, it doesn't seem to be an issue for
>> different CPUs to jump to different trampolines temporarily.
>>
>> 2. When an old trampoline is freed, we should wait for all other CPUs to
>> exit the trampoline and make sure the trampoline is no longer reachable,
>> IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu
>> tasks to do this.
> 
> It would be good to have a comment for these points>

will add a comment for this in v4, thanks!

> Thanks,
> Mark.
> .


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-05-16  7:59 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-24 15:40 [PATCH bpf-next v3 0/7] bpf trampoline for arm64 Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 1/7] arm64: ftrace: Add ftrace direct call support Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 2/7] ftrace: Fix deadloop caused by direct call in ftrace selftest Xu Kuohai
2022-04-25 15:05   ` Steven Rostedt
2022-04-26  7:36     ` Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 3/7] bpf: Move is_valid_bpf_tramp_flags() to the public trampoline code Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Xu Kuohai
2022-05-10 11:45   ` Jakub Sitnicki
2022-05-11  3:18     ` Xu Kuohai
2022-05-13 14:59   ` Mark Rutland
2022-05-16  6:55     ` Xu Kuohai
2022-05-16  7:18       ` Mark Rutland
2022-05-16  7:58         ` Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog Xu Kuohai
2022-05-10  9:36   ` Jakub Sitnicki
2022-05-11  3:12     ` Xu Kuohai
2022-05-12 10:54       ` Jakub Sitnicki
2022-04-24 15:40 ` [PATCH bpf-next v3 6/7] bpf, arm64: bpf trampoline for arm64 Xu Kuohai
2022-04-24 15:40 ` [PATCH bpf-next v3 7/7] selftests/bpf: Fix trivial typo in fentry_fexit.c Xu Kuohai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).