All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V
@ 2022-05-29 14:33 Andy Chiu
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Andy Chiu @ 2022-05-29 14:33 UTC (permalink / raw)
  To: linux-riscv, palmer, mingo, paul.walmsley, aou, rostedt; +Cc: guoren, Andy Chiu

This patch remove dependency of dynamic ftrace from calling
stop_machine(), and make it compatiable with kernel preemption.
Originally, we ran into stack corruptions, or execution of partially
updated instructions when starting or stopping ftrace on a fully
preemptible kernel configuration. The reason is that kernel periodically
calls rcu_momentary_dyntick_idle() on cores waiting for the code-patching
core running in ftrace. Though rcu_momentary_dyntick_idle() itself is
marked as notrace, it would call a bunch of tracable functions if we
configured the kernel as preemptible. For example, these are some functions
that happened to have a symbol and have not been marked as notrace on a
RISC-V preemptible kernel compiled with GCC-11:
 - __rcu_report_exp_rnp()
 - rcu_report_exp_cpu_mult()
 - rcu_preempt_deferred_qs()
 - rcu_preempt_need_deferred_qs()
 - rcu_preempt_deferred_qs_irqrestore()

Thus, this make it not ideal for us to rely on stop_machine() and
handly marked "notrace"s to perform runtime code patching. To remove
such dependency, we must make updates of code seemed atomic on running
cores. This might not be obvious for RISC-V since it usaually uses a pair
of AUIPC + JALR to perform a long jump, which cannot be modified and
executed concurrently if we consider preemptions. As such, this patch
proposed a way to make it possible. It embeds a 32-bit rel-address data
into instructions of each ftrace prologue and jumps indirectly. In this
way, we could store and load the address atomically so that the code
patching core could run simutaneously with the rest of running cores.

After applying the patchset, we compiled a preemptible kernel with all
tracers and ftrace-selftest enabled, and booted it on a 2-core QEMU virt
machine. The kernel could boot up successfully, passing all ftrace
testsuits. Besides, we ran a script that randomly pick a tracer on every
0~5 seconds. The kernel has sustained over 20K rounds of the test. In
contrast, a preemptible kernel without our patch would panic in few
rounds on the same machine.

However, we did run into errors when using hwlat or irqsoff tracers
together with cpu-online stressor from stress-ng on a preemptible kernel.
The stressor would randomly take a cpu from 1-N offline and online using
hotplug mechanism. Nonotheless, since we think there is little chance that
it is related to our changes in ftrace, we are working on it in parallel,
and may come up with another patch soon.

Andy Chiu (4):
  riscv: align ftrace to 4 Byte boundary and increase ftrace prologue
    size
  riscv: export patch_insn_write
  riscv: ftrace: use indirect jump to work with kernel preemption
  riscv: ftrace: do not use stop_machine to update code

 arch/riscv/Makefile             |   2 +-
 arch/riscv/include/asm/ftrace.h |  24 ------
 arch/riscv/include/asm/patch.h  |   1 +
 arch/riscv/kernel/ftrace.c      | 135 ++++++++++++++++++++------------
 arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
 arch/riscv/kernel/patch.c       |   4 +-
 6 files changed, 136 insertions(+), 92 deletions(-)

-- 
2.36.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size
  2022-05-29 14:33 [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V Andy Chiu
@ 2022-05-29 14:33 ` Andy Chiu
  2022-07-21  5:40   ` Palmer Dabbelt
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write Andy Chiu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Andy Chiu @ 2022-05-29 14:33 UTC (permalink / raw)
  To: linux-riscv, palmer, mingo, paul.walmsley, aou, rostedt
  Cc: guoren, Andy Chiu, Greentime Hu, Zong Li

We are introducing a new ftrace mechanism in order to phase out
stop_machine() and enable kernel preemption. The new mechanism requires
ftrace patchable function entries to be 24 bytes and aligned to 4 Byte
boundaries.

Before applying this patch, the size of the kernel code, with 43432 of
ftrace entries, was at 12.31 MB. Under the same configuration, the size
has increased to 12.68 MB after applying this patch set.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
---
 arch/riscv/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 1f19bdac6767..cb6cfdffc49c 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -11,7 +11,7 @@ LDFLAGS_vmlinux :=
 ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
 	LDFLAGS_vmlinux := --no-relax
 	KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
-	CC_FLAGS_FTRACE := -fpatchable-function-entry=8
+	CC_FLAGS_FTRACE := -fpatchable-function-entry=12  -falign-functions=4
 endif
 
 ifeq ($(CONFIG_CMODEL_MEDLOW),y)
-- 
2.36.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write
  2022-05-29 14:33 [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V Andy Chiu
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
@ 2022-05-29 14:33 ` Andy Chiu
  2022-07-21  5:40   ` Palmer Dabbelt
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption Andy Chiu
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code Andy Chiu
  3 siblings, 1 reply; 10+ messages in thread
From: Andy Chiu @ 2022-05-29 14:33 UTC (permalink / raw)
  To: linux-riscv, palmer, mingo, paul.walmsley, aou, rostedt
  Cc: guoren, Andy Chiu, Greentime Hu, Zong Li

So that we may patch code without issuing fence.i

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
---
 arch/riscv/include/asm/patch.h | 1 +
 arch/riscv/kernel/patch.c      | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
index 9a7d7346001e..327e99114d67 100644
--- a/arch/riscv/include/asm/patch.h
+++ b/arch/riscv/include/asm/patch.h
@@ -6,6 +6,7 @@
 #ifndef _ASM_RISCV_PATCH_H
 #define _ASM_RISCV_PATCH_H
 
+int patch_insn_write(void *addr, const void *insn, size_t len);
 int patch_text_nosync(void *addr, const void *insns, size_t len);
 int patch_text(void *addr, u32 insn);
 
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 0b552873a577..834855cfd0fb 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -49,7 +49,7 @@ static void patch_unmap(int fixmap)
 }
 NOKPROBE_SYMBOL(patch_unmap);
 
-static int patch_insn_write(void *addr, const void *insn, size_t len)
+int patch_insn_write(void *addr, const void *insn, size_t len)
 {
 	void *waddr = addr;
 	bool across_pages = (((uintptr_t) addr & ~PAGE_MASK) + len) > PAGE_SIZE;
@@ -78,7 +78,7 @@ static int patch_insn_write(void *addr, const void *insn, size_t len)
 }
 NOKPROBE_SYMBOL(patch_insn_write);
 #else
-static int patch_insn_write(void *addr, const void *insn, size_t len)
+int patch_insn_write(void *addr, const void *insn, size_t len)
 {
 	return copy_to_kernel_nofault(addr, insn, len);
 }
-- 
2.36.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption
  2022-05-29 14:33 [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V Andy Chiu
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write Andy Chiu
@ 2022-05-29 14:33 ` Andy Chiu
  2022-07-21  5:40   ` Palmer Dabbelt
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code Andy Chiu
  3 siblings, 1 reply; 10+ messages in thread
From: Andy Chiu @ 2022-05-29 14:33 UTC (permalink / raw)
  To: linux-riscv, palmer, mingo, paul.walmsley, aou, rostedt
  Cc: guoren, Andy Chiu, Greentime Hu, Zong Li

In RISCV, we must use an AUIPC + JALR pair to encode an immediate,
forming a jump that jumps to an address over 4K. This may cause errors
if we want to enable kernel preemption and remove dependency from
patching code with stop_machine(). For example, if a task was switched
out on auipc. And, if we changed the ftrace function before it was
switched back, then it would jump to an address that has updated 11:0
bits mixing with previous XLEN:12 part.

p: patched area performed by dynamic ftrace
ftrace_prologue:
p|	REG_S	ra, -SZREG(sp)
p|	auipc	ra, 0x? ------------> preempted
					...
				change ftrace function
					...
p|	jalr	-?(ra) <------------- switched back
p|	REG_L	ra, -SZREG(sp)
func:
	xxx
	ret

To prevent such condition, we proposed a way to load or store target
addresses atomically. We store a natually aligned 32-bit relative
address into each ftrace prologue and use a jump at front to decide
whether we should take ftrace detour. To reduce footprint of ftrace
prologues, we clobber t0 and t1 and move ra (re-)storing into
ftrace_{regs_}caller. This is similar to ARM64, which also clobbers x9 at
each prologue.

.align 4
ftrace_prologue:
p|	j	ftrace_cont <----> func
p|	.word	0x? <=== store word to a 4B aligned address can be
			 considered atomic to read sides using load word
ftrace_cont:
	auipc	t0, 0
	lw	t1, -4(t0) <=== read side
	add	t0, t0, t1
	jalr	t0, t0
func:
	xxx
	ret

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
---
 arch/riscv/include/asm/ftrace.h |  24 ------
 arch/riscv/kernel/ftrace.c      | 130 ++++++++++++++++++++------------
 arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
 3 files changed, 127 insertions(+), 89 deletions(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 04dad3380041..eaa611e491fc 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -47,30 +47,6 @@ struct dyn_arch_ftrace {
  */
 
 #define MCOUNT_ADDR		((unsigned long)MCOUNT_NAME)
-#define JALR_SIGN_MASK		(0x00000800)
-#define JALR_OFFSET_MASK	(0x00000fff)
-#define AUIPC_OFFSET_MASK	(0xfffff000)
-#define AUIPC_PAD		(0x00001000)
-#define JALR_SHIFT		20
-#define JALR_BASIC		(0x000080e7)
-#define AUIPC_BASIC		(0x00000097)
-#define NOP4			(0x00000013)
-
-#define make_call(caller, callee, call)					\
-do {									\
-	call[0] = to_auipc_insn((unsigned int)((unsigned long)callee -	\
-				(unsigned long)caller));		\
-	call[1] = to_jalr_insn((unsigned int)((unsigned long)callee -	\
-			       (unsigned long)caller));			\
-} while (0)
-
-#define to_jalr_insn(offset)						\
-	(((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_BASIC)
-
-#define to_auipc_insn(offset)						\
-	((offset & JALR_SIGN_MASK) ?					\
-	(((offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_BASIC) :	\
-	((offset & AUIPC_OFFSET_MASK) | AUIPC_BASIC))
 
 /*
  * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here.
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 4716f4cdc038..d4bf0e5255f6 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -25,31 +25,29 @@ int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
 }
 
 static int ftrace_check_current_call(unsigned long hook_pos,
-				     unsigned int *expected)
+				     unsigned long expected_addr)
 {
-	unsigned int replaced[2];
-	unsigned int nops[2] = {NOP4, NOP4};
+	unsigned long replaced;
 
-	/* we expect nops at the hook position */
-	if (!expected)
-		expected = nops;
+	/* we expect ftrace_stub at the hook position */
+	if (!expected_addr)
+		expected_addr = (unsigned long) ftrace_stub;
 
 	/*
 	 * Read the text we want to modify;
 	 * return must be -EFAULT on read error
 	 */
-	if (copy_from_kernel_nofault(replaced, (void *)hook_pos,
-			MCOUNT_INSN_SIZE))
+	if (copy_from_kernel_nofault(&replaced, (void *)hook_pos,
+			(sizeof(unsigned long))))
 		return -EFAULT;
 
 	/*
 	 * Make sure it is what we expect it to be;
 	 * return must be -EINVAL on failed comparison
 	 */
-	if (memcmp(expected, replaced, sizeof(replaced))) {
-		pr_err("%p: expected (%08x %08x) but got (%08x %08x)\n",
-		       (void *)hook_pos, expected[0], expected[1], replaced[0],
-		       replaced[1]);
+	if (expected_addr != replaced) {
+		pr_err("%p: expected (%016lx) but got (%016lx)\n",
+		       (void *)hook_pos, expected_addr, replaced);
 		return -EINVAL;
 	}
 
@@ -59,55 +57,80 @@ static int ftrace_check_current_call(unsigned long hook_pos,
 static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target,
 				bool enable)
 {
-	unsigned int call[2];
-	unsigned int nops[2] = {NOP4, NOP4};
+	unsigned long call = target;
+	unsigned long nops = (unsigned long)ftrace_stub;
 
-	make_call(hook_pos, target, call);
-
-	/* Replace the auipc-jalr pair at once. Return -EPERM on write error. */
+	/* Replace the target address at once. Return -EPERM on write error. */
 	if (patch_text_nosync
-	    ((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE))
+	    ((void *)hook_pos, enable ? &call : &nops, sizeof(unsigned long)))
 		return -EPERM;
 
 	return 0;
 }
 
 /*
- * Put 5 instructions with 16 bytes at the front of function within
- * patchable function entry nops' area.
- *
- * 0: REG_S  ra, -SZREG(sp)
- * 1: auipc  ra, 0x?
- * 2: jalr   -?(ra)
- * 3: REG_L  ra, -SZREG(sp)
+ * Put 5 instructions and a 32-bit relative address to INSN1 with 24-byte at the
+ * front of function within patchable function entry nops' area.
  *
  * So the opcodes is:
- * 0: 0xfe113c23 (sd)/0xfe112e23 (sw)
- * 1: 0x???????? -> auipc
- * 2: 0x???????? -> jalr
- * 3: 0xff813083 (ld)/0xffc12083 (lw)
+ * INSN0_NOP: J     PC + 0x18 (jump to the real function entry)
+ * INSN0    : J     PC + 0x08 (jump to the fisrt insn)
+ * INSN1    : AUIPC T0, 0
+ * INSN2    : LW    T1, -4(T0)
+ * INSN3    : ADD   T0, T0, T1
+ * INSN4    : JALR  T0, T0
+ *
+ * At runtime, we want to patch the jump target atomically in order to work with
+ * kernel preemption. If we patched with a pair of AUIPC + JALR and a task was
+ * preempted after loading upper bits with AUIPC. Then things would mess up if
+ * we updated the jump target before the task were switched back.
+ *
+ * We also want to align all patchable function entries and, thus, the jump
+ * offset to a 4 Bytes aligned address so that each of them could be natually
+ * updated and observed by patching and running cores.
+ *
+ * | ADDR   | COMPILED | DISABLED         | ENABLED                |
+ * +--------+----------+------------------+------------------------+
+ * | 0x00   | NOP      | J     FUNC       | J  FTRACE              | <-
+ * | 0x04   | NOP      | XXXXXXXX         | 32-bit rel-jump offset | <- 4B aligned
+ * | FTRACE | NOP      | AUIPC T0, 0                               |
+ * | 0x0c   | NOP      | LW    T1, -4(T0)                          |
+ * | 0x10   | NOP      | ADD   T0, T0, T1                          |
+ * | 0x14   | NOP      | JALR  T0, T0                              |
+ * | FUNC   | X                                                    |
  */
-#if __riscv_xlen == 64
-#define INSN0	0xfe113c23
-#define INSN3	0xff813083
-#elif __riscv_xlen == 32
-#define INSN0	0xfe112e23
-#define INSN3	0xffc12083
-#endif
+#define INSN0_NOP	0x0180006f
+#define INSN0		0x0080006f
+#define INSN1		0x00000297
+#define INSN2		0xffc2a303
+#define INSN3		0x006282b3
+#define INSN4		0x000282e7
+#define INSN_SIZE	4
+
+#define FUNC_ENTRY_JMP	8
 
-#define FUNC_ENTRY_SIZE	16
-#define FUNC_ENTRY_JMP	4
+struct ftrace_prologue {
+	unsigned int insn0;
+	unsigned int jmp_target;
+};
 
 int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 {
-	unsigned int call[4] = {INSN0, 0, 0, INSN3};
+	struct ftrace_prologue call = { .insn0 = INSN0 };
 	unsigned long target = addr;
 	unsigned long caller = rec->ip + FUNC_ENTRY_JMP;
 
-	call[1] = to_auipc_insn((unsigned int)(target - caller));
-	call[2] = to_jalr_insn((unsigned int)(target - caller));
+	call.jmp_target = (unsigned int)(target - caller);
+	patch_insn_write(&((struct ftrace_prologue *)rec->ip)->jmp_target,
+			 &call.jmp_target, sizeof(call.jmp_target));
+
+	/*
+	 * Make sure other cores do not get an out-dated jump target after it
+	 * jumps to INSN1.
+	 */
+	smp_wmb();
 
-	if (patch_text_nosync((void *)rec->ip, call, FUNC_ENTRY_SIZE))
+	if (patch_text_nosync((void *)rec->ip, &call, INSN_SIZE))
 		return -EPERM;
 
 	return 0;
@@ -116,14 +139,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 		    unsigned long addr)
 {
-	unsigned int nops[4] = {NOP4, NOP4, NOP4, NOP4};
+	unsigned int nops[1] = {INSN0_NOP};
 
-	if (patch_text_nosync((void *)rec->ip, nops, FUNC_ENTRY_SIZE))
+	if (patch_text_nosync((void *)rec->ip, nops, INSN_SIZE))
 		return -EPERM;
 
 	return 0;
 }
 
+int __ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec,
+		    unsigned long addr)
+{
+	unsigned int nops[6] = {INSN0_NOP, 0, INSN1, INSN2, INSN3, INSN4};
+
+	if (patch_text_nosync((void *)rec->ip, nops, sizeof(nops)))
+		return -EPERM;
+
+	return 0;
+}
 
 /*
  * This is called early on, and isn't wrapped by
@@ -137,7 +170,7 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
 	int out;
 
 	ftrace_arch_code_modify_prepare();
-	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
+	out = __ftrace_init_nop(mod, rec, MCOUNT_ADDR);
 	ftrace_arch_code_modify_post_process();
 
 	return out;
@@ -160,17 +193,14 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 		       unsigned long addr)
 {
-	unsigned int call[2];
-	unsigned long caller = rec->ip + FUNC_ENTRY_JMP;
 	int ret;
 
-	make_call(caller, old_addr, call);
-	ret = ftrace_check_current_call(caller, call);
+	ret = ftrace_check_current_call(rec->ip, old_addr);
 
 	if (ret)
 		return ret;
 
-	return __ftrace_modify_call(caller, addr, true);
+	return __ftrace_modify_call(rec->ip, addr, true);
 }
 #endif
 
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index d171eca623b6..6b7be23d02a0 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -13,7 +13,7 @@
 
 	.text
 
-#define FENTRY_RA_OFFSET	12
+#define FENTRY_RA_OFFSET	24
 #define ABI_SIZE_ON_STACK	72
 #define ABI_A0			0
 #define ABI_A1			8
@@ -25,7 +25,12 @@
 #define ABI_A7			56
 #define ABI_RA			64
 
+# t0 points to return of ftrace
+# ra points to the return address of traced function
+
 	.macro SAVE_ABI
+	REG_S	ra, -SZREG(sp)
+	mv	ra, t0
 	addi	sp, sp, -SZREG
 	addi	sp, sp, -ABI_SIZE_ON_STACK
 
@@ -53,10 +58,14 @@
 
 	addi	sp, sp, ABI_SIZE_ON_STACK
 	addi	sp, sp, SZREG
+	mv	t0, ra
+	REG_L	ra, -SZREG(sp)
 	.endm
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 	.macro SAVE_ALL
+	REG_S	ra, -SZREG(sp)
+	mv	ra, t0
 	addi	sp, sp, -SZREG
 	addi	sp, sp, -PT_SIZE_ON_STACK
 
@@ -138,6 +147,8 @@
 
 	addi	sp, sp, PT_SIZE_ON_STACK
 	addi	sp, sp, SZREG
+	mv	t0, ra # t0 is equal to ra here
+	REG_L	ra, -SZREG(sp)
 	.endm
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 
@@ -150,9 +161,9 @@ ENTRY(ftrace_caller)
 	REG_L	a1, ABI_SIZE_ON_STACK(sp)
 	mv	a3, sp
 
-ftrace_call:
-	.global ftrace_call
-	call	ftrace_stub
+ftrace_call_site:
+	REG_L	ra, ftrace_call
+	jalr	0(ra)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	addi	a0, sp, ABI_SIZE_ON_STACK
@@ -161,12 +172,12 @@ ftrace_call:
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
 	mv	a2, s0
 #endif
-ftrace_graph_call:
-	.global ftrace_graph_call
-	call	ftrace_stub
+ftrace_graph_call_site:
+	REG_L	ra, ftrace_graph_call
+	jalr	0(ra)
 #endif
 	RESTORE_ABI
-	ret
+	jr	t0
 ENDPROC(ftrace_caller)
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
@@ -179,9 +190,9 @@ ENTRY(ftrace_regs_caller)
 	REG_L	a1, PT_SIZE_ON_STACK(sp)
 	mv	a3, sp
 
-ftrace_regs_call:
-	.global ftrace_regs_call
-	call	ftrace_stub
+ftrace_regs_call_site:
+	REG_L	ra, ftrace_regs_call
+	jalr	0(ra)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	addi	a0, sp, PT_RA
@@ -190,12 +201,33 @@ ftrace_regs_call:
 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
 	mv	a2, s0
 #endif
-ftrace_graph_regs_call:
-	.global ftrace_graph_regs_call
-	call	ftrace_stub
+ftrace_graph_regs_call_site:
+	REG_L	ra, ftrace_graph_regs_call
+	jalr	0(ra)
 #endif
 
 	RESTORE_ALL
-	ret
+	jr	t0
 ENDPROC(ftrace_regs_caller)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+.align RISCV_SZPTR
+ftrace_call:
+	.global ftrace_call
+	RISCV_PTR ftrace_stub
+
+.align RISCV_SZPTR
+ftrace_graph_call:
+	.global ftrace_graph_call
+	RISCV_PTR ftrace_stub
+
+.align RISCV_SZPTR
+ftrace_regs_call:
+	.global ftrace_regs_call
+	RISCV_PTR ftrace_stub
+
+.align RISCV_SZPTR
+ftrace_graph_regs_call:
+	.global ftrace_graph_regs_call
+	RISCV_PTR ftrace_stub
+
-- 
2.36.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code
  2022-05-29 14:33 [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V Andy Chiu
                   ` (2 preceding siblings ...)
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption Andy Chiu
@ 2022-05-29 14:33 ` Andy Chiu
  2022-06-05 16:14   ` Steven Rostedt
  2022-07-21  5:40   ` Palmer Dabbelt
  3 siblings, 2 replies; 10+ messages in thread
From: Andy Chiu @ 2022-05-29 14:33 UTC (permalink / raw)
  To: linux-riscv, palmer, mingo, paul.walmsley, aou, rostedt
  Cc: guoren, Andy Chiu, Greentime Hu, Zong Li

Now it is safe to remove dependency from stop_machine() to patch code in
ftrace.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
---
 arch/riscv/kernel/ftrace.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index d4bf0e5255f6..e7b8bf0a699b 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -12,6 +12,11 @@
 #include <asm/patch.h>
 
 #ifdef CONFIG_DYNAMIC_FTRACE
+void arch_ftrace_update_code(int command)
+{
+	ftrace_modify_all_code(command);
+}
+
 int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
 {
 	mutex_lock(&text_mutex);
-- 
2.36.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code Andy Chiu
@ 2022-06-05 16:14   ` Steven Rostedt
  2022-07-21  5:40   ` Palmer Dabbelt
  1 sibling, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2022-06-05 16:14 UTC (permalink / raw)
  To: Andy Chiu
  Cc: linux-riscv, palmer, mingo, paul.walmsley, aou, guoren,
	Greentime Hu, Zong Li

On Sun, 29 May 2022 22:33:15 +0800
Andy Chiu <andy.chiu@sifive.com> wrote:

> Now it is safe to remove dependency from stop_machine() to patch code in
> ftrace.
> 

I wasn't able to test it, and I don't know the intrinsic details of
risc-v, but from the change logs and looking at the code from what I
could understand, it all seems reasonable to me.

For the entire patch set:

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve


> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
> Reviewed-by: Zong Li <zong.li@sifive.com>
> ---
>  arch/riscv/kernel/ftrace.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index d4bf0e5255f6..e7b8bf0a699b 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -12,6 +12,11 @@
>  #include <asm/patch.h>
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +void arch_ftrace_update_code(int command)
> +{
> +	ftrace_modify_all_code(command);
> +}
> +
>  int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
>  {
>  	mutex_lock(&text_mutex);


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
@ 2022-07-21  5:40   ` Palmer Dabbelt
  0 siblings, 0 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2022-07-21  5:40 UTC (permalink / raw)
  To: andy.chiu
  Cc: linux-riscv, mingo, Paul Walmsley, aou, rostedt, guoren,
	andy.chiu, greentime.hu, zong.li

On Sun, 29 May 2022 07:33:12 PDT (-0700), andy.chiu@sifive.com wrote:
> We are introducing a new ftrace mechanism in order to phase out
> stop_machine() and enable kernel preemption. The new mechanism requires
> ftrace patchable function entries to be 24 bytes and aligned to 4 Byte
> boundaries.
>
> Before applying this patch, the size of the kernel code, with 43432 of
> ftrace entries, was at 12.31 MB. Under the same configuration, the size
> has increased to 12.68 MB after applying this patch set.
>
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
> Reviewed-by: Zong Li <zong.li@sifive.com>
> ---
>  arch/riscv/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 1f19bdac6767..cb6cfdffc49c 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -11,7 +11,7 @@ LDFLAGS_vmlinux :=
>  ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
>  	LDFLAGS_vmlinux := --no-relax
>  	KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> -	CC_FLAGS_FTRACE := -fpatchable-function-entry=8
> +	CC_FLAGS_FTRACE := -fpatchable-function-entry=12  -falign-functions=4
>  endif
>
>  ifeq ($(CONFIG_CMODEL_MEDLOW),y)

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write Andy Chiu
@ 2022-07-21  5:40   ` Palmer Dabbelt
  0 siblings, 0 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2022-07-21  5:40 UTC (permalink / raw)
  To: andy.chiu
  Cc: linux-riscv, mingo, Paul Walmsley, aou, rostedt, guoren,
	andy.chiu, greentime.hu, zong.li

On Sun, 29 May 2022 07:33:13 PDT (-0700), andy.chiu@sifive.com wrote:
> So that we may patch code without issuing fence.i
>
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
> Reviewed-by: Zong Li <zong.li@sifive.com>
> ---
>  arch/riscv/include/asm/patch.h | 1 +
>  arch/riscv/kernel/patch.c      | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
> index 9a7d7346001e..327e99114d67 100644
> --- a/arch/riscv/include/asm/patch.h
> +++ b/arch/riscv/include/asm/patch.h
> @@ -6,6 +6,7 @@
>  #ifndef _ASM_RISCV_PATCH_H
>  #define _ASM_RISCV_PATCH_H
>
> +int patch_insn_write(void *addr, const void *insn, size_t len);
>  int patch_text_nosync(void *addr, const void *insns, size_t len);
>  int patch_text(void *addr, u32 insn);
>
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 0b552873a577..834855cfd0fb 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -49,7 +49,7 @@ static void patch_unmap(int fixmap)
>  }
>  NOKPROBE_SYMBOL(patch_unmap);
>
> -static int patch_insn_write(void *addr, const void *insn, size_t len)
> +int patch_insn_write(void *addr, const void *insn, size_t len)
>  {
>  	void *waddr = addr;
>  	bool across_pages = (((uintptr_t) addr & ~PAGE_MASK) + len) > PAGE_SIZE;
> @@ -78,7 +78,7 @@ static int patch_insn_write(void *addr, const void *insn, size_t len)
>  }
>  NOKPROBE_SYMBOL(patch_insn_write);
>  #else
> -static int patch_insn_write(void *addr, const void *insn, size_t len)
> +int patch_insn_write(void *addr, const void *insn, size_t len)
>  {
>  	return copy_to_kernel_nofault(addr, insn, len);
>  }

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption Andy Chiu
@ 2022-07-21  5:40   ` Palmer Dabbelt
  0 siblings, 0 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2022-07-21  5:40 UTC (permalink / raw)
  To: andy.chiu
  Cc: linux-riscv, mingo, Paul Walmsley, aou, rostedt, guoren,
	andy.chiu, greentime.hu, zong.li

On Sun, 29 May 2022 07:33:14 PDT (-0700), andy.chiu@sifive.com wrote:
> In RISCV, we must use an AUIPC + JALR pair to encode an immediate,
> forming a jump that jumps to an address over 4K. This may cause errors
> if we want to enable kernel preemption and remove dependency from
> patching code with stop_machine(). For example, if a task was switched
> out on auipc. And, if we changed the ftrace function before it was
> switched back, then it would jump to an address that has updated 11:0
> bits mixing with previous XLEN:12 part.
>
> p: patched area performed by dynamic ftrace
> ftrace_prologue:
> p|	REG_S	ra, -SZREG(sp)
> p|	auipc	ra, 0x? ------------> preempted
> 					...
> 				change ftrace function
> 					...
> p|	jalr	-?(ra) <------------- switched back
> p|	REG_L	ra, -SZREG(sp)
> func:
> 	xxx
> 	ret
>
> To prevent such condition, we proposed a way to load or store target
> addresses atomically. We store a natually aligned 32-bit relative
> address into each ftrace prologue and use a jump at front to decide
> whether we should take ftrace detour. To reduce footprint of ftrace
> prologues, we clobber t0 and t1 and move ra (re-)storing into
> ftrace_{regs_}caller. This is similar to ARM64, which also clobbers x9 at
> each prologue.
>
> .align 4
> ftrace_prologue:
> p|	j	ftrace_cont <----> func
> p|	.word	0x? <=== store word to a 4B aligned address can be
> 			 considered atomic to read sides using load word
> ftrace_cont:
> 	auipc	t0, 0
> 	lw	t1, -4(t0) <=== read side
> 	add	t0, t0, t1
> 	jalr	t0, t0
> func:
> 	xxx
> 	ret
>
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
> Reviewed-by: Zong Li <zong.li@sifive.com>
> ---
>  arch/riscv/include/asm/ftrace.h |  24 ------
>  arch/riscv/kernel/ftrace.c      | 130 ++++++++++++++++++++------------
>  arch/riscv/kernel/mcount-dyn.S  |  62 +++++++++++----
>  3 files changed, 127 insertions(+), 89 deletions(-)
>
> diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
> index 04dad3380041..eaa611e491fc 100644
> --- a/arch/riscv/include/asm/ftrace.h
> +++ b/arch/riscv/include/asm/ftrace.h
> @@ -47,30 +47,6 @@ struct dyn_arch_ftrace {
>   */
>
>  #define MCOUNT_ADDR		((unsigned long)MCOUNT_NAME)
> -#define JALR_SIGN_MASK		(0x00000800)
> -#define JALR_OFFSET_MASK	(0x00000fff)
> -#define AUIPC_OFFSET_MASK	(0xfffff000)
> -#define AUIPC_PAD		(0x00001000)
> -#define JALR_SHIFT		20
> -#define JALR_BASIC		(0x000080e7)
> -#define AUIPC_BASIC		(0x00000097)
> -#define NOP4			(0x00000013)
> -
> -#define make_call(caller, callee, call)					\
> -do {									\
> -	call[0] = to_auipc_insn((unsigned int)((unsigned long)callee -	\
> -				(unsigned long)caller));		\
> -	call[1] = to_jalr_insn((unsigned int)((unsigned long)callee -	\
> -			       (unsigned long)caller));			\
> -} while (0)
> -
> -#define to_jalr_insn(offset)						\
> -	(((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_BASIC)
> -
> -#define to_auipc_insn(offset)						\
> -	((offset & JALR_SIGN_MASK) ?					\
> -	(((offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_BASIC) :	\
> -	((offset & AUIPC_OFFSET_MASK) | AUIPC_BASIC))
>
>  /*
>   * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here.
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 4716f4cdc038..d4bf0e5255f6 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -25,31 +25,29 @@ int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
>  }
>
>  static int ftrace_check_current_call(unsigned long hook_pos,
> -				     unsigned int *expected)
> +				     unsigned long expected_addr)
>  {
> -	unsigned int replaced[2];
> -	unsigned int nops[2] = {NOP4, NOP4};
> +	unsigned long replaced;
>
> -	/* we expect nops at the hook position */
> -	if (!expected)
> -		expected = nops;
> +	/* we expect ftrace_stub at the hook position */
> +	if (!expected_addr)
> +		expected_addr = (unsigned long) ftrace_stub;
>
>  	/*
>  	 * Read the text we want to modify;
>  	 * return must be -EFAULT on read error
>  	 */
> -	if (copy_from_kernel_nofault(replaced, (void *)hook_pos,
> -			MCOUNT_INSN_SIZE))
> +	if (copy_from_kernel_nofault(&replaced, (void *)hook_pos,
> +			(sizeof(unsigned long))))
>  		return -EFAULT;
>
>  	/*
>  	 * Make sure it is what we expect it to be;
>  	 * return must be -EINVAL on failed comparison
>  	 */
> -	if (memcmp(expected, replaced, sizeof(replaced))) {
> -		pr_err("%p: expected (%08x %08x) but got (%08x %08x)\n",
> -		       (void *)hook_pos, expected[0], expected[1], replaced[0],
> -		       replaced[1]);
> +	if (expected_addr != replaced) {
> +		pr_err("%p: expected (%016lx) but got (%016lx)\n",
> +		       (void *)hook_pos, expected_addr, replaced);
>  		return -EINVAL;
>  	}
>
> @@ -59,55 +57,80 @@ static int ftrace_check_current_call(unsigned long hook_pos,
>  static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target,
>  				bool enable)
>  {
> -	unsigned int call[2];
> -	unsigned int nops[2] = {NOP4, NOP4};
> +	unsigned long call = target;
> +	unsigned long nops = (unsigned long)ftrace_stub;
>
> -	make_call(hook_pos, target, call);
> -
> -	/* Replace the auipc-jalr pair at once. Return -EPERM on write error. */
> +	/* Replace the target address at once. Return -EPERM on write error. */
>  	if (patch_text_nosync
> -	    ((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE))
> +	    ((void *)hook_pos, enable ? &call : &nops, sizeof(unsigned long)))
>  		return -EPERM;
>
>  	return 0;
>  }
>
>  /*
> - * Put 5 instructions with 16 bytes at the front of function within
> - * patchable function entry nops' area.
> - *
> - * 0: REG_S  ra, -SZREG(sp)
> - * 1: auipc  ra, 0x?
> - * 2: jalr   -?(ra)
> - * 3: REG_L  ra, -SZREG(sp)
> + * Put 5 instructions and a 32-bit relative address to INSN1 with 24-byte at the
> + * front of function within patchable function entry nops' area.
>   *
>   * So the opcodes is:
> - * 0: 0xfe113c23 (sd)/0xfe112e23 (sw)
> - * 1: 0x???????? -> auipc
> - * 2: 0x???????? -> jalr
> - * 3: 0xff813083 (ld)/0xffc12083 (lw)
> + * INSN0_NOP: J     PC + 0x18 (jump to the real function entry)
> + * INSN0    : J     PC + 0x08 (jump to the fisrt insn)

You could cut out a few instructions here by using a full 64-bit address 
and generating the PC by replacing the jump-past-trampoline with a "jal 
t0, ...", but it's probably best to sort out the ordering issues first.

> + * INSN1    : AUIPC T0, 0
> + * INSN2    : LW    T1, -4(T0)
> + * INSN3    : ADD   T0, T0, T1
> + * INSN4    : JALR  T0, T0
> + *
> + * At runtime, we want to patch the jump target atomically in order to work with
> + * kernel preemption. If we patched with a pair of AUIPC + JALR and a task was
> + * preempted after loading upper bits with AUIPC. Then things would mess up if
> + * we updated the jump target before the task were switched back.
> + *
> + * We also want to align all patchable function entries and, thus, the jump
> + * offset to a 4 Bytes aligned address so that each of them could be natually
> + * updated and observed by patching and running cores.
> + *
> + * | ADDR   | COMPILED | DISABLED         | ENABLED                |
> + * +--------+----------+------------------+------------------------+
> + * | 0x00   | NOP      | J     FUNC       | J  FTRACE              | <-
> + * | 0x04   | NOP      | XXXXXXXX         | 32-bit rel-jump offset | <- 4B aligned
> + * | FTRACE | NOP      | AUIPC T0, 0                               |
> + * | 0x0c   | NOP      | LW    T1, -4(T0)                          |
> + * | 0x10   | NOP      | ADD   T0, T0, T1                          |
> + * | 0x14   | NOP      | JALR  T0, T0                              |
> + * | FUNC   | X                                                    |
>   */
> -#if __riscv_xlen == 64
> -#define INSN0	0xfe113c23
> -#define INSN3	0xff813083
> -#elif __riscv_xlen == 32
> -#define INSN0	0xfe112e23
> -#define INSN3	0xffc12083
> -#endif
> +#define INSN0_NOP	0x0180006f
> +#define INSN0		0x0080006f
> +#define INSN1		0x00000297
> +#define INSN2		0xffc2a303
> +#define INSN3		0x006282b3
> +#define INSN4		0x000282e7
> +#define INSN_SIZE	4
> +
> +#define FUNC_ENTRY_JMP	8
>
> -#define FUNC_ENTRY_SIZE	16
> -#define FUNC_ENTRY_JMP	4
> +struct ftrace_prologue {
> +	unsigned int insn0;
> +	unsigned int jmp_target;
> +};
>
>  int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  {
> -	unsigned int call[4] = {INSN0, 0, 0, INSN3};
> +	struct ftrace_prologue call = { .insn0 = INSN0 };
>  	unsigned long target = addr;
>  	unsigned long caller = rec->ip + FUNC_ENTRY_JMP;
>
> -	call[1] = to_auipc_insn((unsigned int)(target - caller));
> -	call[2] = to_jalr_insn((unsigned int)(target - caller));
> +	call.jmp_target = (unsigned int)(target - caller);
> +	patch_insn_write(&((struct ftrace_prologue *)rec->ip)->jmp_target,
> +			 &call.jmp_target, sizeof(call.jmp_target));
> +
> +	/*
> +	 * Make sure other cores do not get an out-dated jump target after it
> +	 * jumps to INSN1.
> +	 */
> +	smp_wmb();

We're definately in a grey area here, but I don't see the ISA manual 
requiring this behavior from this fence: this orders the stores, but it 
doesn't even prevent reordering on the load side (and we don't have a 
load->load dependency here, but a fetch->load dependency).  I'm not 
entirely sure what the best way to enfornce this ordering is, as by the 
time we're executing any code on a remote hart we've gone through enough 
fences it doesn't really matter any more.

Without the ordering, there are three possible executions:

* Take the jump past the trampoline.  This is safe.
* Take the jump into the trampoline, then load the target offset.  This 
  is also safe.
* Take the jump into the trampoline, then load the old target offset.  
  This isn't safe, as the address may be invalid (it starts at 0, for 
  example).

I can't really figure out why hardware would produce that last case, but 
I also can't find anything that forbids it in the spec.  The whole 
instruction fetch ordering stuff is pretty loosly written so I'm not 
sure how strong of an argument that is, but we'd at least need some 
testing on actual hardware to sort out whether this works in practice.

That said, I think we can just avoid the problem entirely by making sure 
that last case executes correctly: just fill the initial data with a 
stub routine that does a full data fence (and a fence.i too) and 
re-tries the load.  It's still kind of vague when it comes to ISA 
requirements, but we've got this dependency baked into a bunch of other 
software and it's the biggest fence we can do so it's not like there's 
any better options.

> -	if (patch_text_nosync((void *)rec->ip, call, FUNC_ENTRY_SIZE))
> +	if (patch_text_nosync((void *)rec->ip, &call, INSN_SIZE))
>  		return -EPERM;
>
>  	return 0;
> @@ -116,14 +139,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>  		    unsigned long addr)
>  {
> -	unsigned int nops[4] = {NOP4, NOP4, NOP4, NOP4};
> +	unsigned int nops[1] = {INSN0_NOP};
>
> -	if (patch_text_nosync((void *)rec->ip, nops, FUNC_ENTRY_SIZE))
> +	if (patch_text_nosync((void *)rec->ip, nops, INSN_SIZE))
>  		return -EPERM;
>
>  	return 0;
>  }
>
> +int __ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec,
> +		    unsigned long addr)
> +{
> +	unsigned int nops[6] = {INSN0_NOP, 0, INSN1, INSN2, INSN3, INSN4};
> +
> +	if (patch_text_nosync((void *)rec->ip, nops, sizeof(nops)))
> +		return -EPERM;
> +
> +	return 0;
> +}
>
>  /*
>   * This is called early on, and isn't wrapped by
> @@ -137,7 +170,7 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
>  	int out;
>
>  	ftrace_arch_code_modify_prepare();
> -	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
> +	out = __ftrace_init_nop(mod, rec, MCOUNT_ADDR);
>  	ftrace_arch_code_modify_post_process();
>
>  	return out;
> @@ -160,17 +193,14 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>  		       unsigned long addr)
>  {
> -	unsigned int call[2];
> -	unsigned long caller = rec->ip + FUNC_ENTRY_JMP;
>  	int ret;
>
> -	make_call(caller, old_addr, call);
> -	ret = ftrace_check_current_call(caller, call);
> +	ret = ftrace_check_current_call(rec->ip, old_addr);
>
>  	if (ret)
>  		return ret;
>
> -	return __ftrace_modify_call(caller, addr, true);
> +	return __ftrace_modify_call(rec->ip, addr, true);
>  }
>  #endif
>
> diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
> index d171eca623b6..6b7be23d02a0 100644
> --- a/arch/riscv/kernel/mcount-dyn.S
> +++ b/arch/riscv/kernel/mcount-dyn.S
> @@ -13,7 +13,7 @@
>
>  	.text
>
> -#define FENTRY_RA_OFFSET	12
> +#define FENTRY_RA_OFFSET	24
>  #define ABI_SIZE_ON_STACK	72
>  #define ABI_A0			0
>  #define ABI_A1			8
> @@ -25,7 +25,12 @@
>  #define ABI_A7			56
>  #define ABI_RA			64
>
> +# t0 points to return of ftrace
> +# ra points to the return address of traced function
> +
>  	.macro SAVE_ABI
> +	REG_S	ra, -SZREG(sp)
> +	mv	ra, t0
>  	addi	sp, sp, -SZREG
>  	addi	sp, sp, -ABI_SIZE_ON_STACK
>
> @@ -53,10 +58,14 @@
>
>  	addi	sp, sp, ABI_SIZE_ON_STACK
>  	addi	sp, sp, SZREG
> +	mv	t0, ra
> +	REG_L	ra, -SZREG(sp)
>  	.endm
>
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
>  	.macro SAVE_ALL
> +	REG_S	ra, -SZREG(sp)
> +	mv	ra, t0
>  	addi	sp, sp, -SZREG
>  	addi	sp, sp, -PT_SIZE_ON_STACK
>
> @@ -138,6 +147,8 @@
>
>  	addi	sp, sp, PT_SIZE_ON_STACK
>  	addi	sp, sp, SZREG
> +	mv	t0, ra # t0 is equal to ra here
> +	REG_L	ra, -SZREG(sp)
>  	.endm
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
>
> @@ -150,9 +161,9 @@ ENTRY(ftrace_caller)
>  	REG_L	a1, ABI_SIZE_ON_STACK(sp)
>  	mv	a3, sp
>
> -ftrace_call:
> -	.global ftrace_call
> -	call	ftrace_stub
> +ftrace_call_site:
> +	REG_L	ra, ftrace_call
> +	jalr	0(ra)
>
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	addi	a0, sp, ABI_SIZE_ON_STACK
> @@ -161,12 +172,12 @@ ftrace_call:
>  #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
>  	mv	a2, s0
>  #endif
> -ftrace_graph_call:
> -	.global ftrace_graph_call
> -	call	ftrace_stub
> +ftrace_graph_call_site:
> +	REG_L	ra, ftrace_graph_call
> +	jalr	0(ra)
>  #endif
>  	RESTORE_ABI
> -	ret
> +	jr	t0
>  ENDPROC(ftrace_caller)
>
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> @@ -179,9 +190,9 @@ ENTRY(ftrace_regs_caller)
>  	REG_L	a1, PT_SIZE_ON_STACK(sp)
>  	mv	a3, sp
>
> -ftrace_regs_call:
> -	.global ftrace_regs_call
> -	call	ftrace_stub
> +ftrace_regs_call_site:
> +	REG_L	ra, ftrace_regs_call
> +	jalr	0(ra)
>
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	addi	a0, sp, PT_RA
> @@ -190,12 +201,33 @@ ftrace_regs_call:
>  #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
>  	mv	a2, s0
>  #endif
> -ftrace_graph_regs_call:
> -	.global ftrace_graph_regs_call
> -	call	ftrace_stub
> +ftrace_graph_regs_call_site:
> +	REG_L	ra, ftrace_graph_regs_call
> +	jalr	0(ra)
>  #endif
>
>  	RESTORE_ALL
> -	ret
> +	jr	t0
>  ENDPROC(ftrace_regs_caller)
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
> +
> +.align RISCV_SZPTR
> +ftrace_call:
> +	.global ftrace_call
> +	RISCV_PTR ftrace_stub
> +
> +.align RISCV_SZPTR
> +ftrace_graph_call:
> +	.global ftrace_graph_call
> +	RISCV_PTR ftrace_stub
> +
> +.align RISCV_SZPTR
> +ftrace_regs_call:
> +	.global ftrace_regs_call
> +	RISCV_PTR ftrace_stub
> +
> +.align RISCV_SZPTR
> +ftrace_graph_regs_call:
> +	.global ftrace_graph_regs_call
> +	RISCV_PTR ftrace_stub
> +

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code
  2022-05-29 14:33 ` [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code Andy Chiu
  2022-06-05 16:14   ` Steven Rostedt
@ 2022-07-21  5:40   ` Palmer Dabbelt
  1 sibling, 0 replies; 10+ messages in thread
From: Palmer Dabbelt @ 2022-07-21  5:40 UTC (permalink / raw)
  To: andy.chiu
  Cc: linux-riscv, mingo, Paul Walmsley, aou, rostedt, guoren,
	andy.chiu, greentime.hu, zong.li

On Sun, 29 May 2022 07:33:15 PDT (-0700), andy.chiu@sifive.com wrote:
> Now it is safe to remove dependency from stop_machine() to patch code in
> ftrace.
>
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
> Reviewed-by: Zong Li <zong.li@sifive.com>
> ---
>  arch/riscv/kernel/ftrace.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index d4bf0e5255f6..e7b8bf0a699b 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -12,6 +12,11 @@
>  #include <asm/patch.h>
>
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +void arch_ftrace_update_code(int command)
> +{
> +	ftrace_modify_all_code(command);
> +}
> +
>  int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
>  {
>  	mutex_lock(&text_mutex);

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

Assuming we sort out the issues in #3.  IIUC we don't need to ensure the 
command is fully visible before returning, as the concurrent 
modification scheme doesn't flush the remote icaches and thus may skip 
instrumentation.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-21  5:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-29 14:33 [RFC PATCH riscv/for-next 0/4] Enable ftrace with kernel preemption for RISC-V Andy Chiu
2022-05-29 14:33 ` [RFC PATCH riscv/for-next 1/4] riscv: align ftrace to 4 Byte boundary and increase ftrace prologue size Andy Chiu
2022-07-21  5:40   ` Palmer Dabbelt
2022-05-29 14:33 ` [RFC PATCH riscv/for-next 2/4] riscv: export patch_insn_write Andy Chiu
2022-07-21  5:40   ` Palmer Dabbelt
2022-05-29 14:33 ` [RFC PATCH riscv/for-next 3/4] riscv: ftrace: use indirect jump to work with kernel preemption Andy Chiu
2022-07-21  5:40   ` Palmer Dabbelt
2022-05-29 14:33 ` [RFC PATCH riscv/for-next 4/4] riscv: ftrace: do not use stop_machine to update code Andy Chiu
2022-06-05 16:14   ` Steven Rostedt
2022-07-21  5:40   ` Palmer Dabbelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.