linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/13] Add OPTPROBES feature on RISCV
@ 2023-01-27 13:05 Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES Chen Guokai
                   ` (14 more replies)
  0 siblings, 15 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Add jump optimization support for RISC-V.

Replaces ebreak instructions used by normal kprobes with an AUIPC/JALR
instruction pair with the aim of suppressing the probe-hit overhead.

All known optprobe-capable RISC architectures have been using a single
jump or branch instructions while this patch chooses not. RISC-V has a
quite limited jump range (4KB or 2MB) for both its branch and jump
instructions, which prevent optimizations from supporting probes that
spread all over the kernel.

AUIPC/JALR instruction pair is introduced with a much wider jump range
(4GB), where AUIPC loads the upper 12 bits to a free register and JALR
Deaconappends the lower 20 bits to form a 32 bits immediate. Note that
returns from probe handler require another free register. As kprobes
can appear almost anywhere inside the kernel, the free register should
be found generically, not depending on calling convention or any other
regulations.

The algorithm for finding the free register is inspired by the register
renaming in modern processors. From the perspective of register
renaming, a register could be represented as two different registers if
two neighbor instructions both write to it but no one ever reads it.
Extending this fact, a register is considered to be free if there is no
read before its next write in the execution flow. We are free to change
its value without interfering normal execution.

Static analysis shows that 51% of instructions of the kernel (default
config) is capable of being replaced i.e. one free register can be found
at both the start and end of replaced instruction pairs while the
replaced instructions can be directly executed. We also made an
efficiency test on Gem 5 RISCV which shows a more than 5x speedup on 
breakpoint-based implementation.

Contribution:
Chen Guokai invents the algorithm for searching free register, evaluate
the ratio of optimization, the basic function support RVI kernel binary.
Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix
some bugs with different kernel configure, refactor out the entire
feature into some individual patches.

v6:
1. Correct grammar and spelling errors in commit and comment.
2. Add instruction boundary check for RVI/RVC hybrid kernel.
3. Use addi/c.addi instead of 'nop/c.nop' in the detour assembly
   template.
4. Fix the instruction simulation of JALR.
5. Mark some symbols used in the path of kprobe and uprobe handler as
   NOKPROBE.
6. Add one selftest testcase that cover more complex opcode pattern in
   the code of decoding instruction and searching free register.
7. Run all tests in tools/testing/selftests/ftrace on RISCV64 QEMU
   platform, no regression.
8. Run with the CONFIG_KPROBES_SANITY_TEST module on RISCV64 QEMU
   platform, no regression.

v5:
1. Correct known nits
2. Enable the usage of unused caller-saved registers
3. Append an efficiency test result on Gem 5

v4:
Correct the sequence of Signed-off-by and Co-developed-by.

v3:
1. Support of hybrid RVI and RVC kernel binary.
2. Refactor out entire feature into some individual patches.

v2:
1. Adjust comments
2. Remove improper copyright
3. Clean up format issues that is no common practice
4. Extract common definition of instruction decoder
5. Fix race issue in SMP platform.

v1:
Chen Guokai contribute the basic functionality code.

Chen Guokai (1):
  riscv/kprobe: Search free registers from unused caller-saved ones

Liao Chang (12):
  riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES
  riscv/kprobe: Allocate detour buffer from module region
  riscv/kprobe: Add skeleton for preparing optimized kprobe
  riscv/kprobe: Add common RVI and RVC instruction decoder code
  riscv/kprobe: Introduce free register(s) searching algorithm
  riscv/kprobe: Add code to check if kprobe can be optimized
  riscv/kprobe: Prepare detour buffer for optimized kprobe
  riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe
  riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel
  riscv/kprobe: Fix instruction simulation of JALR
  riscv/kprobe: Move exception related symbols to .kprobe_blacklist
  selftest/kprobes: Add testcase for kprobe SYM[+offs]

 arch/riscv/Kconfig                            |   1 +
 arch/riscv/include/asm/asm.h                  |  10 +
 arch/riscv/include/asm/bug.h                  |   5 +-
 arch/riscv/include/asm/kprobes.h              |  49 ++
 arch/riscv/include/asm/patch.h                |   1 +
 arch/riscv/kernel/entry.S                     |  12 +
 arch/riscv/kernel/mcount.S                    |   1 +
 arch/riscv/kernel/patch.c                     |  23 +-
 arch/riscv/kernel/probes/Makefile             |   1 +
 arch/riscv/kernel/probes/decode-insn.h        | 177 +++++
 arch/riscv/kernel/probes/kprobes.c            |  48 +-
 arch/riscv/kernel/probes/opt.c                | 684 ++++++++++++++++++
 arch/riscv/kernel/probes/opt_trampoline.S     | 137 ++++
 arch/riscv/kernel/probes/simulate-insn.c      |   6 +-
 arch/riscv/kernel/probes/simulate-insn.h      |  42 ++
 .../ftrace/test.d/kprobe/kprobe_sym_offs.tc   |  49 ++
 16 files changed, 1235 insertions(+), 11 deletions(-)
 create mode 100644 arch/riscv/kernel/probes/opt.c
 create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc

-- 
2.34.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 02/13] riscv/kprobe: Allocate detour buffer from module region Chen Guokai
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

Prepare skeleton to implement optimized kprobe on RISCV, although some
architecture specific functions are left blank, they do not change the
correctness of existing kprobe code, on account of these functions just
return zero. To avoid each patch being too complicated to review and
test, these functions will be implemented incrementally.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/Kconfig                        |  1 +
 arch/riscv/include/asm/kprobes.h          | 32 ++++++++++++++
 arch/riscv/kernel/probes/Makefile         |  1 +
 arch/riscv/kernel/probes/opt.c            | 51 +++++++++++++++++++++++
 arch/riscv/kernel/probes/opt_trampoline.S | 12 ++++++
 5 files changed, 97 insertions(+)
 create mode 100644 arch/riscv/kernel/probes/opt.c
 create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 9c687da7756d..48a639c7c055 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -101,6 +101,7 @@ config RISCV
 	select HAVE_KPROBES if !XIP_KERNEL
 	select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
 	select HAVE_KRETPROBES if !XIP_KERNEL
+	select HAVE_OPTPROBES if !XIP_KERNEL
 	select HAVE_RETHOOK if !XIP_KERNEL
 	select HAVE_MOVE_PMD
 	select HAVE_MOVE_PUD
diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index e7882ccb0fd4..96cd36e67e2e 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -41,5 +41,37 @@ int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 bool kprobe_breakpoint_handler(struct pt_regs *regs);
 bool kprobe_single_step_handler(struct pt_regs *regs);
 
+#ifdef CONFIG_OPTPROBES
+
+/* optinsn template addresses */
+extern __visible kprobe_opcode_t optprobe_template_entry[];
+extern __visible kprobe_opcode_t optprobe_template_end[];
+
+#define MAX_OPTINSN_SIZE				\
+	((unsigned long)optprobe_template_end -		\
+	 (unsigned long)optprobe_template_entry)
+
+/*
+ * For RVI and RVC hybrid encoding kernel, although long jump just needs
+ * 2 RVI instructions(AUIPC/JALR), optimized instructions are 10 bytes long
+ * at most to ensure no RVI would be truncated actually, so it means four
+ * combinations:
+ * - 2 RVI
+ * - 4 RVC
+ * - 2 RVC + 1 RVI
+ * - 3 RVC + 1 RVI (truncated, need padding)
+ */
+#define MAX_COPIED_INSN		4
+#define MAX_OPTIMIZED_LENGTH	10
+
+struct arch_optimized_insn {
+	kprobe_opcode_t copied_insn[MAX_COPIED_INSN];
+	/* detour code buffer */
+	kprobe_opcode_t *insn;
+	unsigned long length;
+	int rd;
+};
+
+#endif /* CONFIG_OPTPROBES */
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_RISCV_KPROBES_H */
diff --git a/arch/riscv/kernel/probes/Makefile b/arch/riscv/kernel/probes/Makefile
index c40139e9ca47..3d837eb5f9be 100644
--- a/arch/riscv/kernel/probes/Makefile
+++ b/arch/riscv/kernel/probes/Makefile
@@ -3,4 +3,5 @@ obj-$(CONFIG_KPROBES)		+= kprobes.o decode-insn.o simulate-insn.o
 obj-$(CONFIG_RETHOOK)		+= rethook.o rethook_trampoline.o
 obj-$(CONFIG_KPROBES_ON_FTRACE)	+= ftrace.o
 obj-$(CONFIG_UPROBES)		+= uprobes.o decode-insn.o simulate-insn.o
+obj-$(CONFIG_OPTPROBES)		+= opt.o opt_trampoline.o
 CFLAGS_REMOVE_simulate-insn.o = $(CC_FLAGS_FTRACE)
diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
new file mode 100644
index 000000000000..56c8a227c857
--- /dev/null
+++ b/arch/riscv/kernel/probes/opt.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ *  Kernel Probes Jump Optimization (Optprobes)
+ *
+ * Copyright (C) Guokai Chen, 2022
+ * Author: Guokai Chen chenguokai17@mails.ucas.ac.cn
+ */
+
+#define pr_fmt(fmt)	"optprobe: " fmt
+
+#include <linux/kprobes.h>
+#include <asm/kprobes.h>
+
+int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
+{
+	return 0;
+}
+
+int arch_check_optimized_kprobe(struct optimized_kprobe *op)
+{
+	return 0;
+}
+
+int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
+				  struct kprobe *orig)
+{
+	return 0;
+}
+
+void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
+{
+}
+
+void arch_optimize_kprobes(struct list_head *oplist)
+{
+}
+
+void arch_unoptimize_kprobes(struct list_head *oplist,
+			     struct list_head *done_list)
+{
+}
+
+void arch_unoptimize_kprobe(struct optimized_kprobe *op)
+{
+}
+
+int arch_within_optimized_kprobe(struct optimized_kprobe *op,
+				 kprobe_opcode_t *addr)
+{
+	return 0;
+}
diff --git a/arch/riscv/kernel/probes/opt_trampoline.S b/arch/riscv/kernel/probes/opt_trampoline.S
new file mode 100644
index 000000000000..16160c4367ff
--- /dev/null
+++ b/arch/riscv/kernel/probes/opt_trampoline.S
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2022 Guokai Chen
+ */
+
+#include <linux/linkage.h>
+
+#incldue <asm/csr.h>
+#include <asm/asm-offsets.h>
+
+SYM_ENTRY(optprobe_template_entry, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_ENTRY(optprobe_template_end, SYM_L_GLOBAL, SYM_A_NONE)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 02/13] riscv/kprobe: Allocate detour buffer from module region
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 03/13] riscv/kprobe: Add skeleton for preparing optimized kprobe Chen Guokai
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

To address the limitation of PC-relative branch instruction on riscv
architecture, detour buffer slot used for optprobes has to be allocated
at virtual address that can access from kernel and modules text via
AUIPC/JALR.

For the time being, the vmalloc region is far from kernel/modules text,
the distance between them is half of kernel address space [1], which
can't transfer control to 32-bit pc-relative address, hence it needs to
override the alloc_optinsn_page() to allocate detour buffer from module
region.

[1] Documentation/riscv/vm-layout.rst

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/kernel/probes/kprobes.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index f21592d20306..e1856b04db04 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -6,6 +6,7 @@
 #include <linux/extable.h>
 #include <linux/slab.h>
 #include <linux/stop_machine.h>
+#include <linux/set_memory.h>
 #include <asm/ptrace.h>
 #include <linux/uaccess.h>
 #include <asm/sections.h>
@@ -84,6 +85,29 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
 }
 
 #ifdef CONFIG_MMU
+#if defined(CONFIG_OPTPROBES) && defined(CONFIG_64BIT)
+void *alloc_optinsn_page(void)
+{
+	void *page;
+
+	page = __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR,
+				    MODULES_END, GFP_KERNEL,
+				    PAGE_KERNEL, 0, NUMA_NO_NODE,
+				    __builtin_return_address(0));
+	if (!page)
+		return NULL;
+
+	set_vm_flush_reset_perms(page);
+	/*
+	 * First make the page read-only, and only then make it executable to
+	 * prevent it from being W+X in between.
+	 */
+	set_memory_rox((unsigned long)page, 1);
+
+	return page;
+}
+#endif
+
 void *alloc_insn_page(void)
 {
 	return  __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 03/13] riscv/kprobe: Add skeleton for preparing optimized kprobe
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 02/13] riscv/kprobe: Allocate detour buffer from module region Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

The skeleton for preparing optprobe is consist of three major parts:

 - Check if kprobe satisfies the requirements of optimization.
 - Search two registers to form AUIPC/JALR instructions.
 - Prepare detour buffer for optimized kprobe.

To avoid introducing too much code in single patch just add some dummy
implementaion for compilation.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/kernel/probes/opt.c | 98 +++++++++++++++++++++++++++++++++-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index 56c8a227c857..c03cdb1512a6 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -10,6 +10,53 @@
 
 #include <linux/kprobes.h>
 #include <asm/kprobes.h>
+#include <asm/patch.h>
+
+static int in_auipc_jalr_range(long val)
+{
+#ifdef CONFIG_ARCH_RV32I
+	return 1;
+#else
+	/*
+	 * Note that the set of address offsets that can be formed
+	 * by pairing LUI with LD, AUIPC with JALR, etc. RV64I is
+	 * [−2^31−2^11, 2^31−2^11−1].
+	 */
+	return ((-(1L << 31) - (1L << 11)) <= val) &&
+	       (val < ((1L << 31) - (1L << 11)));
+#endif
+}
+
+/*
+ * Copy optprobe assembly code template into detour buffer and modify some
+ * instructions for each kprobe.
+ */
+static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
+				  int rd, struct optimized_kprobe *op,
+				  kprobe_opcode_t opcode)
+{
+}
+
+/*
+ * In RISC-V ISA, AUIPC/JALR clobber one register to form target address,
+ * inspired by register renaming in OoO processor, this involves search
+ * backward that is not previously used as a source register and is used
+ * as a destination register before any branch or jump instruction.
+ */
+static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
+				int *rd, int *ra)
+{
+}
+
+/*
+ * The kprobe based on breakpoint just requires the instrumented instruction
+ * supports execute out-of-line or simulation, besides that, optimized kprobe
+ * requires no near instruction jump to any instruction replaced by AUIPC/JALR.
+ */
+static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
+{
+	return false;
+}
 
 int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
 {
@@ -24,7 +71,56 @@ int arch_check_optimized_kprobe(struct optimized_kprobe *op)
 int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 				  struct kprobe *orig)
 {
-	return 0;
+	long rel;
+	int rd = 0, ra = 0, ret;
+	kprobe_opcode_t *code = NULL, *slot = NULL;
+
+	if (!can_optimize((unsigned long)orig->addr, op))
+		return -EILSEQ;
+
+	code = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
+	slot = get_optinsn_slot();
+	if (!code || !slot) {
+		ret = -ENOMEM;
+		goto on_error;
+	}
+
+	/* Check if the detour buffer is in the 32-bit pc-relative range. */
+	rel = (unsigned long)slot - (unsigned long)orig->addr;
+	if (!in_auipc_jalr_range(rel)) {
+		ret = -ERANGE;
+		goto on_error;
+	}
+
+	/*
+	 * Search two free registers, rd is used to form AUIPC/JALR jumping
+	 * to detour buffer, ra is used to form JR jumping back from detour
+	 * buffer.
+	 */
+	find_free_registers(orig, op, &rd, &ra);
+	if (rd == 0 || ra == 0) {
+		ret = -EILSEQ;
+		goto on_error;
+	}
+
+	op->optinsn.rd = rd;
+	prepare_detour_buffer(code, slot, ra, op, orig->opcode);
+
+	ret = patch_text_nosync((void *)slot, code, MAX_OPTINSN_SIZE);
+	if (!ret) {
+		op->optinsn.insn = slot;
+		kfree(code);
+		return 0;
+	}
+
+on_error:
+	if (slot) {
+		free_optinsn_slot(slot, 0);
+		op->optinsn.insn = NULL;
+		op->optinsn.length = 0;
+	}
+	kfree(code);
+	return ret;
 }
 
 void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (2 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 03/13] riscv/kprobe: Add skeleton for preparing optimized kprobe Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:29   ` Björn Töpel
  2023-02-02 10:16   ` Conor Dooley
  2023-01-27 13:05 ` [PATCH v6 05/13] riscv/kprobe: Introduce free register(s) searching algorithm Chen Guokai
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

These RVI and RVC instruction decoder are used in the free register
searching algorithm, each instruction of instrumented function needs to
decode and test if it contains a free register to form AUIPC/JALR.

For RVI instruction format, the position and length of rs1/rs2/rd/opcode
parts are uniform [1], but RVC instruction formats are complicated, so
it addresses a series of functions to decode rs1/rs2/rd for RVC [1].

[1] https://github.com/riscv/riscv-isa-manual/releases

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/include/asm/bug.h             |   5 +-
 arch/riscv/kernel/probes/decode-insn.h   | 148 +++++++++++++++++++++++
 arch/riscv/kernel/probes/simulate-insn.h |  42 +++++++
 3 files changed, 194 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
index 1aaea81fb141..9c33d3b58225 100644
--- a/arch/riscv/include/asm/bug.h
+++ b/arch/riscv/include/asm/bug.h
@@ -19,11 +19,14 @@
 #define __BUG_INSN_32	_UL(0x00100073) /* ebreak */
 #define __BUG_INSN_16	_UL(0x9002) /* c.ebreak */
 
+#define RVI_INSN_LEN	4UL
+#define RVC_INSN_LEN	2UL
+
 #define GET_INSN_LENGTH(insn)						\
 ({									\
 	unsigned long __len;						\
 	__len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ?	\
-		4UL : 2UL;						\
+		RVI_INSN_LEN : RVC_INSN_LEN;				\
 	__len;								\
 })
 
diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
index 42269a7d676d..785b023a62ea 100644
--- a/arch/riscv/kernel/probes/decode-insn.h
+++ b/arch/riscv/kernel/probes/decode-insn.h
@@ -3,6 +3,7 @@
 #ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H
 #define _RISCV_KERNEL_KPROBES_DECODE_INSN_H
 
+#include <linux/bitops.h>
 #include <asm/sections.h>
 #include <asm/kprobes.h>
 
@@ -15,4 +16,151 @@ enum probe_insn {
 enum probe_insn __kprobes
 riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);
 
+#ifdef CONFIG_KPROBES
+
+static inline u16 rvi_rs1(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 15) & 0x1f);
+}
+
+static inline u16 rvi_rs2(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 20) & 0x1f);
+}
+
+static inline u16 rvi_rd(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x1f);
+}
+
+static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)
+{
+	u32 imme = 0;
+
+	imme |= (((opcode >> 8)  & 0xf)   << 1)  |
+		(((opcode >> 25) & 0x3f)  << 5)  |
+		(((opcode >> 7)  & 0x1)   << 11) |
+		(((opcode >> 31) & 0x1)   << 12);
+
+	return sign_extend32(imme, 13);
+}
+
+static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)
+{
+	u32 imme = 0;
+
+	imme |= (((opcode >> 21) & 0x3ff) << 1)  |
+		(((opcode >> 20) & 0x1)   << 11) |
+		(((opcode >> 12) & 0xff)  << 12) |
+		(((opcode >> 31) & 0x1)   << 20);
+
+	return sign_extend32(imme, 21);
+}
+
+#ifdef CONFIG_RISCV_ISA_C
+static inline u16 rvc_r_rs1(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_r_rs2(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_r_rd(kprobe_opcode_t opcode)
+{
+	return rvc_r_rs1(opcode);
+}
+
+static inline u16 rvc_i_rs1(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x1f);
+}
+
+static inline u16 rvc_i_rd(kprobe_opcode_t opcode)
+{
+	return rvc_i_rs1(opcode);
+}
+
+static inline u16 rvc_ss_rs2(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_l_rd(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_l_rs(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_s_rs2(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_s_rs1(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_a_rs2(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_a_rs1(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_a_rd(kprobe_opcode_t opcode)
+{
+	return rvc_a_rs1(opcode);
+}
+
+static inline u16 rvc_b_rd(kprobe_opcode_t opcode)
+{
+	return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_b_rs(kprobe_opcode_t opcode)
+{
+	return rvc_b_rd(opcode);
+}
+
+static inline s32 rvc_branch_imme(kprobe_opcode_t opcode)
+{
+	u32 imme = 0;
+
+	imme |= (((opcode >> 3)  & 0x3) << 1) |
+		(((opcode >> 10) & 0x3) << 3) |
+		(((opcode >> 2)  & 0x1) << 5) |
+		(((opcode >> 5)  & 0x3) << 6) |
+		(((opcode >> 12) & 0x1) << 8);
+
+	return sign_extend32(imme, 9);
+}
+
+static inline s32 rvc_jal_imme(kprobe_opcode_t opcode)
+{
+	u32 imme = 0;
+
+	imme |= (((opcode >> 3)  & 0x3) << 1) |
+		(((opcode >> 11) & 0x1) << 4) |
+		(((opcode >> 2)  & 0x1) << 5) |
+		(((opcode >> 7)  & 0x1) << 6) |
+		(((opcode >> 6)  & 0x1) << 7) |
+		(((opcode >> 9)  & 0x3) << 8) |
+		(((opcode >> 8)  & 0x1) << 10) |
+		(((opcode >> 12) & 0x1) << 11);
+
+	return sign_extend32(imme, 12);
+}
+#endif /* CONFIG_KPROBES */
+#endif /* CONFIG_RISCV_ISA_C */
 #endif /* _RISCV_KERNEL_KPROBES_DECODE_INSN_H */
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index a19aaa0feb44..e89747dfabbb 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -28,4 +28,46 @@ bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
 bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
 bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
 
+/* RVC(S) instructions contain rs1 and rs2 */
+__RISCV_INSN_FUNCS(c_sq,	0xe003, 0xa000);
+__RISCV_INSN_FUNCS(c_sw,	0xe003, 0xc000);
+__RISCV_INSN_FUNCS(c_sd,	0xe003, 0xe000);
+/* RVC(A) instructions contain rs1 and rs2 */
+__RISCV_INSN_FUNCS(c_sub,	0xfc63, 0x8c01);
+__RISCV_INSN_FUNCS(c_subw,	0xfc43, 0x9c01);
+/* RVC(L) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_lq,	0xe003, 0x2000);
+__RISCV_INSN_FUNCS(c_lw,	0xe003, 0x4000);
+__RISCV_INSN_FUNCS(c_ld,	0xe003, 0x6000);
+/* RVC(I) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_addi,	0xe003, 0x0001);
+__RISCV_INSN_FUNCS(c_addiw,	0xe003, 0x2001);
+__RISCV_INSN_FUNCS(c_addi16sp,	0xe183, 0x6101);
+__RISCV_INSN_FUNCS(c_slli,	0xe003, 0x0002);
+/* RVC(B) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_sri,	0xe803, 0x8001);
+__RISCV_INSN_FUNCS(c_andi,	0xec03, 0x8801);
+/* RVC(SS) instructions contain rs2 */
+__RISCV_INSN_FUNCS(c_sqsp,	0xe003, 0xa002);
+__RISCV_INSN_FUNCS(c_swsp,	0xe003, 0xc002);
+__RISCV_INSN_FUNCS(c_sdsp,	0xe003, 0xe002);
+/* RVC(R) instructions contain rs2 and rd */
+__RISCV_INSN_FUNCS(c_mv,	0xf003, 0x8002);
+/* RVC(I) instructions contain sp and rd */
+__RISCV_INSN_FUNCS(c_lqsp,	0xe003, 0x2002);
+__RISCV_INSN_FUNCS(c_lwsp,	0xe003, 0x4002);
+__RISCV_INSN_FUNCS(c_ldsp,	0xe003, 0x6002);
+/* RVC(CW) instructions contain sp and rd */
+__RISCV_INSN_FUNCS(c_addi4spn,	0xe003, 0x0000);
+/* RVC(I) instructions contain rd */
+__RISCV_INSN_FUNCS(c_li,	0xe003, 0x4001);
+__RISCV_INSN_FUNCS(c_lui,	0xe003, 0x6001);
+
+__RISCV_INSN_FUNCS(arith_rr,	0x77, 0x33);
+__RISCV_INSN_FUNCS(arith_ri,	0x77, 0x13);
+__RISCV_INSN_FUNCS(lui,		0x7f, 0x37);
+__RISCV_INSN_FUNCS(load,	0x7f, 0x03);
+__RISCV_INSN_FUNCS(store,	0x7f, 0x23);
+__RISCV_INSN_FUNCS(amo,		0x7f, 0x2f);
+
 #endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 05/13] riscv/kprobe: Introduce free register(s) searching algorithm
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (3 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized Chen Guokai
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

To do jump optimization, it needs to clobber two integer GPRs, the first
one is used to form AUIPC/JALR jumping to detour buffer, the second one
is used to form JR in detour buffer. Since kprobe can be installed
anywhere of kernel/module text, hence the register being clobbered needs
to be chosen carefully to avoid changing the original logic.

The algorithm for finding free register is inspired by the register
renaming in modern processors. From the perspective of register renaming,
a register could be represented as two different registers if two neighbor
instructions both write to it but no one ever reads it. Extending this
fact a register is considered to be free if it has never been read since
the first write on it in the execution flow.

Let's use the example below to explain how the algorithm work. Given
kernel is RVI and RCV hybrid binary, and one kprobe is instrumented at
the entry of function idle_dummy().

Before			Optimized		Detour buffer
<idle_dummy>:					...
 #1 add  sp,sp,-16	auipc a0, #?		add  sp,sp,-16
 #2 sd   s0,8(sp)				sd   s0,8(sp)
 #3 addi s0,sp,16	jalr  a0, #?(a0)	addi s0,sp,16
 #4 ld   s0,8(sp)				ld   s0,8(sp)
 #5 li   a0,0		li   a0,0		auipc a0, #?
 #6 addi sp,sp,16	addi sp,sp,16		jr    x0, #?(a0)
 #7 ret			ret

To optimize kprobe, it used to patch the first 8 bytes with AUIPC/JALR,
because from #1 to #7, a0 is the only register that satisfies condition:

 - Never been read before write
 - Never been updated in detour buffer

So a0 will be chosen to form AUIPC/JALR and JR.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/kernel/probes/opt.c | 221 +++++++++++++++++++++++++++++++++
 1 file changed, 221 insertions(+)

diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index c03cdb1512a6..d38ed1a52c93 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -12,6 +12,9 @@
 #include <asm/kprobes.h>
 #include <asm/patch.h>
 
+#include "simulate-insn.h"
+#include "decode-insn.h"
+
 static int in_auipc_jalr_range(long val)
 {
 #ifdef CONFIG_ARCH_RV32I
@@ -37,15 +40,233 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
 {
 }
 
+/* Registers the first usage of which is the destination of instruction */
+#define WRITE_ON(reg)	\
+	(*write |= (((*read >> (reg)) ^ 1UL) & 1) << (reg))
+/* Registers the first usage of which is the source of instruction */
+#define READ_ON(reg)	\
+	(*read |= (((*write >> (reg)) ^ 1UL) & 1) << (reg))
+
 /*
  * In RISC-V ISA, AUIPC/JALR clobber one register to form target address,
  * inspired by register renaming in OoO processor, this involves search
  * backward that is not previously used as a source register and is used
  * as a destination register before any branch or jump instruction.
  */
+static void find_register(unsigned long start, unsigned long end,
+			       unsigned long *write, unsigned long *read)
+{
+	kprobe_opcode_t insn;
+	unsigned long addr, offset = 0UL;
+
+	for (addr = start; addr < end; addr += offset) {
+		insn = *(kprobe_opcode_t *)addr;
+		offset = GET_INSN_LENGTH(insn);
+
+#ifdef CONFIG_RISCV_ISA_C
+		if (offset == RVI_INSN_LEN)
+			goto is_rvi;
+
+		insn &= __COMPRESSED_INSN_MASK;
+		/* Stop searching until any control transfer instruction */
+		if (riscv_insn_is_c_ebreak(insn) || riscv_insn_is_c_j(insn))
+			break;
+
+		if (riscv_insn_is_c_jal(insn)) {
+			/* The rd of C.JAL is x1 by default */
+			WRITE_ON(1);
+			break;
+		}
+
+		if (riscv_insn_is_c_jr(insn)) {
+			READ_ON(rvc_r_rs1(insn));
+			break;
+		}
+
+		if (riscv_insn_is_c_jalr(insn)) {
+			READ_ON(rvc_r_rs1(insn));
+			/* The rd of C.JALR is x1 by default */
+			WRITE_ON(1);
+			break;
+		}
+
+		if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) {
+			READ_ON(rvc_b_rs(insn));
+			break;
+		}
+
+		/*
+		 * Decode RVC instructions to find out some destination
+		 * registers never be used as a source register.
+		 */
+		if (riscv_insn_is_c_sub(insn) || riscv_insn_is_c_subw(insn)) {
+			READ_ON(rvc_a_rs1(insn));
+			READ_ON(rvc_a_rs2(insn));
+			continue;
+		} else if (riscv_insn_is_c_sq(insn) ||
+			   riscv_insn_is_c_sw(insn) ||
+			   riscv_insn_is_c_sd(insn)) {
+			READ_ON(rvc_s_rs1(insn));
+			READ_ON(rvc_s_rs2(insn));
+			continue;
+		} else if (riscv_insn_is_c_addi16sp(insn) ||
+			   riscv_insn_is_c_addi(insn) ||
+			   riscv_insn_is_c_addiw(insn) ||
+			   riscv_insn_is_c_slli(insn)) {
+			READ_ON(rvc_i_rs1(insn));
+			continue;
+		} else if (riscv_insn_is_c_sri(insn) ||
+			   riscv_insn_is_c_andi(insn)) {
+			READ_ON(rvc_b_rs(insn));
+			continue;
+		} else if (riscv_insn_is_c_sqsp(insn) ||
+			   riscv_insn_is_c_swsp(insn) ||
+			   riscv_insn_is_c_sdsp(insn)) {
+			READ_ON(rvc_ss_rs2(insn));
+			/* The rs2 of C.SQSP/SWSP/SDSP are x2 by default */
+			READ_ON(2);
+			continue;
+		} else if (riscv_insn_is_c_mv(insn)) {
+			READ_ON(rvc_r_rs2(insn));
+			WRITE_ON(rvc_r_rd(insn));
+		} else if (riscv_insn_is_c_addi4spn(insn)) {
+			/* The rs of C.ADDI4SPN is x2 by default */
+			READ_ON(2);
+			WRITE_ON(rvc_l_rd(insn));
+		} else if (riscv_insn_is_c_lq(insn) ||
+			   riscv_insn_is_c_lw(insn) ||
+			   riscv_insn_is_c_ld(insn)) {
+			/* FIXME: c.lw/c.ld share opcode with c.flw/c.fld */
+			READ_ON(rvc_l_rs(insn));
+			WRITE_ON(rvc_l_rd(insn));
+		} else if (riscv_insn_is_c_lqsp(insn) ||
+			   riscv_insn_is_c_lwsp(insn) ||
+			   riscv_insn_is_c_ldsp(insn)) {
+			/*
+			 * FIXME: c.lwsp/c.ldsp share opcode with c.flwsp/c.fldsp
+			 * The rs of C.LQSP/C.LWSP/C.LDSP is x2 by default.
+			 */
+			READ_ON(2);
+			WRITE_ON(rvc_i_rd(insn));
+		} else if (riscv_insn_is_c_li(insn) ||
+			   riscv_insn_is_c_lui(insn)) {
+			WRITE_ON(rvc_i_rd(insn));
+		}
+
+		if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL))
+			return;
+is_rvi:
+#endif
+		/* Stop searching until any control transfer instruction */
+		if (riscv_insn_is_branch(insn)) {
+			READ_ON(rvi_rs1(insn));
+			READ_ON(rvi_rs2(insn));
+			break;
+		}
+
+		if (riscv_insn_is_jal(insn)) {
+			WRITE_ON(rvi_rd(insn));
+			break;
+		}
+
+		if (riscv_insn_is_jalr(insn)) {
+			READ_ON(rvi_rs1(insn));
+			WRITE_ON(rvi_rd(insn));
+			break;
+		}
+
+		if (riscv_insn_is_system(insn)) {
+			/* csrrw, csrrs, csrrc */
+			if (rvi_rs1(insn))
+				READ_ON(rvi_rs1(insn));
+			/* csrrwi, csrrsi, csrrci, csrrw, csrrs, csrrc */
+			if (rvi_rd(insn))
+				WRITE_ON(rvi_rd(insn));
+			break;
+		}
+
+		/*
+		 * Decode RVI instructions to find out some destination
+		 * registers never be used as a source register.
+		 */
+		if (riscv_insn_is_lui(insn) || riscv_insn_is_auipc(insn)) {
+			WRITE_ON(rvi_rd(insn));
+		} else if (riscv_insn_is_arith_ri(insn) ||
+			   riscv_insn_is_load(insn)) {
+			READ_ON(rvi_rs1(insn));
+			WRITE_ON(rvi_rd(insn));
+		} else if (riscv_insn_is_arith_rr(insn) ||
+			   riscv_insn_is_store(insn) ||
+			   riscv_insn_is_amo(insn)) {
+			READ_ON(rvi_rs1(insn));
+			READ_ON(rvi_rs2(insn));
+			WRITE_ON(rvi_rd(insn));
+		}
+
+		if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL))
+			return;
+	}
+}
+
 static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
 				int *rd, int *ra)
 {
+	unsigned long start, end;
+	/*
+	 * Searching algorithm explanation:
+	 *
+	 * 1. Define two types of instruction areas firstly:
+	 *
+	 * +-----+
+	 * +     +
+	 * +     + ---> instructions modified by optprobe, named 'O-Area'.
+	 * +     +
+	 * +-----+
+	 * +     +
+	 * +     + ---> instructions after optprobe, named 'K-Area'.
+	 * +     +
+	 * +  ~  +
+	 *
+	 * 2. There are two usages for each GPR in the given instruction area.
+	 *
+	 *   - W: GPR is used as the RD oprand at first emergence.
+	 *   - R: GPR is used as the RS oprand at first emergence.
+	 *
+	 * Then there are 4 different usages for each GPR total:
+	 *
+	 *   1. Used as W in O-Area, Used as W in K-Area.
+	 *   2. Used as W in O-Area, Used as R in K-Area.
+	 *   3. Used as R in O-Area, Used as W in K-Area.
+	 *   4. Used as R in O-Area, Used as R in K-Area.
+	 *
+	 * All registers satisfy #1 or #3 could be chosen to form 'AUIPC/JALR'
+	 * jumping to detour buffer.
+	 *
+	 * All registers satisfy #1 or #2, could be chosen to form 'JR' jumping
+	 * back from detour buffer.
+	 */
+	unsigned long kw = 0UL, kr = 0UL, ow = 0UL, or = 0UL;
+
+	/* Search one free register used to form AUIPC/JALR */
+	start = (unsigned long)&kp->opcode;
+	end = start + GET_INSN_LENGTH(kp->opcode);
+	find_register(start, end, &ow, &or);
+
+	start = (unsigned long)kp->addr + GET_INSN_LENGTH(kp->opcode);
+	end = (unsigned long)kp->addr + op->optinsn.length;
+	find_register(start, end, &ow, &or);
+
+	/* Search one free register used to form JR */
+	find_register(end, (unsigned long)_end, &kw, &kr);
+
+	if ((kw & ow) > 1UL) {
+		*rd = __builtin_ctzl((kw & ow) & ~1UL);
+		*ra = *rd;
+		return;
+	}
+
+	*rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
+	*ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
 }
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (4 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 05/13] riscv/kprobe: Introduce free register(s) searching algorithm Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:30   ` Björn Töpel
  2023-01-27 13:05 ` [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe Chen Guokai
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

For the RVI and RVC hybrid encoding kernel, although AUIPC/JALR just
occupy 8 bytes space, the patched code is 10 bytes at the worst case
to ensure no RVI is truncated, so to check if kprobe satisfies the
requirement of jump optimization, it has to find out an instruction
window large enough to patch AUIPC/JALR(and padding C.NOP), and ensure
no instruction nearby jumps into the patching window.

Besides that, this series does not support the simulation of pc-relative
instruction in optprobe handler yet, so the patching window should not
includes pc-relative instruction.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/kernel/probes/opt.c | 94 +++++++++++++++++++++++++++++++++-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index d38ed1a52c93..d84aa1420fa2 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -269,6 +269,50 @@ static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
 	*ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
 }
 
+static bool insn_jump_into_range(unsigned long addr, unsigned long start,
+				 unsigned long end)
+{
+	kprobe_opcode_t insn = *(kprobe_opcode_t *)addr;
+	unsigned long target, offset = GET_INSN_LENGTH(insn);
+
+#ifdef CONFIG_RISCV_ISA_C
+	if (offset == RVC_INSN_LEN) {
+		if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn))
+			target = addr + rvc_branch_imme(insn);
+		else if (riscv_insn_is_c_jal(insn) || riscv_insn_is_c_j(insn))
+			target = addr + rvc_jal_imme(insn);
+		else
+			target = 0;
+		return (target >= start) && (target < end);
+	}
+#endif
+
+	if (riscv_insn_is_branch(insn))
+		target = addr + rvi_branch_imme(insn);
+	else if (riscv_insn_is_jal(insn))
+		target = addr + rvi_jal_imme(insn);
+	else
+		target = 0;
+	return (target >= start) && (target < end);
+}
+
+static int search_copied_insn(unsigned long paddr, struct optimized_kprobe *op)
+{
+	int i =  1;
+	struct arch_probe_insn api;
+	unsigned long offset = GET_INSN_LENGTH(*(kprobe_opcode_t *)paddr);
+
+	while ((i++ < MAX_COPIED_INSN) && (offset < 2 * RVI_INSN_LEN)) {
+		if (riscv_probe_decode_insn((kprobe_opcode_t *)(paddr + offset),
+					    &api) != INSN_GOOD)
+			return -1;
+		offset += GET_INSN_LENGTH(*(kprobe_opcode_t *)(paddr + offset));
+	}
+
+	op->optinsn.length = offset;
+	return 0;
+}
+
 /*
  * The kprobe based on breakpoint just requires the instrumented instruction
  * supports execute out-of-line or simulation, besides that, optimized kprobe
@@ -276,7 +320,55 @@ static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
  */
 static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
 {
-	return false;
+	int ret;
+	struct arch_probe_insn api;
+	unsigned long addr, size = 0, offset = 0;
+	struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr);
+
+	/*
+	 * Skip optimization if kprobe has been disarmed or instrumented
+	 * instruction doest not support XOI.
+	 */
+	if (!kp || (riscv_probe_decode_insn(&kp->opcode, &api) != INSN_GOOD))
+		return false;
+
+	/*
+	 * Find a instruction window large enough to contain a pair
+	 * of AUIPC/JALR, and ensure each instruction in this window
+	 * supports XOI.
+	 */
+	ret = search_copied_insn(paddr, op);
+	if (ret)
+		return false;
+
+	if (!kallsyms_lookup_size_offset(paddr, &size, &offset))
+		return false;
+
+	/* Check there is enough space for relative jump(AUIPC/JALR) */
+	if (size - offset <= op->optinsn.length)
+		return false;
+
+	/*
+	 * Decode instructions until function end, check any instruction
+	 * don't jump into the window used to emit optprobe(AUIPC/JALR).
+	 */
+	addr = paddr - offset;
+	while (addr < paddr) {
+		if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
+					 paddr + op->optinsn.length))
+			return false;
+		addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
+	}
+
+	addr = paddr + op->optinsn.length;
+	while (addr < paddr - offset + size) {
+		if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
+					 paddr + op->optinsn.length))
+			return false;
+		addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
+	}
+
+	return true;
 }
 
 int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (5 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:30   ` Björn Töpel
  2023-01-27 13:05 ` [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Chen Guokai
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

To avoid messing up the execution context calling optprobe handler, it
needs to save and restore GPR/CSR context in the detour buffer.

The payload of detour buffer for different optprobe have some
differences, which derive from these reasons:

  - 'CALL optimized_callback', the relative offset for 'call'
    instruction is different for each detour buffer.
  - 'EXECUTE INSN OUT-OF-LINE'.
  - 'RETURN BACK', the chosen free register is reused here as the
     destination register of jumping back.

So it also needs to customize the payload for each optimized kprobe.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/include/asm/kprobes.h          |  16 +++
 arch/riscv/kernel/probes/opt.c            |  71 ++++++++++++
 arch/riscv/kernel/probes/opt_trampoline.S | 125 ++++++++++++++++++++++
 3 files changed, 212 insertions(+)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index 96cd36e67e2e..75ebd02be171 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -46,10 +46,26 @@ bool kprobe_single_step_handler(struct pt_regs *regs);
 /* optinsn template addresses */
 extern __visible kprobe_opcode_t optprobe_template_entry[];
 extern __visible kprobe_opcode_t optprobe_template_end[];
+extern __visible kprobe_opcode_t optprobe_template_save[];
+extern __visible kprobe_opcode_t optprobe_template_call[];
+extern __visible kprobe_opcode_t optprobe_template_insn[];
+extern __visible kprobe_opcode_t optprobe_template_return[];
 
 #define MAX_OPTINSN_SIZE				\
 	((unsigned long)optprobe_template_end -		\
 	 (unsigned long)optprobe_template_entry)
+#define DETOUR_SAVE_OFFSET				\
+	((unsigned long)optprobe_template_save -	\
+	 (unsigned long)optprobe_template_entry)
+#define DETOUR_CALL_OFFSET				\
+	((unsigned long)optprobe_template_call -	\
+	 (unsigned long)optprobe_template_entry)
+#define DETOUR_INSN_OFFSET				\
+	((unsigned long)optprobe_template_insn -	\
+	 (unsigned long)optprobe_template_entry)
+#define DETOUR_RETURN_OFFSET				\
+	((unsigned long)optprobe_template_return -	\
+	 (unsigned long)optprobe_template_entry)
 
 /*
  * For RVI and RVC hybrid encoding kernel, although long jump just needs
diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index d84aa1420fa2..a47f7d2bf3a6 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -11,9 +11,32 @@
 #include <linux/kprobes.h>
 #include <asm/kprobes.h>
 #include <asm/patch.h>
+#include <asm/asm-offsets.h>
 
 #include "simulate-insn.h"
 #include "decode-insn.h"
+#include "../../net/bpf_jit.h"
+
+static void optimized_callback(struct optimized_kprobe *op,
+			       struct pt_regs *regs)
+{
+	if (kprobe_disabled(&op->kp))
+		return;
+
+	preempt_disable();
+	if (kprobe_running()) {
+		kprobes_inc_nmissed_count(&op->kp);
+	} else {
+		__this_cpu_write(current_kprobe, &op->kp);
+		/* Save skipped registers */
+		instruction_pointer_set(regs, (unsigned long)op->kp.addr);
+		get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
+		opt_pre_handler(&op->kp, regs);
+		__this_cpu_write(current_kprobe, NULL);
+	}
+	preempt_enable();
+}
+NOKPROBE_SYMBOL(optimized_callback)
 
 static int in_auipc_jalr_range(long val)
 {
@@ -30,6 +53,11 @@ static int in_auipc_jalr_range(long val)
 #endif
 }
 
+#define DETOUR_ADDR(code, offs) \
+	((void *)((unsigned long)(code) + (offs)))
+#define DETOUR_INSN(code, offs) \
+	(*(kprobe_opcode_t *)((unsigned long)(code) + (offs)))
+
 /*
  * Copy optprobe assembly code template into detour buffer and modify some
  * instructions for each kprobe.
@@ -38,6 +66,49 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
 				  int rd, struct optimized_kprobe *op,
 				  kprobe_opcode_t opcode)
 {
+	long offs;
+	unsigned long data;
+
+	memcpy(code, optprobe_template_entry, MAX_OPTINSN_SIZE);
+
+	/* Step1: record optimized_kprobe pointer into detour buffer */
+	memcpy(DETOUR_ADDR(code, DETOUR_SAVE_OFFSET), &op, sizeof(op));
+
+	/*
+	 * Step2
+	 * auipc ra, 0     --> aupic ra, HI20.{optimized_callback - pc}
+	 * jalr  ra, 0(ra) --> jalr  ra, LO12.{optimized_callback - pc}(ra)
+	 */
+	offs = (unsigned long)&optimized_callback -
+	       (unsigned long)DETOUR_ADDR(slot, DETOUR_CALL_OFFSET);
+	DETOUR_INSN(code, DETOUR_CALL_OFFSET) =
+				rv_auipc(1, (offs + (1 << 11)) >> 12);
+	DETOUR_INSN(code, DETOUR_CALL_OFFSET + 0x4) =
+				rv_jalr(1, 1, offs & 0xFFF);
+
+	/* Step3: copy replaced instructions into detour buffer */
+	memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), op->kp.addr,
+	       op->optinsn.length);
+	memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), &opcode,
+	       GET_INSN_LENGTH(opcode));
+
+	/* Step4: record return address of long jump into detour buffer */
+	data = (unsigned long)op->kp.addr + op->optinsn.length;
+	memcpy(DETOUR_ADDR(code, DETOUR_RETURN_OFFSET), &data, sizeof(data));
+
+	/*
+	 * Step5
+	 * auipc ra, 0      --> auipc rd, 0
+	 * ld/w  ra, -4(ra) --> ld/w  rd, -8(rd)
+	 * jalr  x0,  0(ra) --> jalr  x0,  0(rd)
+	 */
+	DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x8) = rv_auipc(rd, 0);
+#if __riscv_xlen == 32
+	DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_lw(rd, -8, rd);
+#else
+	DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_ld(rd, -8, rd);
+#endif
+	DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x10) = rv_jalr(0, rd, 0);
 }
 
 /* Registers the first usage of which is the destination of instruction */
diff --git a/arch/riscv/kernel/probes/opt_trampoline.S b/arch/riscv/kernel/probes/opt_trampoline.S
index 16160c4367ff..5187e71d8e61 100644
--- a/arch/riscv/kernel/probes/opt_trampoline.S
+++ b/arch/riscv/kernel/probes/opt_trampoline.S
@@ -1,12 +1,137 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
  * Copyright (C) 2022 Guokai Chen
+ * Copyright (C) 2022 Liao, Chang <liaochang1@huawei.com>
  */
 
 #include <linux/linkage.h>
 
+#include <asm/asm.h>
 #incldue <asm/csr.h>
 #include <asm/asm-offsets.h>
 
 SYM_ENTRY(optprobe_template_entry, SYM_L_GLOBAL, SYM_A_NONE)
+	addi  sp, sp, -(PT_SIZE_ON_STACK)
+	REG_S x1,  PT_RA(sp)
+	REG_S x2,  PT_SP(sp)
+	REG_S x3,  PT_GP(sp)
+	REG_S x4,  PT_TP(sp)
+	REG_S x5,  PT_T0(sp)
+	REG_S x6,  PT_T1(sp)
+	REG_S x7,  PT_T2(sp)
+	REG_S x8,  PT_S0(sp)
+	REG_S x9,  PT_S1(sp)
+	REG_S x10, PT_A0(sp)
+	REG_S x11, PT_A1(sp)
+	REG_S x12, PT_A2(sp)
+	REG_S x13, PT_A3(sp)
+	REG_S x14, PT_A4(sp)
+	REG_S x15, PT_A5(sp)
+	REG_S x16, PT_A6(sp)
+	REG_S x17, PT_A7(sp)
+	REG_S x18, PT_S2(sp)
+	REG_S x19, PT_S3(sp)
+	REG_S x20, PT_S4(sp)
+	REG_S x21, PT_S5(sp)
+	REG_S x22, PT_S6(sp)
+	REG_S x23, PT_S7(sp)
+	REG_S x24, PT_S8(sp)
+	REG_S x25, PT_S9(sp)
+	REG_S x26, PT_S10(sp)
+	REG_S x27, PT_S11(sp)
+	REG_S x28, PT_T3(sp)
+	REG_S x29, PT_T4(sp)
+	REG_S x30, PT_T5(sp)
+	REG_S x31, PT_T6(sp)
+	/* Update fp is friendly for stacktrace */
+	addi  s0, sp, (PT_SIZE_ON_STACK)
+	j 1f
+
+SYM_ENTRY(optprobe_template_save, SYM_L_GLOBAL, SYM_A_NONE)
+	/*
+	 * Step1:
+	 * Filled with the pointer to optimized_kprobe data
+	 */
+	.dword 0
+1:
+	/* Load optimize_kprobe pointer from .dword below */
+	auipc a0, 0
+	REG_L a0, -8(a0)
+	add   a1, sp, x0
+
+SYM_ENTRY(optprobe_template_call, SYM_L_GLOBAL, SYM_A_NONE)
+	/*
+	 * Step2:
+	 * <IMME> of AUIPC/JALR are modified to the offset to optimized_callback
+	 * jump target is loaded from above .dword.
+	 */
+	auipc ra, 0
+	jalr  ra, 0(ra)
+
+	REG_L x1,  PT_RA(sp)
+	REG_L x3,  PT_GP(sp)
+	REG_L x4,  PT_TP(sp)
+	REG_L x5,  PT_T0(sp)
+	REG_L x6,  PT_T1(sp)
+	REG_L x7,  PT_T2(sp)
+	REG_L x8,  PT_S0(sp)
+	REG_L x9,  PT_S1(sp)
+	REG_L x10, PT_A0(sp)
+	REG_L x11, PT_A1(sp)
+	REG_L x12, PT_A2(sp)
+	REG_L x13, PT_A3(sp)
+	REG_L x14, PT_A4(sp)
+	REG_L x15, PT_A5(sp)
+	REG_L x16, PT_A6(sp)
+	REG_L x17, PT_A7(sp)
+	REG_L x18, PT_S2(sp)
+	REG_L x19, PT_S3(sp)
+	REG_L x20, PT_S4(sp)
+	REG_L x21, PT_S5(sp)
+	REG_L x22, PT_S6(sp)
+	REG_L x23, PT_S7(sp)
+	REG_L x24, PT_S8(sp)
+	REG_L x25, PT_S9(sp)
+	REG_L x26, PT_S10(sp)
+	REG_L x27, PT_S11(sp)
+	REG_L x28, PT_T3(sp)
+	REG_L x29, PT_T4(sp)
+	REG_L x30, PT_T5(sp)
+	REG_L x31, PT_T6(sp)
+	REG_L x2,  PT_SP(sp)
+	addi  sp, sp, (PT_SIZE_ON_STACK)
+
+SYM_ENTRY(optprobe_template_insn, SYM_L_GLOBAL, SYM_A_NONE)
+	/*
+	 * Step3:
+	 * NOPS will be replaced by the probed instruction, at worst case 3 RVC
+	 * and 1 RVI instructions is about to execute out of line.
+	 */
+#ifdef CONFIG_RISCV_ISA_C
+	c.addi zero, 0
+	c.addi zero, 0
+	c.addi zero, 0
+	c.addi zero, 0
+	c.addi zero, 0
+#else
+	addi zero, zero, 0
+	addi zero, zero, 0
+#endif
+	j 2f
+
+SYM_ENTRY(optprobe_template_return, SYM_L_GLOBAL, SYM_A_NONE)
+	/*
+	 * Step4:
+	 * Filled with the return address of long jump(AUIPC/JALR)
+	 */
+	.dword 0
+2:
+	/*
+	 * Step5:
+	 * The <RA> of AUIPC/LD/JALR will be replaced for each kprobe,
+	 * used to read return address saved in .dword above.
+	 */
+	auipc ra, 0
+	REG_L ra, -8(ra)
+	jalr  x0, 0(ra)
 SYM_ENTRY(optprobe_template_end, SYM_L_GLOBAL, SYM_A_NONE)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (6 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:31   ` Björn Töpel
  2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

From: Liao Chang <liaochang1@huawei.com>

There is race when replacing EBREAK with AUIPC/JALR pairs under SMP,
so it needs to patch multiple instructions safely, this patch enhances
patch_text_cb() to ensure no race occurs when patching AUIPC/JALR pairs.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
---
 arch/riscv/include/asm/patch.h |  1 +
 arch/riscv/kernel/patch.c      | 23 +++++++++---
 arch/riscv/kernel/probes/opt.c | 65 ++++++++++++++++++++++++++++++++--
 3 files changed, 83 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
index 9a7d7346001e..ee31539de65f 100644
--- a/arch/riscv/include/asm/patch.h
+++ b/arch/riscv/include/asm/patch.h
@@ -8,5 +8,6 @@
 
 int patch_text_nosync(void *addr, const void *insns, size_t len);
 int patch_text(void *addr, u32 insn);
+int patch_text_batch(void *addr, const void *insn, size_t size);
 
 #endif /* _ASM_RISCV_PATCH_H */
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 765004b60513..ce324b6a6998 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -15,7 +15,8 @@
 
 struct patch_insn {
 	void *addr;
-	u32 insn;
+	const void *insn;
+	size_t size;
 	atomic_t cpu_count;
 };
 
@@ -106,8 +107,7 @@ static int patch_text_cb(void *data)
 
 	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
 		ret =
-		    patch_text_nosync(patch->addr, &patch->insn,
-					    GET_INSN_LENGTH(patch->insn));
+		    patch_text_nosync(patch->addr, patch->insn, patch->size);
 		atomic_inc(&patch->cpu_count);
 	} else {
 		while (atomic_read(&patch->cpu_count) <= num_online_cpus())
@@ -123,7 +123,8 @@ int patch_text(void *addr, u32 insn)
 {
 	struct patch_insn patch = {
 		.addr = addr,
-		.insn = insn,
+		.insn = &insn,
+		.size = GET_INSN_LENGTH(insn),
 		.cpu_count = ATOMIC_INIT(0),
 	};
 
@@ -131,3 +132,17 @@ int patch_text(void *addr, u32 insn)
 				       &patch, cpu_online_mask);
 }
 NOKPROBE_SYMBOL(patch_text);
+
+int patch_text_batch(void *addr, const void *insn, size_t size)
+{
+	struct patch_insn patch = {
+		.addr = addr,
+		.insn = insn,
+		.size = size,
+		.cpu_count = ATOMIC_INIT(0),
+	};
+
+	return stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask);
+}
+
+NOKPROBE_SYMBOL(patch_text_batch);
diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index a47f7d2bf3a6..c52d5bdc748c 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -8,6 +8,7 @@
 
 #define pr_fmt(fmt)	"optprobe: " fmt
 
+#include <linux/types.h>
 #include <linux/kprobes.h>
 #include <asm/kprobes.h>
 #include <asm/patch.h>
@@ -444,11 +445,19 @@ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
 
 int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
 {
-	return 0;
+	return optinsn->length;
 }
 
 int arch_check_optimized_kprobe(struct optimized_kprobe *op)
 {
+	unsigned long i;
+	struct kprobe *p;
+
+	for (i = RVC_INSN_LEN; i < op->optinsn.length; i += RVC_INSN_LEN) {
+		p = get_kprobe(op->kp.addr + i);
+		if (p && !kprobe_disabled(p))
+			return -EEXIST;
+	}
 	return 0;
 }
 
@@ -509,23 +518,75 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 
 void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
 {
+	if (op->optinsn.insn) {
+		free_optinsn_slot(op->optinsn.insn, 1);
+		op->optinsn.insn = NULL;
+		op->optinsn.length = 0;
+	}
 }
 
 void arch_optimize_kprobes(struct list_head *oplist)
 {
+	long offs;
+	kprobe_opcode_t insn[3];
+	struct optimized_kprobe *op, *tmp;
+
+	list_for_each_entry_safe(op, tmp, oplist, list) {
+		WARN_ON(kprobe_disabled(&op->kp));
+
+		/* Backup instructions which will be replaced by jump address */
+		memcpy(op->optinsn.copied_insn,
+		       DETOUR_ADDR(op->optinsn.insn, DETOUR_INSN_OFFSET),
+		       op->optinsn.length);
+
+		/*
+		 * After patching, it should be:
+		 * auipc free_register, %hi(detour_buffer)
+		 * jalr free_register, free_register, %lo(detour_buffer)
+		 * where free_register will eventually save the return address
+		 */
+		offs = (unsigned long)op->optinsn.insn -
+		       (unsigned long)op->kp.addr;
+		insn[0] = rv_auipc(op->optinsn.rd, (offs + (1 << 11)) >> 12);
+		insn[1] = rv_jalr(op->optinsn.rd, op->optinsn.rd, offs & 0xFFF);
+		/* For 3 RVC + 1 RVI scenario, fill C.NOP for padding */
+		if (op->optinsn.length > 2 * RVI_INSN_LEN)
+			insn[2] = rvc_addi(0, 0);
+
+		patch_text_batch(op->kp.addr, insn, op->optinsn.length);
+		if (memcmp(op->kp.addr, insn, op->optinsn.length))
+			continue;
+
+		list_del_init(&op->list);
+	}
 }
 
 void arch_unoptimize_kprobes(struct list_head *oplist,
 			     struct list_head *done_list)
 {
+	struct optimized_kprobe *op, *tmp;
+
+	list_for_each_entry_safe(op, tmp, oplist, list) {
+		arch_unoptimize_kprobe(op);
+		list_move(&op->list, done_list);
+	}
 }
 
 void arch_unoptimize_kprobe(struct optimized_kprobe *op)
 {
+	kprobe_opcode_t buf[MAX_COPIED_INSN];
+
+	memcpy(buf, op->optinsn.copied_insn, op->optinsn.length);
+	if (GET_INSN_LENGTH(op->kp.opcode) == RVI_INSN_LEN)
+		*(u32 *)buf = __BUG_INSN_32;
+	else
+		*(u16 *)buf = __BUG_INSN_16;
+	patch_text_batch(op->kp.addr, buf, op->optinsn.length);
 }
 
 int arch_within_optimized_kprobe(struct optimized_kprobe *op,
 				 kprobe_opcode_t *addr)
 {
-	return 0;
+	return (op->kp.addr <= addr &&
+		op->kp.addr + op->optinsn.length > addr);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (7 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:31   ` Björn Töpel
  2023-02-02  9:08   ` Conor Dooley
  2023-01-27 13:05 ` [PATCH v6 10/13] riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel Chen Guokai
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai,
	Björn Töpel

This patch further allows optprobe to use caller-saved registers that
is not used across the function being optimized as free registers.

Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Co-developed-by: Liao Chang <liaochang1@huawei.com>
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Reported-by: Björn Töpel <bjorn@kernel.org>
---
 arch/riscv/include/asm/kprobes.h       |   1 +
 arch/riscv/kernel/probes/decode-insn.h |  29 +++++++
 arch/riscv/kernel/probes/opt.c         | 116 ++++++++++++++++++++++---
 3 files changed, 134 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index 75ebd02be171..f7d33f6861c6 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -86,6 +86,7 @@ struct arch_optimized_insn {
 	kprobe_opcode_t *insn;
 	unsigned long length;
 	int rd;
+	u32 free_reg;
 };
 
 #endif /* CONFIG_OPTPROBES */
diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
index 785b023a62ea..140f5b6a9886 100644
--- a/arch/riscv/kernel/probes/decode-insn.h
+++ b/arch/riscv/kernel/probes/decode-insn.h
@@ -13,6 +13,35 @@ enum probe_insn {
 	INSN_GOOD,
 };
 
+#define NRREG 32
+#define ALL_REG_OCCUPIED 0xffffffffu
+/*
+ * Register	ABI Name	Saver
+ * x0		zero		--
+ * x1		ra		Caller
+ * x2		sp		Callee
+ * x3		gp		--
+ * x4		tp		--
+ * x5-7 	t0-2		Caller
+ * x8		so/fp		Callee
+ * x9		so/fp		Callee
+ * x10-11	a0-1		Caller
+ * x12-17	a2-7		Caller
+ * x18-27	s2-11		Callee
+ * x28-32	t3-6		Caller
+ *
+ * If register is not caller-saved, it is potentially unsafe to used
+ * as a free register to form AUIPC/JALR, then use one bitmask to filter
+ * out these registers. Because ra is used to record return address for
+ * function call, so mark ra as non-caller-saved register here.
+ * */
+#define NON_CALLER_SAVED_MASK				\
+	(1 <<  0) | (1 <<  1) | (1 <<  2) | (1 <<  3) |	\
+	(1 <<  4) | (1 <<  8) | (1 <<  9) | (1 << 18) |	\
+	(1 << 19) | (1 << 20) | (1 << 21) | (1 << 22) |	\
+	(1 << 23) | (1 << 24) | (1 << 25) | (1 << 26) |	\
+	(1 << 27)
+
 enum probe_insn __kprobes
 riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);
 
diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index c52d5bdc748c..e151b1c60d6d 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -13,6 +13,7 @@
 #include <asm/kprobes.h>
 #include <asm/patch.h>
 #include <asm/asm-offsets.h>
+#include <linux/extable.h>
 
 #include "simulate-insn.h"
 #include "decode-insn.h"
@@ -126,7 +127,7 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
  * as a destination register before any branch or jump instruction.
  */
 static void find_register(unsigned long start, unsigned long end,
-			       unsigned long *write, unsigned long *read)
+			  unsigned long *write, unsigned long *read)
 {
 	kprobe_opcode_t insn;
 	unsigned long addr, offset = 0UL;
@@ -385,18 +386,101 @@ static int search_copied_insn(unsigned long paddr, struct optimized_kprobe *op)
 	return 0;
 }
 
+static void update_free_reg(unsigned long addr, uint32_t *used_reg)
+{
+	kprobe_opcode_t insn = *(kprobe_opcode_t *)addr;
+	unsigned long offset = GET_INSN_LENGTH(insn);
+
+#ifdef CONFIG_RISCV_ISA_C
+	if (offset == RVI_INSN_LEN)
+		goto is_rvi;
+
+	insn &= __COMPRESSED_INSN_MASK;
+	if (riscv_insn_is_c_jal(insn)) {
+		*used_reg |= 1 << 1;
+	} else if (riscv_insn_is_c_jr(insn)) {
+		*used_reg |= 1 << rvc_r_rs1(insn);
+	} else if (riscv_insn_is_c_jalr(insn)) {
+		*used_reg |= 1 << rvc_r_rs1(insn);
+	} else if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) {
+		*used_reg |= 1 << rvc_b_rs(insn);
+	} else if (riscv_insn_is_c_sub(insn) || riscv_insn_is_c_subw(insn)) {
+		*used_reg |= 1 << rvc_a_rs1(insn);
+		*used_reg |= 1 << rvc_a_rs2(insn);
+	} else if (riscv_insn_is_c_sq(insn) || riscv_insn_is_c_sw(insn) ||
+			   riscv_insn_is_c_sd(insn)) {
+		*used_reg |= 1 << rvc_s_rs1(insn);
+		*used_reg |= 1 << rvc_s_rs2(insn);
+	} else if (riscv_insn_is_c_addi16sp(insn) || riscv_insn_is_c_addi(insn) ||
+			   riscv_insn_is_c_addiw(insn) ||
+			   riscv_insn_is_c_slli(insn)) {
+		*used_reg |= 1 << rvc_i_rs1(insn);
+	} else if (riscv_insn_is_c_sri(insn) ||
+			   riscv_insn_is_c_andi(insn)) {
+		*used_reg |= 1 << rvc_b_rs(insn);
+	} else if (riscv_insn_is_c_sqsp(insn) || riscv_insn_is_c_swsp(insn) ||
+			   riscv_insn_is_c_sdsp(insn)) {
+		*used_reg |= 1 << rvc_ss_rs2(insn);
+		*used_reg |= 1 << 2;
+	} else if (riscv_insn_is_c_mv(insn)) {
+		*used_reg |= 1 << rvc_r_rs2(insn);
+	} else if (riscv_insn_is_c_addi4spn(insn)) {
+		*used_reg |= 1 << 2;
+	} else if (riscv_insn_is_c_lq(insn) || riscv_insn_is_c_lw(insn) ||
+			   riscv_insn_is_c_ld(insn)) {
+		*used_reg |= 1 << rvc_l_rs(insn);
+	} else if (riscv_insn_is_c_lqsp(insn) || riscv_insn_is_c_lwsp(insn) ||
+			   riscv_insn_is_c_ldsp(insn)) {
+		*used_reg |= 1 << 2;
+	}
+	/* li and lui does not have source reg */
+	return;
+is_rvi:
+#endif
+	if (riscv_insn_is_arith_ri(insn) || riscv_insn_is_load(insn)) {
+		*used_reg |= 1 << rvi_rs1(insn);
+	} else if (riscv_insn_is_arith_rr(insn) || riscv_insn_is_store(insn) ||
+		riscv_insn_is_amo(insn)) {
+		*used_reg |= 1 << rvi_rs1(insn);
+		*used_reg |= 1 << rvi_rs2(insn);
+	} else if (riscv_insn_is_branch(insn)) {
+		*used_reg |= 1 << rvi_rs1(insn);
+		*used_reg |= 1 << rvi_rs2(insn);
+	} else if (riscv_insn_is_jalr(insn)) {
+		*used_reg |= 1 << rvi_rs1(insn);
+	}
+}
+
+static bool scan_code(unsigned long *addr, unsigned long paddr,
+		      struct optimized_kprobe *op, uint32_t *used_reg)
+{
+	if (insn_jump_into_range(*addr, paddr + RVC_INSN_LEN,
+				 paddr + op->optinsn.length))
+		return false;
+	if (search_exception_tables(*addr))
+		return false;
+	update_free_reg(*addr, used_reg);
+	*addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
+	return true;
+}
+
 /*
  * The kprobe based on breakpoint just requires the instrumented instruction
  * supports execute out-of-line or simulation, besides that, optimized kprobe
  * requires no near instruction jump to any instruction replaced by AUIPC/JALR.
  */
-static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
+static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op, uint32_t *used_reg)
 {
 	int ret;
 	struct arch_probe_insn api;
 	unsigned long addr, size = 0, offset = 0;
 	struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr);
 
+	/*
+	 * All callee
+	 */
+	*used_reg = NON_CALLER_SAVED_MASK;
+
 	/*
 	 * Skip optimization if kprobe has been disarmed or instrumented
 	 * instruction doest not support XOI.
@@ -426,18 +510,14 @@ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
 	 */
 	addr = paddr - offset;
 	while (addr < paddr) {
-		if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
-					 paddr + op->optinsn.length))
+		if (!scan_code(&addr, paddr, op, used_reg))
 			return false;
-		addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
 	}
-
-	addr = paddr + op->optinsn.length;
+	update_free_reg((unsigned long)&kp->opcode, used_reg);
+	addr = paddr + GET_INSN_LENGTH(*(kprobe_opcode_t *)&kp->opcode);
 	while (addr < paddr - offset + size) {
-		if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
-					 paddr + op->optinsn.length))
+		if (!scan_code(&addr, paddr, op, used_reg))
 			return false;
-		addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
 	}
 
 	return true;
@@ -466,10 +546,13 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 {
 	long rel;
 	int rd = 0, ra = 0, ret;
+	u32 used_reg;
 	kprobe_opcode_t *code = NULL, *slot = NULL;
 
-	if (!can_optimize((unsigned long)orig->addr, op))
+	if (!can_optimize((unsigned long)orig->addr, op, &used_reg)) {
+		op->optinsn.rd = -1;
 		return -EILSEQ;
+	}
 
 	code = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
 	slot = get_optinsn_slot();
@@ -490,7 +573,14 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 	 * to detour buffer, ra is used to form JR jumping back from detour
 	 * buffer.
 	 */
-	find_free_registers(orig, op, &rd, &ra);
+	if (used_reg == ALL_REG_OCCUPIED) {
+		find_free_registers(orig, op, &rd, &ra);
+	} else {
+		/* Choose one unused caller-saved register. */
+		rd = ffz(used_reg);
+		ra = rd;
+	}
+
 	if (rd == 0 || ra == 0) {
 		ret = -EILSEQ;
 		goto on_error;
@@ -534,6 +624,8 @@ void arch_optimize_kprobes(struct list_head *oplist)
 	list_for_each_entry_safe(op, tmp, oplist, list) {
 		WARN_ON(kprobe_disabled(&op->kp));
 
+		if (op->optinsn.rd < 0)
+			continue;
 		/* Backup instructions which will be replaced by jump address */
 		memcpy(op->optinsn.copied_insn,
 		       DETOUR_ADDR(op->optinsn.insn, DETOUR_INSN_OFFSET),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 10/13] riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (8 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-27 13:05 ` [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR Chen Guokai
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

From: Liao Chang <liaochang1@huawei.com>

Add instruction boundary check to ensure kprobe doesn't truncate any RVI
instruction, which leads to kernel crash.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
 arch/riscv/kernel/probes/kprobes.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index e1856b04db04..91a6b46909cc 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -49,11 +49,33 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
 	post_kprobe_handler(p, kcb, regs);
 }
 
+bool __kprobes riscv_insn_boundary_check(unsigned long paddr)
+{
+#if defined(CONFIG_RISCV_ISA_C)
+	unsigned long size = 0, offs = 0, len = 0, entry = 0;
+
+	if (!kallsyms_lookup_size_offset(paddr, &size, &offs))
+		return false;
+
+	/*
+	 * Scan instructions from function entry ensure the kprobe address
+	 * is aligned with RVI or RVC boundary.
+	 */
+	entry = paddr - offs;
+	while ((entry + len) < paddr)
+		len += GET_INSN_LENGTH(*(kprobe_opcode_t *)(entry + len));
+	return (entry + len) == paddr;
+#else
+	return true;
+#endif
+}
+
 int __kprobes arch_prepare_kprobe(struct kprobe *p)
 {
 	unsigned long probe_addr = (unsigned long)p->addr;
 
-	if (probe_addr & 0x1)
+	/* for RVI/RCV hybrid kernel, it needs instruction boundary check */
+	if ((probe_addr & 0x1) || !riscv_insn_boundary_check(probe_addr))
 		return -EILSEQ;
 
 	/* copy instruction */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (9 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 10/13] riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-31 12:51   ` Björn Töpel
  2023-01-27 13:05 ` [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist Chen Guokai
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

From: Liao Chang <liaochang1@huawei.com>

Set kprobe at 'jalr 1140(ra)' of vfs_write results in the following
crash:

[   32.092235] Unable to handle kernel access to user memory without uaccess routines at virtual address 00aaaaaad77b1170
[   32.093115] Oops [#1]
[   32.093251] Modules linked in:
[   32.093626] CPU: 0 PID: 135 Comm: ftracetest Not tainted 6.2.0-rc2-00013-gb0aa5e5df0cb-dirty #16
[   32.093985] Hardware name: riscv-virtio,qemu (DT)
[   32.094280] epc : ksys_read+0x88/0xd6
[   32.094855]  ra : ksys_read+0xc0/0xd6
[   32.095016] epc : ffffffff801cda80 ra : ffffffff801cdab8 sp : ff20000000d7bdc0
[   32.095227]  gp : ffffffff80f14000 tp : ff60000080f9cb40 t0 : ffffffff80f13e80
[   32.095500]  t1 : ffffffff8000c29c t2 : ffffffff800dbc54 s0 : ff20000000d7be60
[   32.095716]  s1 : 0000000000000000 a0 : ffffffff805a64ae a1 : ffffffff80a83708
[   32.095921]  a2 : ffffffff80f160a0 a3 : 0000000000000000 a4 : f229b0afdb165300
[   32.096171]  a5 : f229b0afdb165300 a6 : ffffffff80eeebd0 a7 : 00000000000003ff
[   32.096411]  s2 : ff6000007ff76800 s3 : fffffffffffffff7 s4 : 00aaaaaad77b1170
[   32.096638]  s5 : ffffffff80f160a0 s6 : ff6000007ff76800 s7 : 0000000000000030
[   32.096865]  s8 : 00ffffffc3d97be0 s9 : 0000000000000007 s10: 00aaaaaad77c9410
[   32.097092]  s11: 0000000000000000 t3 : ffffffff80f13e48 t4 : ffffffff8000c29c
[   32.097317]  t5 : ffffffff8000c29c t6 : ffffffff800dbc54
[   32.097505] status: 0000000200000120 badaddr: 00aaaaaad77b1170 cause: 000000000000000d
[   32.098011] [<ffffffff801cdb72>] ksys_write+0x6c/0xd6
[   32.098222] [<ffffffff801cdc06>] sys_write+0x2a/0x38
[   32.098405] [<ffffffff80003c76>] ret_from_syscall+0x0/0x2

Since the rs1 and rd might be the same one, such as 'jalr 1140(ra)',
hence it requires obtaining the target address from rs1 followed by
updating rd.

Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
 arch/riscv/kernel/probes/simulate-insn.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/riscv/kernel/probes/simulate-insn.c b/arch/riscv/kernel/probes/simulate-insn.c
index 7441ac8a6843..8402020010d5 100644
--- a/arch/riscv/kernel/probes/simulate-insn.c
+++ b/arch/riscv/kernel/probes/simulate-insn.c
@@ -75,13 +75,9 @@ bool __kprobes simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *reg
 	if (!ret)
 		return ret;
 
-	ret = rv_insn_reg_set_val(regs, rd_index, addr + 4);
-	if (!ret)
-		return ret;
-
 	instruction_pointer_set(regs, (base_addr + sign_extend32((imm), 11))&~1);
 
-	return ret;
+	return rv_insn_reg_set_val(regs, rd_index, addr + 4);
 }
 
 #define auipc_rd_idx(opcode) \
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (10 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-02-01 13:30   ` Björn Töpel
  2023-01-27 13:05 ` [PATCH v6 13/13] selftest/kprobes: Add testcase for kprobe SYM[+offs] Chen Guokai
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

From: Liao Chang <liaochang1@huawei.com>

Due to the generic exception entry and exit code is part of the critical
path for kprobe breakpoint and uprobe syscall entry, set a kprobe on the
assembly symbols in entry.S result in kernel stack overflow crash, hence
it has to explicitly blacklist it, requiring a new _ASM_NOKPROBE() asm
helper.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
 arch/riscv/include/asm/asm.h | 10 ++++++++++
 arch/riscv/kernel/entry.S    | 12 ++++++++++++
 arch/riscv/kernel/mcount.S   |  1 +
 3 files changed, 23 insertions(+)

diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index 816e753de636..5d9f13d8b809 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -81,6 +81,16 @@
 	.endr
 .endm
 
+#ifdef CONFIG_KPROBES
+#define _ASM_NOKPROBE(entry)				\
+	.pushsection "_kprobe_blacklist", "aw" ;	\
+	.balign SZREG ;					\
+	REG_ASM entry ;					\
+	.popsection
+#else
+#define _ASM_NOKPROBE(entry)
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_ASM_H */
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 99d38fdf8b18..9e8882a78523 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -606,3 +606,15 @@ ENTRY(__user_rt_sigreturn)
 	scall
 END(__user_rt_sigreturn)
 #endif
+
+_ASM_NOKPROBE(handle_exception)
+_ASM_NOKPROBE(_restore_kernel_tpsp)
+_ASM_NOKPROBE(_save_context)
+_ASM_NOKPROBE(ret_from_exception)
+_ASM_NOKPROBE(ret_from_syscall)
+_ASM_NOKPROBE(__switch_to)
+_ASM_NOKPROBE(ret_from_syscall_rejected)
+_ASM_NOKPROBE(restore_all)
+_ASM_NOKPROBE(resume_kernel)
+_ASM_NOKPROBE(resume_userspace)
+_ASM_NOKPROBE(check_syscall_nr)
diff --git a/arch/riscv/kernel/mcount.S b/arch/riscv/kernel/mcount.S
index 30102aadc4d7..7393b8895ef3 100644
--- a/arch/riscv/kernel/mcount.S
+++ b/arch/riscv/kernel/mcount.S
@@ -54,6 +54,7 @@ ENTRY(ftrace_stub)
 #endif
 	ret
 ENDPROC(ftrace_stub)
+_ASM_NOKPROBE(MCOUNT_NAME)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(return_to_handler)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v6 13/13] selftest/kprobes: Add testcase for kprobe SYM[+offs]
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (11 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist Chen Guokai
@ 2023-01-27 13:05 ` Chen Guokai
  2023-01-30 12:31 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Björn Töpel
  2023-02-01 13:29 ` Björn Töpel
  14 siblings, 0 replies; 27+ messages in thread
From: Chen Guokai @ 2023-01-27 13:05 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

From: Liao Chang <liaochang1@huawei.com>

This testcase set multiple kprobes to the function that contains a
series of complex opcode pattern, it helps discover some subtle bugs in
the instruction decoder and kprobe jump optimization.

Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
 .../ftrace/test.d/kprobe/kprobe_sym_offs.tc   | 49 +++++++++++++++++++
 1 file changed, 49 insertions(+)
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc

diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc
new file mode 100644
index 000000000000..0007bec48308
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc
@@ -0,0 +1,49 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Kprobe dynamic event with offset
+# requires: kprobe_events
+TARGET_FUNC=vfs_write
+
+dec_addr() { # hexaddr
+  printf "%d" "0x"`echo $1 | tail -c 8`
+}
+
+set_offs() { # target next
+  SYMADDR=$1
+  ENDADDR=$2
+  A1=`dec_addr $SYMADDR`
+  A2=`dec_addr $ENDADDR`
+  NEXT=`expr $A2 - $A1` # offset to previous symbol
+}
+
+# Get the instruction number between two contiguous symbols
+set_offs `grep -A1 -w ${TARGET_FUNC} /proc/kallsyms | cut -f 1 -d " " | xargs`
+
+# Instruction length depends on the machine architecute.
+case `uname -m` in
+  arm64) LEN=4;;
+  riscv32|riscv64) LEN=2;;
+  *) LEN=2;;
+esac
+
+N=0
+OFFS=0
+echo "Setup up kprobes on each instruction in function $TARGET_FUNC"
+while true; do
+  N=$(($N+1))
+  ! echo p ${TARGET_FUNC}+${OFFS} >> kprobe_events
+  OFFS=$(($OFFS+$LEN))
+  test $OFFS -eq $NEXT && break
+done
+
+L=`cat kprobe_events | wc -l`
+echo "The number of kprobes events ($L) not $N in function $TARGET_FUNC"
+
+echo 1 > events/kprobes/enable
+# Trigger vfs_write to test kprobes
+cat kprobe_events >> $testlog
+echo 0 > events/kprobes/enable
+echo > kprobe_events
+echo "Waiting for unoptimizing & freeing"
+sleep 5
+echo "Done"
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (12 preceding siblings ...)
  2023-01-27 13:05 ` [PATCH v6 13/13] selftest/kprobes: Add testcase for kprobe SYM[+offs] Chen Guokai
@ 2023-01-30 12:31 ` Björn Töpel
  2023-01-30 14:38   ` Xim
  2023-02-01 13:29 ` Björn Töpel
  14 siblings, 1 reply; 27+ messages in thread
From: Björn Töpel @ 2023-01-30 12:31 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> Add jump optimization support for RISC-V.

I'd like to take the series for a spin, but I'm having trouble applying
the the patches; What base commit did you use? Or point me to a git
repo.

(It's nice to use "--base" to git-format-patch.)


Thanks!
Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
  2023-01-30 12:31 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Björn Töpel
@ 2023-01-30 14:38   ` Xim
  2023-04-26 18:01     ` Palmer Dabbelt
  0 siblings, 1 reply; 27+ messages in thread
From: Xim @ 2023-01-30 14:38 UTC (permalink / raw)
  To: Björn Töpel
  Cc: paul.walmsley, palmer, aou, rostedt, mingo, sfr, linux-riscv,
	linux-kernel, liaochang (A)

Hi Björn,



> 2023年1月30日 20:31,Björn Töpel <bjorn@kernel.org> 写道:
> 
> Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:
> 
>> Add jump optimization support for RISC-V.
> 
> I'd like to take the series for a spin, but I'm having trouble applying
> the the patches; What base commit did you use? Or point me to a git
> repo.

I generated this patch series based on next-20230127 tag

> 
> (It's nice to use "--base" to git-format-patch.)

I will take this parameter in any following revisions, thanks!

> 
> 
> Thanks!
> Björn


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR
  2023-01-27 13:05 ` [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR Chen Guokai
@ 2023-01-31 12:51   ` Björn Töpel
  0 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-01-31 12:51 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> From: Liao Chang <liaochang1@huawei.com>
>
> Set kprobe at 'jalr 1140(ra)' of vfs_write results in the following
> crash:
>
> [   32.092235] Unable to handle kernel access to user memory without uaccess routines at virtual address 00aaaaaad77b1170
> [   32.093115] Oops [#1]
> [   32.093251] Modules linked in:
> [   32.093626] CPU: 0 PID: 135 Comm: ftracetest Not tainted 6.2.0-rc2-00013-gb0aa5e5df0cb-dirty #16
> [   32.093985] Hardware name: riscv-virtio,qemu (DT)
> [   32.094280] epc : ksys_read+0x88/0xd6
> [   32.094855]  ra : ksys_read+0xc0/0xd6
> [   32.095016] epc : ffffffff801cda80 ra : ffffffff801cdab8 sp : ff20000000d7bdc0
> [   32.095227]  gp : ffffffff80f14000 tp : ff60000080f9cb40 t0 : ffffffff80f13e80
> [   32.095500]  t1 : ffffffff8000c29c t2 : ffffffff800dbc54 s0 : ff20000000d7be60
> [   32.095716]  s1 : 0000000000000000 a0 : ffffffff805a64ae a1 : ffffffff80a83708
> [   32.095921]  a2 : ffffffff80f160a0 a3 : 0000000000000000 a4 : f229b0afdb165300
> [   32.096171]  a5 : f229b0afdb165300 a6 : ffffffff80eeebd0 a7 : 00000000000003ff
> [   32.096411]  s2 : ff6000007ff76800 s3 : fffffffffffffff7 s4 : 00aaaaaad77b1170
> [   32.096638]  s5 : ffffffff80f160a0 s6 : ff6000007ff76800 s7 : 0000000000000030
> [   32.096865]  s8 : 00ffffffc3d97be0 s9 : 0000000000000007 s10: 00aaaaaad77c9410
> [   32.097092]  s11: 0000000000000000 t3 : ffffffff80f13e48 t4 : ffffffff8000c29c
> [   32.097317]  t5 : ffffffff8000c29c t6 : ffffffff800dbc54
> [   32.097505] status: 0000000200000120 badaddr: 00aaaaaad77b1170 cause: 000000000000000d
> [   32.098011] [<ffffffff801cdb72>] ksys_write+0x6c/0xd6
> [   32.098222] [<ffffffff801cdc06>] sys_write+0x2a/0x38
> [   32.098405] [<ffffffff80003c76>] ret_from_syscall+0x0/0x2
>
> Since the rs1 and rd might be the same one, such as 'jalr 1140(ra)',
> hence it requires obtaining the target address from rs1 followed by
> updating rd.
>
> Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
> Signed-off-by: Liao Chang <liaochang1@huawei.com>

This has already been picked up to riscv-fixes:
https://lore.kernel.org/linux-riscv/167462581691.3015.5045414056306333462.git-patchwork-notify@kernel.org/

No need to have this patch in the series (and dito to
https://lore.kernel.org/linux-riscv/20230127130541.1250865-11-chenguokai17@mails.ucas.ac.cn/
that Guo submitted a fix for).


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
  2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
                   ` (13 preceding siblings ...)
  2023-01-30 12:31 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Björn Töpel
@ 2023-02-01 13:29 ` Björn Töpel
  14 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:29 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> Add jump optimization support for RISC-V.
>
> Replaces ebreak instructions used by normal kprobes with an AUIPC/JALR
> instruction pair with the aim of suppressing the probe-hit overhead.
>
> All known optprobe-capable RISC architectures have been using a single
> jump or branch instructions while this patch chooses not. RISC-V has a
> quite limited jump range (4KB or 2MB) for both its branch and jump
> instructions, which prevent optimizations from supporting probes that
> spread all over the kernel.
>
> AUIPC/JALR instruction pair is introduced with a much wider jump range
> (4GB), where AUIPC loads the upper 12 bits to a free register and JALR
> Deaconappends the lower 20 bits to form a 32 bits immediate. Note that
> returns from probe handler require another free register. As kprobes
> can appear almost anywhere inside the kernel, the free register should
> be found generically, not depending on calling convention or any other
> regulations.
>
> The algorithm for finding the free register is inspired by the register
> renaming in modern processors. From the perspective of register
> renaming, a register could be represented as two different registers if
> two neighbor instructions both write to it but no one ever reads it.
> Extending this fact, a register is considered to be free if there is no
> read before its next write in the execution flow. We are free to change
> its value without interfering normal execution.
>
> Static analysis shows that 51% of instructions of the kernel (default
> config) is capable of being replaced i.e. one free register can be found
> at both the start and end of replaced instruction pairs while the
> replaced instructions can be directly executed. We also made an
> efficiency test on Gem 5 RISCV which shows a more than 5x speedup on 
> breakpoint-based implementation.
>
> Contribution:
> Chen Guokai invents the algorithm for searching free register, evaluate
> the ratio of optimization, the basic function support RVI kernel binary.
> Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix
> some bugs with different kernel configure, refactor out the entire
> feature into some individual patches.

Thank you for continuing to work on this series! I took it for a spin,
and it worked nicely on my QEMU setup.

It would be nice to have it run on some *actual* hardware as well. :-)

I have some additional comments on the series, but I'll add those to the
relevant patch. It's mostly minor things!


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code
  2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
@ 2023-02-01 13:29   ` Björn Töpel
  2023-02-02 10:16   ` Conor Dooley
  1 sibling, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:29 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai, Heiko Stuebner

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> From: Liao Chang <liaochang1@huawei.com>
>
> These RVI and RVC instruction decoder are used in the free register
> searching algorithm, each instruction of instrumented function needs to
> decode and test if it contains a free register to form AUIPC/JALR.
>
> For RVI instruction format, the position and length of rs1/rs2/rd/opcode
> parts are uniform [1], but RVC instruction formats are complicated, so
> it addresses a series of functions to decode rs1/rs2/rd for RVC [1].
>
> [1] https://github.com/riscv/riscv-isa-manual/releases
>
> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> ---
>  arch/riscv/include/asm/bug.h             |   5 +-
>  arch/riscv/kernel/probes/decode-insn.h   | 148 +++++++++++++++++++++++
>  arch/riscv/kernel/probes/simulate-insn.h |  42 +++++++
>  3 files changed, 194 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
> index 1aaea81fb141..9c33d3b58225 100644
> --- a/arch/riscv/include/asm/bug.h
> +++ b/arch/riscv/include/asm/bug.h
> @@ -19,11 +19,14 @@
>  #define __BUG_INSN_32	_UL(0x00100073) /* ebreak */
>  #define __BUG_INSN_16	_UL(0x9002) /* c.ebreak */
>  
> +#define RVI_INSN_LEN	4UL
> +#define RVC_INSN_LEN	2UL
> +
>  #define GET_INSN_LENGTH(insn)						\
>  ({									\
>  	unsigned long __len;						\
>  	__len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ?	\
> -		4UL : 2UL;						\
> +		RVI_INSN_LEN : RVC_INSN_LEN;				\
>  	__len;								\
>  })
>  
> diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
> index 42269a7d676d..785b023a62ea 100644
> --- a/arch/riscv/kernel/probes/decode-insn.h
> +++ b/arch/riscv/kernel/probes/decode-insn.h
> @@ -3,6 +3,7 @@
>  #ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H
>  #define _RISCV_KERNEL_KPROBES_DECODE_INSN_H
>  
> +#include <linux/bitops.h>
>  #include <asm/sections.h>
>  #include <asm/kprobes.h>
>  
> @@ -15,4 +16,151 @@ enum probe_insn {
>  enum probe_insn __kprobes
>  riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);
>  
> +#ifdef CONFIG_KPROBES

No reason to hide the static inlines behind an ifdef; Leave it out, so
it's less likely that code breaks slip through.

I wonder if these functions below would make more sense in the
asm/insn.h, where riscv_insn_is_##name live (which you're using in later
patches). Heiko (Cc'd) recently did a big clean up there, which probably
apply to the code below.

> +
> +static inline u16 rvi_rs1(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 15) & 0x1f);
> +}
> +
> +static inline u16 rvi_rs2(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 20) & 0x1f);
> +}
> +
> +static inline u16 rvi_rd(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 7) & 0x1f);
> +}
> +
> +static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)
> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 8)  & 0xf)   << 1)  |
> +		(((opcode >> 25) & 0x3f)  << 5)  |
> +		(((opcode >> 7)  & 0x1)   << 11) |
> +		(((opcode >> 31) & 0x1)   << 12);
> +
> +	return sign_extend32(imme, 13);
> +}
> +
> +static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)
> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 21) & 0x3ff) << 1)  |
> +		(((opcode >> 20) & 0x1)   << 11) |
> +		(((opcode >> 12) & 0xff)  << 12) |
> +		(((opcode >> 31) & 0x1)   << 20);
> +
> +	return sign_extend32(imme, 21);
> +}
> +
> +#ifdef CONFIG_RISCV_ISA_C

Dito. Just get rid of the ifdef clutter.


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized
  2023-01-27 13:05 ` [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized Chen Guokai
@ 2023-02-01 13:30   ` Björn Töpel
  0 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:30 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> From: Liao Chang <liaochang1@huawei.com>
>
> For the RVI and RVC hybrid encoding kernel, although AUIPC/JALR just
> occupy 8 bytes space, the patched code is 10 bytes at the worst case
> to ensure no RVI is truncated, so to check if kprobe satisfies the
> requirement of jump optimization, it has to find out an instruction
> window large enough to patch AUIPC/JALR(and padding C.NOP), and ensure
> no instruction nearby jumps into the patching window.
>
> Besides that, this series does not support the simulation of pc-relative
> instruction in optprobe handler yet, so the patching window should not
> includes pc-relative instruction.
>
> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>

Reviewed-by: Björn Töpel <bjorn@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe
  2023-01-27 13:05 ` [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe Chen Guokai
@ 2023-02-01 13:30   ` Björn Töpel
  0 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:30 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
> index d84aa1420fa2..a47f7d2bf3a6 100644
> --- a/arch/riscv/kernel/probes/opt.c
> +++ b/arch/riscv/kernel/probes/opt.c
> @@ -11,9 +11,32 @@
>  #include <linux/kprobes.h>
>  #include <asm/kprobes.h>
>  #include <asm/patch.h>
> +#include <asm/asm-offsets.h>
>  
>  #include "simulate-insn.h"
>  #include "decode-insn.h"
> +#include "../../net/bpf_jit.h"
> +
> +static void optimized_callback(struct optimized_kprobe *op,
> +			       struct pt_regs *regs)
> +{
> +	if (kprobe_disabled(&op->kp))
> +		return;
> +
> +	preempt_disable();
> +	if (kprobe_running()) {
> +		kprobes_inc_nmissed_count(&op->kp);
> +	} else {
> +		__this_cpu_write(current_kprobe, &op->kp);
> +		/* Save skipped registers */
> +		instruction_pointer_set(regs, (unsigned long)op->kp.addr);
> +		get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
> +		opt_pre_handler(&op->kp, regs);
> +		__this_cpu_write(current_kprobe, NULL);
> +	}
> +	preempt_enable();
> +}
> +NOKPROBE_SYMBOL(optimized_callback)
>  
>  static int in_auipc_jalr_range(long val)
>  {
> @@ -30,6 +53,11 @@ static int in_auipc_jalr_range(long val)
>  #endif
>  }
>  
> +#define DETOUR_ADDR(code, offs) \
> +	((void *)((unsigned long)(code) + (offs)))
> +#define DETOUR_INSN(code, offs) \
> +	(*(kprobe_opcode_t *)((unsigned long)(code) + (offs)))

Can this cause a misaligned u32 load exception?


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist
  2023-01-27 13:05 ` [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist Chen Guokai
@ 2023-02-01 13:30   ` Björn Töpel
  0 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:30 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> From: Liao Chang <liaochang1@huawei.com>
>
> Due to the generic exception entry and exit code is part of the critical
> path for kprobe breakpoint and uprobe syscall entry, set a kprobe on the
> assembly symbols in entry.S result in kernel stack overflow crash, hence
> it has to explicitly blacklist it, requiring a new _ASM_NOKPROBE() asm
> helper.
>
> Signed-off-by: Liao Chang <liaochang1@huawei.com>

Reviewed-by: Björn Töpel <bjorn@kernel.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe
  2023-01-27 13:05 ` [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Chen Guokai
@ 2023-02-01 13:31   ` Björn Töpel
  0 siblings, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:31 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h
> index 9a7d7346001e..ee31539de65f 100644
> --- a/arch/riscv/include/asm/patch.h
> +++ b/arch/riscv/include/asm/patch.h
> @@ -8,5 +8,6 @@
>  
>  int patch_text_nosync(void *addr, const void *insns, size_t len);
>  int patch_text(void *addr, u32 insn);
> +int patch_text_batch(void *addr, const void *insn, size_t size);
>  
>  #endif /* _ASM_RISCV_PATCH_H */
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 765004b60513..ce324b6a6998 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -15,7 +15,8 @@
>  
>  struct patch_insn {
>  	void *addr;
> -	u32 insn;
> +	const void *insn;
> +	size_t size;
>  	atomic_t cpu_count;
>  };
>  
> @@ -106,8 +107,7 @@ static int patch_text_cb(void *data)
>  
>  	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
>  		ret =

Nit: Please use the full width. No need for a NL here.

> -		    patch_text_nosync(patch->addr, &patch->insn,
> -					    GET_INSN_LENGTH(patch->insn));
> +		    patch_text_nosync(patch->addr, patch->insn, patch->size);
>  		atomic_inc(&patch->cpu_count);
>  	} else {
>  		while (atomic_read(&patch->cpu_count) <= num_online_cpus())
> @@ -123,7 +123,8 @@ int patch_text(void *addr, u32 insn)
>  {
>  	struct patch_insn patch = {
>  		.addr = addr,
> -		.insn = insn,
> +		.insn = &insn,
> +		.size = GET_INSN_LENGTH(insn),
>  		.cpu_count = ATOMIC_INIT(0),
>  	};
>  
> @@ -131,3 +132,17 @@ int patch_text(void *addr, u32 insn)
>  				       &patch, cpu_online_mask);
>  }
>  NOKPROBE_SYMBOL(patch_text);
> +
> +int patch_text_batch(void *addr, const void *insn, size_t size)
> +{
> +	struct patch_insn patch = {
> +		.addr = addr,
> +		.insn = insn,
> +		.size = size,
> +		.cpu_count = ATOMIC_INIT(0),
> +	};
> +
> +	return stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask);
> +}
> +
> +NOKPROBE_SYMBOL(patch_text_batch);
> diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
> index a47f7d2bf3a6..c52d5bdc748c 100644
> --- a/arch/riscv/kernel/probes/opt.c
> +++ b/arch/riscv/kernel/probes/opt.c
> @@ -8,6 +8,7 @@
>  
>  #define pr_fmt(fmt)	"optprobe: " fmt
>  
> +#include <linux/types.h>
>  #include <linux/kprobes.h>
>  #include <asm/kprobes.h>
>  #include <asm/patch.h>
> @@ -444,11 +445,19 @@ static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
>  
>  int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
>  {
> -	return 0;
> +	return optinsn->length;
>  }
>  
>  int arch_check_optimized_kprobe(struct optimized_kprobe *op)
>  {
> +	unsigned long i;
> +	struct kprobe *p;
> +
> +	for (i = RVC_INSN_LEN; i < op->optinsn.length; i += RVC_INSN_LEN) {
> +		p = get_kprobe(op->kp.addr + i);
> +		if (p && !kprobe_disabled(p))
> +			return -EEXIST;
> +	}
>  	return 0;
>  }
>  
> @@ -509,23 +518,75 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
>  
>  void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
>  {
> +	if (op->optinsn.insn) {
> +		free_optinsn_slot(op->optinsn.insn, 1);
> +		op->optinsn.insn = NULL;
> +		op->optinsn.length = 0;
> +	}
>  }
>  
>  void arch_optimize_kprobes(struct list_head *oplist)
>  {
> +	long offs;
> +	kprobe_opcode_t insn[3];
> +	struct optimized_kprobe *op, *tmp;
> +
> +	list_for_each_entry_safe(op, tmp, oplist, list) {
> +		WARN_ON(kprobe_disabled(&op->kp));
> +
> +		/* Backup instructions which will be replaced by jump address */
> +		memcpy(op->optinsn.copied_insn,
> +		       DETOUR_ADDR(op->optinsn.insn, DETOUR_INSN_OFFSET),
> +		       op->optinsn.length);
> +
> +		/*
> +		 * After patching, it should be:
> +		 * auipc free_register, %hi(detour_buffer)
> +		 * jalr free_register, free_register, %lo(detour_buffer)
> +		 * where free_register will eventually save the return address
> +		 */
> +		offs = (unsigned long)op->optinsn.insn -
> +		       (unsigned long)op->kp.addr;
> +		insn[0] = rv_auipc(op->optinsn.rd, (offs + (1 << 11)) >> 12);
> +		insn[1] = rv_jalr(op->optinsn.rd, op->optinsn.rd, offs & 0xFFF);
> +		/* For 3 RVC + 1 RVI scenario, fill C.NOP for padding */
> +		if (op->optinsn.length > 2 * RVI_INSN_LEN)
> +			insn[2] = rvc_addi(0, 0);
> +
> +		patch_text_batch(op->kp.addr, insn, op->optinsn.length);
> +		if (memcmp(op->kp.addr, insn, op->optinsn.length))
> +			continue;
> +
> +		list_del_init(&op->list);
> +	}
>  }
>  
>  void arch_unoptimize_kprobes(struct list_head *oplist,
>  			     struct list_head *done_list)
>  {
> +	struct optimized_kprobe *op, *tmp;
> +
> +	list_for_each_entry_safe(op, tmp, oplist, list) {
> +		arch_unoptimize_kprobe(op);
> +		list_move(&op->list, done_list);
> +	}
>  }
>  
>  void arch_unoptimize_kprobe(struct optimized_kprobe *op)
>  {
> +	kprobe_opcode_t buf[MAX_COPIED_INSN];
> +
> +	memcpy(buf, op->optinsn.copied_insn, op->optinsn.length);
> +	if (GET_INSN_LENGTH(op->kp.opcode) == RVI_INSN_LEN)
> +		*(u32 *)buf = __BUG_INSN_32;
> +	else
> +		*(u16 *)buf = __BUG_INSN_16;
> +	patch_text_batch(op->kp.addr, buf, op->optinsn.length);
>  }
>  
>  int arch_within_optimized_kprobe(struct optimized_kprobe *op,
>  				 kprobe_opcode_t *addr)
>  {
> -	return 0;
> +	return (op->kp.addr <= addr &&
> +		op->kp.addr + op->optinsn.length > addr);

Nit: Use the whole 100 char line width, please.

With or w/o the nits fixed:

Reviewed-by: Björn Töpel <bjorn@kernel.org>


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones
  2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
@ 2023-02-01 13:31   ` Björn Töpel
  2023-02-02  9:08   ` Conor Dooley
  1 sibling, 0 replies; 27+ messages in thread
From: Björn Töpel @ 2023-02-01 13:31 UTC (permalink / raw)
  To: Chen Guokai, paul.walmsley, palmer, aou, rostedt, mingo, sfr
  Cc: linux-riscv, linux-kernel, liaochang1, Chen Guokai

Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:

> This patch further allows optprobe to use caller-saved registers that
> is not used across the function being optimized as free registers.
>
> Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> Co-developed-by: Liao Chang <liaochang1@huawei.com>
> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> Reported-by: Björn Töpel <bjorn@kernel.org>

Reported-by: should be used for fixes. Please change to Suggested-by:,
or simply remove.


Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones
  2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
  2023-02-01 13:31   ` Björn Töpel
@ 2023-02-02  9:08   ` Conor Dooley
  1 sibling, 0 replies; 27+ messages in thread
From: Conor Dooley @ 2023-02-02  9:08 UTC (permalink / raw)
  To: Chen Guokai
  Cc: paul.walmsley, palmer, aou, rostedt, mingo, sfr, linux-riscv,
	linux-kernel, liaochang1, Björn Töpel

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]

Hey Chen,

Was looking at the insn manipulation code in 4/13 and noticed a minor
nit in this patch in the process.

On Fri, Jan 27, 2023 at 09:05:37PM +0800, Chen Guokai wrote:
> +/*
> + * Register	ABI Name	Saver
> + * x0		zero		--
> + * x1		ra		Caller
> + * x2		sp		Callee
> + * x3		gp		--
> + * x4		tp		--
> + * x5-7 	t0-2		Caller

I know it's just a comment, but this line here has a space before the
first tab that makes nvim unhappy.

Thanks,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code
  2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
  2023-02-01 13:29   ` Björn Töpel
@ 2023-02-02 10:16   ` Conor Dooley
  1 sibling, 0 replies; 27+ messages in thread
From: Conor Dooley @ 2023-02-02 10:16 UTC (permalink / raw)
  To: Chen Guokai, bjorn, heiko
  Cc: paul.walmsley, palmer, aou, rostedt, mingo, sfr, linux-riscv,
	linux-kernel, liaochang1

[-- Attachment #1: Type: text/plain, Size: 7526 bytes --]

Hey Chen, Liao, Bjorn, Heiko,

Heiko certainly has a more complete understanding of the newly added
stuff in insn*.h, but I've attempted to have a look at the insn stuff
that you have added here...

On Fri, Jan 27, 2023 at 09:05:32PM +0800, Chen Guokai wrote:
> From: Liao Chang <liaochang1@huawei.com>
> 
> These RVI and RVC instruction decoder are used in the free register
> searching algorithm, each instruction of instrumented function needs to
> decode and test if it contains a free register to form AUIPC/JALR.
> 
> For RVI instruction format, the position and length of rs1/rs2/rd/opcode
> parts are uniform [1], but RVC instruction formats are complicated, so
> it addresses a series of functions to decode rs1/rs2/rd for RVC [1].
> 
> [1] https://github.com/riscv/riscv-isa-manual/releases

Please make these regular link tags, so:
Link: https://github.com/riscv/riscv-isa-manual/releases [1]

> Signed-off-by: Liao Chang <liaochang1@huawei.com>
> Co-developed-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> Signed-off-by: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
> ---
>  arch/riscv/include/asm/bug.h             |   5 +-
>  arch/riscv/kernel/probes/decode-insn.h   | 148 +++++++++++++++++++++++
>  arch/riscv/kernel/probes/simulate-insn.h |  42 +++++++
>  3 files changed, 194 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
> index 1aaea81fb141..9c33d3b58225 100644
> --- a/arch/riscv/include/asm/bug.h
> +++ b/arch/riscv/include/asm/bug.h
> @@ -19,11 +19,14 @@
>  #define __BUG_INSN_32	_UL(0x00100073) /* ebreak */
>  #define __BUG_INSN_16	_UL(0x9002) /* c.ebreak */
>  
> +#define RVI_INSN_LEN	4UL
> +#define RVC_INSN_LEN	2UL
> +
>  #define GET_INSN_LENGTH(insn)						\
>  ({									\
>  	unsigned long __len;						\
>  	__len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ?	\
> -		4UL : 2UL;						\
> +		RVI_INSN_LEN : RVC_INSN_LEN;				\
>  	__len;								\
>  })
>  
> diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
> index 42269a7d676d..785b023a62ea 100644
> --- a/arch/riscv/kernel/probes/decode-insn.h
> +++ b/arch/riscv/kernel/probes/decode-insn.h
> @@ -3,6 +3,7 @@
>  #ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H
>  #define _RISCV_KERNEL_KPROBES_DECODE_INSN_H
>  
> +#include <linux/bitops.h>
>  #include <asm/sections.h>
>  #include <asm/kprobes.h>
>  
> @@ -15,4 +16,151 @@ enum probe_insn {
>  enum probe_insn __kprobes
>  riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);
>  
> +#ifdef CONFIG_KPROBES
> +
> +static inline u16 rvi_rs1(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 15) & 0x1f);

insn.h has a bunch of defines for this kind of thing, that have all been
reviewed. We definitely should be using those here, at the very least,
rather than having to review all of these numbers for a second time.
eg:
#define RVG_RS1_OPOFF		15

IMO, anything you need here should either be in that file, or added to
that file by this patch.

> +}
> +
> +static inline u16 rvi_rs2(kprobe_opcode_t opcode)

Also a note, these functions look really odd in their callsites:
+               if (riscv_insn_is_c_jr(insn)) {
+                       READ_ON(rvc_r_rs1(insn));
+                       break;
+               }

Sticking with the existing naming scheme would be great, thanks.
I think these should be moved to insn.h and renamed to:
riscv_insn_extract_rs1(), and ditto for the other things you are newly
adding here.

> +{
> +	return (u16)((opcode >> 20) & 0x1f);
> +}
> +
> +static inline u16 rvi_rd(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 7) & 0x1f);
> +}
> +
> +static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)

RV_EXTRACT_BTYPE_IMM() already exists and provides the same capability,
no? I think the whole patch here should be moved to insn.h, reuse the
defines there and have the function names changed to match the existing,
similar functions.

> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 8)  & 0xf)   << 1)  |
> +		(((opcode >> 25) & 0x3f)  << 5)  |
> +		(((opcode >> 7)  & 0x1)   << 11) |
> +		(((opcode >> 31) & 0x1)   << 12);
> +
> +	return sign_extend32(imme, 13);
> +}
> +
> +static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)

This is a re-implementation of riscv_insn_extract_jtype_imm() except
without the nice defines etc used there.

> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 21) & 0x3ff) << 1)  |
> +		(((opcode >> 20) & 0x1)   << 11) |
> +		(((opcode >> 12) & 0xff)  << 12) |
> +		(((opcode >> 31) & 0x1)   << 20);
> +
> +	return sign_extend32(imme, 21);
> +}
> +
> +#ifdef CONFIG_RISCV_ISA_C

As Bjorn pointed out, this guard can go.

> +static inline u16 rvc_r_rs1(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 2) & 0x1f);

Again, defines exist for all of this stuff already that you can go and
use.
rvc_r_rs1() should be renamed to riscv_insn_extract_csstype_rs1() or
something like that to match the existing users IMO.

Also, perhaps I've missed something, but how does a shift of 2 work for
a CR format rs1? Shouldn't it be a shift of 7?

> +}
> +
> +static inline u16 rvc_r_rs2(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 2) & 0x1f);
> +}

(snip)

> +static inline u16 rvc_b_rd(kprobe_opcode_t opcode)
> +{
> +	return (u16)((opcode >> 7) & 0x7);
> +}

All of these are so common, that I feel you'd be very well served by
defines and some macros.

> +static inline s32 rvc_branch_imme(kprobe_opcode_t opcode)

Similar comments apply here as in the G case, in particular you can use
RVC_EXTRACT_JTYPE_IMM(), no?

> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 3)  & 0x3) << 1) |
> +		(((opcode >> 10) & 0x3) << 3) |
> +		(((opcode >> 2)  & 0x1) << 5) |
> +		(((opcode >> 5)  & 0x3) << 6) |
> +		(((opcode >> 12) & 0x1) << 8);
> +
> +	return sign_extend32(imme, 9);
> +}
> +
> +static inline s32 rvc_jal_imme(kprobe_opcode_t opcode)

Ditto here, but BTYPE instead?

> +{
> +	u32 imme = 0;
> +
> +	imme |= (((opcode >> 3)  & 0x3) << 1) |
> +		(((opcode >> 11) & 0x1) << 4) |
> +		(((opcode >> 2)  & 0x1) << 5) |
> +		(((opcode >> 7)  & 0x1) << 6) |
> +		(((opcode >> 6)  & 0x1) << 7) |
> +		(((opcode >> 9)  & 0x3) << 8) |
> +		(((opcode >> 8)  & 0x1) << 10) |
> +		(((opcode >> 12) & 0x1) << 11);
> +
> +	return sign_extend32(imme, 12);
> +}
> +#endif /* CONFIG_KPROBES */
> +#endif /* CONFIG_RISCV_ISA_C */
>  #endif /* _RISCV_KERNEL_KPROBES_DECODE_INSN_H */
> diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
> index a19aaa0feb44..e89747dfabbb 100644
> --- a/arch/riscv/kernel/probes/simulate-insn.h
> +++ b/arch/riscv/kernel/probes/simulate-insn.h
> @@ -28,4 +28,46 @@ bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
>  bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
>  bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
>  
> +/* RVC(S) instructions contain rs1 and rs2 */
> +__RISCV_INSN_FUNCS(c_sq,	0xe003, 0xa000);
> +__RISCV_INSN_FUNCS(c_sw,	0xe003, 0xc000);
> +__RISCV_INSN_FUNCS(c_sd,	0xe003, 0xe000);

I think all of these should move to insn.h too, and have defines to
match the existing __RISCV_INSN_FUNCS there.
Perhaps Heiko has a more nuanced opinion on this.

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
  2023-01-30 14:38   ` Xim
@ 2023-04-26 18:01     ` Palmer Dabbelt
  0 siblings, 0 replies; 27+ messages in thread
From: Palmer Dabbelt @ 2023-04-26 18:01 UTC (permalink / raw)
  To: chenguokai17
  Cc: bjorn, Paul Walmsley, aou, rostedt, mingo, Stephen Rothwell,
	linux-riscv, linux-kernel, liaochang1

On Mon, 30 Jan 2023 06:38:42 PST (-0800), chenguokai17@mails.ucas.ac.cn wrote:
> Hi Björn,
>
>
>
>> 2023年1月30日 20:31,Björn Töpel <bjorn@kernel.org> 写道:
>>
>> Chen Guokai <chenguokai17@mails.ucas.ac.cn> writes:
>>
>>> Add jump optimization support for RISC-V.
>>
>> I'd like to take the series for a spin, but I'm having trouble applying
>> the the patches; What base commit did you use? Or point me to a git
>> repo.
>
> I generated this patch series based on next-20230127 tag
>
>>
>> (It's nice to use "--base" to git-format-patch.)
>
> I will take this parameter in any following revisions, thanks!

Just checking up on this one, it's got some feedback that seems 
reasonable.  Sorry if I missed the v7, but I'm dropping the v6 from 
patchwork.

If there's no v7 on the lists it's probably too late for 6.4, so no 
rush on my end.

Thanks!

>
>>
>>
>> Thanks!
>> Björn

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-04-26 18:01 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-27 13:05 [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
2023-01-27 13:05 ` [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES Chen Guokai
2023-01-27 13:05 ` [PATCH v6 02/13] riscv/kprobe: Allocate detour buffer from module region Chen Guokai
2023-01-27 13:05 ` [PATCH v6 03/13] riscv/kprobe: Add skeleton for preparing optimized kprobe Chen Guokai
2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
2023-02-01 13:29   ` Björn Töpel
2023-02-02 10:16   ` Conor Dooley
2023-01-27 13:05 ` [PATCH v6 05/13] riscv/kprobe: Introduce free register(s) searching algorithm Chen Guokai
2023-01-27 13:05 ` [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Chen Guokai
2023-02-01 13:31   ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
2023-02-01 13:31   ` Björn Töpel
2023-02-02  9:08   ` Conor Dooley
2023-01-27 13:05 ` [PATCH v6 10/13] riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel Chen Guokai
2023-01-27 13:05 ` [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR Chen Guokai
2023-01-31 12:51   ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 13/13] selftest/kprobes: Add testcase for kprobe SYM[+offs] Chen Guokai
2023-01-30 12:31 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Björn Töpel
2023-01-30 14:38   ` Xim
2023-04-26 18:01     ` Palmer Dabbelt
2023-02-01 13:29 ` Björn Töpel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).