linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] LoongArch: Add hardware breakpoints/watchpoints support
@ 2023-02-17  2:37 Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 1/3] " Qing Zhang
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-17  2:37 UTC (permalink / raw)
  To: Huacai Chen, Oleg Nesterov, WANG Xuerui
  Cc: Jiaxun Yang, loongarch, linux-kernel

Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and load/store operations, after the software configures hardware watchpoints
for instruction fetch and load/store operations. The processor hardware will
monitor the access address of the instruction fetch and load/store operation,
and will trigger the exception of the watchpoint when it meets the conditions
set by the watchpoint.

The hardware monitoring points for instruction fetching and load/store operations
each have a register for the overall configuration of all monitoring points,
a register for recording the status of all monitoring points, and four registers
required for configuration of each watchpoint individually.

Watchpoint related control status register chapters:
https://github.com/loongson/LoongArch-Documentation/releases/download/2022.08.12/
LoongArch-Vol1-v1.02-CN.pdf

Initial version has the following limitations:
- no support for virtualization

Can be tested: e.g.
1. see samples/hw_breakpoint and register_wide_hw_breakpoint.
2. ptrace(PTRACE_SINGLESTEP/..., pid, NULL, NULL)
3. ptrace (PTRACE_SETREGSET, tid, ... , ...)

TODO:
- Add hardware breakpoints/watchpoints for gdb, kgdb

Changes v1 -> v2:
Suggested by Huacai:
- Use irqentry_enter()/irqentry_exit() instead of
  exception_enter()/exception_exit().
- Add ptrace interface to expose hw-breakpoints to debuggers.
- Use 2022-2023.
- Some other changes.

Changes v2 -> v3:
Suggested by Jinyang:
- get_num_brps/wrps, decode/encode_ctrl_reg which the function
  returns directly.
- Remove irrelevant content from the first patch.
- Use macros to avoid using magic values directly.
- Add new arg to indicate it is breakpoint or watchpoint
  to avoid function coupling.
- Remove redundant tab.
- Modify the breakpoint/watchpoint_handler function type to void.
- Some other changes.

Changes v3 -> v4:
Suggested by Jinyang:
- Add judgment prevent rd == pc from causing single step to fail to stop.
- Add code comments to improve readability.
- Use macros in csr_write32.

Qing Zhang (3):
  LoongArch: Add hardware breakpoints/watchpoints support
  LoongArch: Add ptrace single step support
  LoongArch: ptrace: expose hardware breakpoints to debuggers

 arch/loongarch/Kconfig                     |   1 +
 arch/loongarch/include/asm/hw_breakpoint.h | 145 ++++++
 arch/loongarch/include/asm/inst.h          |  39 ++
 arch/loongarch/include/asm/loongarch.h     |  35 +-
 arch/loongarch/include/asm/processor.h     |  14 +-
 arch/loongarch/include/asm/ptrace.h        |   2 +
 arch/loongarch/include/asm/switch_to.h     |   1 +
 arch/loongarch/include/uapi/asm/ptrace.h   |   9 +
 arch/loongarch/kernel/Makefile             |   1 +
 arch/loongarch/kernel/hw_breakpoint.c      | 544 +++++++++++++++++++++
 arch/loongarch/kernel/process.c            |   7 +
 arch/loongarch/kernel/ptrace.c             | 484 ++++++++++++++++++
 arch/loongarch/kernel/traps.c              |  38 +-
 include/uapi/linux/elf.h                   |   2 +
 14 files changed, 1301 insertions(+), 21 deletions(-)
 create mode 100644 arch/loongarch/include/asm/hw_breakpoint.h
 create mode 100644 arch/loongarch/kernel/hw_breakpoint.c

-- 
2.36.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 1/3] LoongArch: Add hardware breakpoints/watchpoints support
  2023-02-17  2:37 [PATCH v4 0/3] LoongArch: Add hardware breakpoints/watchpoints support Qing Zhang
@ 2023-02-17  2:37 ` Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 2/3] LoongArch: Add ptrace single step support Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 3/3] LoongArch: ptrace: expose hardware breakpoints to debuggers Qing Zhang
  2 siblings, 0 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-17  2:37 UTC (permalink / raw)
  To: Huacai Chen, Oleg Nesterov, WANG Xuerui
  Cc: Jiaxun Yang, loongarch, linux-kernel

Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and load/store operations, after the software configures hardware watchpoints
for instruction fetch and load/store operations. The processor hardware will
monitor the access address of the instruction fetch and load/store operation,
and will trigger the exception of the watchpoint when it meets the conditions
set by the watchpoint.

The hardware monitoring points for instruction fetching and load/store operations
each have a register for the overall configuration of all monitoring points,
a register for recording the status of all monitoring points, and four registers
required for configuration of each watchpoint individually.

Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
---
 arch/loongarch/Kconfig                     |   1 +
 arch/loongarch/include/asm/hw_breakpoint.h | 145 ++++++
 arch/loongarch/include/asm/loongarch.h     |  32 +-
 arch/loongarch/include/asm/processor.h     |  11 +-
 arch/loongarch/include/asm/switch_to.h     |   1 +
 arch/loongarch/kernel/Makefile             |   1 +
 arch/loongarch/kernel/hw_breakpoint.c      | 519 +++++++++++++++++++++
 arch/loongarch/kernel/process.c            |   7 +
 arch/loongarch/kernel/traps.c              |  10 +-
 9 files changed, 706 insertions(+), 21 deletions(-)
 create mode 100644 arch/loongarch/include/asm/hw_breakpoint.h
 create mode 100644 arch/loongarch/kernel/hw_breakpoint.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 54bd3dbde1f2..e3eba2eb4b44 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -100,6 +100,7 @@ config LOONGARCH
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACER
 	select HAVE_GENERIC_VDSO
+	select HAVE_HW_BREAKPOINT if PERF_EVENTS
 	select HAVE_IOREMAP_PROT
 	select HAVE_IRQ_EXIT_ON_IRQ_STACK
 	select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/arch/loongarch/include/asm/hw_breakpoint.h b/arch/loongarch/include/asm/hw_breakpoint.h
new file mode 100644
index 000000000000..c70b56559e00
--- /dev/null
+++ b/arch/loongarch/include/asm/hw_breakpoint.h
@@ -0,0 +1,145 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022-2023 Loongson Technology Corporation Limited
+ */
+#ifndef __ASM_HW_BREAKPOINT_H
+#define __ASM_HW_BREAKPOINT_H
+
+#include <asm/loongarch.h>
+
+#ifdef __KERNEL__
+
+/* Breakpoint */
+#define LOONGARCH_BREAKPOINT_EXECUTE		(0 << 0)
+
+/* Watchpoints */
+#define LOONGARCH_BREAKPOINT_LOAD		(1 << 0)
+#define LOONGARCH_BREAKPOINT_STORE		(1 << 1)
+
+struct arch_hw_breakpoint_ctrl {
+	u32 __reserved1	: 28,
+	len		: 2,
+	type		: 2;
+};
+
+struct arch_hw_breakpoint {
+	u64 address;
+	u64 mask;
+	struct arch_hw_breakpoint_ctrl ctrl;
+};
+
+/* Lengths */
+#define LOONGARCH_BREAKPOINT_LEN_1    0b11
+#define LOONGARCH_BREAKPOINT_LEN_2    0b10
+#define LOONGARCH_BREAKPOINT_LEN_4    0b01
+#define LOONGARCH_BREAKPOINT_LEN_8    0b00
+
+/*
+ * Limits.
+ * Changing these will require modifications to the register accessors.
+ */
+#define LOONGARCH_MAX_BRP		8
+#define LOONGARCH_MAX_WRP		8
+
+/* Virtual debug register bases. */
+#define CSR_CFG_ADDR	0
+#define CSR_CFG_MASK	(CSR_CFG_ADDR + LOONGARCH_MAX_BRP)
+#define CSR_CFG_CTRL	(CSR_CFG_MASK + LOONGARCH_MAX_BRP)
+#define CSR_CFG_ASID	(CSR_CFG_CTRL + LOONGARCH_MAX_WRP)
+
+/* Debug register names. */
+#define LOONGARCH_CSR_NAME_ADDR	ADDR
+#define LOONGARCH_CSR_NAME_MASK	MASK
+#define LOONGARCH_CSR_NAME_CTRL	CTRL
+#define LOONGARCH_CSR_NAME_ASID	ASID
+
+/* Accessor macros for the debug registers. */
+#define LOONGARCH_CSR_WATCH_READ(N, REG, T, VAL)			\
+do {								\
+	if (T == 0)						\
+		VAL = csr_read64(LOONGARCH_CSR_##IB##N##REG);	\
+	else							\
+		VAL = csr_read64(LOONGARCH_CSR_##DB##N##REG);	\
+} while (0)
+
+#define LOONGARCH_CSR_WATCH_WRITE(N, REG, T, VAL)			\
+do {								\
+	if (T == 0)						\
+		csr_write64(VAL, LOONGARCH_CSR_##IB##N##REG);	\
+	else							\
+		csr_write64(VAL, LOONGARCH_CSR_##DB##N##REG);	\
+} while (0)
+
+/* Exact number */
+#define CSR_FWPC_NUM		0x3f
+#define CSR_MWPC_NUM		0x3f
+
+#define CTRL_PLV_ENABLE		0x1e
+
+#define MWPnCFG3_LoadEn		8
+#define MWPnCFG3_StoreEn	9
+
+#define MWPnCFG3_Type_mask	0x3
+#define MWPnCFG3_Size_mask	0x3
+
+static inline u32 encode_ctrl_reg(struct arch_hw_breakpoint_ctrl ctrl)
+{
+	return (ctrl.len << 10) | (ctrl.type << 8);
+}
+
+static inline void decode_ctrl_reg(u32 reg, struct arch_hw_breakpoint_ctrl *ctrl)
+{
+	reg >>= 8;
+	ctrl->type	= reg & MWPnCFG3_Type_mask;
+	reg >>= 2;
+	ctrl->len	= reg & MWPnCFG3_Size_mask;
+}
+
+struct task_struct;
+struct notifier_block;
+struct perf_event_attr;
+struct perf_event;
+
+extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+				  int *gen_len, int *gen_type, int *offset);
+extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw);
+extern int hw_breakpoint_arch_parse(struct perf_event *bp,
+				    const struct perf_event_attr *attr,
+				    struct arch_hw_breakpoint *hw);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+					   unsigned long val, void *data);
+
+extern int arch_install_hw_breakpoint(struct perf_event *bp);
+extern void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+extern void hw_breakpoint_pmu_read(struct perf_event *bp);
+extern int hw_breakpoint_slots(int type);
+
+void breakpoint_handler(struct pt_regs *regs);
+void watchpoint_handler(struct pt_regs *regs);
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+extern void hw_breakpoint_thread_switch(struct task_struct *next);
+extern void ptrace_hw_copy_thread(struct task_struct *task);
+#else
+static inline void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+}
+static inline void ptrace_hw_copy_thread(struct task_struct *task)
+{
+}
+#endif
+
+/* Determine number of BRP registers available. */
+static inline int get_num_brps(void)
+{
+	return csr_read64(LOONGARCH_CSR_FWPC) & CSR_FWPC_NUM;
+}
+
+/* Determine number of WRP registers available. */
+static inline int get_num_wrps(void)
+{
+	return csr_read64(LOONGARCH_CSR_MWPC) & CSR_MWPC_NUM;
+}
+
+#endif	/* __KERNEL__ */
+#endif	/* __ASM_BREAKPOINT_H */
diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index 7f8d57a61c8b..e9aed583a064 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -970,42 +970,42 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)
 
 #define LOONGARCH_CSR_DB0ADDR		0x310	/* data breakpoint 0 address */
 #define LOONGARCH_CSR_DB0MASK		0x311	/* data breakpoint 0 mask */
-#define LOONGARCH_CSR_DB0CTL		0x312	/* data breakpoint 0 control */
+#define LOONGARCH_CSR_DB0CTRL		0x312	/* data breakpoint 0 control */
 #define LOONGARCH_CSR_DB0ASID		0x313	/* data breakpoint 0 asid */
 
 #define LOONGARCH_CSR_DB1ADDR		0x318	/* data breakpoint 1 address */
 #define LOONGARCH_CSR_DB1MASK		0x319	/* data breakpoint 1 mask */
-#define LOONGARCH_CSR_DB1CTL		0x31a	/* data breakpoint 1 control */
+#define LOONGARCH_CSR_DB1CTRL		0x31a	/* data breakpoint 1 control */
 #define LOONGARCH_CSR_DB1ASID		0x31b	/* data breakpoint 1 asid */
 
 #define LOONGARCH_CSR_DB2ADDR		0x320	/* data breakpoint 2 address */
 #define LOONGARCH_CSR_DB2MASK		0x321	/* data breakpoint 2 mask */
-#define LOONGARCH_CSR_DB2CTL		0x322	/* data breakpoint 2 control */
+#define LOONGARCH_CSR_DB2CTRL		0x322	/* data breakpoint 2 control */
 #define LOONGARCH_CSR_DB2ASID		0x323	/* data breakpoint 2 asid */
 
 #define LOONGARCH_CSR_DB3ADDR		0x328	/* data breakpoint 3 address */
 #define LOONGARCH_CSR_DB3MASK		0x329	/* data breakpoint 3 mask */
-#define LOONGARCH_CSR_DB3CTL		0x32a	/* data breakpoint 3 control */
+#define LOONGARCH_CSR_DB3CTRL		0x32a	/* data breakpoint 3 control */
 #define LOONGARCH_CSR_DB3ASID		0x32b	/* data breakpoint 3 asid */
 
 #define LOONGARCH_CSR_DB4ADDR		0x330	/* data breakpoint 4 address */
 #define LOONGARCH_CSR_DB4MASK		0x331	/* data breakpoint 4 maks */
-#define LOONGARCH_CSR_DB4CTL		0x332	/* data breakpoint 4 control */
+#define LOONGARCH_CSR_DB4CTRL		0x332	/* data breakpoint 4 control */
 #define LOONGARCH_CSR_DB4ASID		0x333	/* data breakpoint 4 asid */
 
 #define LOONGARCH_CSR_DB5ADDR		0x338	/* data breakpoint 5 address */
 #define LOONGARCH_CSR_DB5MASK		0x339	/* data breakpoint 5 mask */
-#define LOONGARCH_CSR_DB5CTL		0x33a	/* data breakpoint 5 control */
+#define LOONGARCH_CSR_DB5CTRL		0x33a	/* data breakpoint 5 control */
 #define LOONGARCH_CSR_DB5ASID		0x33b	/* data breakpoint 5 asid */
 
 #define LOONGARCH_CSR_DB6ADDR		0x340	/* data breakpoint 6 address */
 #define LOONGARCH_CSR_DB6MASK		0x341	/* data breakpoint 6 mask */
-#define LOONGARCH_CSR_DB6CTL		0x342	/* data breakpoint 6 control */
+#define LOONGARCH_CSR_DB6CTRL		0x342	/* data breakpoint 6 control */
 #define LOONGARCH_CSR_DB6ASID		0x343	/* data breakpoint 6 asid */
 
 #define LOONGARCH_CSR_DB7ADDR		0x348	/* data breakpoint 7 address */
 #define LOONGARCH_CSR_DB7MASK		0x349	/* data breakpoint 7 mask */
-#define LOONGARCH_CSR_DB7CTL		0x34a	/* data breakpoint 7 control */
+#define LOONGARCH_CSR_DB7CTRL		0x34a	/* data breakpoint 7 control */
 #define LOONGARCH_CSR_DB7ASID		0x34b	/* data breakpoint 7 asid */
 
 #define LOONGARCH_CSR_FWPC		0x380	/* instruction breakpoint config */
@@ -1013,42 +1013,42 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)
 
 #define LOONGARCH_CSR_IB0ADDR		0x390	/* inst breakpoint 0 address */
 #define LOONGARCH_CSR_IB0MASK		0x391	/* inst breakpoint 0 mask */
-#define LOONGARCH_CSR_IB0CTL		0x392	/* inst breakpoint 0 control */
+#define LOONGARCH_CSR_IB0CTRL		0x392	/* inst breakpoint 0 control */
 #define LOONGARCH_CSR_IB0ASID		0x393	/* inst breakpoint 0 asid */
 
 #define LOONGARCH_CSR_IB1ADDR		0x398	/* inst breakpoint 1 address */
 #define LOONGARCH_CSR_IB1MASK		0x399	/* inst breakpoint 1 mask */
-#define LOONGARCH_CSR_IB1CTL		0x39a	/* inst breakpoint 1 control */
+#define LOONGARCH_CSR_IB1CTRL		0x39a	/* inst breakpoint 1 control */
 #define LOONGARCH_CSR_IB1ASID		0x39b	/* inst breakpoint 1 asid */
 
 #define LOONGARCH_CSR_IB2ADDR		0x3a0	/* inst breakpoint 2 address */
 #define LOONGARCH_CSR_IB2MASK		0x3a1	/* inst breakpoint 2 mask */
-#define LOONGARCH_CSR_IB2CTL		0x3a2	/* inst breakpoint 2 control */
+#define LOONGARCH_CSR_IB2CTRL		0x3a2	/* inst breakpoint 2 control */
 #define LOONGARCH_CSR_IB2ASID		0x3a3	/* inst breakpoint 2 asid */
 
 #define LOONGARCH_CSR_IB3ADDR		0x3a8	/* inst breakpoint 3 address */
 #define LOONGARCH_CSR_IB3MASK		0x3a9	/* breakpoint 3 mask */
-#define LOONGARCH_CSR_IB3CTL		0x3aa	/* inst breakpoint 3 control */
+#define LOONGARCH_CSR_IB3CTRL		0x3aa	/* inst breakpoint 3 control */
 #define LOONGARCH_CSR_IB3ASID		0x3ab	/* inst breakpoint 3 asid */
 
 #define LOONGARCH_CSR_IB4ADDR		0x3b0	/* inst breakpoint 4 address */
 #define LOONGARCH_CSR_IB4MASK		0x3b1	/* inst breakpoint 4 mask */
-#define LOONGARCH_CSR_IB4CTL		0x3b2	/* inst breakpoint 4 control */
+#define LOONGARCH_CSR_IB4CTRL		0x3b2	/* inst breakpoint 4 control */
 #define LOONGARCH_CSR_IB4ASID		0x3b3	/* inst breakpoint 4 asid */
 
 #define LOONGARCH_CSR_IB5ADDR		0x3b8	/* inst breakpoint 5 address */
 #define LOONGARCH_CSR_IB5MASK		0x3b9	/* inst breakpoint 5 mask */
-#define LOONGARCH_CSR_IB5CTL		0x3ba	/* inst breakpoint 5 control */
+#define LOONGARCH_CSR_IB5CTRL		0x3ba	/* inst breakpoint 5 control */
 #define LOONGARCH_CSR_IB5ASID		0x3bb	/* inst breakpoint 5 asid */
 
 #define LOONGARCH_CSR_IB6ADDR		0x3c0	/* inst breakpoint 6 address */
 #define LOONGARCH_CSR_IB6MASK		0x3c1	/* inst breakpoint 6 mask */
-#define LOONGARCH_CSR_IB6CTL		0x3c2	/* inst breakpoint 6 control */
+#define LOONGARCH_CSR_IB6CTRL		0x3c2	/* inst breakpoint 6 control */
 #define LOONGARCH_CSR_IB6ASID		0x3c3	/* inst breakpoint 6 asid */
 
 #define LOONGARCH_CSR_IB7ADDR		0x3c8	/* inst breakpoint 7 address */
 #define LOONGARCH_CSR_IB7MASK		0x3c9	/* inst breakpoint 7 mask */
-#define LOONGARCH_CSR_IB7CTL		0x3ca	/* inst breakpoint 7 control */
+#define LOONGARCH_CSR_IB7CTRL		0x3ca	/* inst breakpoint 7 control */
 #define LOONGARCH_CSR_IB7ASID		0x3cb	/* inst breakpoint 7 asid */
 
 #define LOONGARCH_CSR_DEBUG		0x500	/* debug config */
diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
index 7184f1dc61f2..db060c5a976f 100644
--- a/arch/loongarch/include/asm/processor.h
+++ b/arch/loongarch/include/asm/processor.h
@@ -11,6 +11,7 @@
 
 #include <asm/cpu.h>
 #include <asm/cpu-info.h>
+#include <asm/hw_breakpoint.h>
 #include <asm/loongarch.h>
 #include <asm/vdso/processor.h>
 #include <uapi/asm/ptrace.h>
@@ -126,6 +127,10 @@ struct thread_struct {
 	unsigned long error_code;
 	struct loongarch_vdso_info *vdso;
 
+	/* Hardware breakpoints pinned to this task. */
+	struct perf_event	*hbp_break[LOONGARCH_MAX_BRP];
+	struct perf_event	*hbp_watch[LOONGARCH_MAX_WRP];
+
 	/*
 	 * FPU & vector registers, must be at last because
 	 * they are conditionally copied at fork().
@@ -164,6 +169,8 @@ struct thread_struct {
 	 */							\
 	.trap_nr		= 0,				\
 	.error_code		= 0,				\
+	.hbp_break		= {0},				\
+	.hbp_watch		= {0},				\
 	/*							\
 	 * FPU & vector registers				\
 	 */							\
@@ -184,10 +191,6 @@ extern unsigned long		boot_option_idle_override;
  */
 extern void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp);
 
-static inline void flush_thread(void)
-{
-}
-
 unsigned long __get_wchan(struct task_struct *p);
 
 #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
diff --git a/arch/loongarch/include/asm/switch_to.h b/arch/loongarch/include/asm/switch_to.h
index 43a5ab162d38..24e3094bebab 100644
--- a/arch/loongarch/include/asm/switch_to.h
+++ b/arch/loongarch/include/asm/switch_to.h
@@ -34,6 +34,7 @@ extern asmlinkage struct task_struct *__switch_to(struct task_struct *prev,
 #define switch_to(prev, next, last)						\
 do {										\
 	lose_fpu_inatomic(1, prev);						\
+	hw_breakpoint_thread_switch(next);					\
 	(last) = __switch_to(prev, next, task_thread_info(next),		\
 		 __builtin_return_address(0), __builtin_frame_address(0));	\
 } while (0)
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index 45c78aea63ce..006c892f9be3 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_UNWINDER_GUESS)	+= unwind_guess.o
 obj-$(CONFIG_UNWINDER_PROLOGUE) += unwind_prologue.o
 
 obj-$(CONFIG_PERF_EVENTS)	+= perf_event.o perf_regs.o
+obj-$(CONFIG_HAVE_HW_BREAKPOINT)        += hw_breakpoint.o
 
 obj-$(CONFIG_KPROBES)		+= kprobes.o kprobes_trampoline.o
 
diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
new file mode 100644
index 000000000000..6431cd319c32
--- /dev/null
+++ b/arch/loongarch/kernel/hw_breakpoint.c
@@ -0,0 +1,519 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022-2023 Loongson Technology Corporation Limited
+ */
+#define pr_fmt(fmt) "hw-breakpoint: " fmt
+
+#include <linux/hw_breakpoint.h>
+#include <linux/kprobes.h>
+#include <linux/perf_event.h>
+
+#include <asm/hw_breakpoint.h>
+
+/* Breakpoint currently in use for each BRP. */
+static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[LOONGARCH_MAX_BRP]);
+
+/* Watchpoint currently in use for each WRP. */
+static DEFINE_PER_CPU(struct perf_event *, wp_on_reg[LOONGARCH_MAX_WRP]);
+
+/* Number of BRP/WRP registers on this CPU. */
+static int core_num_brps;
+static int core_num_wrps;
+
+int hw_breakpoint_slots(int type)
+{
+	/*
+	 * We can be called early, so don't rely on
+	 * our static variables being initialised.
+	 */
+	switch (type) {
+	case TYPE_INST:
+		return get_num_brps();
+	case TYPE_DATA:
+		return get_num_wrps();
+	default:
+		pr_warn("unknown slot type: %d\n", type);
+		return 0;
+	}
+}
+
+#define READ_WB_REG_CASE(OFF, N, REG, T, VAL)		\
+	case (OFF + N):					\
+		LOONGARCH_CSR_WATCH_READ(N, REG, T, VAL);	\
+		break
+
+#define WRITE_WB_REG_CASE(OFF, N, REG, T, VAL)		\
+	case (OFF + N):					\
+		LOONGARCH_CSR_WATCH_WRITE(N, REG, T, VAL);	\
+		break
+
+#define GEN_READ_WB_REG_CASES(OFF, REG, T, VAL)		\
+	READ_WB_REG_CASE(OFF, 0, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 1, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 2, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 3, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 4, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 5, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 6, REG, T, VAL);		\
+	READ_WB_REG_CASE(OFF, 7, REG, T, VAL);
+
+#define GEN_WRITE_WB_REG_CASES(OFF, REG, T, VAL)	\
+	WRITE_WB_REG_CASE(OFF, 0, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 1, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 2, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 3, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 4, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 5, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 6, REG, T, VAL);		\
+	WRITE_WB_REG_CASE(OFF, 7, REG, T, VAL);
+
+static u64 read_wb_reg(int reg, int n, int t)
+{
+	u64 val = 0;
+
+	switch (reg + n) {
+	GEN_READ_WB_REG_CASES(CSR_CFG_ADDR, ADDR, t, val);
+	GEN_READ_WB_REG_CASES(CSR_CFG_MASK, MASK, t, val);
+	GEN_READ_WB_REG_CASES(CSR_CFG_CTRL, CTRL, t, val);
+	GEN_READ_WB_REG_CASES(CSR_CFG_ASID, ASID, t, val);
+	default:
+		pr_warn("attempt to read from unknown breakpoint register %d\n", n);
+	}
+
+	return val;
+}
+NOKPROBE_SYMBOL(read_wb_reg);
+
+static void write_wb_reg(int reg, int n, int t, u64 val)
+{
+	switch (reg + n) {
+	GEN_WRITE_WB_REG_CASES(CSR_CFG_ADDR, ADDR, t, val);
+	GEN_WRITE_WB_REG_CASES(CSR_CFG_MASK, MASK, t, val);
+	GEN_WRITE_WB_REG_CASES(CSR_CFG_CTRL, CTRL, t, val);
+	GEN_WRITE_WB_REG_CASES(CSR_CFG_ASID, ASID, t, val);
+	default:
+		pr_warn("attempt to write to unknown breakpoint register %d\n", n);
+	}
+}
+NOKPROBE_SYMBOL(write_wb_reg);
+
+enum hw_breakpoint_ops {
+	HW_BREAKPOINT_INSTALL,
+	HW_BREAKPOINT_UNINSTALL,
+};
+
+/**
+ * hw_breakpoint_slot_setup - Find and setup a perf slot according to
+ *			      operations
+ *
+ * @slots: pointer to array of slots
+ * @max_slots: max number of slots
+ * @bp: perf_event to setup
+ * @ops: operation to be carried out on the slot
+ *
+ * Return:
+ *	slot index on success
+ *	-ENOSPC if no slot is available/matches
+ *	-EINVAL on wrong operations parameter
+ */
+
+static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
+				    struct perf_event *bp,
+				    enum hw_breakpoint_ops ops)
+{
+	int i;
+	struct perf_event **slot;
+
+	for (i = 0; i < max_slots; ++i) {
+		slot = &slots[i];
+		switch (ops) {
+		case HW_BREAKPOINT_INSTALL:
+			if (!*slot) {
+				*slot = bp;
+				return i;
+			}
+			break;
+		case HW_BREAKPOINT_UNINSTALL:
+			if (*slot == bp) {
+				*slot = NULL;
+				return i;
+			}
+			break;
+		default:
+			pr_warn_once("Unhandled hw breakpoint ops %d\n", ops);
+			return -EINVAL;
+		}
+	}
+	return -ENOSPC;
+}
+
+/*
+ * Unregister breakpoints from this task and reset the pointers in
+ * the thread_struct.
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+}
+
+void ptrace_hw_copy_thread(struct task_struct *tsk)
+{
+	memset(&tsk->thread.hbp_break, 0, sizeof(tsk->thread.hbp_break));
+	memset(&tsk->thread.hbp_watch, 0, sizeof(tsk->thread.hbp_watch));
+}
+
+static int hw_breakpoint_control(struct perf_event *bp,
+				 enum hw_breakpoint_ops ops)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slots;
+	int i, max_slots, enable;
+	u32 ctrl;
+
+	if (info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) {
+		/* Breakpoint */
+		slots = this_cpu_ptr(bp_on_reg);
+		max_slots = core_num_brps;
+	} else {
+		/* Watchpoint */
+		slots = this_cpu_ptr(wp_on_reg);
+		max_slots = core_num_wrps;
+	}
+
+	i = hw_breakpoint_slot_setup(slots, max_slots, bp, ops);
+
+	if (WARN_ONCE(i < 0, "Can't find any breakpoint slot"))
+		return i;
+
+	switch (ops) {
+	case HW_BREAKPOINT_INSTALL:
+		/* Set the FWPnCFG/MWPnCFG 1~4 register. */
+		write_wb_reg(CSR_CFG_ADDR, i, 0, info->address);
+		write_wb_reg(CSR_CFG_ADDR, i, 1, info->address);
+		write_wb_reg(CSR_CFG_MASK, i, 0, info->mask);
+		write_wb_reg(CSR_CFG_MASK, i, 1, info->mask);
+		write_wb_reg(CSR_CFG_ASID, i, 0, 0);
+		write_wb_reg(CSR_CFG_ASID, i, 1, 0);
+		if (info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) {
+			write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+		} else {
+			ctrl = encode_ctrl_reg(info->ctrl);
+			write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl | CTRL_PLV_ENABLE |
+				     1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn);
+		}
+		enable = csr_read64(LOONGARCH_CSR_CRMD);
+		csr_write64(CSR_CRMD_WE | enable, LOONGARCH_CSR_CRMD);
+		break;
+	case HW_BREAKPOINT_UNINSTALL:
+		/* Reset the FWPnCFG/MWPnCFG 1~4 register. */
+		write_wb_reg(CSR_CFG_ADDR, i, 0, 0);
+		write_wb_reg(CSR_CFG_ADDR, i, 1, 0);
+		write_wb_reg(CSR_CFG_MASK, i, 0, 0);
+		write_wb_reg(CSR_CFG_MASK, i, 1, 0);
+		write_wb_reg(CSR_CFG_CTRL, i, 0, 0);
+		write_wb_reg(CSR_CFG_CTRL, i, 1, 0);
+		write_wb_reg(CSR_CFG_ASID, i, 0, 0);
+		write_wb_reg(CSR_CFG_ASID, i, 1, 0);
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Install a perf counter breakpoint.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	return hw_breakpoint_control(bp, HW_BREAKPOINT_INSTALL);
+}
+
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	hw_breakpoint_control(bp, HW_BREAKPOINT_UNINSTALL);
+}
+
+static int get_hbp_len(u8 hbp_len)
+{
+	unsigned int len_in_bytes = 0;
+
+	switch (hbp_len) {
+	case LOONGARCH_BREAKPOINT_LEN_1:
+		len_in_bytes = 1;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_2:
+		len_in_bytes = 2;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_4:
+		len_in_bytes = 4;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_8:
+		len_in_bytes = 8;
+		break;
+	}
+
+	return len_in_bytes;
+}
+
+/*
+ * Check whether bp virtual address is in kernel space.
+ */
+int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw)
+{
+	unsigned int len;
+	unsigned long va;
+
+	va = hw->address;
+	len = get_hbp_len(hw->ctrl.len);
+
+	return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
+}
+
+/*
+ * Extract generic type and length encodings from an arch_hw_breakpoint_ctrl.
+ * Hopefully this will disappear when ptrace can bypass the conversion
+ * to generic breakpoint descriptions.
+ */
+int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+			   int *gen_len, int *gen_type, int *offset)
+{
+	/* Type */
+	switch (ctrl.type) {
+	case LOONGARCH_BREAKPOINT_EXECUTE:
+		*gen_type = HW_BREAKPOINT_X;
+		break;
+	case LOONGARCH_BREAKPOINT_LOAD:
+		*gen_type = HW_BREAKPOINT_R;
+		break;
+	case LOONGARCH_BREAKPOINT_STORE:
+		*gen_type = HW_BREAKPOINT_W;
+		break;
+	case LOONGARCH_BREAKPOINT_LOAD | LOONGARCH_BREAKPOINT_STORE:
+		*gen_type = HW_BREAKPOINT_RW;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (!ctrl.len)
+		return -EINVAL;
+	*offset = __ffs(ctrl.len);
+
+	/* Len */
+	switch (ctrl.len) {
+	case LOONGARCH_BREAKPOINT_LEN_1:
+		*gen_len = HW_BREAKPOINT_LEN_1;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_2:
+		*gen_len = HW_BREAKPOINT_LEN_2;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_4:
+		*gen_len = HW_BREAKPOINT_LEN_4;
+		break;
+	case LOONGARCH_BREAKPOINT_LEN_8:
+		*gen_len = HW_BREAKPOINT_LEN_8;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
+ * Construct an arch_hw_breakpoint from a perf_event.
+ */
+static int arch_build_bp_info(struct perf_event *bp,
+			      const struct perf_event_attr *attr,
+			      struct arch_hw_breakpoint *hw)
+{
+	/* Type */
+	switch (attr->bp_type) {
+	case HW_BREAKPOINT_X:
+		hw->ctrl.type = LOONGARCH_BREAKPOINT_EXECUTE;
+		break;
+	case HW_BREAKPOINT_R:
+		hw->ctrl.type = LOONGARCH_BREAKPOINT_LOAD;
+		break;
+	case HW_BREAKPOINT_W:
+		hw->ctrl.type = LOONGARCH_BREAKPOINT_STORE;
+		break;
+	case HW_BREAKPOINT_RW:
+		hw->ctrl.type = LOONGARCH_BREAKPOINT_LOAD | LOONGARCH_BREAKPOINT_STORE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Len */
+	switch (attr->bp_len) {
+	case HW_BREAKPOINT_LEN_1:
+		hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_1;
+		break;
+	case HW_BREAKPOINT_LEN_2:
+		hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_2;
+		break;
+	case HW_BREAKPOINT_LEN_4:
+		hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_4;
+		break;
+	case HW_BREAKPOINT_LEN_8:
+		hw->ctrl.len = LOONGARCH_BREAKPOINT_LEN_8;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Address */
+	hw->address = attr->bp_addr;
+
+	return 0;
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings.
+ */
+int hw_breakpoint_arch_parse(struct perf_event *bp,
+			     const struct perf_event_attr *attr,
+			     struct arch_hw_breakpoint *hw)
+{
+	int ret;
+	u64 alignment_mask, offset;
+
+	/* Build the arch_hw_breakpoint. */
+	ret = arch_build_bp_info(bp, attr, hw);
+	if (ret)
+		return ret;
+
+	if (hw->ctrl.type != LOONGARCH_BREAKPOINT_EXECUTE)
+		alignment_mask = 0x7;
+	offset = hw->address & alignment_mask;
+
+	hw->address &= ~alignment_mask;
+	hw->ctrl.len <<= offset;
+	return 0;
+}
+
+static void update_bp_registers(struct pt_regs *regs, int enable, int type)
+{
+	int i, max_slots;
+	u32 ctrl;
+	struct perf_event **slots;
+	struct arch_hw_breakpoint *info;
+
+	if (type == 0) {
+		slots = this_cpu_ptr(bp_on_reg);
+		max_slots = core_num_brps;
+	} else if (type == 1) {
+		slots = this_cpu_ptr(wp_on_reg);
+		max_slots = core_num_wrps;
+	} else {
+		return;
+	}
+
+	for (i = 0; i < max_slots; ++i) {
+		if (!slots[i])
+			continue;
+
+		info = counter_arch_bp(slots[i]);
+		if (enable) {
+			if ((info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) && (type == 0)) {
+				write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+				write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+			} else {
+				ctrl = read_wb_reg(CSR_CFG_CTRL, i, 1);
+				if (info->ctrl.type == LOONGARCH_BREAKPOINT_LOAD)
+					ctrl |= 0x1 << MWPnCFG3_LoadEn;
+				if (info->ctrl.type == LOONGARCH_BREAKPOINT_STORE)
+					ctrl |= 0x1 << MWPnCFG3_StoreEn;
+				write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl);
+			}
+			regs->csr_prmd |= CSR_PRMD_PWE;
+		} else {
+			if ((info->ctrl.type == LOONGARCH_BREAKPOINT_EXECUTE) && (type == 0)) {
+				write_wb_reg(CSR_CFG_CTRL, i, 0, 0);
+			} else {
+				ctrl = read_wb_reg(CSR_CFG_CTRL, i, 1);
+				if (info->ctrl.type == LOONGARCH_BREAKPOINT_LOAD)
+					ctrl &= ~0x1 << MWPnCFG3_LoadEn;
+				if (info->ctrl.type == LOONGARCH_BREAKPOINT_STORE)
+					ctrl &= ~0x1 << MWPnCFG3_StoreEn;
+				write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl);
+			}
+			regs->csr_prmd &= ~CSR_PRMD_PWE;
+		}
+	}
+}
+NOKPROBE_SYMBOL(update_bp_registers);
+
+/*
+ * Debug exception handlers.
+ */
+void breakpoint_handler(struct pt_regs *regs)
+{
+	int i;
+	struct perf_event *bp, **slots;
+
+	slots = this_cpu_ptr(bp_on_reg);
+
+	for (i = 0; i < core_num_brps; ++i) {
+		bp = slots[i];
+		if (bp == NULL)
+			continue;
+		perf_bp_event(bp, regs);
+	}
+	update_bp_registers(regs, 0, 0);
+}
+NOKPROBE_SYMBOL(breakpoint_handler);
+
+void watchpoint_handler(struct pt_regs *regs)
+{
+	int i;
+	struct perf_event *wp, **slots;
+
+	slots = this_cpu_ptr(wp_on_reg);
+
+	for (i = 0; i < core_num_wrps; ++i) {
+		wp = slots[i];
+		if (wp == NULL)
+			continue;
+		perf_bp_event(wp, regs);
+	}
+	update_bp_registers(regs, 0, 1);
+}
+NOKPROBE_SYMBOL(watchpoint_handler);
+
+/*
+ * One-time initialisation.
+ */
+static int __init arch_hw_breakpoint_init(void)
+{
+	core_num_brps = get_num_brps();
+	core_num_wrps = get_num_wrps();
+
+	pr_info("found %d breakpoint and %d watchpoint registers.\n",
+		core_num_brps, core_num_wrps);
+	return 0;
+}
+arch_initcall(arch_hw_breakpoint_init);
+
+void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+	struct pt_regs *regs = task_pt_regs(next);
+
+	/* Update breakpoints */
+	update_bp_registers(regs, 1, 0);
+	/* Update watchpoints */
+	update_bp_registers(regs, 1, 1);
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+}
+
+/*
+ * Dummy function to register with die_notifier.
+ */
+int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+				    unsigned long val, void *data)
+{
+	return NOTIFY_DONE;
+}
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index 9073fab1a487..6b53781c0b59 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -37,6 +37,7 @@
 #include <asm/cpu.h>
 #include <asm/elf.h>
 #include <asm/fpu.h>
+#include <linux/hw_breakpoint.h>
 #include <asm/io.h>
 #include <asm/irq.h>
 #include <asm/irq_regs.h>
@@ -100,6 +101,11 @@ void exit_thread(struct task_struct *tsk)
 {
 }
 
+void flush_thread(void)
+{
+	flush_ptrace_hw_breakpoint(current);
+}
+
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
 	/*
@@ -186,6 +192,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 	if (clone_flags & CLONE_SETTLS)
 		childregs->regs[2] = tls;
 
+	ptrace_hw_copy_thread(p);
 out:
 	clear_tsk_thread_flag(p, TIF_USEDFPU);
 	clear_tsk_thread_flag(p, TIF_USEDSIMD);
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 4aa3901c5623..2b133079e0f3 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -508,7 +508,15 @@ asmlinkage void noinstr do_bp(struct pt_regs *regs)
 
 asmlinkage void noinstr do_watch(struct pt_regs *regs)
 {
-	pr_warn("Hardware watch point handler not implemented!\n");
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+	irqentry_state_t state = irqentry_enter(regs);
+
+	breakpoint_handler(regs);
+	watchpoint_handler(regs);
+	force_sig(SIGTRAP);
+
+	irqentry_exit(regs, state);
+#endif
 }
 
 asmlinkage void noinstr do_ri(struct pt_regs *regs)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17  2:37 [PATCH v4 0/3] LoongArch: Add hardware breakpoints/watchpoints support Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 1/3] " Qing Zhang
@ 2023-02-17  2:37 ` Qing Zhang
  2023-02-17  9:50   ` Youling Tang
  2023-02-17 10:16   ` WANG Xuerui
  2023-02-17  2:37 ` [PATCH v4 3/3] LoongArch: ptrace: expose hardware breakpoints to debuggers Qing Zhang
  2 siblings, 2 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-17  2:37 UTC (permalink / raw)
  To: Huacai Chen, Oleg Nesterov, WANG Xuerui
  Cc: Jiaxun Yang, loongarch, linux-kernel

Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
arch_has_single_step in  and implementing the
user_enable_single_step and user_disable_single_step functions.

LongArch has no hardware single-step register. the hardware single-step
function multiplex fetch instruction watchpoint(FWPS) and specifies that
the next instruction must trigger the watch exception by setting the mask bit.
Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not to trigger
the watchpoint exception, and proceed to the next instruction.

Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
---
 arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
 arch/loongarch/include/asm/loongarch.h |  3 ++
 arch/loongarch/include/asm/processor.h |  3 ++
 arch/loongarch/include/asm/ptrace.h    |  2 +
 arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
 arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
 arch/loongarch/kernel/traps.c          | 34 +++++++++++--
 7 files changed, 176 insertions(+), 8 deletions(-)

diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index ba18ce8fbdf2..00c6f261d9a2 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union loongarch_instruction *ip)
 		is_imm12_negative(ip->reg2i12_format.immediate);
 }
 
+static inline bool branch_ins_target_pc(union loongarch_instruction *ip)
+{
+	switch (ip->reg0i26_format.opcode) {
+	case b_op:
+	case bl_op:
+		if (ip->reg0i26_format.immediate_l == 0
+		   && ip->reg0i26_format.immediate_h == 0)
+			return false;
+	}
+
+	switch (ip->reg1i21_format.opcode) {
+	case beqz_op:
+	case bnez_op:
+	case bceqz_op:
+		if (ip->reg1i21_format.immediate_l == 0
+		   && ip->reg1i21_format.immediate_h == 0)
+			return false;
+	}
+
+	switch (ip->reg2i16_format.opcode) {
+	case jirl_op:
+		if (ip->reg2i16_format.rj == 0x1
+		   && ip->reg2i16_format.rd == 0x1
+		   && ip->reg2i16_format.immediate == 0)
+			return false;
+		break;
+	case beq_op:
+	case bne_op:
+	case blt_op:
+	case bge_op:
+	case bltu_op:
+	case bgeu_op:
+		if (ip->reg2i16_format.immediate == 0)
+			return false;
+	}
+
+	return true;
+}
+
 void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
 void simu_branch(struct pt_regs *regs, union loongarch_instruction insn);
 
diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
index e9aed583a064..65b7dcdea16d 100644
--- a/arch/loongarch/include/asm/loongarch.h
+++ b/arch/loongarch/include/asm/loongarch.h
@@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)
 #define LOONGARCH_CSR_DERA		0x501	/* debug era */
 #define LOONGARCH_CSR_DESAVE		0x502	/* debug save */
 
+#define CSR_FWPC_SKIP_SHIFT		16
+#define CSR_FWPC_SKIP			(_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
+
 /*
  * CSR_ECFG IM
  */
diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
index db060c5a976f..3ea0f1910c23 100644
--- a/arch/loongarch/include/asm/processor.h
+++ b/arch/loongarch/include/asm/processor.h
@@ -131,6 +131,9 @@ struct thread_struct {
 	struct perf_event	*hbp_break[LOONGARCH_MAX_BRP];
 	struct perf_event	*hbp_watch[LOONGARCH_MAX_WRP];
 
+	/* Used by ptrace single_step */
+	unsigned long single_step;
+
 	/*
 	 * FPU & vector registers, must be at last because
 	 * they are conditionally copied at fork().
diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
index 58596c4f8a0f..66a0e6c480a3 100644
--- a/arch/loongarch/include/asm/ptrace.h
+++ b/arch/loongarch/include/asm/ptrace.h
@@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct pt_regs *regs,
 	regs->regs[3] = val;
 }
 
+#define arch_has_single_step()		(1)
+
 #endif /* _ASM_PTRACE_H */
diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
index 6431cd319c32..75d3652fbe00 100644
--- a/arch/loongarch/kernel/hw_breakpoint.c
+++ b/arch/loongarch/kernel/hw_breakpoint.c
@@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
  */
 void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
 {
+	int i;
+	struct thread_struct *t = &tsk->thread;
+
+	for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
+		if (t->hbp_break[i]) {
+			unregister_hw_breakpoint(t->hbp_break[i]);
+			t->hbp_break[i] = NULL;
+		}
+	}
+
+	for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
+		if (t->hbp_watch[i]) {
+			unregister_hw_breakpoint(t->hbp_watch[i]);
+			t->hbp_watch[i] = NULL;
+		}
+	}
 }
 
 void ptrace_hw_copy_thread(struct task_struct *tsk)
@@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
 void hw_breakpoint_thread_switch(struct task_struct *next)
 {
 	struct pt_regs *regs = task_pt_regs(next);
-
-	/* Update breakpoints */
-	update_bp_registers(regs, 1, 0);
-	/* Update watchpoints */
-	update_bp_registers(regs, 1, 1);
+	u64 addr, mask;
+
+	if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
+		addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
+		mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
+		if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
+			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
+		regs->csr_prmd |= CSR_PRMD_PWE;
+	} else {
+		/* Update breakpoints */
+		update_bp_registers(regs, 1, 0);
+		/* Update watchpoints */
+		update_bp_registers(regs, 1, 1);
+	}
 }
 
 void hw_breakpoint_pmu_read(struct perf_event *bp)
diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
index bee4194177fd..52a3ee4366f4 100644
--- a/arch/loongarch/kernel/ptrace.c
+++ b/arch/loongarch/kernel/ptrace.c
@@ -20,6 +20,7 @@
 #include <linux/context_tracking.h>
 #include <linux/elf.h>
 #include <linux/errno.h>
+#include <linux/hw_breakpoint.h>
 #include <linux/mm.h>
 #include <linux/ptrace.h>
 #include <linux/regset.h>
@@ -30,6 +31,7 @@
 #include <linux/stddef.h>
 #include <linux/seccomp.h>
 #include <linux/uaccess.h>
+#include <linux/thread_info.h>
 
 #include <asm/byteorder.h>
 #include <asm/cpu.h>
@@ -39,6 +41,7 @@
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
+#include <asm/ptrace.h>
 #include <asm/reg.h>
 #include <asm/syscall.h>
 
@@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long request,
 
 	return ret;
 }
+
+void ptrace_triggered(struct perf_event *bp,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
+static int set_single_step(struct task_struct *tsk, unsigned long addr)
+{
+	struct thread_struct *thread = &tsk->thread;
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+	struct arch_hw_breakpoint *info;
+
+	bp = thread->hbp_break[0];
+	if (!bp) {
+		ptrace_breakpoint_init(&attr);
+
+		attr.bp_addr = addr;
+		attr.bp_len = HW_BREAKPOINT_LEN_8;
+		attr.bp_type = HW_BREAKPOINT_X;
+
+		bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
+						 NULL, tsk);
+		if (IS_ERR(bp))
+			return PTR_ERR(bp);
+
+		thread->hbp_break[0] = bp;
+	} else {
+		int err;
+
+		attr = bp->attr;
+		attr.bp_addr = addr;
+		/* reenable breakpoint */
+		attr.disabled = false;
+		err = modify_user_hw_breakpoint(bp, &attr);
+		if (unlikely(err))
+			return err;
+
+		csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
+	}
+	info = counter_arch_bp(bp);
+	info->mask = 0xffffffffffff;
+
+	return 0;
+}
+
+/* ptrace API */
+void user_enable_single_step(struct task_struct *task)
+{
+	struct thread_info *ti = task_thread_info(task);
+
+	set_single_step(task, task_pt_regs(task)->csr_era);
+	task->thread.single_step = task_pt_regs(task)->csr_era;
+	set_ti_thread_flag(ti, TIF_SINGLESTEP);
+}
+
+void user_disable_single_step(struct task_struct *task)
+{
+	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
+}
diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
index 2b133079e0f3..a59275a43bd1 100644
--- a/arch/loongarch/kernel/traps.c
+++ b/arch/loongarch/kernel/traps.c
@@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs *regs)
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 	irqentry_state_t state = irqentry_enter(regs);
 
-	breakpoint_handler(regs);
-	watchpoint_handler(regs);
-	force_sig(SIGTRAP);
+	if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
+		int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
+		unsigned long pc = regs->csr_era;
+		union loongarch_instruction *ip = (union loongarch_instruction *)pc;
+
+		if (llbit) {
+		/*
+		 * When the ll-sc combo is encountered, it is regarded as an single
+		 * instruction. So don't clear llbit and reset CSR.FWPS.Skip until
+		 * the llsc execution is completed.
+		 */
+			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
+			csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
+		} else if (pc == current->thread.single_step) {
+		/*
+		 * Maybe some hardware Bug caused that certain insns are occasionally
+		 * not skipped when CSR.FWPS.Skip is set, such as fld.d/fst.d. So Singlestep
+		 * needs to compare whether the csr_era is equal to the value of singlestep
+		 * which last time set.
+		 */
+			if (branch_ins_target_pc(ip))
+			/* Prevent rd == pc from causing single step to fail to stop */
+				csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
+		} else {
+			force_sig(SIGTRAP);
+		}
+	} else {
+		breakpoint_handler(regs);
+		watchpoint_handler(regs);
+		force_sig(SIGTRAP);
+	}
 
 	irqentry_exit(regs, state);
 #endif
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 3/3] LoongArch: ptrace: expose hardware breakpoints to debuggers
  2023-02-17  2:37 [PATCH v4 0/3] LoongArch: Add hardware breakpoints/watchpoints support Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 1/3] " Qing Zhang
  2023-02-17  2:37 ` [PATCH v4 2/3] LoongArch: Add ptrace single step support Qing Zhang
@ 2023-02-17  2:37 ` Qing Zhang
  2 siblings, 0 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-17  2:37 UTC (permalink / raw)
  To: Huacai Chen, Oleg Nesterov, WANG Xuerui
  Cc: Jiaxun Yang, loongarch, linux-kernel

Implement regset-based ptrace interface that exposes hardware
breakpoints to user-space debuggers to query and set instruction
and data breakpoints.

Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
---
 arch/loongarch/include/uapi/asm/ptrace.h |   9 +
 arch/loongarch/kernel/ptrace.c           | 416 +++++++++++++++++++++++
 include/uapi/linux/elf.h                 |   2 +
 3 files changed, 427 insertions(+)

diff --git a/arch/loongarch/include/uapi/asm/ptrace.h b/arch/loongarch/include/uapi/asm/ptrace.h
index 46eb40932bb1..7ac4a0e44570 100644
--- a/arch/loongarch/include/uapi/asm/ptrace.h
+++ b/arch/loongarch/include/uapi/asm/ptrace.h
@@ -56,6 +56,15 @@ struct user_lasx_state {
 	uint64_t vregs[32*4];
 };
 
+struct user_watch_state {
+	uint16_t dbg_info;
+	struct {
+		uint64_t    addr;
+		uint64_t    mask;
+		uint32_t    ctrl;
+	}		dbg_regs[8];
+};
+
 #define PTRACE_SYSEMU			0x1f
 #define PTRACE_SYSEMU_SINGLESTEP	0x20
 
diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
index 52a3ee4366f4..7b0ec3b29212 100644
--- a/arch/loongarch/kernel/ptrace.c
+++ b/arch/loongarch/kernel/ptrace.c
@@ -22,6 +22,7 @@
 #include <linux/errno.h>
 #include <linux/hw_breakpoint.h>
 #include <linux/mm.h>
+#include <linux/nospec.h>
 #include <linux/ptrace.h>
 #include <linux/regset.h>
 #include <linux/sched.h>
@@ -333,6 +334,399 @@ static int simd_set(struct task_struct *target,
 
 #endif /* CONFIG_CPU_HAS_LSX */
 
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+
+/*
+ * Handle hitting a HW-breakpoint.
+ */
+static void ptrace_hbptriggered(struct perf_event *bp,
+				struct perf_sample_data *data,
+				struct pt_regs *regs)
+{
+	int i;
+	struct arch_hw_breakpoint *bkpt = counter_arch_bp(bp);
+
+	for (i = 0; i < LOONGARCH_MAX_BRP; ++i)
+		if (current->thread.hbp_break[i] == bp)
+			break;
+
+	for (i = 0; i < LOONGARCH_MAX_WRP; ++i)
+		if (current->thread.hbp_watch[i] == bp)
+			break;
+
+	force_sig_ptrace_errno_trap(i, (void __user *)bkpt->address);
+}
+
+static struct perf_event *ptrace_hbp_get_event(unsigned int note_type,
+					       struct task_struct *tsk,
+					       unsigned long idx)
+{
+	struct perf_event *bp = ERR_PTR(-EINVAL);
+
+	switch (note_type) {
+	case NT_LOONGARCH_HW_BREAK:
+		if (idx >= LOONGARCH_MAX_BRP)
+			goto out;
+		idx = array_index_nospec(idx, LOONGARCH_MAX_BRP);
+		bp = tsk->thread.hbp_break[idx];
+		break;
+	case NT_LOONGARCH_HW_WATCH:
+		if (idx >= LOONGARCH_MAX_WRP)
+			goto out;
+		idx = array_index_nospec(idx, LOONGARCH_MAX_WRP);
+		bp = tsk->thread.hbp_watch[idx];
+		break;
+	}
+
+out:
+	return bp;
+}
+
+static int ptrace_hbp_set_event(unsigned int note_type,
+				struct task_struct *tsk,
+				unsigned long idx,
+				struct perf_event *bp)
+{
+	int err = -EINVAL;
+
+	switch (note_type) {
+	case NT_LOONGARCH_HW_BREAK:
+		if (idx >= LOONGARCH_MAX_BRP)
+			goto out;
+		idx = array_index_nospec(idx, LOONGARCH_MAX_BRP);
+		tsk->thread.hbp_break[idx] = bp;
+		err = 0;
+		break;
+	case NT_LOONGARCH_HW_WATCH:
+		if (idx >= LOONGARCH_MAX_WRP)
+			goto out;
+		idx = array_index_nospec(idx, LOONGARCH_MAX_WRP);
+		tsk->thread.hbp_watch[idx] = bp;
+		err = 0;
+		break;
+	}
+
+out:
+	return err;
+}
+
+static struct perf_event *ptrace_hbp_create(unsigned int note_type,
+					    struct task_struct *tsk,
+					    unsigned long idx)
+{
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+	int err, type;
+
+	switch (note_type) {
+	case NT_LOONGARCH_HW_BREAK:
+		type = HW_BREAKPOINT_X;
+		break;
+	case NT_LOONGARCH_HW_WATCH:
+		type = HW_BREAKPOINT_RW;
+		break;
+	default:
+		return ERR_PTR(-EINVAL);
+	}
+
+	ptrace_breakpoint_init(&attr);
+
+	/*
+	 * Initialise fields to sane defaults
+	 * (i.e. values that will pass validation).
+	 */
+	attr.bp_addr	= 0;
+	attr.bp_len	= HW_BREAKPOINT_LEN_4;
+	attr.bp_type	= type;
+	attr.disabled	= 1;
+
+	bp = register_user_hw_breakpoint(&attr, ptrace_hbptriggered, NULL, tsk);
+	if (IS_ERR(bp))
+		return bp;
+
+	err = ptrace_hbp_set_event(note_type, tsk, idx, bp);
+	if (err)
+		return ERR_PTR(err);
+
+	return bp;
+}
+
+static int ptrace_hbp_fill_attr_ctrl(unsigned int note_type,
+				     struct arch_hw_breakpoint_ctrl ctrl,
+				     struct perf_event_attr *attr)
+{
+	int err, len, type, offset;
+
+	err = arch_bp_generic_fields(ctrl, &len, &type, &offset);
+	if (err)
+		return err;
+
+	switch (note_type) {
+	case NT_LOONGARCH_HW_BREAK:
+		if ((type & HW_BREAKPOINT_X) != type)
+			return -EINVAL;
+		break;
+	case NT_LOONGARCH_HW_WATCH:
+		if ((type & HW_BREAKPOINT_RW) != type)
+			return -EINVAL;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	attr->bp_len	= len;
+	attr->bp_type	= type;
+	attr->bp_addr	+= offset;
+
+	return 0;
+}
+
+static int ptrace_hbp_get_resource_info(unsigned int note_type, u16 *info)
+{
+	u8 num;
+	u16 reg = 0;
+
+	switch (note_type) {
+	case NT_LOONGARCH_HW_BREAK:
+		num = hw_breakpoint_slots(TYPE_INST);
+		break;
+	case NT_LOONGARCH_HW_WATCH:
+		num = hw_breakpoint_slots(TYPE_DATA);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	reg |= num;
+
+	*info = reg;
+	return 0;
+}
+
+static int ptrace_hbp_get_ctrl(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u32 *ctrl)
+{
+	struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+	*ctrl = bp ? encode_ctrl_reg(counter_arch_bp(bp)->ctrl) : 0;
+	return 0;
+}
+
+static int ptrace_hbp_get_mask(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u64 *mask)
+{
+	struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+	*mask = bp ? counter_arch_bp(bp)->mask : 0;
+	return 0;
+}
+
+static int ptrace_hbp_get_addr(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u64 *addr)
+{
+	struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+	*addr = bp ? counter_arch_bp(bp)->address : 0;
+	return 0;
+}
+
+static struct perf_event *ptrace_hbp_get_initialised_bp(unsigned int note_type,
+							struct task_struct *tsk,
+							unsigned long idx)
+{
+	struct perf_event *bp = ptrace_hbp_get_event(note_type, tsk, idx);
+
+	if (!bp)
+		bp = ptrace_hbp_create(note_type, tsk, idx);
+
+	return bp;
+}
+
+static int ptrace_hbp_set_ctrl(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u32 uctrl)
+{
+	int err;
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+	struct arch_hw_breakpoint_ctrl ctrl;
+
+	bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+	if (IS_ERR(bp)) {
+		err = PTR_ERR(bp);
+		return err;
+	}
+
+	attr = bp->attr;
+	decode_ctrl_reg(uctrl, &ctrl);
+	err = ptrace_hbp_fill_attr_ctrl(note_type, ctrl, &attr);
+	if (err)
+		return err;
+
+	return modify_user_hw_breakpoint(bp, &attr);
+}
+
+static int ptrace_hbp_set_mask(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u64 mask)
+{
+	int err;
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+	struct arch_hw_breakpoint *info;
+
+	bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+	if (IS_ERR(bp)) {
+		err = PTR_ERR(bp);
+		return err;
+	}
+
+	attr = bp->attr;
+	info = counter_arch_bp(bp);
+	info->mask = mask;
+	err = modify_user_hw_breakpoint(bp, &attr);
+	return err;
+}
+
+static int ptrace_hbp_set_addr(unsigned int note_type,
+			       struct task_struct *tsk,
+			       unsigned long idx,
+			       u64 addr)
+{
+	int err;
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+
+	bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+	if (IS_ERR(bp)) {
+		err = PTR_ERR(bp);
+		return err;
+	}
+
+	attr = bp->attr;
+	attr.bp_addr = addr;
+	err = modify_user_hw_breakpoint(bp, &attr);
+	return err;
+}
+
+#define PTRACE_HBP_ADDR_SZ	sizeof(u64)
+#define PTRACE_HBP_MASK_SZ	sizeof(u64)
+#define PTRACE_HBP_CTRL_SZ	sizeof(u32)
+
+static int hw_break_get(struct task_struct *target,
+			const struct user_regset *regset,
+			struct membuf to)
+{
+	unsigned int note_type = regset->core_note_type;
+	int ret, idx = 0;
+	u16 info;
+	u32 ctrl;
+	u64 addr, mask;
+
+	/* Resource info */
+	ret = ptrace_hbp_get_resource_info(note_type, &info);
+	if (ret)
+		return ret;
+
+	membuf_write(&to, &info, sizeof(info));
+	/* (address, ctrl) registers */
+	while (to.left) {
+		ret = ptrace_hbp_get_addr(note_type, target, idx, &addr);
+		if (ret)
+			return ret;
+
+		ret = ptrace_hbp_get_mask(note_type, target, idx, &mask);
+		if (ret)
+			return ret;
+
+		ret = ptrace_hbp_get_ctrl(note_type, target, idx, &ctrl);
+		if (ret)
+			return ret;
+
+		membuf_store(&to, addr);
+		membuf_store(&to, mask);
+		membuf_store(&to, ctrl);
+		idx++;
+	}
+	return 0;
+}
+
+static int hw_break_set(struct task_struct *target,
+			const struct user_regset *regset,
+			unsigned int pos, unsigned int count,
+			const void *kbuf, const void __user *ubuf)
+{
+	unsigned int note_type = regset->core_note_type;
+	int ret, idx = 0, offset, limit;
+	u32 ctrl;
+	u64 addr, mask;
+
+	/* Resource info */
+	offset = offsetof(struct user_watch_state, dbg_regs);
+	user_regset_copyin_ignore(&pos, &count, &kbuf, &ubuf, 0, offset);
+
+	/* (address, ctrl) registers */
+	limit = regset->n * regset->size;
+	while (count && offset < limit) {
+		if (count < PTRACE_HBP_ADDR_SZ)
+			return -EINVAL;
+
+		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &addr,
+					 offset, offset + PTRACE_HBP_ADDR_SZ);
+		if (ret)
+			return ret;
+
+		ret = ptrace_hbp_set_addr(note_type, target, idx, addr);
+		if (ret)
+			return ret;
+		offset += PTRACE_HBP_ADDR_SZ;
+
+		if (!count)
+			break;
+
+		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &mask,
+					 offset, offset + PTRACE_HBP_ADDR_SZ);
+		if (ret)
+			return ret;
+
+		ret = ptrace_hbp_set_mask(note_type, target, idx, mask);
+		if (ret)
+			return ret;
+		offset += PTRACE_HBP_MASK_SZ;
+
+		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &mask,
+					 offset, offset + PTRACE_HBP_MASK_SZ);
+		if (ret)
+			return ret;
+
+		ret = ptrace_hbp_set_ctrl(note_type, target, idx, ctrl);
+		if (ret)
+			return ret;
+		offset += PTRACE_HBP_CTRL_SZ;
+		idx++;
+	}
+
+	return 0;
+}
+#endif
+
 struct pt_regs_offset {
 	const char *name;
 	int offset;
@@ -412,6 +806,10 @@ enum loongarch_regset {
 #ifdef CONFIG_CPU_HAS_LASX
 	REGSET_LASX,
 #endif
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+	REGSET_HW_BREAK,
+	REGSET_HW_WATCH,
+#endif
 };
 
 static const struct user_regset loongarch64_regsets[] = {
@@ -459,6 +857,24 @@ static const struct user_regset loongarch64_regsets[] = {
 		.set		= simd_set,
 	},
 #endif
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+	[REGSET_HW_BREAK] = {
+		.core_note_type = NT_LOONGARCH_HW_BREAK,
+		.n = sizeof(struct user_watch_state) / sizeof(u32),
+		.size = sizeof(u32),
+		.align = sizeof(u32),
+		.regset_get = hw_break_get,
+		.set = hw_break_set,
+	},
+	[REGSET_HW_WATCH] = {
+		.core_note_type = NT_LOONGARCH_HW_WATCH,
+		.n = sizeof(struct user_watch_state) / sizeof(u32),
+		.size = sizeof(u32),
+		.align = sizeof(u32),
+		.regset_get = hw_break_get,
+		.set = hw_break_set,
+	},
+#endif
 };
 
 static const struct user_regset_view user_loongarch64_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 4c6a8fa5e7ed..3cf66946a0bb 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -444,6 +444,8 @@ typedef struct elf64_shdr {
 #define NT_LOONGARCH_LSX	0xa02	/* LoongArch Loongson SIMD Extension registers */
 #define NT_LOONGARCH_LASX	0xa03	/* LoongArch Loongson Advanced SIMD Extension registers */
 #define NT_LOONGARCH_LBT	0xa04	/* LoongArch Loongson Binary Translation registers */
+#define NT_LOONGARCH_HW_BREAK   0xa05   /* LoongArch hardware breakpoint registers */
+#define NT_LOONGARCH_HW_WATCH   0xa06   /* LoongArch hardware watchpoint registers */
 
 /* Note types with note name "GNU" */
 #define NT_GNU_PROPERTY_TYPE_0	5
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17  2:37 ` [PATCH v4 2/3] LoongArch: Add ptrace single step support Qing Zhang
@ 2023-02-17  9:50   ` Youling Tang
  2023-02-17 10:00     ` Qing Zhang
  2023-02-17 10:16   ` WANG Xuerui
  1 sibling, 1 reply; 10+ messages in thread
From: Youling Tang @ 2023-02-17  9:50 UTC (permalink / raw)
  To: Qing Zhang, WANG Xuerui
  Cc: Huacai Chen, Oleg Nesterov, Jiaxun Yang, loongarch, linux-kernel

Hi, Qing

On 02/17/2023 10:37 AM, Qing Zhang wrote:
> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
> arch_has_single_step in  and implementing the
> user_enable_single_step and user_disable_single_step functions.
>
> LongArch has no hardware single-step register. the hardware single-step
> function multiplex fetch instruction watchpoint(FWPS) and specifies that
> the next instruction must trigger the watch exception by setting the mask bit.
> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not to trigger
> the watchpoint exception, and proceed to the next instruction.
>
> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
> ---
>  arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>  arch/loongarch/include/asm/loongarch.h |  3 ++
>  arch/loongarch/include/asm/processor.h |  3 ++
>  arch/loongarch/include/asm/ptrace.h    |  2 +
>  arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>  arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>  arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>  7 files changed, 176 insertions(+), 8 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index ba18ce8fbdf2..00c6f261d9a2 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union loongarch_instruction *ip)
>  		is_imm12_negative(ip->reg2i12_format.immediate);
>  }
>
> +static inline bool branch_ins_target_pc(union loongarch_instruction *ip)
> +{
> +	switch (ip->reg0i26_format.opcode) {
> +	case b_op:
> +	case bl_op:
> +		if (ip->reg0i26_format.immediate_l == 0
> +		   && ip->reg0i26_format.immediate_h == 0)
> +			return false;
> +	}
> +
> +	switch (ip->reg1i21_format.opcode) {
> +	case beqz_op:
> +	case bnez_op:
> +	case bceqz_op:
> +		if (ip->reg1i21_format.immediate_l == 0
> +		   && ip->reg1i21_format.immediate_h == 0)
> +			return false;
> +	}
> +
> +	switch (ip->reg2i16_format.opcode) {
> +	case jirl_op:
> +		if (ip->reg2i16_format.rj == 0x1
> +		   && ip->reg2i16_format.rd == 0x1
LOONGARCH_GPR_RA can be used instead of 0x1.

> +		   && ip->reg2i16_format.immediate == 0)
> +			return false;
> +		break;
> +	case beq_op:
> +	case bne_op:
> +	case blt_op:
> +	case bge_op:
> +	case bltu_op:
> +	case bgeu_op:
> +		if (ip->reg2i16_format.immediate == 0)
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
>  void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>  void simu_branch(struct pt_regs *regs, union loongarch_instruction insn);
>
> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> index e9aed583a064..65b7dcdea16d 100644
> --- a/arch/loongarch/include/asm/loongarch.h
> +++ b/arch/loongarch/include/asm/loongarch.h
> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)
>  #define LOONGARCH_CSR_DERA		0x501	/* debug era */
>  #define LOONGARCH_CSR_DESAVE		0x502	/* debug save */
>
> +#define CSR_FWPC_SKIP_SHIFT		16
> +#define CSR_FWPC_SKIP			(_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
> +
>  /*
>   * CSR_ECFG IM
>   */
> diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
> index db060c5a976f..3ea0f1910c23 100644
> --- a/arch/loongarch/include/asm/processor.h
> +++ b/arch/loongarch/include/asm/processor.h
> @@ -131,6 +131,9 @@ struct thread_struct {
>  	struct perf_event	*hbp_break[LOONGARCH_MAX_BRP];
>  	struct perf_event	*hbp_watch[LOONGARCH_MAX_WRP];
>
> +	/* Used by ptrace single_step */
> +	unsigned long single_step;
> +
>  	/*
>  	 * FPU & vector registers, must be at last because
>  	 * they are conditionally copied at fork().
> diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
> index 58596c4f8a0f..66a0e6c480a3 100644
> --- a/arch/loongarch/include/asm/ptrace.h
> +++ b/arch/loongarch/include/asm/ptrace.h
> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct pt_regs *regs,
>  	regs->regs[3] = val;
>  }
>
> +#define arch_has_single_step()		(1)
> +
>  #endif /* _ASM_PTRACE_H */
> diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
> index 6431cd319c32..75d3652fbe00 100644
> --- a/arch/loongarch/kernel/hw_breakpoint.c
> +++ b/arch/loongarch/kernel/hw_breakpoint.c
> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
>   */
>  void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>  {
> +	int i;
> +	struct thread_struct *t = &tsk->thread;
> +
> +	for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
> +		if (t->hbp_break[i]) {
> +			unregister_hw_breakpoint(t->hbp_break[i]);
> +			t->hbp_break[i] = NULL;
> +		}
> +	}
> +
> +	for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
> +		if (t->hbp_watch[i]) {
> +			unregister_hw_breakpoint(t->hbp_watch[i]);
> +			t->hbp_watch[i] = NULL;
> +		}
> +	}
>  }
>
>  void ptrace_hw_copy_thread(struct task_struct *tsk)
> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>  void hw_breakpoint_thread_switch(struct task_struct *next)
>  {
>  	struct pt_regs *regs = task_pt_regs(next);
> -
> -	/* Update breakpoints */
> -	update_bp_registers(regs, 1, 0);
> -	/* Update watchpoints */
> -	update_bp_registers(regs, 1, 1);
> +	u64 addr, mask;
> +
> +	if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
> +		addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
> +		mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
> +		if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
> +			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +		regs->csr_prmd |= CSR_PRMD_PWE;
> +	} else {
> +		/* Update breakpoints */
> +		update_bp_registers(regs, 1, 0);
> +		/* Update watchpoints */
> +		update_bp_registers(regs, 1, 1);
> +	}
>  }
>
>  void hw_breakpoint_pmu_read(struct perf_event *bp)
> diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
> index bee4194177fd..52a3ee4366f4 100644
> --- a/arch/loongarch/kernel/ptrace.c
> +++ b/arch/loongarch/kernel/ptrace.c
> @@ -20,6 +20,7 @@
>  #include <linux/context_tracking.h>
>  #include <linux/elf.h>
>  #include <linux/errno.h>
> +#include <linux/hw_breakpoint.h>
>  #include <linux/mm.h>
>  #include <linux/ptrace.h>
>  #include <linux/regset.h>
> @@ -30,6 +31,7 @@
>  #include <linux/stddef.h>
>  #include <linux/seccomp.h>
>  #include <linux/uaccess.h>
> +#include <linux/thread_info.h>
>
>  #include <asm/byteorder.h>
>  #include <asm/cpu.h>
> @@ -39,6 +41,7 @@
>  #include <asm/page.h>
>  #include <asm/pgtable.h>
>  #include <asm/processor.h>
> +#include <asm/ptrace.h>
>  #include <asm/reg.h>
>  #include <asm/syscall.h>
>
> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long request,
>
>  	return ret;
>  }
> +
> +void ptrace_triggered(struct perf_event *bp,
> +		      struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	struct perf_event_attr attr;
> +
> +	attr = bp->attr;
> +	attr.disabled = true;
> +	modify_user_hw_breakpoint(bp, &attr);
> +}
> +
> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
> +{
> +	struct thread_struct *thread = &tsk->thread;
> +	struct perf_event *bp;
> +	struct perf_event_attr attr;
> +	struct arch_hw_breakpoint *info;
> +
> +	bp = thread->hbp_break[0];
> +	if (!bp) {
> +		ptrace_breakpoint_init(&attr);
> +
> +		attr.bp_addr = addr;
> +		attr.bp_len = HW_BREAKPOINT_LEN_8;
> +		attr.bp_type = HW_BREAKPOINT_X;
> +
> +		bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
> +						 NULL, tsk);
> +		if (IS_ERR(bp))
> +			return PTR_ERR(bp);
> +
> +		thread->hbp_break[0] = bp;
> +	} else {
> +		int err;
> +
> +		attr = bp->attr;
> +		attr.bp_addr = addr;
> +		/* reenable breakpoint */
> +		attr.disabled = false;
> +		err = modify_user_hw_breakpoint(bp, &attr);
> +		if (unlikely(err))
> +			return err;
> +
> +		csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
> +	}
> +	info = counter_arch_bp(bp);
> +	info->mask = 0xffffffffffff;
If `mask` is only used for user space address mask, it should not be
fixed to 0xffffffffffff, the user space address size may have many
situations, which can be defined according to TASK_SIZE.

Use (TASK_SIZE - 1) instead.

> +
> +	return 0;
> +}
> +
> +/* ptrace API */
> +void user_enable_single_step(struct task_struct *task)
> +{
> +	struct thread_info *ti = task_thread_info(task);
> +
> +	set_single_step(task, task_pt_regs(task)->csr_era);
> +	task->thread.single_step = task_pt_regs(task)->csr_era;
> +	set_ti_thread_flag(ti, TIF_SINGLESTEP);
> +}
> +
> +void user_disable_single_step(struct task_struct *task)
> +{
> +	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
> index 2b133079e0f3..a59275a43bd1 100644
> --- a/arch/loongarch/kernel/traps.c
> +++ b/arch/loongarch/kernel/traps.c
> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs *regs)
>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
>  	irqentry_state_t state = irqentry_enter(regs);
>
> -	breakpoint_handler(regs);
> -	watchpoint_handler(regs);
> -	force_sig(SIGTRAP);
> +	if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
> +		int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
> +		unsigned long pc = regs->csr_era;
> +		union loongarch_instruction *ip = (union loongarch_instruction *)pc;
> +
> +		if (llbit) {
> +		/*
> +		 * When the ll-sc combo is encountered, it is regarded as an single
> +		 * instruction. So don't clear llbit and reset CSR.FWPS.Skip until
> +		 * the llsc execution is completed.
> +		 */
Comments should be aligned with code, similar modifications in other
places.

> +			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +			csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
> +		} else if (pc == current->thread.single_step) {
> +		/*
> +		 * Maybe some hardware Bug caused that certain insns are occasionally
> +		 * not skipped when CSR.FWPS.Skip is set, such as fld.d/fst.d. So Singlestep
> +		 * needs to compare whether the csr_era is equal to the value of singlestep
> +		 * which last time set.
> +		 */
> +			if (branch_ins_target_pc(ip))
> +			/* Prevent rd == pc from causing single step to fail to stop */
IMO, that comment description is not accurate enough.

Youling.

> +				csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +		} else {
> +			force_sig(SIGTRAP);
> +		}
> +	} else {
> +		breakpoint_handler(regs);
> +		watchpoint_handler(regs);
> +		force_sig(SIGTRAP);
> +	}
>
>  	irqentry_exit(regs, state);
>  #endif
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17  9:50   ` Youling Tang
@ 2023-02-17 10:00     ` Qing Zhang
  2023-02-18  1:21       ` Jinyang He
  0 siblings, 1 reply; 10+ messages in thread
From: Qing Zhang @ 2023-02-17 10:00 UTC (permalink / raw)
  To: Youling Tang, WANG Xuerui
  Cc: Huacai Chen, Oleg Nesterov, Jiaxun Yang, loongarch, linux-kernel

Hi, Youling

On 2023/2/17 下午5:50, Youling Tang wrote:
> Hi, Qing
> 
> On 02/17/2023 10:37 AM, Qing Zhang wrote:
>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>> arch_has_single_step in  and implementing the
>> user_enable_single_step and user_disable_single_step functions.
>>
>> LongArch has no hardware single-step register. the hardware single-step
>> function multiplex fetch instruction watchpoint(FWPS) and specifies that
>> the next instruction must trigger the watch exception by setting the 
>> mask bit.
>> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not to 
>> trigger
>> the watchpoint exception, and proceed to the next instruction.
>>
>> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
>> ---
>>  arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>>  arch/loongarch/include/asm/loongarch.h |  3 ++
>>  arch/loongarch/include/asm/processor.h |  3 ++
>>  arch/loongarch/include/asm/ptrace.h    |  2 +
>>  arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>  arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>>  arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>>  7 files changed, 176 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/inst.h 
>> b/arch/loongarch/include/asm/inst.h
>> index ba18ce8fbdf2..00c6f261d9a2 100644
>> --- a/arch/loongarch/include/asm/inst.h
>> +++ b/arch/loongarch/include/asm/inst.h
>> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union 
>> loongarch_instruction *ip)
>>          is_imm12_negative(ip->reg2i12_format.immediate);
>>  }
>>
>> +static inline bool branch_ins_target_pc(union loongarch_instruction *ip)
>> +{
>> +    switch (ip->reg0i26_format.opcode) {
>> +    case b_op:
>> +    case bl_op:
>> +        if (ip->reg0i26_format.immediate_l == 0
>> +           && ip->reg0i26_format.immediate_h == 0)
>> +            return false;
>> +    }
>> +
>> +    switch (ip->reg1i21_format.opcode) {
>> +    case beqz_op:
>> +    case bnez_op:
>> +    case bceqz_op:
>> +        if (ip->reg1i21_format.immediate_l == 0
>> +           && ip->reg1i21_format.immediate_h == 0)
>> +            return false;
>> +    }
>> +
>> +    switch (ip->reg2i16_format.opcode) {
>> +    case jirl_op:
>> +        if (ip->reg2i16_format.rj == 0x1
>> +           && ip->reg2i16_format.rd == 0x1
> LOONGARCH_GPR_RA can be used instead of 0x1.
> 
>> +           && ip->reg2i16_format.immediate == 0)
>> +            return false;
>> +        break;
>> +    case beq_op:
>> +    case bne_op:
>> +    case blt_op:
>> +    case bge_op:
>> +    case bltu_op:
>> +    case bgeu_op:
>> +        if (ip->reg2i16_format.immediate == 0)
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>>  void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>>  void simu_branch(struct pt_regs *regs, union loongarch_instruction 
>> insn);
>>
>> diff --git a/arch/loongarch/include/asm/loongarch.h 
>> b/arch/loongarch/include/asm/loongarch.h
>> index e9aed583a064..65b7dcdea16d 100644
>> --- a/arch/loongarch/include/asm/loongarch.h
>> +++ b/arch/loongarch/include/asm/loongarch.h
>> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 
>> val, u32 reg)
>>  #define LOONGARCH_CSR_DERA        0x501    /* debug era */
>>  #define LOONGARCH_CSR_DESAVE        0x502    /* debug save */
>>
>> +#define CSR_FWPC_SKIP_SHIFT        16
>> +#define CSR_FWPC_SKIP            (_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
>> +
>>  /*
>>   * CSR_ECFG IM
>>   */
>> diff --git a/arch/loongarch/include/asm/processor.h 
>> b/arch/loongarch/include/asm/processor.h
>> index db060c5a976f..3ea0f1910c23 100644
>> --- a/arch/loongarch/include/asm/processor.h
>> +++ b/arch/loongarch/include/asm/processor.h
>> @@ -131,6 +131,9 @@ struct thread_struct {
>>      struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>      struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>
>> +    /* Used by ptrace single_step */
>> +    unsigned long single_step;
>> +
>>      /*
>>       * FPU & vector registers, must be at last because
>>       * they are conditionally copied at fork().
>> diff --git a/arch/loongarch/include/asm/ptrace.h 
>> b/arch/loongarch/include/asm/ptrace.h
>> index 58596c4f8a0f..66a0e6c480a3 100644
>> --- a/arch/loongarch/include/asm/ptrace.h
>> +++ b/arch/loongarch/include/asm/ptrace.h
>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct 
>> pt_regs *regs,
>>      regs->regs[3] = val;
>>  }
>>
>> +#define arch_has_single_step()        (1)
>> +
>>  #endif /* _ASM_PTRACE_H */
>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c 
>> b/arch/loongarch/kernel/hw_breakpoint.c
>> index 6431cd319c32..75d3652fbe00 100644
>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct 
>> perf_event **slots, int max_slots,
>>   */
>>  void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>  {
>> +    int i;
>> +    struct thread_struct *t = &tsk->thread;
>> +
>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>> +        if (t->hbp_break[i]) {
>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>> +            t->hbp_break[i] = NULL;
>> +        }
>> +    }
>> +
>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>> +        if (t->hbp_watch[i]) {
>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>> +            t->hbp_watch[i] = NULL;
>> +        }
>> +    }
>>  }
>>
>>  void ptrace_hw_copy_thread(struct task_struct *tsk)
>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>  void hw_breakpoint_thread_switch(struct task_struct *next)
>>  {
>>      struct pt_regs *regs = task_pt_regs(next);
>> -
>> -    /* Update breakpoints */
>> -    update_bp_registers(regs, 1, 0);
>> -    /* Update watchpoints */
>> -    update_bp_registers(regs, 1, 1);
>> +    u64 addr, mask;
>> +
>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>> +    } else {
>> +        /* Update breakpoints */
>> +        update_bp_registers(regs, 1, 0);
>> +        /* Update watchpoints */
>> +        update_bp_registers(regs, 1, 1);
>> +    }
>>  }
>>
>>  void hw_breakpoint_pmu_read(struct perf_event *bp)
>> diff --git a/arch/loongarch/kernel/ptrace.c 
>> b/arch/loongarch/kernel/ptrace.c
>> index bee4194177fd..52a3ee4366f4 100644
>> --- a/arch/loongarch/kernel/ptrace.c
>> +++ b/arch/loongarch/kernel/ptrace.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/context_tracking.h>
>>  #include <linux/elf.h>
>>  #include <linux/errno.h>
>> +#include <linux/hw_breakpoint.h>
>>  #include <linux/mm.h>
>>  #include <linux/ptrace.h>
>>  #include <linux/regset.h>
>> @@ -30,6 +31,7 @@
>>  #include <linux/stddef.h>
>>  #include <linux/seccomp.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/thread_info.h>
>>
>>  #include <asm/byteorder.h>
>>  #include <asm/cpu.h>
>> @@ -39,6 +41,7 @@
>>  #include <asm/page.h>
>>  #include <asm/pgtable.h>
>>  #include <asm/processor.h>
>> +#include <asm/ptrace.h>
>>  #include <asm/reg.h>
>>  #include <asm/syscall.h>
>>
>> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long 
>> request,
>>
>>      return ret;
>>  }
>> +
>> +void ptrace_triggered(struct perf_event *bp,
>> +              struct perf_sample_data *data, struct pt_regs *regs)
>> +{
>> +    struct perf_event_attr attr;
>> +
>> +    attr = bp->attr;
>> +    attr.disabled = true;
>> +    modify_user_hw_breakpoint(bp, &attr);
>> +}
>> +
>> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
>> +{
>> +    struct thread_struct *thread = &tsk->thread;
>> +    struct perf_event *bp;
>> +    struct perf_event_attr attr;
>> +    struct arch_hw_breakpoint *info;
>> +
>> +    bp = thread->hbp_break[0];
>> +    if (!bp) {
>> +        ptrace_breakpoint_init(&attr);
>> +
>> +        attr.bp_addr = addr;
>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>> +        attr.bp_type = HW_BREAKPOINT_X;
>> +
>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>> +                         NULL, tsk);
>> +        if (IS_ERR(bp))
>> +            return PTR_ERR(bp);
>> +
>> +        thread->hbp_break[0] = bp;
>> +    } else {
>> +        int err;
>> +
>> +        attr = bp->attr;
>> +        attr.bp_addr = addr;
>> +        /* reenable breakpoint */
>> +        attr.disabled = false;
>> +        err = modify_user_hw_breakpoint(bp, &attr);
>> +        if (unlikely(err))
>> +            return err;
>> +
>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>> +    }
>> +    info = counter_arch_bp(bp);
>> +    info->mask = 0xffffffffffff;
> If `mask` is only used for user space address mask, it should not be
> fixed to 0xffffffffffff, the user space address size may have many
> situations, which can be defined according to TASK_SIZE.
> 
> Use (TASK_SIZE - 1) instead.
> 
ok, got it.
>> +
>> +    return 0;
>> +}
>> +
>> +/* ptrace API */
>> +void user_enable_single_step(struct task_struct *task)
>> +{
>> +    struct thread_info *ti = task_thread_info(task);
>> +
>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>> +}
>> +
>> +void user_disable_single_step(struct task_struct *task)
>> +{
>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>> +}
>> diff --git a/arch/loongarch/kernel/traps.c 
>> b/arch/loongarch/kernel/traps.c
>> index 2b133079e0f3..a59275a43bd1 100644
>> --- a/arch/loongarch/kernel/traps.c
>> +++ b/arch/loongarch/kernel/traps.c
>> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs 
>> *regs)
>>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>      irqentry_state_t state = irqentry_enter(regs);
>>
>> -    breakpoint_handler(regs);
>> -    watchpoint_handler(regs);
>> -    force_sig(SIGTRAP);
>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>> +        unsigned long pc = regs->csr_era;
>> +        union loongarch_instruction *ip = (union 
>> loongarch_instruction *)pc;
>> +
>> +        if (llbit) {
>> +        /*
>> +         * When the ll-sc combo is encountered, it is regarded as an 
>> single
>> +         * instruction. So don't clear llbit and reset CSR.FWPS.Skip 
>> until
>> +         * the llsc execution is completed.
>> +         */
> Comments should be aligned with code, similar modifications in other
> places.
> 
>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +            csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
>> +        } else if (pc == current->thread.single_step) {
>> +        /*
>> +         * Maybe some hardware Bug caused that certain insns are 
>> occasionally
>> +         * not skipped when CSR.FWPS.Skip is set, such as 
>> fld.d/fst.d. So Singlestep
>> +         * needs to compare whether the csr_era is equal to the value 
>> of singlestep
>> +         * which last time set.
>> +         */
>> +            if (branch_ins_target_pc(ip))
>> +            /* Prevent rd == pc from causing single step to fail to 
>> stop */
> IMO, that comment description is not accurate enough.ok, I'll add more comments.

Thanks,
-Qing
> Youling.
> 
>> +                csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +        } else {
>> +            force_sig(SIGTRAP);
>> +        }
>> +    } else {
>> +        breakpoint_handler(regs);
>> +        watchpoint_handler(regs);
>> +        force_sig(SIGTRAP);
>> +    }
>>
>>      irqentry_exit(regs, state);
>>  #endif
>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17  2:37 ` [PATCH v4 2/3] LoongArch: Add ptrace single step support Qing Zhang
  2023-02-17  9:50   ` Youling Tang
@ 2023-02-17 10:16   ` WANG Xuerui
  2023-02-18  6:31     ` Qing Zhang
  1 sibling, 1 reply; 10+ messages in thread
From: WANG Xuerui @ 2023-02-17 10:16 UTC (permalink / raw)
  To: Qing Zhang, Huacai Chen, Oleg Nesterov
  Cc: Jiaxun Yang, loongarch, linux-kernel

On 2023/2/17 10:37, Qing Zhang wrote:
> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
> arch_has_single_step in  and implementing the
> user_enable_single_step and user_disable_single_step functions.
> 
> LongArch has no hardware single-step register. the hardware single-step

"LoongArch cannot do hardware single-stepping per se" could be more 
accurate a description. ("Hardware single-step register" could be 
ambiguous in implying the only way to achieve single-stepping would be 
via software, but in fact you explained in the next sentence that it's 
actually doable, just in an alternate way.)

> function multiplex fetch instruction watchpoint(FWPS) and specifies that
> the next instruction must trigger the watch exception by setting the mask bit.

I assume the "multiplex" is translated from "复用", in which case it's not 
used in the exact same way as in Chinese. Plus you have the subject 
wrong, "hardware single-step function" is the topic but not the actor 
doing the "multiplexing".

My take (continuing from the previous sentence I suggested):

"Instead it is achieved by configuring the instruction fetch watchpoints 
so that WPEF is triggered on the next instruction."

> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not to trigger
> the watchpoint exception, and proceed to the next instruction.

Again, wrong topicalization ("scenarios" is not the actor/subject). "In 
some scenarios CSR.FWPS.Skip is used to ..." sounds better.

And the multiple "next" usages sound unclear, you could probably reword 
this a bit. (I can't immediately understand this part so I'm unable to 
provide a suggestion.)

> 
> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
> ---
>   arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>   arch/loongarch/include/asm/loongarch.h |  3 ++
>   arch/loongarch/include/asm/processor.h |  3 ++
>   arch/loongarch/include/asm/ptrace.h    |  2 +
>   arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>   arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>   arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>   7 files changed, 176 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index ba18ce8fbdf2..00c6f261d9a2 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union loongarch_instruction *ip)
>   		is_imm12_negative(ip->reg2i12_format.immediate);
>   }
>   
> +static inline bool branch_ins_target_pc(union loongarch_instruction *ip)

This name is not faithfully reflecting its purpose, which is to check if 
the given instruction will change PC; and the reason this has to be done 
in the first place being that, single stepping would never stop due to 
current hardware erratum if the PC would stay the same after the insn's 
execution.

You may want to add some comments regarding the erratum too if you're in 
a position to disclose more details.

> +{
> +	switch (ip->reg0i26_format.opcode) {
> +	case b_op:
> +	case bl_op:
> +		if (ip->reg0i26_format.immediate_l == 0
> +		   && ip->reg0i26_format.immediate_h == 0)
> +			return false;
> +	}
> +
> +	switch (ip->reg1i21_format.opcode) {
> +	case beqz_op:
> +	case bnez_op:
> +	case bceqz_op:
> +		if (ip->reg1i21_format.immediate_l == 0
> +		   && ip->reg1i21_format.immediate_h == 0)
> +			return false;
> +	}
> +
> +	switch (ip->reg2i16_format.opcode) {
> +	case jirl_op:
> +		if (ip->reg2i16_format.rj == 0x1
> +		   && ip->reg2i16_format.rd == 0x1
Given the appearance of the other cases, I feel like the triggering 
condition of the nonterminating singlestepping is that "new PC == 
current PC" always hold. So, isn't every `rd == rj && imm == 0` case 
problematic?

> +		   && ip->reg2i16_format.immediate == 0)
> +			return false;
> +		break;
> +	case beq_op:
> +	case bne_op:
> +	case blt_op:
> +	case bge_op:
> +	case bltu_op:
> +	case bgeu_op:
> +		if (ip->reg2i16_format.immediate == 0)
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
>   void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>   void simu_branch(struct pt_regs *regs, union loongarch_instruction insn);
>   
> diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h
> index e9aed583a064..65b7dcdea16d 100644
> --- a/arch/loongarch/include/asm/loongarch.h
> +++ b/arch/loongarch/include/asm/loongarch.h
> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 val, u32 reg)
>   #define LOONGARCH_CSR_DERA		0x501	/* debug era */
>   #define LOONGARCH_CSR_DESAVE		0x502	/* debug save */
>   
> +#define CSR_FWPC_SKIP_SHIFT		16
> +#define CSR_FWPC_SKIP			(_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
> +
>   /*
>    * CSR_ECFG IM
>    */
> diff --git a/arch/loongarch/include/asm/processor.h b/arch/loongarch/include/asm/processor.h
> index db060c5a976f..3ea0f1910c23 100644
> --- a/arch/loongarch/include/asm/processor.h
> +++ b/arch/loongarch/include/asm/processor.h
> @@ -131,6 +131,9 @@ struct thread_struct {
>   	struct perf_event	*hbp_break[LOONGARCH_MAX_BRP];
>   	struct perf_event	*hbp_watch[LOONGARCH_MAX_WRP];
>   
> +	/* Used by ptrace single_step */

"PTRACE_SINGLESTEP" if you want to refer to the technical name of the 
feature, or "ptrace single stepping" which is more natural language.

> +	unsigned long single_step;
> +
>   	/*
>   	 * FPU & vector registers, must be at last because
>   	 * they are conditionally copied at fork().
> diff --git a/arch/loongarch/include/asm/ptrace.h b/arch/loongarch/include/asm/ptrace.h
> index 58596c4f8a0f..66a0e6c480a3 100644
> --- a/arch/loongarch/include/asm/ptrace.h
> +++ b/arch/loongarch/include/asm/ptrace.h
> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct pt_regs *regs,
>   	regs->regs[3] = val;
>   }
>   
> +#define arch_has_single_step()		(1)
> +
>   #endif /* _ASM_PTRACE_H */
> diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
> index 6431cd319c32..75d3652fbe00 100644
> --- a/arch/loongarch/kernel/hw_breakpoint.c
> +++ b/arch/loongarch/kernel/hw_breakpoint.c
> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct perf_event **slots, int max_slots,
>    */
>   void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>   {
> +	int i;
> +	struct thread_struct *t = &tsk->thread;
> +
> +	for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
> +		if (t->hbp_break[i]) {
> +			unregister_hw_breakpoint(t->hbp_break[i]);
> +			t->hbp_break[i] = NULL;
> +		}
> +	}
> +
> +	for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
> +		if (t->hbp_watch[i]) {
> +			unregister_hw_breakpoint(t->hbp_watch[i]);
> +			t->hbp_watch[i] = NULL;
> +		}
> +	}
>   }
>   
>   void ptrace_hw_copy_thread(struct task_struct *tsk)
> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>   void hw_breakpoint_thread_switch(struct task_struct *next)
>   {
>   	struct pt_regs *regs = task_pt_regs(next);
> -
> -	/* Update breakpoints */
> -	update_bp_registers(regs, 1, 0);
> -	/* Update watchpoints */
> -	update_bp_registers(regs, 1, 1);
> +	u64 addr, mask;
> +
> +	if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
> +		addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
> +		mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
> +		if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
> +			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +		regs->csr_prmd |= CSR_PRMD_PWE;
> +	} else {
> +		/* Update breakpoints */
> +		update_bp_registers(regs, 1, 0);
> +		/* Update watchpoints */
> +		update_bp_registers(regs, 1, 1);
> +	}
>   }
>   
>   void hw_breakpoint_pmu_read(struct perf_event *bp)
> diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
> index bee4194177fd..52a3ee4366f4 100644
> --- a/arch/loongarch/kernel/ptrace.c
> +++ b/arch/loongarch/kernel/ptrace.c
> @@ -20,6 +20,7 @@
>   #include <linux/context_tracking.h>
>   #include <linux/elf.h>
>   #include <linux/errno.h>
> +#include <linux/hw_breakpoint.h>
>   #include <linux/mm.h>
>   #include <linux/ptrace.h>
>   #include <linux/regset.h>
> @@ -30,6 +31,7 @@
>   #include <linux/stddef.h>
>   #include <linux/seccomp.h>
>   #include <linux/uaccess.h>
> +#include <linux/thread_info.h>
>   
>   #include <asm/byteorder.h>
>   #include <asm/cpu.h>
> @@ -39,6 +41,7 @@
>   #include <asm/page.h>
>   #include <asm/pgtable.h>
>   #include <asm/processor.h>
> +#include <asm/ptrace.h>
>   #include <asm/reg.h>
>   #include <asm/syscall.h>
>   
> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long request,
>   
>   	return ret;
>   }
> +
> +void ptrace_triggered(struct perf_event *bp,
> +		      struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	struct perf_event_attr attr;
> +
> +	attr = bp->attr;
> +	attr.disabled = true;
> +	modify_user_hw_breakpoint(bp, &attr);
> +}
> +
> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
> +{
> +	struct thread_struct *thread = &tsk->thread;
> +	struct perf_event *bp;
> +	struct perf_event_attr attr;
> +	struct arch_hw_breakpoint *info;
> +
> +	bp = thread->hbp_break[0];
> +	if (!bp) {
> +		ptrace_breakpoint_init(&attr);
> +
> +		attr.bp_addr = addr;
> +		attr.bp_len = HW_BREAKPOINT_LEN_8;
> +		attr.bp_type = HW_BREAKPOINT_X;
> +
> +		bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
> +						 NULL, tsk);
> +		if (IS_ERR(bp))
> +			return PTR_ERR(bp);
> +
> +		thread->hbp_break[0] = bp;
> +	} else {
> +		int err;
> +
> +		attr = bp->attr;
> +		attr.bp_addr = addr;
> +		/* reenable breakpoint */
> +		attr.disabled = false;
> +		err = modify_user_hw_breakpoint(bp, &attr);
> +		if (unlikely(err))
> +			return err;
> +
> +		csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
> +	}
> +	info = counter_arch_bp(bp);
> +	info->mask = 0xffffffffffff;
> +
> +	return 0;
> +}
> +
> +/* ptrace API */
> +void user_enable_single_step(struct task_struct *task)
> +{
> +	struct thread_info *ti = task_thread_info(task);
> +
> +	set_single_step(task, task_pt_regs(task)->csr_era);
> +	task->thread.single_step = task_pt_regs(task)->csr_era;
> +	set_ti_thread_flag(ti, TIF_SINGLESTEP);
> +}
> +
> +void user_disable_single_step(struct task_struct *task)
> +{
> +	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c
> index 2b133079e0f3..a59275a43bd1 100644
> --- a/arch/loongarch/kernel/traps.c
> +++ b/arch/loongarch/kernel/traps.c
> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs *regs)
>   #ifdef CONFIG_HAVE_HW_BREAKPOINT
>   	irqentry_state_t state = irqentry_enter(regs);
>   
> -	breakpoint_handler(regs);
> -	watchpoint_handler(regs);
> -	force_sig(SIGTRAP);
> +	if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
> +		int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
> +		unsigned long pc = regs->csr_era;
> +		union loongarch_instruction *ip = (union loongarch_instruction *)pc;
> +
> +		if (llbit) {
> +		/*
> +		 * When the ll-sc combo is encountered, it is regarded as an single
> +		 * instruction. So don't clear llbit and reset CSR.FWPS.Skip until
> +		 * the llsc execution is completed.
> +		 */
> +			csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +			csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
> +		} else if (pc == current->thread.single_step) {
> +		/*
> +		 * Maybe some hardware Bug caused that certain insns are occasionally
> +		 * not skipped when CSR.FWPS.Skip is set, such as fld.d/fst.d. So Singlestep
> +		 * needs to compare whether the csr_era is equal to the value of singlestep
> +		 * which last time set.
> +		 */
> +			if (branch_ins_target_pc(ip))
> +			/* Prevent rd == pc from causing single step to fail to stop */
> +				csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
> +		} else {
> +			force_sig(SIGTRAP);
> +		}
> +	} else {
> +		breakpoint_handler(regs);
> +		watchpoint_handler(regs);
> +		force_sig(SIGTRAP);
> +	}
>   
>   	irqentry_exit(regs, state);
>   #endif

-- 
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17 10:00     ` Qing Zhang
@ 2023-02-18  1:21       ` Jinyang He
  2023-02-18  6:38         ` Qing Zhang
  0 siblings, 1 reply; 10+ messages in thread
From: Jinyang He @ 2023-02-18  1:21 UTC (permalink / raw)
  To: Qing Zhang, Youling Tang, WANG Xuerui
  Cc: Huacai Chen, Oleg Nesterov, Jiaxun Yang, loongarch, linux-kernel


On 2023-02-17 18:00, Qing Zhang wrote:
> Hi, Youling
>
> On 2023/2/17 下午5:50, Youling Tang wrote:
>> Hi, Qing
>>
>> On 02/17/2023 10:37 AM, Qing Zhang wrote:
>>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>>> arch_has_single_step in  and implementing the
>>> user_enable_single_step and user_disable_single_step functions.
>>>
>>> LongArch has no hardware single-step register. the hardware single-step
>>> function multiplex fetch instruction watchpoint(FWPS) and specifies 
>>> that
>>> the next instruction must trigger the watch exception by setting the 
>>> mask bit.
>>> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not 
>>> to trigger
>>> the watchpoint exception, and proceed to the next instruction.
>>>
>>> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
>>> ---
>>>  arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>>>  arch/loongarch/include/asm/loongarch.h |  3 ++
>>>  arch/loongarch/include/asm/processor.h |  3 ++
>>>  arch/loongarch/include/asm/ptrace.h    |  2 +
>>>  arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>>  arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>>>  arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>>>  7 files changed, 176 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/loongarch/include/asm/inst.h 
>>> b/arch/loongarch/include/asm/inst.h
>>> index ba18ce8fbdf2..00c6f261d9a2 100644
>>> --- a/arch/loongarch/include/asm/inst.h
>>> +++ b/arch/loongarch/include/asm/inst.h
>>> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union 
>>> loongarch_instruction *ip)
>>>          is_imm12_negative(ip->reg2i12_format.immediate);
>>>  }
>>>
>>> +static inline bool branch_ins_target_pc(union loongarch_instruction 
>>> *ip)
>>> +{
>>> +    switch (ip->reg0i26_format.opcode) {
>>> +    case b_op:
>>> +    case bl_op:
>>> +        if (ip->reg0i26_format.immediate_l == 0
>>> +           && ip->reg0i26_format.immediate_h == 0)
>>> +            return false;
>>> +    }
>>> +
>>> +    switch (ip->reg1i21_format.opcode) {
>>> +    case beqz_op:
>>> +    case bnez_op:
>>> +    case bceqz_op:
>>> +        if (ip->reg1i21_format.immediate_l == 0
>>> +           && ip->reg1i21_format.immediate_h == 0)
>>> +            return false;
>>> +    }
>>> +
>>> +    switch (ip->reg2i16_format.opcode) {
>>> +    case jirl_op:
>>> +        if (ip->reg2i16_format.rj == 0x1
>>> +           && ip->reg2i16_format.rd == 0x1
>> LOONGARCH_GPR_RA can be used instead of 0x1.
>>
>>> +           && ip->reg2i16_format.immediate == 0)
>>> +            return false;
>>> +        break;
>>> +    case beq_op:
>>> +    case bne_op:
>>> +    case blt_op:
>>> +    case bge_op:
>>> +    case bltu_op:
>>> +    case bgeu_op:
>>> +        if (ip->reg2i16_format.immediate == 0)
>>> +            return false;
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>> +
>>>  void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>>>  void simu_branch(struct pt_regs *regs, union loongarch_instruction 
>>> insn);
>>>
>>> diff --git a/arch/loongarch/include/asm/loongarch.h 
>>> b/arch/loongarch/include/asm/loongarch.h
>>> index e9aed583a064..65b7dcdea16d 100644
>>> --- a/arch/loongarch/include/asm/loongarch.h
>>> +++ b/arch/loongarch/include/asm/loongarch.h
>>> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 
>>> val, u32 reg)
>>>  #define LOONGARCH_CSR_DERA        0x501    /* debug era */
>>>  #define LOONGARCH_CSR_DESAVE        0x502    /* debug save */
>>>
>>> +#define CSR_FWPC_SKIP_SHIFT        16
>>> +#define CSR_FWPC_SKIP            (_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
>>> +
>>>  /*
>>>   * CSR_ECFG IM
>>>   */
>>> diff --git a/arch/loongarch/include/asm/processor.h 
>>> b/arch/loongarch/include/asm/processor.h
>>> index db060c5a976f..3ea0f1910c23 100644
>>> --- a/arch/loongarch/include/asm/processor.h
>>> +++ b/arch/loongarch/include/asm/processor.h
>>> @@ -131,6 +131,9 @@ struct thread_struct {
>>>      struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>>      struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>>
>>> +    /* Used by ptrace single_step */
>>> +    unsigned long single_step;
>>> +
>>>      /*
>>>       * FPU & vector registers, must be at last because
>>>       * they are conditionally copied at fork().
>>> diff --git a/arch/loongarch/include/asm/ptrace.h 
>>> b/arch/loongarch/include/asm/ptrace.h
>>> index 58596c4f8a0f..66a0e6c480a3 100644
>>> --- a/arch/loongarch/include/asm/ptrace.h
>>> +++ b/arch/loongarch/include/asm/ptrace.h
>>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct 
>>> pt_regs *regs,
>>>      regs->regs[3] = val;
>>>  }
>>>
>>> +#define arch_has_single_step()        (1)
>>> +
>>>  #endif /* _ASM_PTRACE_H */
>>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c 
>>> b/arch/loongarch/kernel/hw_breakpoint.c
>>> index 6431cd319c32..75d3652fbe00 100644
>>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct 
>>> perf_event **slots, int max_slots,
>>>   */
>>>  void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>>  {
>>> +    int i;
>>> +    struct thread_struct *t = &tsk->thread;
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>>> +        if (t->hbp_break[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>>> +            t->hbp_break[i] = NULL;
>>> +        }
>>> +    }
>>> +
>>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>>> +        if (t->hbp_watch[i]) {
>>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>>> +            t->hbp_watch[i] = NULL;
>>> +        }
>>> +    }
>>>  }
>>>
>>>  void ptrace_hw_copy_thread(struct task_struct *tsk)
>>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>>  void hw_breakpoint_thread_switch(struct task_struct *next)
>>>  {
>>>      struct pt_regs *regs = task_pt_regs(next);
>>> -
>>> -    /* Update breakpoints */
>>> -    update_bp_registers(regs, 1, 0);
>>> -    /* Update watchpoints */
>>> -    update_bp_registers(regs, 1, 1);
>>> +    u64 addr, mask;
>>> +
>>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>>> +    } else {
>>> +        /* Update breakpoints */
>>> +        update_bp_registers(regs, 1, 0);
>>> +        /* Update watchpoints */
>>> +        update_bp_registers(regs, 1, 1);
>>> +    }
>>>  }
>>>
>>>  void hw_breakpoint_pmu_read(struct perf_event *bp)
>>> diff --git a/arch/loongarch/kernel/ptrace.c 
>>> b/arch/loongarch/kernel/ptrace.c
>>> index bee4194177fd..52a3ee4366f4 100644
>>> --- a/arch/loongarch/kernel/ptrace.c
>>> +++ b/arch/loongarch/kernel/ptrace.c
>>> @@ -20,6 +20,7 @@
>>>  #include <linux/context_tracking.h>
>>>  #include <linux/elf.h>
>>>  #include <linux/errno.h>
>>> +#include <linux/hw_breakpoint.h>
>>>  #include <linux/mm.h>
>>>  #include <linux/ptrace.h>
>>>  #include <linux/regset.h>
>>> @@ -30,6 +31,7 @@
>>>  #include <linux/stddef.h>
>>>  #include <linux/seccomp.h>
>>>  #include <linux/uaccess.h>
>>> +#include <linux/thread_info.h>
>>>
>>>  #include <asm/byteorder.h>
>>>  #include <asm/cpu.h>
>>> @@ -39,6 +41,7 @@
>>>  #include <asm/page.h>
>>>  #include <asm/pgtable.h>
>>>  #include <asm/processor.h>
>>> +#include <asm/ptrace.h>
>>>  #include <asm/reg.h>
>>>  #include <asm/syscall.h>
>>>
>>> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, 
>>> long request,
>>>
>>>      return ret;
>>>  }
>>> +
>>> +void ptrace_triggered(struct perf_event *bp,
>>> +              struct perf_sample_data *data, struct pt_regs *regs)
>>> +{
>>> +    struct perf_event_attr attr;
>>> +
>>> +    attr = bp->attr;
>>> +    attr.disabled = true;
>>> +    modify_user_hw_breakpoint(bp, &attr);
>>> +}
>>> +
>>> +static int set_single_step(struct task_struct *tsk, unsigned long 
>>> addr)
>>> +{
>>> +    struct thread_struct *thread = &tsk->thread;
>>> +    struct perf_event *bp;
>>> +    struct perf_event_attr attr;
>>> +    struct arch_hw_breakpoint *info;
>>> +
>>> +    bp = thread->hbp_break[0];
>>> +    if (!bp) {
>>> +        ptrace_breakpoint_init(&attr);
>>> +
>>> +        attr.bp_addr = addr;
>>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>>> +        attr.bp_type = HW_BREAKPOINT_X;
>>> +
>>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>>> +                         NULL, tsk);
>>> +        if (IS_ERR(bp))
>>> +            return PTR_ERR(bp);
>>> +
>>> +        thread->hbp_break[0] = bp;
>>> +    } else {
>>> +        int err;
>>> +
>>> +        attr = bp->attr;
>>> +        attr.bp_addr = addr;
>>> +        /* reenable breakpoint */
>>> +        attr.disabled = false;
>>> +        err = modify_user_hw_breakpoint(bp, &attr);
>>> +        if (unlikely(err))
>>> +            return err;
>>> +
>>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>>> +    }
>>> +    info = counter_arch_bp(bp);
>>> +    info->mask = 0xffffffffffff;
>> If `mask` is only used for user space address mask, it should not be
>> fixed to 0xffffffffffff, the user space address size may have many
>> situations, which can be defined according to TASK_SIZE.
>>
>> Use (TASK_SIZE - 1) instead.
>>
> ok, got it.

But according to to [1], the priority of watchpoint exception is highest
in the priority of exception which all fetching stage. Thus, we may get
the exception unexpected if set MASK to (TASK_SIZE - 1).

e.g.
     0x1000: addi.d $rd, $zero, -1 # a way to set a kernel address in rd
     0x1004: jr $rd

If singlestep is set at 0x1004, it will fetch insn in kernel address and
and get operation address error exception if we set MASK to (TASK_SIZE - 1).


[1] 
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#exception-priority


Thanks,

Jinyang


>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/* ptrace API */
>>> +void user_enable_single_step(struct task_struct *task)
>>> +{
>>> +    struct thread_info *ti = task_thread_info(task);
>>> +
>>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>>> +}
>>> +
>>> +void user_disable_single_step(struct task_struct *task)
>>> +{
>>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>>> +}
>>> diff --git a/arch/loongarch/kernel/traps.c 
>>> b/arch/loongarch/kernel/traps.c
>>> index 2b133079e0f3..a59275a43bd1 100644
>>> --- a/arch/loongarch/kernel/traps.c
>>> +++ b/arch/loongarch/kernel/traps.c
>>> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs 
>>> *regs)
>>>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>>      irqentry_state_t state = irqentry_enter(regs);
>>>
>>> -    breakpoint_handler(regs);
>>> -    watchpoint_handler(regs);
>>> -    force_sig(SIGTRAP);
>>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>>> +        unsigned long pc = regs->csr_era;
>>> +        union loongarch_instruction *ip = (union 
>>> loongarch_instruction *)pc;
>>> +
>>> +        if (llbit) {
>>> +        /*
>>> +         * When the ll-sc combo is encountered, it is regarded as 
>>> an single
>>> +         * instruction. So don't clear llbit and reset 
>>> CSR.FWPS.Skip until
>>> +         * the llsc execution is completed.
>>> +         */
>> Comments should be aligned with code, similar modifications in other
>> places.
>>
>>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>> +            csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
>>> +        } else if (pc == current->thread.single_step) {
>>> +        /*
>>> +         * Maybe some hardware Bug caused that certain insns are 
>>> occasionally
>>> +         * not skipped when CSR.FWPS.Skip is set, such as 
>>> fld.d/fst.d. So Singlestep
>>> +         * needs to compare whether the csr_era is equal to the 
>>> value of singlestep
>>> +         * which last time set.
>>> +         */
>>> +            if (branch_ins_target_pc(ip))
>>> +            /* Prevent rd == pc from causing single step to fail to 
>>> stop */
>> IMO, that comment description is not accurate enough.ok, I'll add 
>> more comments.
>
> Thanks,
> -Qing
>> Youling.
>>
>>> + csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>> +        } else {
>>> +            force_sig(SIGTRAP);
>>> +        }
>>> +    } else {
>>> +        breakpoint_handler(regs);
>>> +        watchpoint_handler(regs);
>>> +        force_sig(SIGTRAP);
>>> +    }
>>>
>>>      irqentry_exit(regs, state);
>>>  #endif
>>>
>>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-17 10:16   ` WANG Xuerui
@ 2023-02-18  6:31     ` Qing Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-18  6:31 UTC (permalink / raw)
  To: WANG Xuerui, Huacai Chen, Oleg Nesterov
  Cc: Jiaxun Yang, loongarch, linux-kernel

Hi, Xuerui

On 2023/2/17 下午6:16, WANG Xuerui wrote:
> On 2023/2/17 10:37, Qing Zhang wrote:
>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>> arch_has_single_step in  and implementing the
>> user_enable_single_step and user_disable_single_step functions.
>>
>> LongArch has no hardware single-step register. the hardware single-step
> 
> "LoongArch cannot do hardware single-stepping per se" could be more 
> accurate a description. ("Hardware single-step register" could be 
> ambiguous in implying the only way to achieve single-stepping would be 
> via software, but in fact you explained in the next sentence that it's 
> actually doable, just in an alternate way.)
> 
>> function multiplex fetch instruction watchpoint(FWPS) and specifies that
>> the next instruction must trigger the watch exception by setting the 
>> mask bit.
> 
> I assume the "multiplex" is translated from "复用", in which case it's 
> not used in the exact same way as in Chinese. Plus you have the subject 
> wrong, "hardware single-step function" is the topic but not the actor 
> doing the "multiplexing".
> 
> My take (continuing from the previous sentence I suggested):
> 
> "Instead it is achieved by configuring the instruction fetch watchpoints 
> so that WPEF is triggered on the next instruction."
> 
>> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not to 
>> trigger
>> the watchpoint exception, and proceed to the next instruction.
> 
> Again, wrong topicalization ("scenarios" is not the actor/subject). "In 
> some scenarios CSR.FWPS.Skip is used to ..." sounds better.
> 
> And the multiple "next" usages sound unclear, you could probably reword 
> this a bit. (I can't immediately understand this part so I'm unable to 
> provide a suggestion.)
> 
>>
>> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
>> ---
>>   arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>>   arch/loongarch/include/asm/loongarch.h |  3 ++
>>   arch/loongarch/include/asm/processor.h |  3 ++
>>   arch/loongarch/include/asm/ptrace.h    |  2 +
>>   arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>   arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>>   arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>>   7 files changed, 176 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/inst.h 
>> b/arch/loongarch/include/asm/inst.h
>> index ba18ce8fbdf2..00c6f261d9a2 100644
>> --- a/arch/loongarch/include/asm/inst.h
>> +++ b/arch/loongarch/include/asm/inst.h
>> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union 
>> loongarch_instruction *ip)
>>           is_imm12_negative(ip->reg2i12_format.immediate);
>>   }
>> +static inline bool branch_ins_target_pc(union loongarch_instruction *ip)
> 
> This name is not faithfully reflecting its purpose, which is to check if 
> the given instruction will change PC; and the reason this has to be done 
> in the first place being that, single stepping would never stop due to 
> current hardware erratum if the PC would stay the same after the insn's 
> execution.
> 
> You may want to add some comments regarding the erratum too if you're in 
> a position to disclose more details.
> >> +{
>> +    switch (ip->reg0i26_format.opcode) {
>> +    case b_op:
>> +    case bl_op:
>> +        if (ip->reg0i26_format.immediate_l == 0
>> +           && ip->reg0i26_format.immediate_h == 0)
>> +            return false;
>> +    }
>> +
>> +    switch (ip->reg1i21_format.opcode) {
>> +    case beqz_op:
>> +    case bnez_op:
>> +    case bceqz_op:
>> +        if (ip->reg1i21_format.immediate_l == 0
>> +           && ip->reg1i21_format.immediate_h == 0)
>> +            return false;
>> +    }
>> +
>> +    switch (ip->reg2i16_format.opcode) {
>> +    case jirl_op:
>> +        if (ip->reg2i16_format.rj == 0x1
>> +           && ip->reg2i16_format.rd == 0x1
> Given the appearance of the other cases, I feel like the triggering 
> condition of the nonterminating singlestepping is that "new PC == 
> current PC" always hold. So, isn't every `rd == rj && imm == 0` case 
> problematic?
> 
We just have to think about dest_pc == current_pc,
Not every case where 'rd == rj &&imm == 0' is a problem.

current_pc = csr_era (The user mode of the Exception pc)
dest_pc = rj + offset

eg:
jirl ra,ra 0
1. When rj == ra, current_pc = dest_pc.
jirl t0,t0 0
2. When rj is another register, if the value in it is equal to ra, 
current_pc = dest_pc. The value of rj is recovered by pt_reg.

Therefore, rd does not need comparison, I modified jirl's judgment and
others in v5.

Thanks,
-Qing
>> +           && ip->reg2i16_format.immediate == 0)
>> +            return false;
>> +        break;
>> +    case beq_op:
>> +    case bne_op:
>> +    case blt_op:
>> +    case bge_op:
>> +    case bltu_op:
>> +    case bgeu_op:
>> +        if (ip->reg2i16_format.immediate == 0)
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +}
>> +
>>   void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>>   void simu_branch(struct pt_regs *regs, union loongarch_instruction 
>> insn);
>> diff --git a/arch/loongarch/include/asm/loongarch.h 
>> b/arch/loongarch/include/asm/loongarch.h
>> index e9aed583a064..65b7dcdea16d 100644
>> --- a/arch/loongarch/include/asm/loongarch.h
>> +++ b/arch/loongarch/include/asm/loongarch.h
>> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 
>> val, u32 reg)
>>   #define LOONGARCH_CSR_DERA        0x501    /* debug era */
>>   #define LOONGARCH_CSR_DESAVE        0x502    /* debug save */
>> +#define CSR_FWPC_SKIP_SHIFT        16
>> +#define CSR_FWPC_SKIP            (_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
>> +
>>   /*
>>    * CSR_ECFG IM
>>    */
>> diff --git a/arch/loongarch/include/asm/processor.h 
>> b/arch/loongarch/include/asm/processor.h
>> index db060c5a976f..3ea0f1910c23 100644
>> --- a/arch/loongarch/include/asm/processor.h
>> +++ b/arch/loongarch/include/asm/processor.h
>> @@ -131,6 +131,9 @@ struct thread_struct {
>>       struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>       struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>> +    /* Used by ptrace single_step */
> 
> "PTRACE_SINGLESTEP" if you want to refer to the technical name of the 
> feature, or "ptrace single stepping" which is more natural language.
> 
>> +    unsigned long single_step;
>> +
>>       /*
>>        * FPU & vector registers, must be at last because
>>        * they are conditionally copied at fork().
>> diff --git a/arch/loongarch/include/asm/ptrace.h 
>> b/arch/loongarch/include/asm/ptrace.h
>> index 58596c4f8a0f..66a0e6c480a3 100644
>> --- a/arch/loongarch/include/asm/ptrace.h
>> +++ b/arch/loongarch/include/asm/ptrace.h
>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct 
>> pt_regs *regs,
>>       regs->regs[3] = val;
>>   }
>> +#define arch_has_single_step()        (1)
>> +
>>   #endif /* _ASM_PTRACE_H */
>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c 
>> b/arch/loongarch/kernel/hw_breakpoint.c
>> index 6431cd319c32..75d3652fbe00 100644
>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct 
>> perf_event **slots, int max_slots,
>>    */
>>   void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>   {
>> +    int i;
>> +    struct thread_struct *t = &tsk->thread;
>> +
>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>> +        if (t->hbp_break[i]) {
>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>> +            t->hbp_break[i] = NULL;
>> +        }
>> +    }
>> +
>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>> +        if (t->hbp_watch[i]) {
>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>> +            t->hbp_watch[i] = NULL;
>> +        }
>> +    }
>>   }
>>   void ptrace_hw_copy_thread(struct task_struct *tsk)
>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>   void hw_breakpoint_thread_switch(struct task_struct *next)
>>   {
>>       struct pt_regs *regs = task_pt_regs(next);
>> -
>> -    /* Update breakpoints */
>> -    update_bp_registers(regs, 1, 0);
>> -    /* Update watchpoints */
>> -    update_bp_registers(regs, 1, 1);
>> +    u64 addr, mask;
>> +
>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>> +    } else {
>> +        /* Update breakpoints */
>> +        update_bp_registers(regs, 1, 0);
>> +        /* Update watchpoints */
>> +        update_bp_registers(regs, 1, 1);
>> +    }
>>   }
>>   void hw_breakpoint_pmu_read(struct perf_event *bp)
>> diff --git a/arch/loongarch/kernel/ptrace.c 
>> b/arch/loongarch/kernel/ptrace.c
>> index bee4194177fd..52a3ee4366f4 100644
>> --- a/arch/loongarch/kernel/ptrace.c
>> +++ b/arch/loongarch/kernel/ptrace.c
>> @@ -20,6 +20,7 @@
>>   #include <linux/context_tracking.h>
>>   #include <linux/elf.h>
>>   #include <linux/errno.h>
>> +#include <linux/hw_breakpoint.h>
>>   #include <linux/mm.h>
>>   #include <linux/ptrace.h>
>>   #include <linux/regset.h>
>> @@ -30,6 +31,7 @@
>>   #include <linux/stddef.h>
>>   #include <linux/seccomp.h>
>>   #include <linux/uaccess.h>
>> +#include <linux/thread_info.h>
>>   #include <asm/byteorder.h>
>>   #include <asm/cpu.h>
>> @@ -39,6 +41,7 @@
>>   #include <asm/page.h>
>>   #include <asm/pgtable.h>
>>   #include <asm/processor.h>
>> +#include <asm/ptrace.h>
>>   #include <asm/reg.h>
>>   #include <asm/syscall.h>
>> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, long 
>> request,
>>       return ret;
>>   }
>> +
>> +void ptrace_triggered(struct perf_event *bp,
>> +              struct perf_sample_data *data, struct pt_regs *regs)
>> +{
>> +    struct perf_event_attr attr;
>> +
>> +    attr = bp->attr;
>> +    attr.disabled = true;
>> +    modify_user_hw_breakpoint(bp, &attr);
>> +}
>> +
>> +static int set_single_step(struct task_struct *tsk, unsigned long addr)
>> +{
>> +    struct thread_struct *thread = &tsk->thread;
>> +    struct perf_event *bp;
>> +    struct perf_event_attr attr;
>> +    struct arch_hw_breakpoint *info;
>> +
>> +    bp = thread->hbp_break[0];
>> +    if (!bp) {
>> +        ptrace_breakpoint_init(&attr);
>> +
>> +        attr.bp_addr = addr;
>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>> +        attr.bp_type = HW_BREAKPOINT_X;
>> +
>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>> +                         NULL, tsk);
>> +        if (IS_ERR(bp))
>> +            return PTR_ERR(bp);
>> +
>> +        thread->hbp_break[0] = bp;
>> +    } else {
>> +        int err;
>> +
>> +        attr = bp->attr;
>> +        attr.bp_addr = addr;
>> +        /* reenable breakpoint */
>> +        attr.disabled = false;
>> +        err = modify_user_hw_breakpoint(bp, &attr);
>> +        if (unlikely(err))
>> +            return err;
>> +
>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>> +    }
>> +    info = counter_arch_bp(bp);
>> +    info->mask = 0xffffffffffff;
>> +
>> +    return 0;
>> +}
>> +
>> +/* ptrace API */
>> +void user_enable_single_step(struct task_struct *task)
>> +{
>> +    struct thread_info *ti = task_thread_info(task);
>> +
>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>> +}
>> +
>> +void user_disable_single_step(struct task_struct *task)
>> +{
>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>> +}
>> diff --git a/arch/loongarch/kernel/traps.c 
>> b/arch/loongarch/kernel/traps.c
>> index 2b133079e0f3..a59275a43bd1 100644
>> --- a/arch/loongarch/kernel/traps.c
>> +++ b/arch/loongarch/kernel/traps.c
>> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs 
>> *regs)
>>   #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>       irqentry_state_t state = irqentry_enter(regs);
>> -    breakpoint_handler(regs);
>> -    watchpoint_handler(regs);
>> -    force_sig(SIGTRAP);
>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>> +        unsigned long pc = regs->csr_era;
>> +        union loongarch_instruction *ip = (union 
>> loongarch_instruction *)pc;
>> +
>> +        if (llbit) {
>> +        /*
>> +         * When the ll-sc combo is encountered, it is regarded as an 
>> single
>> +         * instruction. So don't clear llbit and reset CSR.FWPS.Skip 
>> until
>> +         * the llsc execution is completed.
>> +         */
>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +            csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
>> +        } else if (pc == current->thread.single_step) {
>> +        /*
>> +         * Maybe some hardware Bug caused that certain insns are 
>> occasionally
>> +         * not skipped when CSR.FWPS.Skip is set, such as 
>> fld.d/fst.d. So Singlestep
>> +         * needs to compare whether the csr_era is equal to the value 
>> of singlestep
>> +         * which last time set.
>> +         */
>> +            if (branch_ins_target_pc(ip))
>> +            /* Prevent rd == pc from causing single step to fail to 
>> stop */
>> +                csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>> +        } else {
>> +            force_sig(SIGTRAP);
>> +        }
>> +    } else {
>> +        breakpoint_handler(regs);
>> +        watchpoint_handler(regs);
>> +        force_sig(SIGTRAP);
>> +    }
>>       irqentry_exit(regs, state);
>>   #endif
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/3] LoongArch: Add ptrace single step support
  2023-02-18  1:21       ` Jinyang He
@ 2023-02-18  6:38         ` Qing Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Qing Zhang @ 2023-02-18  6:38 UTC (permalink / raw)
  To: Jinyang He, Youling Tang, WANG Xuerui
  Cc: Huacai Chen, Oleg Nesterov, Jiaxun Yang, loongarch, linux-kernel

Hi, Jinyang

On 2023/2/18 上午9:21, Jinyang He wrote:
> 
> On 2023-02-17 18:00, Qing Zhang wrote:
>> Hi, Youling
>>
>> On 2023/2/17 下午5:50, Youling Tang wrote:
>>> Hi, Qing
>>>
>>> On 02/17/2023 10:37 AM, Qing Zhang wrote:
>>>> Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
>>>> PTRACE_KILL and PTRACE_SINGLESTEP. This implies defining
>>>> arch_has_single_step in  and implementing the
>>>> user_enable_single_step and user_disable_single_step functions.
>>>>
>>>> LongArch has no hardware single-step register. the hardware single-step
>>>> function multiplex fetch instruction watchpoint(FWPS) and specifies 
>>>> that
>>>> the next instruction must trigger the watch exception by setting the 
>>>> mask bit.
>>>> Some scenarios use CSR.FWPS.Skip to ignore the next hit result, not 
>>>> to trigger
>>>> the watchpoint exception, and proceed to the next instruction.
>>>>
>>>> Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
>>>> ---
>>>>  arch/loongarch/include/asm/inst.h      | 39 +++++++++++++++
>>>>  arch/loongarch/include/asm/loongarch.h |  3 ++
>>>>  arch/loongarch/include/asm/processor.h |  3 ++
>>>>  arch/loongarch/include/asm/ptrace.h    |  2 +
>>>>  arch/loongarch/kernel/hw_breakpoint.c  | 35 +++++++++++--
>>>>  arch/loongarch/kernel/ptrace.c         | 68 ++++++++++++++++++++++++++
>>>>  arch/loongarch/kernel/traps.c          | 34 +++++++++++--
>>>>  7 files changed, 176 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/loongarch/include/asm/inst.h 
>>>> b/arch/loongarch/include/asm/inst.h
>>>> index ba18ce8fbdf2..00c6f261d9a2 100644
>>>> --- a/arch/loongarch/include/asm/inst.h
>>>> +++ b/arch/loongarch/include/asm/inst.h
>>>> @@ -368,6 +368,45 @@ static inline bool is_stack_alloc_ins(union 
>>>> loongarch_instruction *ip)
>>>>          is_imm12_negative(ip->reg2i12_format.immediate);
>>>>  }
>>>>
>>>> +static inline bool branch_ins_target_pc(union loongarch_instruction 
>>>> *ip)
>>>> +{
>>>> +    switch (ip->reg0i26_format.opcode) {
>>>> +    case b_op:
>>>> +    case bl_op:
>>>> +        if (ip->reg0i26_format.immediate_l == 0
>>>> +           && ip->reg0i26_format.immediate_h == 0)
>>>> +            return false;
>>>> +    }
>>>> +
>>>> +    switch (ip->reg1i21_format.opcode) {
>>>> +    case beqz_op:
>>>> +    case bnez_op:
>>>> +    case bceqz_op:
>>>> +        if (ip->reg1i21_format.immediate_l == 0
>>>> +           && ip->reg1i21_format.immediate_h == 0)
>>>> +            return false;
>>>> +    }
>>>> +
>>>> +    switch (ip->reg2i16_format.opcode) {
>>>> +    case jirl_op:
>>>> +        if (ip->reg2i16_format.rj == 0x1
>>>> +           && ip->reg2i16_format.rd == 0x1
>>> LOONGARCH_GPR_RA can be used instead of 0x1.
>>>
>>>> +           && ip->reg2i16_format.immediate == 0)
>>>> +            return false;
>>>> +        break;
>>>> +    case beq_op:
>>>> +    case bne_op:
>>>> +    case blt_op:
>>>> +    case bge_op:
>>>> +    case bltu_op:
>>>> +    case bgeu_op:
>>>> +        if (ip->reg2i16_format.immediate == 0)
>>>> +            return false;
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>>  void simu_pc(struct pt_regs *regs, union loongarch_instruction insn);
>>>>  void simu_branch(struct pt_regs *regs, union loongarch_instruction 
>>>> insn);
>>>>
>>>> diff --git a/arch/loongarch/include/asm/loongarch.h 
>>>> b/arch/loongarch/include/asm/loongarch.h
>>>> index e9aed583a064..65b7dcdea16d 100644
>>>> --- a/arch/loongarch/include/asm/loongarch.h
>>>> +++ b/arch/loongarch/include/asm/loongarch.h
>>>> @@ -1055,6 +1055,9 @@ static __always_inline void iocsr_write64(u64 
>>>> val, u32 reg)
>>>>  #define LOONGARCH_CSR_DERA        0x501    /* debug era */
>>>>  #define LOONGARCH_CSR_DESAVE        0x502    /* debug save */
>>>>
>>>> +#define CSR_FWPC_SKIP_SHIFT        16
>>>> +#define CSR_FWPC_SKIP            (_ULCAST_(1) << CSR_FWPC_SKIP_SHIFT)
>>>> +
>>>>  /*
>>>>   * CSR_ECFG IM
>>>>   */
>>>> diff --git a/arch/loongarch/include/asm/processor.h 
>>>> b/arch/loongarch/include/asm/processor.h
>>>> index db060c5a976f..3ea0f1910c23 100644
>>>> --- a/arch/loongarch/include/asm/processor.h
>>>> +++ b/arch/loongarch/include/asm/processor.h
>>>> @@ -131,6 +131,9 @@ struct thread_struct {
>>>>      struct perf_event    *hbp_break[LOONGARCH_MAX_BRP];
>>>>      struct perf_event    *hbp_watch[LOONGARCH_MAX_WRP];
>>>>
>>>> +    /* Used by ptrace single_step */
>>>> +    unsigned long single_step;
>>>> +
>>>>      /*
>>>>       * FPU & vector registers, must be at last because
>>>>       * they are conditionally copied at fork().
>>>> diff --git a/arch/loongarch/include/asm/ptrace.h 
>>>> b/arch/loongarch/include/asm/ptrace.h
>>>> index 58596c4f8a0f..66a0e6c480a3 100644
>>>> --- a/arch/loongarch/include/asm/ptrace.h
>>>> +++ b/arch/loongarch/include/asm/ptrace.h
>>>> @@ -150,4 +150,6 @@ static inline void user_stack_pointer_set(struct 
>>>> pt_regs *regs,
>>>>      regs->regs[3] = val;
>>>>  }
>>>>
>>>> +#define arch_has_single_step()        (1)
>>>> +
>>>>  #endif /* _ASM_PTRACE_H */
>>>> diff --git a/arch/loongarch/kernel/hw_breakpoint.c 
>>>> b/arch/loongarch/kernel/hw_breakpoint.c
>>>> index 6431cd319c32..75d3652fbe00 100644
>>>> --- a/arch/loongarch/kernel/hw_breakpoint.c
>>>> +++ b/arch/loongarch/kernel/hw_breakpoint.c
>>>> @@ -153,6 +153,22 @@ static int hw_breakpoint_slot_setup(struct 
>>>> perf_event **slots, int max_slots,
>>>>   */
>>>>  void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
>>>>  {
>>>> +    int i;
>>>> +    struct thread_struct *t = &tsk->thread;
>>>> +
>>>> +    for (i = 0; i < LOONGARCH_MAX_BRP; i++) {
>>>> +        if (t->hbp_break[i]) {
>>>> +            unregister_hw_breakpoint(t->hbp_break[i]);
>>>> +            t->hbp_break[i] = NULL;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    for (i = 0; i < LOONGARCH_MAX_WRP; i++) {
>>>> +        if (t->hbp_watch[i]) {
>>>> +            unregister_hw_breakpoint(t->hbp_watch[i]);
>>>> +            t->hbp_watch[i] = NULL;
>>>> +        }
>>>> +    }
>>>>  }
>>>>
>>>>  void ptrace_hw_copy_thread(struct task_struct *tsk)
>>>> @@ -498,11 +514,20 @@ arch_initcall(arch_hw_breakpoint_init);
>>>>  void hw_breakpoint_thread_switch(struct task_struct *next)
>>>>  {
>>>>      struct pt_regs *regs = task_pt_regs(next);
>>>> -
>>>> -    /* Update breakpoints */
>>>> -    update_bp_registers(regs, 1, 0);
>>>> -    /* Update watchpoints */
>>>> -    update_bp_registers(regs, 1, 1);
>>>> +    u64 addr, mask;
>>>> +
>>>> +    if (test_bit(TIF_SINGLESTEP, &task_thread_info(next)->flags)) {
>>>> +        addr = read_wb_reg(CSR_CFG_ADDR, 0, 0);
>>>> +        mask = read_wb_reg(CSR_CFG_MASK, 0, 0);
>>>> +        if ((task_pt_regs(next)->csr_era & ~mask) == (addr & ~mask))
>>>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>>> +        regs->csr_prmd |= CSR_PRMD_PWE;
>>>> +    } else {
>>>> +        /* Update breakpoints */
>>>> +        update_bp_registers(regs, 1, 0);
>>>> +        /* Update watchpoints */
>>>> +        update_bp_registers(regs, 1, 1);
>>>> +    }
>>>>  }
>>>>
>>>>  void hw_breakpoint_pmu_read(struct perf_event *bp)
>>>> diff --git a/arch/loongarch/kernel/ptrace.c 
>>>> b/arch/loongarch/kernel/ptrace.c
>>>> index bee4194177fd..52a3ee4366f4 100644
>>>> --- a/arch/loongarch/kernel/ptrace.c
>>>> +++ b/arch/loongarch/kernel/ptrace.c
>>>> @@ -20,6 +20,7 @@
>>>>  #include <linux/context_tracking.h>
>>>>  #include <linux/elf.h>
>>>>  #include <linux/errno.h>
>>>> +#include <linux/hw_breakpoint.h>
>>>>  #include <linux/mm.h>
>>>>  #include <linux/ptrace.h>
>>>>  #include <linux/regset.h>
>>>> @@ -30,6 +31,7 @@
>>>>  #include <linux/stddef.h>
>>>>  #include <linux/seccomp.h>
>>>>  #include <linux/uaccess.h>
>>>> +#include <linux/thread_info.h>
>>>>
>>>>  #include <asm/byteorder.h>
>>>>  #include <asm/cpu.h>
>>>> @@ -39,6 +41,7 @@
>>>>  #include <asm/page.h>
>>>>  #include <asm/pgtable.h>
>>>>  #include <asm/processor.h>
>>>> +#include <asm/ptrace.h>
>>>>  #include <asm/reg.h>
>>>>  #include <asm/syscall.h>
>>>>
>>>> @@ -541,3 +544,68 @@ long arch_ptrace(struct task_struct *child, 
>>>> long request,
>>>>
>>>>      return ret;
>>>>  }
>>>> +
>>>> +void ptrace_triggered(struct perf_event *bp,
>>>> +              struct perf_sample_data *data, struct pt_regs *regs)
>>>> +{
>>>> +    struct perf_event_attr attr;
>>>> +
>>>> +    attr = bp->attr;
>>>> +    attr.disabled = true;
>>>> +    modify_user_hw_breakpoint(bp, &attr);
>>>> +}
>>>> +
>>>> +static int set_single_step(struct task_struct *tsk, unsigned long 
>>>> addr)
>>>> +{
>>>> +    struct thread_struct *thread = &tsk->thread;
>>>> +    struct perf_event *bp;
>>>> +    struct perf_event_attr attr;
>>>> +    struct arch_hw_breakpoint *info;
>>>> +
>>>> +    bp = thread->hbp_break[0];
>>>> +    if (!bp) {
>>>> +        ptrace_breakpoint_init(&attr);
>>>> +
>>>> +        attr.bp_addr = addr;
>>>> +        attr.bp_len = HW_BREAKPOINT_LEN_8;
>>>> +        attr.bp_type = HW_BREAKPOINT_X;
>>>> +
>>>> +        bp = register_user_hw_breakpoint(&attr, ptrace_triggered,
>>>> +                         NULL, tsk);
>>>> +        if (IS_ERR(bp))
>>>> +            return PTR_ERR(bp);
>>>> +
>>>> +        thread->hbp_break[0] = bp;
>>>> +    } else {
>>>> +        int err;
>>>> +
>>>> +        attr = bp->attr;
>>>> +        attr.bp_addr = addr;
>>>> +        /* reenable breakpoint */
>>>> +        attr.disabled = false;
>>>> +        err = modify_user_hw_breakpoint(bp, &attr);
>>>> +        if (unlikely(err))
>>>> +            return err;
>>>> +
>>>> +        csr_write64(attr.bp_addr, LOONGARCH_CSR_IB0ADDR);
>>>> +    }
>>>> +    info = counter_arch_bp(bp);
>>>> +    info->mask = 0xffffffffffff;
>>> If `mask` is only used for user space address mask, it should not be
>>> fixed to 0xffffffffffff, the user space address size may have many
>>> situations, which can be defined according to TASK_SIZE.
>>>
>>> Use (TASK_SIZE - 1) instead.
>>>
>> ok, got it.
> 
> But according to to [1], the priority of watchpoint exception is highest
> in the priority of exception which all fetching stage. Thus, we may get
> the exception unexpected if set MASK to (TASK_SIZE - 1).
> 
> e.g.
>      0x1000: addi.d $rd, $zero, -1 # a way to set a kernel address in rd
>      0x1004: jr $rd
> 
> If singlestep is set at 0x1004, it will fetch insn in kernel address and
> and get operation address error exception if we set MASK to (TASK_SIZE - 
> 1).
> 
> 
> [1] 
> https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#exception-priority 
> 
> 
In the future, breakpoints at kernel_enable_single_step will be added, 
possibly with mask set to -1.
Now, because FWPS CTRL is set to 1 for PLV0-3, mask is set to 
0xffffffffffff at user_enable_single_step.
As discussed below, user_code will be added to the breakpoint framework.

Thanks,
-Qing
> 
> Thanks,
> 
> Jinyang
> 
> 
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +/* ptrace API */
>>>> +void user_enable_single_step(struct task_struct *task)
>>>> +{
>>>> +    struct thread_info *ti = task_thread_info(task);
>>>> +
>>>> +    set_single_step(task, task_pt_regs(task)->csr_era);
>>>> +    task->thread.single_step = task_pt_regs(task)->csr_era;
>>>> +    set_ti_thread_flag(ti, TIF_SINGLESTEP);
>>>> +}
>>>> +
>>>> +void user_disable_single_step(struct task_struct *task)
>>>> +{
>>>> +    clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>>>> +}
>>>> diff --git a/arch/loongarch/kernel/traps.c 
>>>> b/arch/loongarch/kernel/traps.c
>>>> index 2b133079e0f3..a59275a43bd1 100644
>>>> --- a/arch/loongarch/kernel/traps.c
>>>> +++ b/arch/loongarch/kernel/traps.c
>>>> @@ -511,9 +511,37 @@ asmlinkage void noinstr do_watch(struct pt_regs 
>>>> *regs)
>>>>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
>>>>      irqentry_state_t state = irqentry_enter(regs);
>>>>
>>>> -    breakpoint_handler(regs);
>>>> -    watchpoint_handler(regs);
>>>> -    force_sig(SIGTRAP);
>>>> +    if (test_tsk_thread_flag(current, TIF_SINGLESTEP)) {
>>>> +        int llbit = (csr_read32(LOONGARCH_CSR_LLBCTL) & 0x1);
>>>> +        unsigned long pc = regs->csr_era;
>>>> +        union loongarch_instruction *ip = (union 
>>>> loongarch_instruction *)pc;
>>>> +
>>>> +        if (llbit) {
>>>> +        /*
>>>> +         * When the ll-sc combo is encountered, it is regarded as 
>>>> an single
>>>> +         * instruction. So don't clear llbit and reset 
>>>> CSR.FWPS.Skip until
>>>> +         * the llsc execution is completed.
>>>> +         */
>>> Comments should be aligned with code, similar modifications in other
>>> places.
>>>
>>>> +            csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>>> +            csr_write32(CSR_LLBCTL_KLO, LOONGARCH_CSR_LLBCTL);
>>>> +        } else if (pc == current->thread.single_step) {
>>>> +        /*
>>>> +         * Maybe some hardware Bug caused that certain insns are 
>>>> occasionally
>>>> +         * not skipped when CSR.FWPS.Skip is set, such as 
>>>> fld.d/fst.d. So Singlestep
>>>> +         * needs to compare whether the csr_era is equal to the 
>>>> value of singlestep
>>>> +         * which last time set.
>>>> +         */
>>>> +            if (branch_ins_target_pc(ip))
>>>> +            /* Prevent rd == pc from causing single step to fail to 
>>>> stop */
>>> IMO, that comment description is not accurate enough.ok, I'll add 
>>> more comments.
>>
>> Thanks,
>> -Qing
>>> Youling.
>>>
>>>> + csr_write32(CSR_FWPC_SKIP, LOONGARCH_CSR_FWPS);
>>>> +        } else {
>>>> +            force_sig(SIGTRAP);
>>>> +        }
>>>> +    } else {
>>>> +        breakpoint_handler(regs);
>>>> +        watchpoint_handler(regs);
>>>> +        force_sig(SIGTRAP);
>>>> +    }
>>>>
>>>>      irqentry_exit(regs, state);
>>>>  #endif
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-18  6:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-17  2:37 [PATCH v4 0/3] LoongArch: Add hardware breakpoints/watchpoints support Qing Zhang
2023-02-17  2:37 ` [PATCH v4 1/3] " Qing Zhang
2023-02-17  2:37 ` [PATCH v4 2/3] LoongArch: Add ptrace single step support Qing Zhang
2023-02-17  9:50   ` Youling Tang
2023-02-17 10:00     ` Qing Zhang
2023-02-18  1:21       ` Jinyang He
2023-02-18  6:38         ` Qing Zhang
2023-02-17 10:16   ` WANG Xuerui
2023-02-18  6:31     ` Qing Zhang
2023-02-17  2:37 ` [PATCH v4 3/3] LoongArch: ptrace: expose hardware breakpoints to debuggers Qing Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).