* [PATCH v5 01/21] kprobes: Add generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 02/21] x86/kprobes: Use " Masami Hiramatsu
` (20 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Add a generic kretprobe trampoline handler for unifying
the all cloned /arch/* kretprobe trampoline handlers.
The generic kretprobe trampoline handler is based on the
x86 implementation, because it is the latest implementation.
It has frame pointer checking, kprobe_busy_begin/end and
return address fixup for user handlers.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v5:
- Restore actual current_kprobe instead of kprobe_busy.
Changes in v4:
- Remove orig_ret_address
- Change the type of trampoline_address to void *
---
include/linux/kprobes.h | 32 +++++++++++++--
kernel/kprobes.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 126 insertions(+), 4 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 9be1bff4f586..c6a913e608b7 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -187,10 +187,38 @@ static inline int kprobes_built_in(void)
return 1;
}
+extern struct kprobe kprobe_busy;
+void kprobe_busy_begin(void);
+void kprobe_busy_end(void);
+
#ifdef CONFIG_KRETPROBES
extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs);
extern int arch_trampoline_kprobe(struct kprobe *p);
+
+/* If the trampoline handler called from a kprobe, use this version */
+unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
+ void *trampoline_address,
+ void *frame_pointer);
+
+static nokprobe_inline
+unsigned long kretprobe_trampoline_handler(struct pt_regs *regs,
+ void *trampoline_address,
+ void *frame_pointer)
+{
+ unsigned long ret;
+ /*
+ * Set a dummy kprobe for avoiding kretprobe recursion.
+ * Since kretprobe never runs in kprobe handler, any kprobe must not
+ * be running at this point.
+ */
+ kprobe_busy_begin();
+ ret = __kretprobe_trampoline_handler(regs, trampoline_address, frame_pointer);
+ kprobe_busy_end();
+
+ return ret;
+}
+
#else /* CONFIG_KRETPROBES */
static inline void arch_prepare_kretprobe(struct kretprobe *rp,
struct pt_regs *regs)
@@ -354,10 +382,6 @@ static inline struct kprobe_ctlblk *get_kprobe_ctlblk(void)
return this_cpu_ptr(&kprobe_ctlblk);
}
-extern struct kprobe kprobe_busy;
-void kprobe_busy_begin(void);
-void kprobe_busy_end(void);
-
kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset);
int register_kprobe(struct kprobe *p);
void unregister_kprobe(struct kprobe *p);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 287b263c9cb9..a0afaa79024e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1927,6 +1927,104 @@ unsigned long __weak arch_deref_entry_point(void *entry)
}
#ifdef CONFIG_KRETPROBES
+
+unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
+ void *trampoline_address,
+ void *frame_pointer)
+{
+ struct kretprobe_instance *ri = NULL, *last = NULL;
+ struct hlist_head *head, empty_rp;
+ struct hlist_node *tmp;
+ unsigned long flags;
+ kprobe_opcode_t *correct_ret_addr = NULL;
+ bool skipped = false;
+
+ INIT_HLIST_HEAD(&empty_rp);
+ kretprobe_hash_lock(current, &head, &flags);
+
+ /*
+ * It is possible to have multiple instances associated with a given
+ * task either because multiple functions in the call path have
+ * return probes installed on them, and/or more than one
+ * return probe was registered for a target function.
+ *
+ * We can handle this because:
+ * - instances are always pushed into the head of the list
+ * - when multiple return probes are registered for the same
+ * function, the (chronologically) first instance's ret_addr
+ * will be the real return address, and all the rest will
+ * point to kretprobe_trampoline.
+ */
+ hlist_for_each_entry(ri, head, hlist) {
+ if (ri->task != current)
+ /* another task is sharing our hash bucket */
+ continue;
+ /*
+ * Return probes must be pushed on this hash list correct
+ * order (same as return order) so that it can be popped
+ * correctly. However, if we find it is pushed it incorrect
+ * order, this means we find a function which should not be
+ * probed, because the wrong order entry is pushed on the
+ * path of processing other kretprobe itself.
+ */
+ if (ri->fp != frame_pointer) {
+ if (!skipped)
+ pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n");
+ skipped = true;
+ continue;
+ }
+
+ correct_ret_addr = ri->ret_addr;
+ if (skipped)
+ pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n",
+ ri->rp->kp.addr);
+
+ if (correct_ret_addr != trampoline_address)
+ /*
+ * This is the real return address. Any other
+ * instances associated with this task are for
+ * other calls deeper on the call stack
+ */
+ break;
+ }
+
+ kretprobe_assert(ri, (unsigned long)correct_ret_addr,
+ (unsigned long)trampoline_address);
+ last = ri;
+
+ hlist_for_each_entry_safe(ri, tmp, head, hlist) {
+ if (ri->task != current)
+ /* another task is sharing our hash bucket */
+ continue;
+ if (ri->fp != frame_pointer)
+ continue;
+
+ if (ri->rp && ri->rp->handler) {
+ struct kprobe *prev = kprobe_running();
+
+ __this_cpu_write(current_kprobe, &ri->rp->kp);
+ ri->ret_addr = correct_ret_addr;
+ ri->rp->handler(ri, regs);
+ __this_cpu_write(current_kprobe, prev);
+ }
+
+ recycle_rp_inst(ri, &empty_rp);
+
+ if (ri == last)
+ break;
+ }
+
+ kretprobe_hash_unlock(current, &flags);
+
+ hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
+ hlist_del(&ri->hlist);
+ kfree(ri);
+ }
+
+ return (unsigned long)correct_ret_addr;
+}
+NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
+
/*
* This kprobe pre_handler is registered with every kretprobe. When probe
* hits it will set up the return probe.
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 02/21] x86/kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 01/21] kprobes: Add generic kretprobe trampoline handler Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 03/21] arm: kprobes: " Masami Hiramatsu
` (19 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/x86/kernel/kprobes/core.c | 108 +---------------------------------------
1 file changed, 3 insertions(+), 105 deletions(-)
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index fdadc37d72af..882b95313ad6 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -767,124 +767,22 @@ asm(
NOKPROBE_SYMBOL(kretprobe_trampoline);
STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
+
/*
* Called from kretprobe_trampoline
*/
__used __visible void *trampoline_handler(struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
- kprobe_opcode_t *correct_ret_addr = NULL;
- void *frame_pointer;
- bool skipped = false;
-
- /*
- * Set a dummy kprobe for avoiding kretprobe recursion.
- * Since kretprobe never run in kprobe handler, kprobe must not
- * be running at this point.
- */
- kprobe_busy_begin();
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
/* fixup registers */
regs->cs = __KERNEL_CS;
#ifdef CONFIG_X86_32
regs->cs |= get_kernel_rpl();
regs->gs = 0;
#endif
- /* We use pt_regs->sp for return address holder. */
- frame_pointer = ®s->sp;
- regs->ip = trampoline_address;
+ regs->ip = (unsigned long)&kretprobe_trampoline;
regs->orig_ax = ~0UL;
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * return probes installed on them, and/or more than one
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always pushed into the head of the list
- * - when multiple return probes are registered for the same
- * function, the (chronologically) first instance's ret_addr
- * will be the real return address, and all the rest will
- * point to kretprobe_trampoline.
- */
- hlist_for_each_entry(ri, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
- /*
- * Return probes must be pushed on this hash list correct
- * order (same as return order) so that it can be popped
- * correctly. However, if we find it is pushed it incorrect
- * order, this means we find a function which should not be
- * probed, because the wrong order entry is pushed on the
- * path of processing other kretprobe itself.
- */
- if (ri->fp != frame_pointer) {
- if (!skipped)
- pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n");
- skipped = true;
- continue;
- }
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (skipped)
- pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n",
- ri->rp->kp.addr);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
- if (ri->fp != frame_pointer)
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, &kprobe_busy);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_hash_unlock(current, &flags);
-
- kprobe_busy_end();
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
- return (void *)orig_ret_address;
+ return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline, ®s->sp);
}
NOKPROBE_SYMBOL(trampoline_handler);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 03/21] arm: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 01/21] kprobes: Add generic kretprobe trampoline handler Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 02/21] x86/kprobes: Use " Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 04/21] arm64: " Masami Hiramatsu
` (18 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Use the generic kretprobe trampoline handler. Use regs->ARM_fp
for framepointer verification.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v2:
- Fix to cast frame_pointer type to void *.
---
arch/arm/probes/kprobes/core.c | 78 ++--------------------------------------
1 file changed, 3 insertions(+), 75 deletions(-)
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index feefa2055eba..a9653117ca0d 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -413,87 +413,15 @@ void __naked __kprobes kretprobe_trampoline(void)
/* Called from kretprobe_trampoline */
static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
- kprobe_opcode_t *correct_ret_addr = NULL;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * a return probe installed on them, and/or more than one return
- * probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, NULL);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
-
- return (void *)orig_ret_address;
+ return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
+ (void *)regs->ARM_fp);
}
void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->ARM_lr;
+ ri->fp = (void *)regs->ARM_fp;
/* Replace the return addr with trampoline addr. */
regs->ARM_lr = (unsigned long)&kretprobe_trampoline;
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 04/21] arm64: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (2 preceding siblings ...)
2020-08-29 13:00 ` [PATCH v5 03/21] arm: kprobes: " Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 05/21] arc: " Masami Hiramatsu
` (17 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Use the generic kretprobe trampoline handler, and use the
kernel_stack_pointer(regs) for framepointer verification.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/arm64/kernel/probes/kprobes.c | 78 +-----------------------------------
1 file changed, 3 insertions(+), 75 deletions(-)
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index 5290f17a4d80..deba738142ed 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -464,87 +464,15 @@ int __init arch_populate_kprobe_blacklist(void)
void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address =
- (unsigned long)&kretprobe_trampoline;
- kprobe_opcode_t *correct_ret_addr = NULL;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * return probes installed on them, and/or more than one
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always pushed into the head of the list
- * - when multiple return probes are registered for the same
- * function, the (chronologically) first instance's ret_addr
- * will be the real return address, and all the rest will
- * point to kretprobe_trampoline.
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, NULL);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
- return (void *)orig_ret_address;
+ return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
+ (void *)kernel_stack_pointer(regs));
}
void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
+ ri->fp = (void *)kernel_stack_pointer(regs);
/* replace return addr (x30) with trampoline */
regs->regs[30] = (long)&kretprobe_trampoline;
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 05/21] arc: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (3 preceding siblings ...)
2020-08-29 13:00 ` [PATCH v5 04/21] arm64: " Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:00 ` [PATCH v5 06/21] csky: " Masami Hiramatsu
` (16 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/arc/kernel/kprobes.c | 54 ++-------------------------------------------
1 file changed, 2 insertions(+), 52 deletions(-)
diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index 7d3efe83cba7..cabef45f11df 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -388,6 +388,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
{
ri->ret_addr = (kprobe_opcode_t *) regs->blink;
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->blink = (unsigned long)&kretprobe_trampoline;
@@ -396,58 +397,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
static int __kprobes trampoline_probe_handler(struct kprobe *p,
struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler)
- ri->rp->handler(ri, regs);
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address) {
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
- regs->ret = orig_ret_address;
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
+ regs->ret = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
/* By returning a non zero value, we are telling the kprobe handler
* that we don't want the post_handler to run
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 06/21] csky: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (4 preceding siblings ...)
2020-08-29 13:00 ` [PATCH v5 05/21] arc: " Masami Hiramatsu
@ 2020-08-29 13:00 ` Masami Hiramatsu
2020-08-29 13:01 ` [PATCH v5 07/21] ia64: " Masami Hiramatsu
` (15 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:00 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
---
arch/csky/kernel/probes/kprobes.c | 77 +------------------------------------
1 file changed, 2 insertions(+), 75 deletions(-)
diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
index f0f733b7ac5a..589f090f48b9 100644
--- a/arch/csky/kernel/probes/kprobes.c
+++ b/arch/csky/kernel/probes/kprobes.c
@@ -404,87 +404,14 @@ int __init arch_populate_kprobe_blacklist(void)
void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address =
- (unsigned long)&kretprobe_trampoline;
- kprobe_opcode_t *correct_ret_addr = NULL;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * return probes installed on them, and/or more than one
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always pushed into the head of the list
- * - when multiple return probes are registered for the same
- * function, the (chronologically) first instance's ret_addr
- * will be the real return address, and all the rest will
- * point to kretprobe_trampoline.
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, NULL);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
- return (void *)orig_ret_address;
+ return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
}
void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->lr;
+ ri->fp = NULL;
regs->lr = (unsigned long) &kretprobe_trampoline;
}
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 07/21] ia64: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (5 preceding siblings ...)
2020-08-29 13:00 ` [PATCH v5 06/21] csky: " Masami Hiramatsu
@ 2020-08-29 13:01 ` Masami Hiramatsu
2020-08-29 13:01 ` [PATCH v5 08/21] mips: " Masami Hiramatsu
` (14 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:01 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/ia64/kernel/kprobes.c | 77 +-------------------------------------------
1 file changed, 2 insertions(+), 75 deletions(-)
diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 7a7df944d798..fc1ff8a4d7de 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -396,83 +396,9 @@ static void kretprobe_trampoline(void)
{
}
-/*
- * At this point the target function has been tricked into
- * returning into our trampoline. Lookup the associated instance
- * and then:
- * - call the handler function
- * - cleanup by marking the instance as unused
- * - long jump back to the original return address
- */
int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address =
- ((struct fnptr *)kretprobe_trampoline)->ip;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- regs->cr_iip = orig_ret_address;
-
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler)
- ri->rp->handler(ri, regs);
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
+ regs->cr_iip = __kretprobe_trampoline_handler(regs, kretprobe_trampoline, NULL);
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we don't want the post_handler
@@ -485,6 +411,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->b0;
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->b0 = ((struct fnptr *)kretprobe_trampoline)->ip;
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 08/21] mips: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (6 preceding siblings ...)
2020-08-29 13:01 ` [PATCH v5 07/21] ia64: " Masami Hiramatsu
@ 2020-08-29 13:01 ` Masami Hiramatsu
2020-08-29 13:01 ` [PATCH v5 09/21] parisc: " Masami Hiramatsu
` (13 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:01 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/mips/kernel/kprobes.c | 54 ++------------------------------------------
1 file changed, 3 insertions(+), 51 deletions(-)
diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c
index d043c2f897fc..54dfba8fa77c 100644
--- a/arch/mips/kernel/kprobes.c
+++ b/arch/mips/kernel/kprobes.c
@@ -477,6 +477,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *) regs->regs[31];
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->regs[31] = (unsigned long)kretprobe_trampoline;
@@ -488,57 +489,8 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
static int __kprobes trampoline_probe_handler(struct kprobe *p,
struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)kretprobe_trampoline;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler)
- ri->rp->handler(ri, regs);
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
- instruction_pointer(regs) = orig_ret_address;
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
+ instruction_pointer(regs) = __kretprobe_trampoline_handler(regs,
+ kretprobe_trampoline, NULL);
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we don't want the post_handler
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 09/21] parisc: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (7 preceding siblings ...)
2020-08-29 13:01 ` [PATCH v5 08/21] mips: " Masami Hiramatsu
@ 2020-08-29 13:01 ` Masami Hiramatsu
2020-08-29 13:01 ` [PATCH v5 10/21] powerpc: " Masami Hiramatsu
` (12 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:01 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/parisc/kernel/kprobes.c | 76 ++----------------------------------------
1 file changed, 4 insertions(+), 72 deletions(-)
diff --git a/arch/parisc/kernel/kprobes.c b/arch/parisc/kernel/kprobes.c
index 77ec51818916..6d21a515eea5 100644
--- a/arch/parisc/kernel/kprobes.c
+++ b/arch/parisc/kernel/kprobes.c
@@ -191,80 +191,11 @@ static struct kprobe trampoline_p = {
static int __kprobes trampoline_probe_handler(struct kprobe *p,
struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)trampoline_p.addr;
- kprobe_opcode_t *correct_ret_addr = NULL;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * a return probe installed on them, and/or more than one return
- * probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, NULL);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
+ unsigned long orig_ret_address;
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
+ orig_ret_address = __kretprobe_trampoline_handler(regs, trampoline_p.addr, NULL);
instruction_pointer_set(regs, orig_ret_address);
+
return 1;
}
@@ -272,6 +203,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->gr[2];
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr. */
regs->gr[2] = (unsigned long)trampoline_p.addr;
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 10/21] powerpc: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (8 preceding siblings ...)
2020-08-29 13:01 ` [PATCH v5 09/21] parisc: " Masami Hiramatsu
@ 2020-08-29 13:01 ` Masami Hiramatsu
2020-08-29 13:02 ` [PATCH v5 11/21] s390: " Masami Hiramatsu
` (11 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:01 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v2:
Fix to use correct trampoline_address.
---
arch/powerpc/kernel/kprobes.c | 53 ++---------------------------------------
1 file changed, 3 insertions(+), 50 deletions(-)
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 6ab9b4d037c3..01ab2163659e 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -218,6 +218,7 @@ bool arch_kprobe_on_func_entry(unsigned long offset)
void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)regs->link;
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->link = (unsigned long)kretprobe_trampoline;
@@ -396,50 +397,9 @@ asm(".global kretprobe_trampoline\n"
*/
static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address =(unsigned long)&kretprobe_trampoline;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler)
- ri->rp->handler(ri, regs);
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
+ unsigned long orig_ret_address;
+ orig_ret_address = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
/*
* We get here through one of two paths:
* 1. by taking a trap -> kprobe_handler() -> here
@@ -458,13 +418,6 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
regs->nip = orig_ret_address - 4;
regs->link = orig_ret_address;
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
-
return 0;
}
NOKPROBE_SYMBOL(trampoline_probe_handler);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 11/21] s390: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (9 preceding siblings ...)
2020-08-29 13:01 ` [PATCH v5 10/21] powerpc: " Masami Hiramatsu
@ 2020-08-29 13:02 ` Masami Hiramatsu
2020-08-29 13:02 ` [PATCH v5 12/21] sh: " Masami Hiramatsu
` (10 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:02 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/s390/kernel/kprobes.c | 79 +-------------------------------------------
1 file changed, 2 insertions(+), 77 deletions(-)
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index d2a71d872638..fc30e799bd84 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -228,6 +228,7 @@ NOKPROBE_SYMBOL(pop_kprobe);
void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *) regs->gprs[14];
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->gprs[14] = (unsigned long) &kretprobe_trampoline;
@@ -331,83 +332,7 @@ static void __used kretprobe_trampoline_holder(void)
*/
static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
- struct kretprobe_instance *ri;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address;
- unsigned long trampoline_address;
- kprobe_opcode_t *correct_ret_addr;
-
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- ri = NULL;
- orig_ret_address = 0;
- correct_ret_addr = NULL;
- trampoline_address = (unsigned long) &kretprobe_trampoline;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long) ri->ret_addr;
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- correct_ret_addr = ri->ret_addr;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- orig_ret_address = (unsigned long) ri->ret_addr;
-
- if (ri->rp && ri->rp->handler) {
- ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
- }
-
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- regs->psw.addr = orig_ret_address;
-
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
+ regs->psw.addr = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we don't want the post_handler
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 12/21] sh: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (10 preceding siblings ...)
2020-08-29 13:02 ` [PATCH v5 11/21] s390: " Masami Hiramatsu
@ 2020-08-29 13:02 ` Masami Hiramatsu
2020-08-29 13:02 ` [PATCH v5 13/21] sparc: " Masami Hiramatsu
` (9 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:02 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/sh/kernel/kprobes.c | 58 ++--------------------------------------------
1 file changed, 3 insertions(+), 55 deletions(-)
diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index 318296f48f1a..756100b01e84 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -204,6 +204,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *) regs->pr;
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->pr = (unsigned long)kretprobe_trampoline;
@@ -302,62 +303,9 @@ static void __used kretprobe_trampoline_holder(void)
*/
int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
+ regs->pc = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more then one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler) {
- __this_cpu_write(current_kprobe, &ri->rp->kp);
- ri->rp->handler(ri, regs);
- __this_cpu_write(current_kprobe, NULL);
- }
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
-
- regs->pc = orig_ret_address;
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
-
- return orig_ret_address;
+ return 1;
}
static int __kprobes post_kprobe_handler(struct pt_regs *regs)
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 13/21] sparc: kprobes: Use generic kretprobe trampoline handler
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (11 preceding siblings ...)
2020-08-29 13:02 ` [PATCH v5 12/21] sh: " Masami Hiramatsu
@ 2020-08-29 13:02 ` Masami Hiramatsu
2020-08-29 13:02 ` [PATCH v5 14/21] kprobes: Remove NMI context check Masami Hiramatsu
` (8 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:02 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
arch/sparc/kernel/kprobes.c | 51 +++----------------------------------------
1 file changed, 3 insertions(+), 48 deletions(-)
diff --git a/arch/sparc/kernel/kprobes.c b/arch/sparc/kernel/kprobes.c
index dfbca2470536..217c21a6986a 100644
--- a/arch/sparc/kernel/kprobes.c
+++ b/arch/sparc/kernel/kprobes.c
@@ -453,6 +453,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
struct pt_regs *regs)
{
ri->ret_addr = (kprobe_opcode_t *)(regs->u_regs[UREG_RETPC] + 8);
+ ri->fp = NULL;
/* Replace the return addr with trampoline addr */
regs->u_regs[UREG_RETPC] =
@@ -465,58 +466,12 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
static int __kprobes trampoline_probe_handler(struct kprobe *p,
struct pt_regs *regs)
{
- struct kretprobe_instance *ri = NULL;
- struct hlist_head *head, empty_rp;
- struct hlist_node *tmp;
- unsigned long flags, orig_ret_address = 0;
- unsigned long trampoline_address =(unsigned long)&kretprobe_trampoline;
+ unsigned long orig_ret_address = 0;
- INIT_HLIST_HEAD(&empty_rp);
- kretprobe_hash_lock(current, &head, &flags);
-
- /*
- * It is possible to have multiple instances associated with a given
- * task either because an multiple functions in the call path
- * have a return probe installed on them, and/or more than one return
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always inserted at the head of the list
- * - when multiple return probes are registered for the same
- * function, the first instance's ret_addr will point to the
- * real return address, and all the rest will point to
- * kretprobe_trampoline
- */
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
-
- if (ri->rp && ri->rp->handler)
- ri->rp->handler(ri, regs);
-
- orig_ret_address = (unsigned long)ri->ret_addr;
- recycle_rp_inst(ri, &empty_rp);
-
- if (orig_ret_address != trampoline_address)
- /*
- * This is the real return address. Any other
- * instances associated with this task are for
- * other calls deeper on the call stack
- */
- break;
- }
-
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
+ orig_ret_address = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
regs->tpc = orig_ret_address;
regs->tnpc = orig_ret_address + 4;
- kretprobe_hash_unlock(current, &flags);
-
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we don't want the post_handler
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 14/21] kprobes: Remove NMI context check
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (12 preceding siblings ...)
2020-08-29 13:02 ` [PATCH v5 13/21] sparc: " Masami Hiramatsu
@ 2020-08-29 13:02 ` Masami Hiramatsu
[not found] ` <20201030213831.04e81962@oasis.local.home>
2020-08-29 13:02 ` [PATCH v5 15/21] kprobes: Free kretprobe_instance with rcu callback Masami Hiramatsu
` (7 subsequent siblings)
21 siblings, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:02 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Since the commit 9b38cc704e84 ("kretprobe: Prevent triggering
kretprobe from within kprobe_flush_task") sets a dummy current
kprobe in the trampoline handler by kprobe_busy_begin/end(),
it is not possible to run a kretprobe pre handler in kretprobe
trampoline handler context even with the NMI. If the NMI interrupts
a kretprobe_trampoline_handler() and it hits a kretprobe, the
2nd kretprobe will detect recursion correctly and it will be
skipped.
This means we have almost no double-lock issue on kretprobes by NMI.
The last one point is in cleanup_rp_inst() which also takes
kretprobe_table_lock without setting up current kprobes.
So adding kprobe_busy_begin/end() there allows us to remove
in_nmi() check.
The above commit applies kprobe_busy_begin/end() on x86, but
now all arch implementation are unified to generic one, we can
safely remove the in_nmi() check from arch independent code.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
kernel/kprobes.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index a0afaa79024e..211138225fa5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1359,7 +1359,8 @@ static void cleanup_rp_inst(struct kretprobe *rp)
struct hlist_node *next;
struct hlist_head *head;
- /* No race here */
+ /* To avoid recursive kretprobe by NMI, set kprobe busy here */
+ kprobe_busy_begin();
for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
kretprobe_table_lock(hash, &flags);
head = &kretprobe_inst_table[hash];
@@ -1369,6 +1370,8 @@ static void cleanup_rp_inst(struct kretprobe *rp)
}
kretprobe_table_unlock(hash, &flags);
}
+ kprobe_busy_end();
+
free_rp_inst(rp);
}
NOKPROBE_SYMBOL(cleanup_rp_inst);
@@ -2035,17 +2038,6 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
unsigned long hash, flags = 0;
struct kretprobe_instance *ri;
- /*
- * To avoid deadlocks, prohibit return probing in NMI contexts,
- * just skip the probe and increase the (inexact) 'nmissed'
- * statistical counter, so that the user is informed that
- * something happened:
- */
- if (unlikely(in_nmi())) {
- rp->nmissed++;
- return 0;
- }
-
/* TODO: consider to only swap the RA after the last pre_handler fired */
hash = hash_ptr(current, KPROBE_HASH_BITS);
raw_spin_lock_irqsave(&rp->lock, flags);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 15/21] kprobes: Free kretprobe_instance with rcu callback
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (13 preceding siblings ...)
2020-08-29 13:02 ` [PATCH v5 14/21] kprobes: Remove NMI context check Masami Hiramatsu
@ 2020-08-29 13:02 ` Masami Hiramatsu
2020-08-29 13:03 ` [PATCH v5 16/21] kprobes: Make local used functions static Masami Hiramatsu
` (6 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:02 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Free kretprobe_instance with rcu callback instead of directly
freeing the object in the kretprobe handler context.
This will make kretprobe run safer in NMI context.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v3:
- Stick the rcu_head with hlist_node in kretprobe_instance
- Make recycle_rp_inst() static
---
include/linux/kprobes.h | 6 ++++--
kernel/kprobes.c | 25 ++++++-------------------
2 files changed, 10 insertions(+), 21 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index c6a913e608b7..663be8debf25 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -156,7 +156,10 @@ struct kretprobe {
};
struct kretprobe_instance {
- struct hlist_node hlist;
+ union {
+ struct hlist_node hlist;
+ struct rcu_head rcu;
+ };
struct kretprobe *rp;
kprobe_opcode_t *ret_addr;
struct task_struct *task;
@@ -395,7 +398,6 @@ int register_kretprobes(struct kretprobe **rps, int num);
void unregister_kretprobes(struct kretprobe **rps, int num);
void kprobe_flush_task(struct task_struct *tk);
-void recycle_rp_inst(struct kretprobe_instance *ri, struct hlist_head *head);
int disable_kprobe(struct kprobe *kp);
int enable_kprobe(struct kprobe *kp);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 211138225fa5..0676868f1ac2 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1223,8 +1223,7 @@ void kprobes_inc_nmissed_count(struct kprobe *p)
}
NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);
-void recycle_rp_inst(struct kretprobe_instance *ri,
- struct hlist_head *head)
+static void recycle_rp_inst(struct kretprobe_instance *ri)
{
struct kretprobe *rp = ri->rp;
@@ -1236,8 +1235,7 @@ void recycle_rp_inst(struct kretprobe_instance *ri,
hlist_add_head(&ri->hlist, &rp->free_instances);
raw_spin_unlock(&rp->lock);
} else
- /* Unregistering */
- hlist_add_head(&ri->hlist, head);
+ kfree_rcu(ri, rcu);
}
NOKPROBE_SYMBOL(recycle_rp_inst);
@@ -1313,7 +1311,7 @@ void kprobe_busy_end(void)
void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
- struct hlist_head *head, empty_rp;
+ struct hlist_head *head;
struct hlist_node *tmp;
unsigned long hash, flags = 0;
@@ -1323,19 +1321,14 @@ void kprobe_flush_task(struct task_struct *tk)
kprobe_busy_begin();
- INIT_HLIST_HEAD(&empty_rp);
hash = hash_ptr(tk, KPROBE_HASH_BITS);
head = &kretprobe_inst_table[hash];
kretprobe_table_lock(hash, &flags);
hlist_for_each_entry_safe(ri, tmp, head, hlist) {
if (ri->task == tk)
- recycle_rp_inst(ri, &empty_rp);
+ recycle_rp_inst(ri);
}
kretprobe_table_unlock(hash, &flags);
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
kprobe_busy_end();
}
@@ -1936,13 +1929,12 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
void *frame_pointer)
{
struct kretprobe_instance *ri = NULL, *last = NULL;
- struct hlist_head *head, empty_rp;
+ struct hlist_head *head;
struct hlist_node *tmp;
unsigned long flags;
kprobe_opcode_t *correct_ret_addr = NULL;
bool skipped = false;
- INIT_HLIST_HEAD(&empty_rp);
kretprobe_hash_lock(current, &head, &flags);
/*
@@ -2011,7 +2003,7 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
__this_cpu_write(current_kprobe, prev);
}
- recycle_rp_inst(ri, &empty_rp);
+ recycle_rp_inst(ri);
if (ri == last)
break;
@@ -2019,11 +2011,6 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
kretprobe_hash_unlock(current, &flags);
- hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
- hlist_del(&ri->hlist);
- kfree(ri);
- }
-
return (unsigned long)correct_ret_addr;
}
NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 16/21] kprobes: Make local used functions static
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (14 preceding siblings ...)
2020-08-29 13:02 ` [PATCH v5 15/21] kprobes: Free kretprobe_instance with rcu callback Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-08-29 13:03 ` [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all() Masami Hiramatsu
` (5 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
Since we unified the kretprobe trampoline handler from arch/* code,
some functions and objects no need to be exported anymore.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v4:
- Convert kretprobe_assert() into a BUG_ON().
---
include/linux/kprobes.h | 15 ---------------
kernel/kprobes.c | 9 ++++-----
2 files changed, 4 insertions(+), 20 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 663be8debf25..9c880c8a4e80 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -190,7 +190,6 @@ static inline int kprobes_built_in(void)
return 1;
}
-extern struct kprobe kprobe_busy;
void kprobe_busy_begin(void);
void kprobe_busy_end(void);
@@ -235,16 +234,6 @@ static inline int arch_trampoline_kprobe(struct kprobe *p)
extern struct kretprobe_blackpoint kretprobe_blacklist[];
-static inline void kretprobe_assert(struct kretprobe_instance *ri,
- unsigned long orig_ret_address, unsigned long trampoline_address)
-{
- if (!orig_ret_address || (orig_ret_address == trampoline_address)) {
- printk("kretprobe BUG!: Processing kretprobe %p @ %p\n",
- ri->rp, ri->rp->kp.addr);
- BUG();
- }
-}
-
#ifdef CONFIG_KPROBES_SANITY_TEST
extern int init_test_probes(void);
#else
@@ -364,10 +353,6 @@ int arch_check_ftrace_location(struct kprobe *p);
/* Get the kprobe at this addr (if any) - called with preemption disabled */
struct kprobe *get_kprobe(void *addr);
-void kretprobe_hash_lock(struct task_struct *tsk,
- struct hlist_head **head, unsigned long *flags);
-void kretprobe_hash_unlock(struct task_struct *tsk, unsigned long *flags);
-struct hlist_head * kretprobe_inst_table_head(struct task_struct *tsk);
/* kprobe_running() will just return the current_kprobe on this CPU */
static inline struct kprobe *kprobe_running(void)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0676868f1ac2..732a70163584 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1239,7 +1239,7 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
}
NOKPROBE_SYMBOL(recycle_rp_inst);
-void kretprobe_hash_lock(struct task_struct *tsk,
+static void kretprobe_hash_lock(struct task_struct *tsk,
struct hlist_head **head, unsigned long *flags)
__acquires(hlist_lock)
{
@@ -1261,7 +1261,7 @@ __acquires(hlist_lock)
}
NOKPROBE_SYMBOL(kretprobe_table_lock);
-void kretprobe_hash_unlock(struct task_struct *tsk,
+static void kretprobe_hash_unlock(struct task_struct *tsk,
unsigned long *flags)
__releases(hlist_lock)
{
@@ -1282,7 +1282,7 @@ __releases(hlist_lock)
}
NOKPROBE_SYMBOL(kretprobe_table_unlock);
-struct kprobe kprobe_busy = {
+static struct kprobe kprobe_busy = {
.addr = (void *) get_kprobe,
};
@@ -1983,8 +1983,7 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
break;
}
- kretprobe_assert(ri, (unsigned long)correct_ret_addr,
- (unsigned long)trampoline_address);
+ BUG_ON(!correct_ret_addr || (correct_ret_addr == trampoline_address));
last = ri;
hlist_for_each_entry_safe(ri, tmp, head, hlist) {
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all()
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (15 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 16/21] kprobes: Make local used functions static Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-10-12 16:24 ` Ingo Molnar
2020-08-29 13:03 ` [PATCH v5 18/21] kprobes: Remove kretprobe hash Masami Hiramatsu
` (4 subsequent siblings)
21 siblings, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
From: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
drivers/gpu/drm/i915/i915_request.c | 6 ------
include/linux/llist.h | 23 +++++++++++++++++++++++
2 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0b2fe55e6194..0e851b925c8c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -357,12 +357,6 @@ void i915_request_retire_upto(struct i915_request *rq)
} while (i915_request_retire(tmp) && tmp != rq);
}
-static void __llist_add(struct llist_node *node, struct llist_head *head)
-{
- node->next = head->first;
- head->first = node;
-}
-
static struct i915_request * const *
__engine_active(struct intel_engine_cs *engine)
{
diff --git a/include/linux/llist.h b/include/linux/llist.h
index 2e9c7215882b..24f207b0190b 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -197,6 +197,16 @@ static inline struct llist_node *llist_next(struct llist_node *node)
extern bool llist_add_batch(struct llist_node *new_first,
struct llist_node *new_last,
struct llist_head *head);
+
+static inline bool __llist_add_batch(struct llist_node *new_first,
+ struct llist_node *new_last,
+ struct llist_head *head)
+{
+ new_last->next = head->first;
+ head->first = new_first;
+ return new_last->next == NULL;
+}
+
/**
* llist_add - add a new entry
* @new: new entry to be added
@@ -209,6 +219,11 @@ static inline bool llist_add(struct llist_node *new, struct llist_head *head)
return llist_add_batch(new, new, head);
}
+static inline bool __llist_add(struct llist_node *new, struct llist_head *head)
+{
+ return __llist_add_batch(new, new, head);
+}
+
/**
* llist_del_all - delete all entries from lock-less list
* @head: the head of lock-less list to delete all entries
@@ -222,6 +237,14 @@ static inline struct llist_node *llist_del_all(struct llist_head *head)
return xchg(&head->first, NULL);
}
+static inline struct llist_node *__llist_del_all(struct llist_head *head)
+{
+ struct llist_node *first = head->first;
+
+ head->first = NULL;
+ return first;
+}
+
extern struct llist_node *llist_del_first(struct llist_head *head);
struct llist_node *llist_reverse_order(struct llist_node *head);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all()
2020-08-29 13:03 ` [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all() Masami Hiramatsu
@ 2020-10-12 16:24 ` Ingo Molnar
2020-10-14 0:24 ` Masami Hiramatsu
0 siblings, 1 reply; 51+ messages in thread
From: Ingo Molnar @ 2020-10-12 16:24 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: linux-kernel, Peter Zijlstra, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
* Masami Hiramatsu <mhiramat@kernel.org> wrote:
> From: Peter Zijlstra <peterz@infradead.org>
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Because you are forwarding this patch here, I've added your SOB:
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
(Let me know if that's not OK.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all()
2020-10-12 16:24 ` Ingo Molnar
@ 2020-10-14 0:24 ` Masami Hiramatsu
0 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-10-14 0:24 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Peter Zijlstra, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Mon, 12 Oct 2020 18:24:54 +0200
Ingo Molnar <mingo@kernel.org> wrote:
>
> * Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > From: Peter Zijlstra <peterz@infradead.org>
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> Because you are forwarding this patch here, I've added your SOB:
>
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
>
> (Let me know if that's not OK.)
OK, I keep it in mind next time.
Thanks Ingo!
>
> Thanks,
>
> Ingo
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH v5 18/21] kprobes: Remove kretprobe hash
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (16 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 17/21] llist: Add nonatomic __llist_add() and __llist_dell_all() Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-08-29 13:03 ` [PATCH v5 19/21] asm-generic/atomic: Add try_cmpxchg() fallbacks Masami Hiramatsu
` (3 subsequent siblings)
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
From: Peter Zijlstra <peterz@infradead.org>
The kretprobe hash is mostly superfluous, replace it with a per-task
variable.
This gets rid of the task hash and it's related locking.
Note that this may change the kprobes module-exported API for kretprobe
handlers. If any out-of-tree kretprobe user uses ri->rp, use
get_kretprobe(ri) instead.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
Changes in v5:
- Fix not to flush current task's kretprobe_instance
- Remove task scan by merging kretprobe_holder patch
- Use READ_ONCE() and rcu_locked check instead of smp_r/wmb()
- Make the failure case in __kretprobe_trampoline_handler() clearer
---
include/linux/kprobes.h | 19 +++
include/linux/sched.h | 4 +
kernel/fork.c | 4 +
kernel/kprobes.c | 236 +++++++++++++------------------------------
kernel/trace/trace_kprobe.c | 3 -
5 files changed, 97 insertions(+), 169 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 9c880c8a4e80..f8f87a13345a 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -27,6 +27,7 @@
#include <linux/rcupdate.h>
#include <linux/mutex.h>
#include <linux/ftrace.h>
+#include <linux/refcount.h>
#include <asm/kprobes.h>
#ifdef CONFIG_KPROBES
@@ -144,6 +145,11 @@ static inline int kprobe_ftrace(struct kprobe *p)
* ignored, due to maxactive being too low.
*
*/
+struct kretprobe_holder {
+ struct kretprobe *rp;
+ refcount_t ref;
+};
+
struct kretprobe {
struct kprobe kp;
kretprobe_handler_t handler;
@@ -152,17 +158,18 @@ struct kretprobe {
int nmissed;
size_t data_size;
struct hlist_head free_instances;
+ struct kretprobe_holder *rph;
raw_spinlock_t lock;
};
struct kretprobe_instance {
union {
+ struct llist_node llist;
struct hlist_node hlist;
struct rcu_head rcu;
};
- struct kretprobe *rp;
+ struct kretprobe_holder *rph;
kprobe_opcode_t *ret_addr;
- struct task_struct *task;
void *fp;
char data[];
};
@@ -221,6 +228,14 @@ unsigned long kretprobe_trampoline_handler(struct pt_regs *regs,
return ret;
}
+static nokprobe_inline struct kretprobe *get_kretprobe(struct kretprobe_instance *ri)
+{
+ RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
+ "Kretprobe is accessed from instance under preemptive context");
+
+ return READ_ONCE(ri->rph->rp);
+}
+
#else /* CONFIG_KRETPROBES */
static inline void arch_prepare_kretprobe(struct kretprobe *rp,
struct pt_regs *regs)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 93ecd930efd3..0f2532f052a9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1315,6 +1315,10 @@ struct task_struct {
struct callback_head mce_kill_me;
#endif
+#ifdef CONFIG_KRETPROBES
+ struct llist_head kretprobe_instances;
+#endif
+
/*
* New fields for task_struct should be added above here, so that
* they are included in the randomized portion of task_struct.
diff --git a/kernel/fork.c b/kernel/fork.c
index 4d32190861bd..2ff5cceb0732 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2161,6 +2161,10 @@ static __latent_entropy struct task_struct *copy_process(
INIT_LIST_HEAD(&p->thread_group);
p->task_works = NULL;
+#ifdef CONFIG_KRETPROBES
+ p->kretprobe_instances.first = NULL;
+#endif
+
/*
* Ensure that the cgroup subsystem policies allow the new process to be
* forked. It should be noted the the new process's css_set can be changed
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 732a70163584..bc65603fce00 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -53,7 +53,6 @@ static int kprobes_initialized;
* - RCU hlist traversal under disabling preempt (breakpoint handlers)
*/
static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
-static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
/* NOTE: change this value only with kprobe_mutex held */
static bool kprobes_all_disarmed;
@@ -61,9 +60,6 @@ static bool kprobes_all_disarmed;
/* This protects kprobe_table and optimizing_list */
static DEFINE_MUTEX(kprobe_mutex);
static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
-static struct {
- raw_spinlock_t lock ____cacheline_aligned_in_smp;
-} kretprobe_table_locks[KPROBE_TABLE_SIZE];
kprobe_opcode_t * __weak kprobe_lookup_name(const char *name,
unsigned int __unused)
@@ -71,11 +67,6 @@ kprobe_opcode_t * __weak kprobe_lookup_name(const char *name,
return ((kprobe_opcode_t *)(kallsyms_lookup_name(name)));
}
-static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash)
-{
- return &(kretprobe_table_locks[hash].lock);
-}
-
/* Blacklist -- list of struct kprobe_blacklist_entry */
static LIST_HEAD(kprobe_blacklist);
@@ -1223,65 +1214,30 @@ void kprobes_inc_nmissed_count(struct kprobe *p)
}
NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);
+static void free_rp_inst_rcu(struct rcu_head *head)
+{
+ struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+
+ if (refcount_dec_and_test(&ri->rph->ref))
+ kfree(ri->rph);
+ kfree(ri);
+}
+NOKPROBE_SYMBOL(free_rp_inst_rcu);
+
static void recycle_rp_inst(struct kretprobe_instance *ri)
{
- struct kretprobe *rp = ri->rp;
+ struct kretprobe *rp = get_kretprobe(ri);
- /* remove rp inst off the rprobe_inst_table */
- hlist_del(&ri->hlist);
INIT_HLIST_NODE(&ri->hlist);
if (likely(rp)) {
raw_spin_lock(&rp->lock);
hlist_add_head(&ri->hlist, &rp->free_instances);
raw_spin_unlock(&rp->lock);
} else
- kfree_rcu(ri, rcu);
+ call_rcu(&ri->rcu, free_rp_inst_rcu);
}
NOKPROBE_SYMBOL(recycle_rp_inst);
-static void kretprobe_hash_lock(struct task_struct *tsk,
- struct hlist_head **head, unsigned long *flags)
-__acquires(hlist_lock)
-{
- unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
- raw_spinlock_t *hlist_lock;
-
- *head = &kretprobe_inst_table[hash];
- hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_lock_irqsave(hlist_lock, *flags);
-}
-NOKPROBE_SYMBOL(kretprobe_hash_lock);
-
-static void kretprobe_table_lock(unsigned long hash,
- unsigned long *flags)
-__acquires(hlist_lock)
-{
- raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_lock_irqsave(hlist_lock, *flags);
-}
-NOKPROBE_SYMBOL(kretprobe_table_lock);
-
-static void kretprobe_hash_unlock(struct task_struct *tsk,
- unsigned long *flags)
-__releases(hlist_lock)
-{
- unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
- raw_spinlock_t *hlist_lock;
-
- hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_unlock_irqrestore(hlist_lock, *flags);
-}
-NOKPROBE_SYMBOL(kretprobe_hash_unlock);
-
-static void kretprobe_table_unlock(unsigned long hash,
- unsigned long *flags)
-__releases(hlist_lock)
-{
- raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_unlock_irqrestore(hlist_lock, *flags);
-}
-NOKPROBE_SYMBOL(kretprobe_table_unlock);
-
static struct kprobe kprobe_busy = {
.addr = (void *) get_kprobe,
};
@@ -1311,24 +1267,21 @@ void kprobe_busy_end(void)
void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
- struct hlist_head *head;
- struct hlist_node *tmp;
- unsigned long hash, flags = 0;
+ struct llist_node *node;
+ /* Early boot, not yet initialized. */
if (unlikely(!kprobes_initialized))
- /* Early boot. kretprobe_table_locks not yet initialized. */
return;
kprobe_busy_begin();
- hash = hash_ptr(tk, KPROBE_HASH_BITS);
- head = &kretprobe_inst_table[hash];
- kretprobe_table_lock(hash, &flags);
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task == tk)
- recycle_rp_inst(ri);
+ node = __llist_del_all(&tk->kretprobe_instances);
+ while (node) {
+ ri = container_of(node, struct kretprobe_instance, llist);
+ node = node->next;
+
+ recycle_rp_inst(ri);
}
- kretprobe_table_unlock(hash, &flags);
kprobe_busy_end();
}
@@ -1338,36 +1291,19 @@ static inline void free_rp_inst(struct kretprobe *rp)
{
struct kretprobe_instance *ri;
struct hlist_node *next;
+ int count = 0;
hlist_for_each_entry_safe(ri, next, &rp->free_instances, hlist) {
hlist_del(&ri->hlist);
kfree(ri);
+ count++;
}
-}
-
-static void cleanup_rp_inst(struct kretprobe *rp)
-{
- unsigned long flags, hash;
- struct kretprobe_instance *ri;
- struct hlist_node *next;
- struct hlist_head *head;
- /* To avoid recursive kretprobe by NMI, set kprobe busy here */
- kprobe_busy_begin();
- for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
- kretprobe_table_lock(hash, &flags);
- head = &kretprobe_inst_table[hash];
- hlist_for_each_entry_safe(ri, next, head, hlist) {
- if (ri->rp == rp)
- ri->rp = NULL;
- }
- kretprobe_table_unlock(hash, &flags);
+ if (refcount_sub_and_test(count, &rp->rph->ref)) {
+ kfree(rp->rph);
+ rp->rph = NULL;
}
- kprobe_busy_end();
-
- free_rp_inst(rp);
}
-NOKPROBE_SYMBOL(cleanup_rp_inst);
/* Add the new probe to ap->list */
static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
@@ -1928,88 +1864,56 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
void *trampoline_address,
void *frame_pointer)
{
- struct kretprobe_instance *ri = NULL, *last = NULL;
- struct hlist_head *head;
- struct hlist_node *tmp;
- unsigned long flags;
kprobe_opcode_t *correct_ret_addr = NULL;
- bool skipped = false;
+ struct kretprobe_instance *ri = NULL;
+ struct llist_node *first, *node;
+ struct kretprobe *rp;
- kretprobe_hash_lock(current, &head, &flags);
+ /* Find all nodes for this frame. */
+ first = node = current->kretprobe_instances.first;
+ while (node) {
+ ri = container_of(node, struct kretprobe_instance, llist);
- /*
- * It is possible to have multiple instances associated with a given
- * task either because multiple functions in the call path have
- * return probes installed on them, and/or more than one
- * return probe was registered for a target function.
- *
- * We can handle this because:
- * - instances are always pushed into the head of the list
- * - when multiple return probes are registered for the same
- * function, the (chronologically) first instance's ret_addr
- * will be the real return address, and all the rest will
- * point to kretprobe_trampoline.
- */
- hlist_for_each_entry(ri, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
- /*
- * Return probes must be pushed on this hash list correct
- * order (same as return order) so that it can be popped
- * correctly. However, if we find it is pushed it incorrect
- * order, this means we find a function which should not be
- * probed, because the wrong order entry is pushed on the
- * path of processing other kretprobe itself.
- */
- if (ri->fp != frame_pointer) {
- if (!skipped)
- pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n");
- skipped = true;
- continue;
- }
+ BUG_ON(ri->fp != frame_pointer);
- correct_ret_addr = ri->ret_addr;
- if (skipped)
- pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n",
- ri->rp->kp.addr);
-
- if (correct_ret_addr != trampoline_address)
+ if (ri->ret_addr != trampoline_address) {
+ correct_ret_addr = ri->ret_addr;
/*
* This is the real return address. Any other
* instances associated with this task are for
* other calls deeper on the call stack
*/
- break;
+ goto found;
+ }
+
+ node = node->next;
}
+ pr_err("Oops! Kretprobe fails to find correct return address.\n");
+ BUG_ON(1);
- BUG_ON(!correct_ret_addr || (correct_ret_addr == trampoline_address));
- last = ri;
+found:
+ /* Unlink all nodes for this frame. */
+ current->kretprobe_instances.first = node->next;
+ node->next = NULL;
- hlist_for_each_entry_safe(ri, tmp, head, hlist) {
- if (ri->task != current)
- /* another task is sharing our hash bucket */
- continue;
- if (ri->fp != frame_pointer)
- continue;
+ /* Run them.. */
+ while (first) {
+ ri = container_of(first, struct kretprobe_instance, llist);
+ first = first->next;
- if (ri->rp && ri->rp->handler) {
+ rp = get_kretprobe(ri);
+ if (rp && rp->handler) {
struct kprobe *prev = kprobe_running();
- __this_cpu_write(current_kprobe, &ri->rp->kp);
+ __this_cpu_write(current_kprobe, &rp->kp);
ri->ret_addr = correct_ret_addr;
- ri->rp->handler(ri, regs);
+ rp->handler(ri, regs);
__this_cpu_write(current_kprobe, prev);
}
recycle_rp_inst(ri);
-
- if (ri == last)
- break;
}
- kretprobe_hash_unlock(current, &flags);
-
return (unsigned long)correct_ret_addr;
}
NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
@@ -2021,11 +1925,10 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe *rp = container_of(p, struct kretprobe, kp);
- unsigned long hash, flags = 0;
+ unsigned long flags = 0;
struct kretprobe_instance *ri;
/* TODO: consider to only swap the RA after the last pre_handler fired */
- hash = hash_ptr(current, KPROBE_HASH_BITS);
raw_spin_lock_irqsave(&rp->lock, flags);
if (!hlist_empty(&rp->free_instances)) {
ri = hlist_entry(rp->free_instances.first,
@@ -2033,9 +1936,6 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
hlist_del(&ri->hlist);
raw_spin_unlock_irqrestore(&rp->lock, flags);
- ri->rp = rp;
- ri->task = current;
-
if (rp->entry_handler && rp->entry_handler(ri, regs)) {
raw_spin_lock_irqsave(&rp->lock, flags);
hlist_add_head(&ri->hlist, &rp->free_instances);
@@ -2045,11 +1945,8 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
arch_prepare_kretprobe(ri, regs);
- /* XXX(hch): why is there no hlist_move_head? */
- INIT_HLIST_NODE(&ri->hlist);
- kretprobe_table_lock(hash, &flags);
- hlist_add_head(&ri->hlist, &kretprobe_inst_table[hash]);
- kretprobe_table_unlock(hash, &flags);
+ __llist_add(&ri->llist, ¤t->kretprobe_instances);
+
} else {
rp->nmissed++;
raw_spin_unlock_irqrestore(&rp->lock, flags);
@@ -2112,16 +2009,24 @@ int register_kretprobe(struct kretprobe *rp)
}
raw_spin_lock_init(&rp->lock);
INIT_HLIST_HEAD(&rp->free_instances);
+ rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
+ if (!rp->rph)
+ return -ENOMEM;
+
+ rp->rph->rp = rp;
for (i = 0; i < rp->maxactive; i++) {
- inst = kmalloc(sizeof(struct kretprobe_instance) +
+ inst = kzalloc(sizeof(struct kretprobe_instance) +
rp->data_size, GFP_KERNEL);
if (inst == NULL) {
+ refcount_set(&rp->rph->ref, i);
free_rp_inst(rp);
return -ENOMEM;
}
+ inst->rph = rp->rph;
INIT_HLIST_NODE(&inst->hlist);
hlist_add_head(&inst->hlist, &rp->free_instances);
}
+ refcount_set(&rp->rph->ref, i);
rp->nmissed = 0;
/* Establish function entry probe point */
@@ -2163,16 +2068,18 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
if (num <= 0)
return;
mutex_lock(&kprobe_mutex);
- for (i = 0; i < num; i++)
+ for (i = 0; i < num; i++) {
if (__unregister_kprobe_top(&rps[i]->kp) < 0)
rps[i]->kp.addr = NULL;
+ rps[i]->rph->rp = NULL;
+ }
mutex_unlock(&kprobe_mutex);
synchronize_rcu();
for (i = 0; i < num; i++) {
if (rps[i]->kp.addr) {
__unregister_kprobe_bottom(&rps[i]->kp);
- cleanup_rp_inst(rps[i]);
+ free_rp_inst(rps[i]);
}
}
}
@@ -2534,11 +2441,8 @@ static int __init init_kprobes(void)
/* FIXME allocate the probe table, currently defined statically */
/* initialize all list heads */
- for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
+ for (i = 0; i < KPROBE_TABLE_SIZE; i++)
INIT_HLIST_HEAD(&kprobe_table[i]);
- INIT_HLIST_HEAD(&kretprobe_inst_table[i]);
- raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
- }
err = populate_kprobe_blacklist(__start_kprobe_blacklist,
__stop_kprobe_blacklist);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index aefb6065b508..07baf6f6cecc 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1714,7 +1714,8 @@ NOKPROBE_SYMBOL(kprobe_dispatcher);
static int
kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
{
- struct trace_kprobe *tk = container_of(ri->rp, struct trace_kprobe, rp);
+ struct kretprobe *rp = get_kretprobe(ri);
+ struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
raw_cpu_inc(*tk->nhit);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 19/21] asm-generic/atomic: Add try_cmpxchg() fallbacks
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (17 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 18/21] kprobes: Remove kretprobe hash Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-10-12 16:25 ` Ingo Molnar
2020-08-29 13:03 ` [PATCH v5 20/21] freelist: Lock less freelist Masami Hiramatsu
` (2 subsequent siblings)
21 siblings, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
From: Peter Zijlstra <peterz@infradead.org>
Only x86 provides try_cmpxchg() outside of the atomic_t interfaces,
provide generic fallbacks to create this interface from the widely
available cmpxchg() function.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
---
arch/x86/include/asm/atomic.h | 2
arch/x86/include/asm/atomic64_64.h | 2
arch/x86/include/asm/cmpxchg.h | 2
include/asm-generic/atomic-instrumented.h | 216 +++++++++++++++++------------
include/linux/atomic-arch-fallback.h | 90 +++++++++++-
include/linux/atomic-fallback.h | 90 +++++++++++-
scripts/atomic/gen-atomic-fallback.sh | 63 ++++++++
scripts/atomic/gen-atomic-instrumented.sh | 29 +++-
8 files changed, 372 insertions(+), 122 deletions(-)
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index b6cac6e9bb70..f732741ad7c7 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -199,7 +199,7 @@ static __always_inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
static __always_inline bool arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
{
- return try_cmpxchg(&v->counter, old, new);
+ return arch_try_cmpxchg(&v->counter, old, new);
}
#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 809bd010a751..7886d0578fc9 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -187,7 +187,7 @@ static inline s64 arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
{
- return try_cmpxchg(&v->counter, old, new);
+ return arch_try_cmpxchg(&v->counter, old, new);
}
#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index a8bfac131256..4d4ec5cbdc51 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,7 +221,7 @@ extern void __add_wrong_size(void)
#define __try_cmpxchg(ptr, pold, new, size) \
__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
-#define try_cmpxchg(ptr, pold, new) \
+#define arch_try_cmpxchg(ptr, pold, new) \
__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
/*
diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index 379986e40159..2cab3fdaae3b 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -1642,148 +1642,192 @@ atomic64_dec_if_positive(atomic64_t *v)
#endif
#if !defined(arch_xchg_relaxed) || defined(arch_xchg)
-#define xchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg(__ai_ptr, __VA_ARGS__); \
+#define xchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_xchg_acquire)
-#define xchg_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_acquire(__ai_ptr, __VA_ARGS__); \
+#define xchg_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_acquire(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_xchg_release)
-#define xchg_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_release(__ai_ptr, __VA_ARGS__); \
+#define xchg_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_release(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_xchg_relaxed)
-#define xchg_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_relaxed(__ai_ptr, __VA_ARGS__); \
+#define xchg_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_relaxed(__ai_ptr, __VA_ARGS__); \
})
#endif
#if !defined(arch_cmpxchg_relaxed) || defined(arch_cmpxchg)
-#define cmpxchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg_acquire)
-#define cmpxchg_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_acquire(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_acquire(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg_release)
-#define cmpxchg_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_release(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_release(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg_relaxed)
-#define cmpxchg_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
})
#endif
#if !defined(arch_cmpxchg64_relaxed) || defined(arch_cmpxchg64)
-#define cmpxchg64(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg64(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg64_acquire)
-#define cmpxchg64_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_acquire(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg64_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_acquire(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg64_release)
-#define cmpxchg64_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_release(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg64_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_release(__ai_ptr, __VA_ARGS__); \
})
#endif
#if defined(arch_cmpxchg64_relaxed)
-#define cmpxchg64_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg64_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \
})
#endif
-#define cmpxchg_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_local(__ai_ptr, __VA_ARGS__); \
+#if !defined(arch_try_cmpxchg_relaxed) || defined(arch_try_cmpxchg)
+#define try_cmpxchg(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+#endif
+
+#if defined(arch_try_cmpxchg_acquire)
+#define try_cmpxchg_acquire(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+#endif
+
+#if defined(arch_try_cmpxchg_release)
+#define try_cmpxchg_release(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+#endif
+
+#if defined(arch_try_cmpxchg_relaxed)
+#define try_cmpxchg_relaxed(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+#endif
+
+#define cmpxchg_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_local(__ai_ptr, __VA_ARGS__); \
})
-#define cmpxchg64_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg64_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \
})
-#define sync_cmpxchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
+#define sync_cmpxchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
})
-#define cmpxchg_double(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
- arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg_double(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
+ arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \
})
-#define cmpxchg_double_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
- arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \
+#define cmpxchg_double_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
+ arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \
})
#endif /* _ASM_GENERIC_ATOMIC_INSTRUMENTED_H */
-// 89bf97f3a7509b740845e51ddf31055b48a81f40
+// ff0fe7f81ee97f01f13bb78b0e3ce800bc56d9dd
diff --git a/include/linux/atomic-arch-fallback.h b/include/linux/atomic-arch-fallback.h
index bcb6aa27cfa6..a3dba31df01e 100644
--- a/include/linux/atomic-arch-fallback.h
+++ b/include/linux/atomic-arch-fallback.h
@@ -9,9 +9,9 @@
#include <linux/compiler.h>
#ifndef arch_xchg_relaxed
-#define arch_xchg_relaxed arch_xchg
-#define arch_xchg_acquire arch_xchg
-#define arch_xchg_release arch_xchg
+#define arch_xchg_acquire arch_xchg
+#define arch_xchg_release arch_xchg
+#define arch_xchg_relaxed arch_xchg
#else /* arch_xchg_relaxed */
#ifndef arch_xchg_acquire
@@ -32,9 +32,9 @@
#endif /* arch_xchg_relaxed */
#ifndef arch_cmpxchg_relaxed
-#define arch_cmpxchg_relaxed arch_cmpxchg
-#define arch_cmpxchg_acquire arch_cmpxchg
-#define arch_cmpxchg_release arch_cmpxchg
+#define arch_cmpxchg_acquire arch_cmpxchg
+#define arch_cmpxchg_release arch_cmpxchg
+#define arch_cmpxchg_relaxed arch_cmpxchg
#else /* arch_cmpxchg_relaxed */
#ifndef arch_cmpxchg_acquire
@@ -55,9 +55,9 @@
#endif /* arch_cmpxchg_relaxed */
#ifndef arch_cmpxchg64_relaxed
-#define arch_cmpxchg64_relaxed arch_cmpxchg64
-#define arch_cmpxchg64_acquire arch_cmpxchg64
-#define arch_cmpxchg64_release arch_cmpxchg64
+#define arch_cmpxchg64_acquire arch_cmpxchg64
+#define arch_cmpxchg64_release arch_cmpxchg64
+#define arch_cmpxchg64_relaxed arch_cmpxchg64
#else /* arch_cmpxchg64_relaxed */
#ifndef arch_cmpxchg64_acquire
@@ -77,6 +77,76 @@
#endif /* arch_cmpxchg64_relaxed */
+#ifndef arch_try_cmpxchg_relaxed
+#ifdef arch_try_cmpxchg
+#define arch_try_cmpxchg_acquire arch_try_cmpxchg
+#define arch_try_cmpxchg_release arch_try_cmpxchg
+#define arch_try_cmpxchg_relaxed arch_try_cmpxchg
+#endif /* arch_try_cmpxchg */
+
+#ifndef arch_try_cmpxchg
+#define arch_try_cmpxchg(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg */
+
+#ifndef arch_try_cmpxchg_acquire
+#define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_acquire */
+
+#ifndef arch_try_cmpxchg_release
+#define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_release */
+
+#ifndef arch_try_cmpxchg_relaxed
+#define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_relaxed */
+
+#else /* arch_try_cmpxchg_relaxed */
+
+#ifndef arch_try_cmpxchg_acquire
+#define arch_try_cmpxchg_acquire(...) \
+ __atomic_op_acquire(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg_release
+#define arch_try_cmpxchg_release(...) \
+ __atomic_op_release(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg
+#define arch_try_cmpxchg(...) \
+ __atomic_op_fence(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#endif /* arch_try_cmpxchg_relaxed */
+
#ifndef arch_atomic_read_acquire
static __always_inline int
arch_atomic_read_acquire(const atomic_t *v)
@@ -2288,4 +2358,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
#endif
#endif /* _LINUX_ATOMIC_FALLBACK_H */
-// 90cd26cfd69d2250303d654955a0cc12620fb91b
+// cca554917d7ea73d5e3e7397dd70c484cad9b2c4
diff --git a/include/linux/atomic-fallback.h b/include/linux/atomic-fallback.h
index fd525c71d676..2a3f55d98be9 100644
--- a/include/linux/atomic-fallback.h
+++ b/include/linux/atomic-fallback.h
@@ -9,9 +9,9 @@
#include <linux/compiler.h>
#ifndef xchg_relaxed
-#define xchg_relaxed xchg
-#define xchg_acquire xchg
-#define xchg_release xchg
+#define xchg_acquire xchg
+#define xchg_release xchg
+#define xchg_relaxed xchg
#else /* xchg_relaxed */
#ifndef xchg_acquire
@@ -32,9 +32,9 @@
#endif /* xchg_relaxed */
#ifndef cmpxchg_relaxed
-#define cmpxchg_relaxed cmpxchg
-#define cmpxchg_acquire cmpxchg
-#define cmpxchg_release cmpxchg
+#define cmpxchg_acquire cmpxchg
+#define cmpxchg_release cmpxchg
+#define cmpxchg_relaxed cmpxchg
#else /* cmpxchg_relaxed */
#ifndef cmpxchg_acquire
@@ -55,9 +55,9 @@
#endif /* cmpxchg_relaxed */
#ifndef cmpxchg64_relaxed
-#define cmpxchg64_relaxed cmpxchg64
-#define cmpxchg64_acquire cmpxchg64
-#define cmpxchg64_release cmpxchg64
+#define cmpxchg64_acquire cmpxchg64
+#define cmpxchg64_release cmpxchg64
+#define cmpxchg64_relaxed cmpxchg64
#else /* cmpxchg64_relaxed */
#ifndef cmpxchg64_acquire
@@ -77,6 +77,76 @@
#endif /* cmpxchg64_relaxed */
+#ifndef try_cmpxchg_relaxed
+#ifdef try_cmpxchg
+#define try_cmpxchg_acquire try_cmpxchg
+#define try_cmpxchg_release try_cmpxchg
+#define try_cmpxchg_relaxed try_cmpxchg
+#endif /* try_cmpxchg */
+
+#ifndef try_cmpxchg
+#define try_cmpxchg(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = cmpxchg((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* try_cmpxchg */
+
+#ifndef try_cmpxchg_acquire
+#define try_cmpxchg_acquire(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = cmpxchg_acquire((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* try_cmpxchg_acquire */
+
+#ifndef try_cmpxchg_release
+#define try_cmpxchg_release(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = cmpxchg_release((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* try_cmpxchg_release */
+
+#ifndef try_cmpxchg_relaxed
+#define try_cmpxchg_relaxed(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = cmpxchg_relaxed((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* try_cmpxchg_relaxed */
+
+#else /* try_cmpxchg_relaxed */
+
+#ifndef try_cmpxchg_acquire
+#define try_cmpxchg_acquire(...) \
+ __atomic_op_acquire(try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef try_cmpxchg_release
+#define try_cmpxchg_release(...) \
+ __atomic_op_release(try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef try_cmpxchg
+#define try_cmpxchg(...) \
+ __atomic_op_fence(try_cmpxchg, __VA_ARGS__)
+#endif
+
+#endif /* try_cmpxchg_relaxed */
+
#define arch_atomic_read atomic_read
#define arch_atomic_read_acquire atomic_read_acquire
@@ -2522,4 +2592,4 @@ atomic64_dec_if_positive(atomic64_t *v)
#endif
#endif /* _LINUX_ATOMIC_FALLBACK_H */
-// 9d95b56f98d82a2a26c7b79ccdd0c47572d50a6f
+// d78e6c293c661c15188f0ec05bce45188c8d5892
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 693dfa1de430..317a6cec76e1 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -144,15 +144,11 @@ gen_proto_order_variants()
printf "#endif /* ${basename}_relaxed */\n\n"
}
-gen_xchg_fallbacks()
+gen_order_fallbacks()
{
local xchg="$1"; shift
+
cat <<EOF
-#ifndef ${xchg}_relaxed
-#define ${xchg}_relaxed ${xchg}
-#define ${xchg}_acquire ${xchg}
-#define ${xchg}_release ${xchg}
-#else /* ${xchg}_relaxed */
#ifndef ${xchg}_acquire
#define ${xchg}_acquire(...) \\
@@ -169,11 +165,62 @@ cat <<EOF
__atomic_op_fence(${xchg}, __VA_ARGS__)
#endif
-#endif /* ${xchg}_relaxed */
+EOF
+}
+
+gen_xchg_fallbacks()
+{
+ local xchg="$1"; shift
+ printf "#ifndef ${xchg}_relaxed\n"
+
+ gen_basic_fallbacks ${xchg}
+
+ printf "#else /* ${xchg}_relaxed */\n"
+
+ gen_order_fallbacks ${xchg}
+
+ printf "#endif /* ${xchg}_relaxed */\n\n"
+}
+
+gen_try_cmpxchg_fallback()
+{
+ local order="$1"; shift;
+
+cat <<EOF
+#ifndef ${ARCH}try_cmpxchg${order}
+#define ${ARCH}try_cmpxchg${order}(_ptr, _oldp, _new) \\
+({ \\
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
+ ___r = ${ARCH}cmpxchg${order}((_ptr), ___o, (_new)); \\
+ if (unlikely(___r != ___o)) \\
+ *___op = ___r; \\
+ likely(___r == ___o); \\
+})
+#endif /* ${ARCH}try_cmpxchg${order} */
EOF
}
+gen_try_cmpxchg_fallbacks()
+{
+ printf "#ifndef ${ARCH}try_cmpxchg_relaxed\n"
+ printf "#ifdef ${ARCH}try_cmpxchg\n"
+
+ gen_basic_fallbacks "${ARCH}try_cmpxchg"
+
+ printf "#endif /* ${ARCH}try_cmpxchg */\n\n"
+
+ for order in "" "_acquire" "_release" "_relaxed"; do
+ gen_try_cmpxchg_fallback "${order}"
+ done
+
+ printf "#else /* ${ARCH}try_cmpxchg_relaxed */\n"
+
+ gen_order_fallbacks "${ARCH}try_cmpxchg"
+
+ printf "#endif /* ${ARCH}try_cmpxchg_relaxed */\n\n"
+}
+
cat << EOF
// SPDX-License-Identifier: GPL-2.0
@@ -191,6 +238,8 @@ for xchg in "${ARCH}xchg" "${ARCH}cmpxchg" "${ARCH}cmpxchg64"; do
gen_xchg_fallbacks "${xchg}"
done
+gen_try_cmpxchg_fallbacks
+
grep '^[a-z]' "$1" | while read name meta args; do
gen_proto "${meta}" "${name}" "${ARCH}" "atomic" "int" ${args}
done
diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
index 6afadf73da17..85dc25685c0d 100755
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -103,14 +103,31 @@ gen_xchg()
local xchg="$1"; shift
local mult="$1"; shift
+ if [ "${xchg%${xchg#try_cmpxchg}}" = "try_cmpxchg" ] ; then
+
+cat <<EOF
+#define ${xchg}(ptr, oldp, ...) \\
+({ \\
+ typeof(ptr) __ai_ptr = (ptr); \\
+ typeof(oldp) __ai_oldp = (oldp); \\
+ instrument_atomic_write(__ai_ptr, ${mult}sizeof(*__ai_ptr)); \\
+ instrument_atomic_write(__ai_oldp, ${mult}sizeof(*__ai_oldp)); \\
+ arch_${xchg}(__ai_ptr, __ai_oldp, __VA_ARGS__); \\
+})
+EOF
+
+ else
+
cat <<EOF
-#define ${xchg}(ptr, ...) \\
-({ \\
- typeof(ptr) __ai_ptr = (ptr); \\
- instrument_atomic_write(__ai_ptr, ${mult}sizeof(*__ai_ptr)); \\
- arch_${xchg}(__ai_ptr, __VA_ARGS__); \\
+#define ${xchg}(ptr, ...) \\
+({ \\
+ typeof(ptr) __ai_ptr = (ptr); \\
+ instrument_atomic_write(__ai_ptr, ${mult}sizeof(*__ai_ptr)); \\
+ arch_${xchg}(__ai_ptr, __VA_ARGS__); \\
})
EOF
+
+ fi
}
gen_optional_xchg()
@@ -160,7 +177,7 @@ grep '^[a-z]' "$1" | while read name meta args; do
gen_proto "${meta}" "${name}" "atomic64" "s64" ${args}
done
-for xchg in "xchg" "cmpxchg" "cmpxchg64"; do
+for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg"; do
for order in "" "_acquire" "_release" "_relaxed"; do
gen_optional_xchg "${xchg}" "${order}"
done
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v5 19/21] asm-generic/atomic: Add try_cmpxchg() fallbacks
2020-08-29 13:03 ` [PATCH v5 19/21] asm-generic/atomic: Add try_cmpxchg() fallbacks Masami Hiramatsu
@ 2020-10-12 16:25 ` Ingo Molnar
0 siblings, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2020-10-12 16:25 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: linux-kernel, Peter Zijlstra, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
* Masami Hiramatsu <mhiramat@kernel.org> wrote:
> From: Peter Zijlstra <peterz@infradead.org>
>
> Only x86 provides try_cmpxchg() outside of the atomic_t interfaces,
> provide generic fallbacks to create this interface from the widely
> available cmpxchg() function.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Acked-by: Will Deacon <will@kernel.org>
Your SOB was missing here too.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH v5 20/21] freelist: Lock less freelist
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (18 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 19/21] asm-generic/atomic: Add try_cmpxchg() fallbacks Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-08-29 13:03 ` [PATCH v5 21/21] kprobes: Replace rp->free_instance with freelist Masami Hiramatsu
2020-09-01 19:08 ` [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Peter Zijlstra
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
From: Peter Zijlstra <peterz@infradead.org>
Cc: cameron@moodycamel.com
Cc: oleg@redhat.com
Cc: will@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/freelist.h | 129 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 129 insertions(+)
create mode 100644 include/linux/freelist.h
diff --git a/include/linux/freelist.h b/include/linux/freelist.h
new file mode 100644
index 000000000000..fc1842b96469
--- /dev/null
+++ b/include/linux/freelist.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
+#ifndef FREELIST_H
+#define FREELIST_H
+
+#include <linux/atomic.h>
+
+/*
+ * Copyright: cameron@moodycamel.com
+ *
+ * A simple CAS-based lock-free free list. Not the fastest thing in the world
+ * under heavy contention, but simple and correct (assuming nodes are never
+ * freed until after the free list is destroyed), and fairly speedy under low
+ * contention.
+ *
+ * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
+ */
+
+struct freelist_node {
+ atomic_t refs;
+ struct freelist_node *next;
+};
+
+struct freelist_head {
+ struct freelist_node *head;
+};
+
+#define REFS_ON_FREELIST 0x80000000
+#define REFS_MASK 0x7FFFFFFF
+
+static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
+{
+ /*
+ * Since the refcount is zero, and nobody can increase it once it's
+ * zero (except us, and we run only one copy of this method per node at
+ * a time, i.e. the single thread case), then we know we can safely
+ * change the next pointer of the node; however, once the refcount is
+ * back above zero, then other threads could increase it (happens under
+ * heavy contention, when the refcount goes to zero in between a load
+ * and a refcount increment of a node in try_get, then back up to
+ * something non-zero, then the refcount increment is done by the other
+ * thread) -- so if the CAS to add the node to the actual list fails,
+ * decrese the refcount and leave the add operation to the next thread
+ * who puts the refcount back to zero (which could be us, hence the
+ * loop).
+ */
+ struct freelist_node *head = READ_ONCE(list->head);
+
+ for (;;) {
+ WRITE_ONCE(node->next, head);
+ atomic_set_release(&node->refs, 1);
+
+ if (!try_cmpxchg_release(&list->head, &head, node)) {
+ /*
+ * Hmm, the add failed, but we can only try again when
+ * the refcount goes back to zero.
+ */
+ if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
+ continue;
+ }
+ return;
+ }
+}
+
+static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
+{
+ /*
+ * We know that the should-be-on-freelist bit is 0 at this point, so
+ * it's safe to set it using a fetch_add.
+ */
+ if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
+ /*
+ * Oh look! We were the last ones referencing this node, and we
+ * know we want to add it to the free list, so let's do it!
+ */
+ __freelist_add(node, list);
+ }
+}
+
+static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
+{
+ struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
+ unsigned int refs;
+
+ while (head) {
+ prev = head;
+ refs = atomic_read(&head->refs);
+ if ((refs & REFS_MASK) == 0 ||
+ !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
+ head = smp_load_acquire(&list->head);
+ continue;
+ }
+
+ /*
+ * Good, reference count has been incremented (it wasn't at
+ * zero), which means we can read the next and not worry about
+ * it changing between now and the time we do the CAS.
+ */
+ next = READ_ONCE(head->next);
+ if (try_cmpxchg_acquire(&list->head, &head, next)) {
+ /*
+ * Yay, got the node. This means it was on the list,
+ * which means should-be-on-freelist must be false no
+ * matter the refcount (because nobody else knows it's
+ * been taken off yet, it can't have been put back on).
+ */
+ WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
+
+ /*
+ * Decrease refcount twice, once for our ref, and once
+ * for the list's ref.
+ */
+ atomic_fetch_add(-2, &head->refs);
+
+ return head;
+ }
+
+ /*
+ * OK, the head must have changed on us, but we still need to decrement
+ * the refcount we increased.
+ */
+ refs = atomic_fetch_add(-1, &prev->refs);
+ if (refs == REFS_ON_FREELIST + 1)
+ __freelist_add(prev, list);
+ }
+
+ return NULL;
+}
+
+#endif /* FREELIST_H */
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v5 21/21] kprobes: Replace rp->free_instance with freelist
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (19 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 20/21] freelist: Lock less freelist Masami Hiramatsu
@ 2020-08-29 13:03 ` Masami Hiramatsu
2020-09-01 19:08 ` [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Peter Zijlstra
21 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-08-29 13:03 UTC (permalink / raw)
To: linux-kernel, Peter Zijlstra
Cc: Eddy_Wu, x86, davem, rostedt, naveen.n.rao, anil.s.keshavamurthy,
linux-arch, cameron, oleg, will, paulmck, mhiramat
From: Peter Zijlstra <peterz@infradead.org>
Gets rid of rp->lock, and as a result kretprobes are now fully
lockless.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
Changes
- [MH] expel the llist from anon union in kretprobe_instance
---
include/linux/kprobes.h | 8 +++----
kernel/kprobes.c | 56 ++++++++++++++++++++---------------------------
2 files changed, 28 insertions(+), 36 deletions(-)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index f8f87a13345a..89ff0fc15286 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -28,6 +28,7 @@
#include <linux/mutex.h>
#include <linux/ftrace.h>
#include <linux/refcount.h>
+#include <linux/freelist.h>
#include <asm/kprobes.h>
#ifdef CONFIG_KPROBES
@@ -157,17 +158,16 @@ struct kretprobe {
int maxactive;
int nmissed;
size_t data_size;
- struct hlist_head free_instances;
+ struct freelist_head freelist;
struct kretprobe_holder *rph;
- raw_spinlock_t lock;
};
struct kretprobe_instance {
union {
- struct llist_node llist;
- struct hlist_node hlist;
+ struct freelist_node freelist;
struct rcu_head rcu;
};
+ struct llist_node llist;
struct kretprobe_holder *rph;
kprobe_opcode_t *ret_addr;
void *fp;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index bc65603fce00..af6551992b91 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1228,11 +1228,8 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
{
struct kretprobe *rp = get_kretprobe(ri);
- INIT_HLIST_NODE(&ri->hlist);
if (likely(rp)) {
- raw_spin_lock(&rp->lock);
- hlist_add_head(&ri->hlist, &rp->free_instances);
- raw_spin_unlock(&rp->lock);
+ freelist_add(&ri->freelist, &rp->freelist);
} else
call_rcu(&ri->rcu, free_rp_inst_rcu);
}
@@ -1290,11 +1287,14 @@ NOKPROBE_SYMBOL(kprobe_flush_task);
static inline void free_rp_inst(struct kretprobe *rp)
{
struct kretprobe_instance *ri;
- struct hlist_node *next;
+ struct freelist_node *node;
int count = 0;
- hlist_for_each_entry_safe(ri, next, &rp->free_instances, hlist) {
- hlist_del(&ri->hlist);
+ node = rp->freelist.head;
+ while (node) {
+ ri = container_of(node, struct kretprobe_instance, freelist);
+ node = node->next;
+
kfree(ri);
count++;
}
@@ -1925,32 +1925,26 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe *rp = container_of(p, struct kretprobe, kp);
- unsigned long flags = 0;
struct kretprobe_instance *ri;
+ struct freelist_node *fn;
- /* TODO: consider to only swap the RA after the last pre_handler fired */
- raw_spin_lock_irqsave(&rp->lock, flags);
- if (!hlist_empty(&rp->free_instances)) {
- ri = hlist_entry(rp->free_instances.first,
- struct kretprobe_instance, hlist);
- hlist_del(&ri->hlist);
- raw_spin_unlock_irqrestore(&rp->lock, flags);
-
- if (rp->entry_handler && rp->entry_handler(ri, regs)) {
- raw_spin_lock_irqsave(&rp->lock, flags);
- hlist_add_head(&ri->hlist, &rp->free_instances);
- raw_spin_unlock_irqrestore(&rp->lock, flags);
- return 0;
- }
-
- arch_prepare_kretprobe(ri, regs);
+ fn = freelist_try_get(&rp->freelist);
+ if (!fn) {
+ rp->nmissed++;
+ return 0;
+ }
- __llist_add(&ri->llist, ¤t->kretprobe_instances);
+ ri = container_of(fn, struct kretprobe_instance, freelist);
- } else {
- rp->nmissed++;
- raw_spin_unlock_irqrestore(&rp->lock, flags);
+ if (rp->entry_handler && rp->entry_handler(ri, regs)) {
+ freelist_add(&ri->freelist, &rp->freelist);
+ return 0;
}
+
+ arch_prepare_kretprobe(ri, regs);
+
+ __llist_add(&ri->llist, ¤t->kretprobe_instances);
+
return 0;
}
NOKPROBE_SYMBOL(pre_handler_kretprobe);
@@ -2007,8 +2001,7 @@ int register_kretprobe(struct kretprobe *rp)
rp->maxactive = num_possible_cpus();
#endif
}
- raw_spin_lock_init(&rp->lock);
- INIT_HLIST_HEAD(&rp->free_instances);
+ rp->freelist.head = NULL;
rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
if (!rp->rph)
return -ENOMEM;
@@ -2023,8 +2016,7 @@ int register_kretprobe(struct kretprobe *rp)
return -ENOMEM;
}
inst->rph = rp->rph;
- INIT_HLIST_NODE(&inst->hlist);
- hlist_add_head(&inst->hlist, &rp->free_instances);
+ freelist_add(&inst->freelist, &rp->freelist);
}
refcount_set(&rp->rph->ref, i);
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-08-29 12:59 [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Masami Hiramatsu
` (20 preceding siblings ...)
2020-08-29 13:03 ` [PATCH v5 21/21] kprobes: Replace rp->free_instance with freelist Masami Hiramatsu
@ 2020-09-01 19:08 ` Peter Zijlstra
2020-09-02 0:37 ` Masami Hiramatsu
21 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2020-09-01 19:08 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: linux-kernel, Eddy_Wu, x86, davem, rostedt, naveen.n.rao,
anil.s.keshavamurthy, linux-arch, cameron, oleg, will, paulmck
On Sat, Aug 29, 2020 at 09:59:49PM +0900, Masami Hiramatsu wrote:
> Masami Hiramatsu (16):
> kprobes: Add generic kretprobe trampoline handler
> x86/kprobes: Use generic kretprobe trampoline handler
> arm: kprobes: Use generic kretprobe trampoline handler
> arm64: kprobes: Use generic kretprobe trampoline handler
> arc: kprobes: Use generic kretprobe trampoline handler
> csky: kprobes: Use generic kretprobe trampoline handler
> ia64: kprobes: Use generic kretprobe trampoline handler
> mips: kprobes: Use generic kretprobe trampoline handler
> parisc: kprobes: Use generic kretprobe trampoline handler
> powerpc: kprobes: Use generic kretprobe trampoline handler
> s390: kprobes: Use generic kretprobe trampoline handler
> sh: kprobes: Use generic kretprobe trampoline handler
> sparc: kprobes: Use generic kretprobe trampoline handler
> kprobes: Remove NMI context check
> kprobes: Free kretprobe_instance with rcu callback
> kprobes: Make local used functions static
>
> Peter Zijlstra (5):
> llist: Add nonatomic __llist_add() and __llist_dell_all()
> kprobes: Remove kretprobe hash
> asm-generic/atomic: Add try_cmpxchg() fallbacks
> freelist: Lock less freelist
> kprobes: Replace rp->free_instance with freelist
This looks good to me, do you want me to merge them through -tip? If so,
do we want to try and get them in this release still?
Ingo, opinions? This basically fixes a regression cauesd by
0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-01 19:08 ` [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless Peter Zijlstra
@ 2020-09-02 0:37 ` Masami Hiramatsu
2020-09-02 7:02 ` peterz
0 siblings, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-02 0:37 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: linux-kernel, Eddy_Wu, x86, davem, rostedt, naveen.n.rao,
anil.s.keshavamurthy, linux-arch, cameron, oleg, will, paulmck
On Tue, 1 Sep 2020 21:08:08 +0200
Peter Zijlstra <peterz@infradead.org> wrote:
> On Sat, Aug 29, 2020 at 09:59:49PM +0900, Masami Hiramatsu wrote:
> > Masami Hiramatsu (16):
> > kprobes: Add generic kretprobe trampoline handler
> > x86/kprobes: Use generic kretprobe trampoline handler
> > arm: kprobes: Use generic kretprobe trampoline handler
> > arm64: kprobes: Use generic kretprobe trampoline handler
> > arc: kprobes: Use generic kretprobe trampoline handler
> > csky: kprobes: Use generic kretprobe trampoline handler
> > ia64: kprobes: Use generic kretprobe trampoline handler
> > mips: kprobes: Use generic kretprobe trampoline handler
> > parisc: kprobes: Use generic kretprobe trampoline handler
> > powerpc: kprobes: Use generic kretprobe trampoline handler
> > s390: kprobes: Use generic kretprobe trampoline handler
> > sh: kprobes: Use generic kretprobe trampoline handler
> > sparc: kprobes: Use generic kretprobe trampoline handler
> > kprobes: Remove NMI context check
> > kprobes: Free kretprobe_instance with rcu callback
> > kprobes: Make local used functions static
> >
> > Peter Zijlstra (5):
> > llist: Add nonatomic __llist_add() and __llist_dell_all()
> > kprobes: Remove kretprobe hash
> > asm-generic/atomic: Add try_cmpxchg() fallbacks
> > freelist: Lock less freelist
> > kprobes: Replace rp->free_instance with freelist
>
> This looks good to me, do you want me to merge them through -tip? If so,
> do we want to try and get them in this release still?
Yes, thanks. For the kretprobe missing issue, we will need the first half
(up to "kprobes: Remove NMI context check"), so we can split the series
if someone think the lockless is still immature.
>
> Ingo, opinions? This basically fixes a regression cauesd by
>
> 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
>
Oops, I missed Ingo in CC.
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 0:37 ` Masami Hiramatsu
@ 2020-09-02 7:02 ` peterz
2020-09-02 8:17 ` Masami Hiramatsu
2020-09-11 2:32 ` Masami Hiramatsu
0 siblings, 2 replies; 51+ messages in thread
From: peterz @ 2020-09-02 7:02 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Wed, Sep 02, 2020 at 09:37:39AM +0900, Masami Hiramatsu wrote:
> On Tue, 1 Sep 2020 21:08:08 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
>
> > On Sat, Aug 29, 2020 at 09:59:49PM +0900, Masami Hiramatsu wrote:
> > > Masami Hiramatsu (16):
> > > kprobes: Add generic kretprobe trampoline handler
> > > x86/kprobes: Use generic kretprobe trampoline handler
> > > arm: kprobes: Use generic kretprobe trampoline handler
> > > arm64: kprobes: Use generic kretprobe trampoline handler
> > > arc: kprobes: Use generic kretprobe trampoline handler
> > > csky: kprobes: Use generic kretprobe trampoline handler
> > > ia64: kprobes: Use generic kretprobe trampoline handler
> > > mips: kprobes: Use generic kretprobe trampoline handler
> > > parisc: kprobes: Use generic kretprobe trampoline handler
> > > powerpc: kprobes: Use generic kretprobe trampoline handler
> > > s390: kprobes: Use generic kretprobe trampoline handler
> > > sh: kprobes: Use generic kretprobe trampoline handler
> > > sparc: kprobes: Use generic kretprobe trampoline handler
> > > kprobes: Remove NMI context check
> > > kprobes: Free kretprobe_instance with rcu callback
> > > kprobes: Make local used functions static
> > >
> > > Peter Zijlstra (5):
> > > llist: Add nonatomic __llist_add() and __llist_dell_all()
> > > kprobes: Remove kretprobe hash
> > > asm-generic/atomic: Add try_cmpxchg() fallbacks
> > > freelist: Lock less freelist
> > > kprobes: Replace rp->free_instance with freelist
> >
> > This looks good to me, do you want me to merge them through -tip? If so,
> > do we want to try and get them in this release still?
>
> Yes, thanks. For the kretprobe missing issue, we will need the first half
> (up to "kprobes: Remove NMI context check"), so we can split the series
> if someone think the lockless is still immature.
Ok, but then lockdep will yell at you if you have that enabled and run
the unoptimized things.
> > Ingo, opinions? This basically fixes a regression cauesd by
> >
> > 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
> >
>
> Oops, I missed Ingo in CC.
You had x86@, he'll have a copy :-)
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 7:02 ` peterz
@ 2020-09-02 8:17 ` Masami Hiramatsu
2020-09-02 9:36 ` peterz
2020-09-11 2:32 ` Masami Hiramatsu
1 sibling, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-02 8:17 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Wed, 2 Sep 2020 09:02:26 +0200
peterz@infradead.org wrote:
> On Wed, Sep 02, 2020 at 09:37:39AM +0900, Masami Hiramatsu wrote:
> > On Tue, 1 Sep 2020 21:08:08 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > On Sat, Aug 29, 2020 at 09:59:49PM +0900, Masami Hiramatsu wrote:
> > > > Masami Hiramatsu (16):
> > > > kprobes: Add generic kretprobe trampoline handler
> > > > x86/kprobes: Use generic kretprobe trampoline handler
> > > > arm: kprobes: Use generic kretprobe trampoline handler
> > > > arm64: kprobes: Use generic kretprobe trampoline handler
> > > > arc: kprobes: Use generic kretprobe trampoline handler
> > > > csky: kprobes: Use generic kretprobe trampoline handler
> > > > ia64: kprobes: Use generic kretprobe trampoline handler
> > > > mips: kprobes: Use generic kretprobe trampoline handler
> > > > parisc: kprobes: Use generic kretprobe trampoline handler
> > > > powerpc: kprobes: Use generic kretprobe trampoline handler
> > > > s390: kprobes: Use generic kretprobe trampoline handler
> > > > sh: kprobes: Use generic kretprobe trampoline handler
> > > > sparc: kprobes: Use generic kretprobe trampoline handler
> > > > kprobes: Remove NMI context check
> > > > kprobes: Free kretprobe_instance with rcu callback
> > > > kprobes: Make local used functions static
> > > >
> > > > Peter Zijlstra (5):
> > > > llist: Add nonatomic __llist_add() and __llist_dell_all()
> > > > kprobes: Remove kretprobe hash
> > > > asm-generic/atomic: Add try_cmpxchg() fallbacks
> > > > freelist: Lock less freelist
> > > > kprobes: Replace rp->free_instance with freelist
> > >
> > > This looks good to me, do you want me to merge them through -tip? If so,
> > > do we want to try and get them in this release still?
> >
> > Yes, thanks. For the kretprobe missing issue, we will need the first half
> > (up to "kprobes: Remove NMI context check"), so we can split the series
> > if someone think the lockless is still immature.
>
> Ok, but then lockdep will yell at you if you have that enabled and run
> the unoptimized things.
Oh, does it warn for all spinlock things in kprobes if it is unoptimized?
Hmm, it has to be noted in the documentation.
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 8:17 ` Masami Hiramatsu
@ 2020-09-02 9:36 ` peterz
2020-09-02 13:19 ` Masami Hiramatsu
0 siblings, 1 reply; 51+ messages in thread
From: peterz @ 2020-09-02 9:36 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Wed, Sep 02, 2020 at 05:17:55PM +0900, Masami Hiramatsu wrote:
> > Ok, but then lockdep will yell at you if you have that enabled and run
> > the unoptimized things.
>
> Oh, does it warn for all spinlock things in kprobes if it is unoptimized?
> Hmm, it has to be noted in the documentation.
Lockdep will warn about spinlocks used in NMI context that are also used
outside NMI context.
Now, for the kretprobe that kprobe_busy flag prevents the actual
recursion self-deadlock, but lockdep isn't smart enough to see that.
One way around this might be to use SINGLE_DEPTH_NESTING for locks when
we use them from INT3 context. That way they'll have a different class
and lockdep will not see the recursion.
pre_handler_kretprobe() is always called from INT3, right?
Something like the below might then work...
---
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 287b263c9cb9..b78f4fe08e86 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1255,11 +1255,11 @@ __acquires(hlist_lock)
NOKPROBE_SYMBOL(kretprobe_hash_lock);
static void kretprobe_table_lock(unsigned long hash,
- unsigned long *flags)
+ unsigned long *flags, int nesting)
__acquires(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_lock_irqsave(hlist_lock, *flags);
+ raw_spin_lock_irqsave_nested(hlist_lock, *flags, nesting);
}
NOKPROBE_SYMBOL(kretprobe_table_lock);
@@ -1326,7 +1326,7 @@ void kprobe_flush_task(struct task_struct *tk)
INIT_HLIST_HEAD(&empty_rp);
hash = hash_ptr(tk, KPROBE_HASH_BITS);
head = &kretprobe_inst_table[hash];
- kretprobe_table_lock(hash, &flags);
+ kretprobe_table_lock(hash, &flags, 0);
hlist_for_each_entry_safe(ri, tmp, head, hlist) {
if (ri->task == tk)
recycle_rp_inst(ri, &empty_rp);
@@ -1361,7 +1361,7 @@ static void cleanup_rp_inst(struct kretprobe *rp)
/* No race here */
for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
- kretprobe_table_lock(hash, &flags);
+ kretprobe_table_lock(hash, &flags, 0);
head = &kretprobe_inst_table[hash];
hlist_for_each_entry_safe(ri, next, head, hlist) {
if (ri->rp == rp)
@@ -1950,7 +1950,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
/* TODO: consider to only swap the RA after the last pre_handler fired */
hash = hash_ptr(current, KPROBE_HASH_BITS);
- raw_spin_lock_irqsave(&rp->lock, flags);
+ raw_spin_lock_irqsave_nested(&rp->lock, flags, SINGLE_DEPTH_NESTING);
if (!hlist_empty(&rp->free_instances)) {
ri = hlist_entry(rp->free_instances.first,
struct kretprobe_instance, hlist);
@@ -1961,7 +1961,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
ri->task = current;
if (rp->entry_handler && rp->entry_handler(ri, regs)) {
- raw_spin_lock_irqsave(&rp->lock, flags);
+ raw_spin_lock_irqsave_nested(&rp->lock, flags, SINGLE_DEPTH_NESTING);
hlist_add_head(&ri->hlist, &rp->free_instances);
raw_spin_unlock_irqrestore(&rp->lock, flags);
return 0;
@@ -1971,7 +1971,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
/* XXX(hch): why is there no hlist_move_head? */
INIT_HLIST_NODE(&ri->hlist);
- kretprobe_table_lock(hash, &flags);
+ kretprobe_table_lock(hash, &flags, SINGLE_DEPTH_NESTING);
hlist_add_head(&ri->hlist, &kretprobe_inst_table[hash]);
kretprobe_table_unlock(hash, &flags);
} else {
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 9:36 ` peterz
@ 2020-09-02 13:19 ` Masami Hiramatsu
2020-09-02 13:42 ` peterz
0 siblings, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-02 13:19 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Wed, 2 Sep 2020 11:36:13 +0200
peterz@infradead.org wrote:
> On Wed, Sep 02, 2020 at 05:17:55PM +0900, Masami Hiramatsu wrote:
>
> > > Ok, but then lockdep will yell at you if you have that enabled and run
> > > the unoptimized things.
> >
> > Oh, does it warn for all spinlock things in kprobes if it is unoptimized?
> > Hmm, it has to be noted in the documentation.
>
> Lockdep will warn about spinlocks used in NMI context that are also used
> outside NMI context.
OK, but raw_spin_lock_irqsave() will not involve lockdep, correct?
> Now, for the kretprobe that kprobe_busy flag prevents the actual
> recursion self-deadlock, but lockdep isn't smart enough to see that.
>
> One way around this might be to use SINGLE_DEPTH_NESTING for locks when
> we use them from INT3 context. That way they'll have a different class
> and lockdep will not see the recursion.
Hmm, so lockdep warns only when it detects the spinlock in NMI context,
and int3 is now always NMI, thus all spinlock (except raw_spinlock?)
in kprobe handlers should get warned, right?
I have tested this series up to [16/21] with optprobe disabled, but
I haven't see the lockdep warnings.
>
> pre_handler_kretprobe() is always called from INT3, right?
No, not always, it can be called from optprobe (same as original code
context) or ftrace handler.
But if you set 0 to /proc/sys/debug/kprobe_optimization, and compile
the kernel without function tracer, it should always be called from
INT3.
>
> Something like the below might then work...
OK, but I would like to reproduce the lockdep warning on this for
confirmation.
Thank you,
>
> ---
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 287b263c9cb9..b78f4fe08e86 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1255,11 +1255,11 @@ __acquires(hlist_lock)
> NOKPROBE_SYMBOL(kretprobe_hash_lock);
>
> static void kretprobe_table_lock(unsigned long hash,
> - unsigned long *flags)
> + unsigned long *flags, int nesting)
> __acquires(hlist_lock)
> {
> raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
> - raw_spin_lock_irqsave(hlist_lock, *flags);
> + raw_spin_lock_irqsave_nested(hlist_lock, *flags, nesting);
> }
> NOKPROBE_SYMBOL(kretprobe_table_lock);
>
> @@ -1326,7 +1326,7 @@ void kprobe_flush_task(struct task_struct *tk)
> INIT_HLIST_HEAD(&empty_rp);
> hash = hash_ptr(tk, KPROBE_HASH_BITS);
> head = &kretprobe_inst_table[hash];
> - kretprobe_table_lock(hash, &flags);
> + kretprobe_table_lock(hash, &flags, 0);
> hlist_for_each_entry_safe(ri, tmp, head, hlist) {
> if (ri->task == tk)
> recycle_rp_inst(ri, &empty_rp);
> @@ -1361,7 +1361,7 @@ static void cleanup_rp_inst(struct kretprobe *rp)
>
> /* No race here */
> for (hash = 0; hash < KPROBE_TABLE_SIZE; hash++) {
> - kretprobe_table_lock(hash, &flags);
> + kretprobe_table_lock(hash, &flags, 0);
> head = &kretprobe_inst_table[hash];
> hlist_for_each_entry_safe(ri, next, head, hlist) {
> if (ri->rp == rp)
> @@ -1950,7 +1950,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>
> /* TODO: consider to only swap the RA after the last pre_handler fired */
> hash = hash_ptr(current, KPROBE_HASH_BITS);
> - raw_spin_lock_irqsave(&rp->lock, flags);
> + raw_spin_lock_irqsave_nested(&rp->lock, flags, SINGLE_DEPTH_NESTING);
> if (!hlist_empty(&rp->free_instances)) {
> ri = hlist_entry(rp->free_instances.first,
> struct kretprobe_instance, hlist);
> @@ -1961,7 +1961,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
> ri->task = current;
>
> if (rp->entry_handler && rp->entry_handler(ri, regs)) {
> - raw_spin_lock_irqsave(&rp->lock, flags);
> + raw_spin_lock_irqsave_nested(&rp->lock, flags, SINGLE_DEPTH_NESTING);
> hlist_add_head(&ri->hlist, &rp->free_instances);
> raw_spin_unlock_irqrestore(&rp->lock, flags);
> return 0;
> @@ -1971,7 +1971,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>
> /* XXX(hch): why is there no hlist_move_head? */
> INIT_HLIST_NODE(&ri->hlist);
> - kretprobe_table_lock(hash, &flags);
> + kretprobe_table_lock(hash, &flags, SINGLE_DEPTH_NESTING);
> hlist_add_head(&ri->hlist, &kretprobe_inst_table[hash]);
> kretprobe_table_unlock(hash, &flags);
> } else {
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 13:19 ` Masami Hiramatsu
@ 2020-09-02 13:42 ` peterz
2020-09-03 1:39 ` Masami Hiramatsu
0 siblings, 1 reply; 51+ messages in thread
From: peterz @ 2020-09-02 13:42 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
On Wed, Sep 02, 2020 at 10:19:26PM +0900, Masami Hiramatsu wrote:
> On Wed, 2 Sep 2020 11:36:13 +0200
> peterz@infradead.org wrote:
>
> > On Wed, Sep 02, 2020 at 05:17:55PM +0900, Masami Hiramatsu wrote:
> >
> > > > Ok, but then lockdep will yell at you if you have that enabled and run
> > > > the unoptimized things.
> > >
> > > Oh, does it warn for all spinlock things in kprobes if it is unoptimized?
> > > Hmm, it has to be noted in the documentation.
> >
> > Lockdep will warn about spinlocks used in NMI context that are also used
> > outside NMI context.
>
> OK, but raw_spin_lock_irqsave() will not involve lockdep, correct?
It will. The distinction between spin_lock and raw_spin_lock is only
that raw_spin_lock stays a spinlock on PREEMPT_RT, while spin_lock will
turn into a (PI) mutex in that case.
But both will call into lockdep. Unlike local_irq_disable() and
raw_local_irq_disable(), where the latter will not. Yes your prefixes
are a mess :/
> > Now, for the kretprobe that kprobe_busy flag prevents the actual
> > recursion self-deadlock, but lockdep isn't smart enough to see that.
> >
> > One way around this might be to use SINGLE_DEPTH_NESTING for locks when
> > we use them from INT3 context. That way they'll have a different class
> > and lockdep will not see the recursion.
>
> Hmm, so lockdep warns only when it detects the spinlock in NMI context,
> and int3 is now always NMI, thus all spinlock (except raw_spinlock?)
> in kprobe handlers should get warned, right?
> I have tested this series up to [16/21] with optprobe disabled, but
> I haven't see the lockdep warnings.
There's a bug, that might make it miss it. I have a patch. I'll send it
shortly.
> > pre_handler_kretprobe() is always called from INT3, right?
>
> No, not always, it can be called from optprobe (same as original code
> context) or ftrace handler.
> But if you set 0 to /proc/sys/debug/kprobe_optimization, and compile
> the kernel without function tracer, it should always be called from
> INT3.
D'oh, ofcourse! Arguably I should make the optprobe context NMI like
too.. but that's for another day.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 13:42 ` peterz
@ 2020-09-03 1:39 ` Masami Hiramatsu
2020-09-03 2:02 ` Masami Hiramatsu
2020-09-08 10:37 ` peterz
0 siblings, 2 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-03 1:39 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Wed, 2 Sep 2020 15:42:52 +0200
peterz@infradead.org wrote:
> On Wed, Sep 02, 2020 at 10:19:26PM +0900, Masami Hiramatsu wrote:
> > On Wed, 2 Sep 2020 11:36:13 +0200
> > peterz@infradead.org wrote:
> >
> > > On Wed, Sep 02, 2020 at 05:17:55PM +0900, Masami Hiramatsu wrote:
> > >
> > > > > Ok, but then lockdep will yell at you if you have that enabled and run
> > > > > the unoptimized things.
> > > >
> > > > Oh, does it warn for all spinlock things in kprobes if it is unoptimized?
> > > > Hmm, it has to be noted in the documentation.
> > >
> > > Lockdep will warn about spinlocks used in NMI context that are also used
> > > outside NMI context.
> >
> > OK, but raw_spin_lock_irqsave() will not involve lockdep, correct?
>
> It will. The distinction between spin_lock and raw_spin_lock is only
> that raw_spin_lock stays a spinlock on PREEMPT_RT, while spin_lock will
> turn into a (PI) mutex in that case.
>
> But both will call into lockdep. Unlike local_irq_disable() and
> raw_local_irq_disable(), where the latter will not. Yes your prefixes
> are a mess :/
Yeah, that's really confusing...
> > > Now, for the kretprobe that kprobe_busy flag prevents the actual
> > > recursion self-deadlock, but lockdep isn't smart enough to see that.
> > >
> > > One way around this might be to use SINGLE_DEPTH_NESTING for locks when
> > > we use them from INT3 context. That way they'll have a different class
> > > and lockdep will not see the recursion.
> >
> > Hmm, so lockdep warns only when it detects the spinlock in NMI context,
> > and int3 is now always NMI, thus all spinlock (except raw_spinlock?)
> > in kprobe handlers should get warned, right?
> > I have tested this series up to [16/21] with optprobe disabled, but
> > I haven't see the lockdep warnings.
>
> There's a bug, that might make it miss it. I have a patch. I'll send it
> shortly.
OK, I've confirmed that the lockdep warns on kretprobe from INT3
with your fix. Of course make it lockless then warning is gone.
But even without the lockless patch, this warning can be false-positive
because we prohibit nested kprobe call, right?
If the kprobe user handler uses a spinlock, the spinlock is used
only in that handler (and in the context between kprobe_busy_begin/end),
it will be safe since the spinlock is not nested.
But if the spinlock is shared with other context, it will be dangerous
because it can be interrupted by NMI (including INT3). This also applied
to the function which is called from kprobe user handlers, thus user
has to take care of it.
BTW, what would you think about setting NMI count between kprobe_busy_begin/end?
>
> > > pre_handler_kretprobe() is always called from INT3, right?
> >
> > No, not always, it can be called from optprobe (same as original code
> > context) or ftrace handler.
> > But if you set 0 to /proc/sys/debug/kprobe_optimization, and compile
> > the kernel without function tracer, it should always be called from
> > INT3.
>
> D'oh, ofcourse! Arguably I should make the optprobe context NMI like
> too.. but that's for another day.
Hmm, we still have kprobe-on-ftrace. Would you consider we will
make it NMI like too? (and what the ftrace does for this)
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-03 1:39 ` Masami Hiramatsu
@ 2020-09-03 2:02 ` Masami Hiramatsu
2020-09-07 17:44 ` Frank Ch. Eigler
2020-09-08 10:37 ` peterz
1 sibling, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-03 2:02 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: peterz, Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Thu, 3 Sep 2020 10:39:54 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:
> OK, I've confirmed that the lockdep warns on kretprobe from INT3
> with your fix. Of course make it lockless then warning is gone.
> But even without the lockless patch, this warning can be false-positive
> because we prohibit nested kprobe call, right?
>
> If the kprobe user handler uses a spinlock, the spinlock is used
> only in that handler (and in the context between kprobe_busy_begin/end),
> it will be safe since the spinlock is not nested.
> But if the spinlock is shared with other context, it will be dangerous
> because it can be interrupted by NMI (including INT3). This also applied
> to the function which is called from kprobe user handlers, thus user
> has to take care of it.
Sorry, for noticing this point, I Cc'd to systemtap. Is systemtap taking
care of spinlock too?
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-03 2:02 ` Masami Hiramatsu
@ 2020-09-07 17:44 ` Frank Ch. Eigler
2020-09-08 2:55 ` Masami Hiramatsu
0 siblings, 1 reply; 51+ messages in thread
From: Frank Ch. Eigler @ 2020-09-07 17:44 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: peterz, Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
Masami Hiramatsu <mhiramat@kernel.org> writes:
> Sorry, for noticing this point, I Cc'd to systemtap. Is systemtap taking
> care of spinlock too?
On PRREMPT_RT configurations, systemtap uses the raw_spinlock_t
types/functions, to keep its probe handlers as atomic as we can make them.
- FChE
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-07 17:44 ` Frank Ch. Eigler
@ 2020-09-08 2:55 ` Masami Hiramatsu
0 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-08 2:55 UTC (permalink / raw)
To: Frank Ch. Eigler
Cc: peterz, Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Mon, 07 Sep 2020 13:44:19 -0400
fche@redhat.com (Frank Ch. Eigler) wrote:
> Masami Hiramatsu <mhiramat@kernel.org> writes:
>
> > Sorry, for noticing this point, I Cc'd to systemtap. Is systemtap taking
> > care of spinlock too?
>
> On PRREMPT_RT configurations, systemtap uses the raw_spinlock_t
> types/functions, to keep its probe handlers as atomic as we can make them.
OK, if the lock is only used in the probe handlers, there should be
no problem. Even if a probe hits in the NMI which happens in another
kprobe handler, the probe does not call its handler (because we don't
support nested kprobes* yet).
But maybe you'll get warnings if you enable the lockdep.
* https://lkml.kernel.org/r/158894789510.14896.13461271606820304664.stgit@devnote2
It seems that we need more work for the nested kprobes.
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-03 1:39 ` Masami Hiramatsu
2020-09-03 2:02 ` Masami Hiramatsu
@ 2020-09-08 10:37 ` peterz
2020-09-08 11:15 ` Eddy_Wu
2020-09-08 15:09 ` Masami Hiramatsu
1 sibling, 2 replies; 51+ messages in thread
From: peterz @ 2020-09-08 10:37 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Thu, Sep 03, 2020 at 10:39:54AM +0900, Masami Hiramatsu wrote:
> > There's a bug, that might make it miss it. I have a patch. I'll send it
> > shortly.
>
> OK, I've confirmed that the lockdep warns on kretprobe from INT3
> with your fix.
I'm now trying and failing to reproduce.... I can't seem to make it use
int3 today. It seems to want to use ftrace or refuses everything. I'm
probably doing it wrong.
Could you verify the below actually works? It's on top of the first 16
patches.
> Of course make it lockless then warning is gone.
> But even without the lockless patch, this warning can be false-positive
> because we prohibit nested kprobe call, right?
Yes, because the actual nesting is avoided by kprobe_busy, but lockdep
can't tell. Lockdep sees a regular lock user and an in-nmi lock user and
figures that's a bad combination.
---
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1241,48 +1241,47 @@ void recycle_rp_inst(struct kretprobe_in
}
NOKPROBE_SYMBOL(recycle_rp_inst);
-void kretprobe_hash_lock(struct task_struct *tsk,
- struct hlist_head **head, unsigned long *flags)
-__acquires(hlist_lock)
-{
- unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
- raw_spinlock_t *hlist_lock;
-
- *head = &kretprobe_inst_table[hash];
- hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_lock_irqsave(hlist_lock, *flags);
-}
-NOKPROBE_SYMBOL(kretprobe_hash_lock);
-
static void kretprobe_table_lock(unsigned long hash,
unsigned long *flags)
__acquires(hlist_lock)
{
raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_lock_irqsave(hlist_lock, *flags);
+ /*
+ * HACK, due to kprobe_busy we'll never actually recurse, make NMI
+ * context use a different lock class to avoid it reporting.
+ */
+ raw_spin_lock_irqsave_nested(hlist_lock, *flags, !!in_nmi());
}
NOKPROBE_SYMBOL(kretprobe_table_lock);
-void kretprobe_hash_unlock(struct task_struct *tsk,
- unsigned long *flags)
+static void kretprobe_table_unlock(unsigned long hash,
+ unsigned long *flags)
__releases(hlist_lock)
{
+ raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
+ raw_spin_unlock_irqrestore(hlist_lock, *flags);
+}
+NOKPROBE_SYMBOL(kretprobe_table_unlock);
+
+void kretprobe_hash_lock(struct task_struct *tsk,
+ struct hlist_head **head, unsigned long *flags)
+__acquires(hlist_lock)
+{
unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
- raw_spinlock_t *hlist_lock;
- hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_unlock_irqrestore(hlist_lock, *flags);
+ *head = &kretprobe_inst_table[hash];
+ kretprobe_table_lock(hash, flags);
}
-NOKPROBE_SYMBOL(kretprobe_hash_unlock);
+NOKPROBE_SYMBOL(kretprobe_hash_lock);
-static void kretprobe_table_unlock(unsigned long hash,
- unsigned long *flags)
+void kretprobe_hash_unlock(struct task_struct *tsk,
+ unsigned long *flags)
__releases(hlist_lock)
{
- raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
- raw_spin_unlock_irqrestore(hlist_lock, *flags);
+ unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
+ kretprobe_table_unlock(hash, flags);
}
-NOKPROBE_SYMBOL(kretprobe_table_unlock);
+NOKPROBE_SYMBOL(kretprobe_hash_unlock);
struct kprobe kprobe_busy = {
.addr = (void *) get_kprobe,
^ permalink raw reply [flat|nested] 51+ messages in thread
* RE: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-08 10:37 ` peterz
@ 2020-09-08 11:15 ` Eddy_Wu
2020-09-08 11:33 ` peterz
2020-09-08 15:09 ` Masami Hiramatsu
1 sibling, 1 reply; 51+ messages in thread
From: Eddy_Wu @ 2020-09-08 11:15 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, x86, davem, rostedt, naveen.n.rao,
anil.s.keshavamurthy, linux-arch, cameron, oleg, will, paulmck,
systemtap
> From: peterz@infradead.org <peterz@infradead.org>
>
> I'm now trying and failing to reproduce.... I can't seem to make it use
> int3 today. It seems to want to use ftrace or refuses everything. I'm
> probably doing it wrong.
>
You can turn off CONFIG_KPROBES_ON_FTRACE (and also sysctl debug.kprobes-optimization) to make it use int3 handler
TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system.
For details about what personal information we collect and why, please see our Privacy Notice on our website at: Read privacy policy<http://www.trendmicro.com/privacy>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-08 11:15 ` Eddy_Wu
@ 2020-09-08 11:33 ` peterz
0 siblings, 0 replies; 51+ messages in thread
From: peterz @ 2020-09-08 11:33 UTC (permalink / raw)
To: Eddy_Wu
Cc: Ingo Molnar, linux-kernel, x86, davem, rostedt, naveen.n.rao,
anil.s.keshavamurthy, linux-arch, cameron, oleg, will, paulmck,
systemtap
On Tue, Sep 08, 2020 at 11:15:14AM +0000, Eddy_Wu@trendmicro.com wrote:
> > From: peterz@infradead.org <peterz@infradead.org>
> >
> > I'm now trying and failing to reproduce.... I can't seem to make it use
> > int3 today. It seems to want to use ftrace or refuses everything. I'm
> > probably doing it wrong.
> >
>
> You can turn off CONFIG_KPROBES_ON_FTRACE (and also sysctl
> debug.kprobes-optimization) to make it use int3 handler
I'm fairly sure I used the sysctl like in your original reproducer.
I'll try again later.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-08 10:37 ` peterz
2020-09-08 11:15 ` Eddy_Wu
@ 2020-09-08 15:09 ` Masami Hiramatsu
2020-09-09 5:28 ` Masami Hiramatsu
1 sibling, 1 reply; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-08 15:09 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Tue, 8 Sep 2020 12:37:36 +0200
peterz@infradead.org wrote:
> On Thu, Sep 03, 2020 at 10:39:54AM +0900, Masami Hiramatsu wrote:
>
> > > There's a bug, that might make it miss it. I have a patch. I'll send it
> > > shortly.
> >
> > OK, I've confirmed that the lockdep warns on kretprobe from INT3
> > with your fix.
>
> I'm now trying and failing to reproduce.... I can't seem to make it use
> int3 today. It seems to want to use ftrace or refuses everything. I'm
> probably doing it wrong.
Ah, yes. For using the INT3, we need to disable CONFIG_FUNCTION_TRACER,
or enable CONFIG_KPROBE_EVENTS_ON_NOTRACE and put a kretprobe on a notrace
function :)
>
> Could you verify the below actually works? It's on top of the first 16
> patches.
Sure. (BTW, you mean the first 15 patches, since kretprobe_hash_lock()
becomes static by 16th patch ?)
Here is the result. I got same warning with or without your patch.
I have built the kernel with CONFIG_FUNCTION_TRACER=n and CONFIG_KPROBES_SANITY_TEST=y.
[ 0.269235] PCI: Using configuration type 1 for base access
[ 0.272171] Kprobe smoke test: started
[ 0.281125]
[ 0.281126] ================================
[ 0.281126] WARNING: inconsistent lock state
[ 0.281126] 5.9.0-rc2+ #101 Not tainted
[ 0.281126] --------------------------------
[ 0.281127] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[ 0.281127] swapper/0/1 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 0.281127] ffffffff8228c778 (&rp->lock){....}-{2:2}, at: pre_handler_kretprobe+0x2b/0x1b0
[ 0.281128] {INITIAL USE} state was registered at:
[ 0.281129] lock_acquire+0x9e/0x3c0
[ 0.281129] _raw_spin_lock+0x2f/0x40
[ 0.281129] recycle_rp_inst+0x48/0x90
[ 0.281129] __kretprobe_trampoline_handler+0x15d/0x1e0
[ 0.281130] trampoline_handler+0x43/0x60
[ 0.281130] kretprobe_trampoline+0x2a/0x50
[ 0.281130] kretprobe_trampoline+0x0/0x50
[ 0.281130] init_kprobes+0x1b6/0x1d5
[ 0.281130] do_one_initcall+0x5c/0x315
[ 0.281131] kernel_init_freeable+0x21a/0x25f
[ 0.281131] kernel_init+0x9/0x104
[ 0.281131] ret_from_fork+0x22/0x30
[ 0.281131] irq event stamp: 25978
[ 0.281132] hardirqs last enabled at (25977): [<ffffffff8108a0f7>] queue_delayed_work_on+0x57/0x90
[ 0.281132] hardirqs last disabled at (25978): [<ffffffff818df778>] exc_int3+0x48/0x140
[ 0.281132] softirqs last enabled at (25964): [<ffffffff81c00389>] __do_softirq+0x389/0x48b
[ 0.281133] softirqs last disabled at (25957): [<ffffffff81a00f42>] asm_call_on_stack+0x12/0x20
[ 0.281133]
[ 0.281133] other info that might help us debug this:
[ 0.281133] Possible unsafe locking scenario:
[ 0.281134]
[ 0.281134] CPU0
[ 0.281134] ----
[ 0.281134] lock(&rp->lock);
[ 0.281135] <Interrupt>
[ 0.281135] lock(&rp->lock);
[ 0.281136]
[ 0.281136] *** DEADLOCK ***
[ 0.281136]
[ 0.281136] no locks held by swapper/0/1.
[ 0.281136]
[ 0.281137] stack backtrace:
[ 0.281137] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc2+ #101
[ 0.281137] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
[ 0.281137] Call Trace:
[ 0.281138] dump_stack+0x81/0xba
[ 0.281138] print_usage_bug.cold+0x195/0x19e
[ 0.281138] lock_acquire+0x314/0x3c0
[ 0.281138] ? __text_poke+0x2db/0x400
[ 0.281139] ? pre_handler_kretprobe+0x2b/0x1b0
[ 0.281139] _raw_spin_lock_irqsave+0x3a/0x50
[ 0.281139] ? pre_handler_kretprobe+0x2b/0x1b0
[ 0.281139] pre_handler_kretprobe+0x2b/0x1b0
[ 0.281139] ? stop_machine_from_inactive_cpu+0x120/0x120
[ 0.281140] aggr_pre_handler+0x5f/0xd0
[ 0.281140] kprobe_int3_handler+0x10f/0x160
[ 0.281140] exc_int3+0x52/0x140
[ 0.281140] asm_exc_int3+0x31/0x40
[ 0.281141] RIP: 0010:kprobe_target+0x1/0x10
[ 0.281141] Code: 65 48 33 04 25 28 00 00 00 75 10 48 81 c4 90 00 00 00 44 89 e0 41 5c 5d c3 0f 0b e8 a
[ 0.281141] RSP: 0000:ffffc90000013e48 EFLAGS: 00000246
[ 0.281142] RAX: ffffffff81142130 RBX: 0000000000000001 RCX: ffffc90000013cec
[ 0.281142] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 00000000f3eb0b20
[ 0.281142] RBP: ffffc90000013e68 R08: 0000000000000000 R09: 0000000000000001
[ 0.281143] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 0.281143] R13: ffffffff8224c2b0 R14: ffffffff8255cf60 R15: 0000000000000000
[ 0.281143] ? stop_machine_from_inactive_cpu+0x120/0x120
[ 0.281143] ? kprobe_target+0x1/0x10
[ 0.281144] ? stop_machine_from_inactive_cpu+0x120/0x120
[ 0.281144] ? init_test_probes.cold+0x2e3/0x391
[ 0.281144] init_kprobes+0x1b6/0x1d5
[ 0.281144] ? debugfs_kprobe_init+0xa9/0xa9
[ 0.281145] do_one_initcall+0x5c/0x315
[ 0.281145] ? rcu_read_lock_sched_held+0x4a/0x80
[ 0.281145] kernel_init_freeable+0x21a/0x25f
[ 0.281145] ? rest_init+0x23c/0x23c
[ 0.281145] kernel_init+0x9/0x104
[ 0.281146] ret_from_fork+0x22/0x30
[ 0.281282] Kprobe smoke test: passed successfully
>
> > Of course make it lockless then warning is gone.
> > But even without the lockless patch, this warning can be false-positive
> > because we prohibit nested kprobe call, right?
>
> Yes, because the actual nesting is avoided by kprobe_busy, but lockdep
> can't tell. Lockdep sees a regular lock user and an in-nmi lock user and
> figures that's a bad combination.
Thanks for confirmation!
>
>
> ---
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1241,48 +1241,47 @@ void recycle_rp_inst(struct kretprobe_in
> }
> NOKPROBE_SYMBOL(recycle_rp_inst);
>
> -void kretprobe_hash_lock(struct task_struct *tsk,
> - struct hlist_head **head, unsigned long *flags)
> -__acquires(hlist_lock)
> -{
> - unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
> - raw_spinlock_t *hlist_lock;
> -
> - *head = &kretprobe_inst_table[hash];
> - hlist_lock = kretprobe_table_lock_ptr(hash);
> - raw_spin_lock_irqsave(hlist_lock, *flags);
> -}
> -NOKPROBE_SYMBOL(kretprobe_hash_lock);
> -
> static void kretprobe_table_lock(unsigned long hash,
> unsigned long *flags)
> __acquires(hlist_lock)
> {
> raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
> - raw_spin_lock_irqsave(hlist_lock, *flags);
> + /*
> + * HACK, due to kprobe_busy we'll never actually recurse, make NMI
> + * context use a different lock class to avoid it reporting.
> + */
> + raw_spin_lock_irqsave_nested(hlist_lock, *flags, !!in_nmi());
> }
> NOKPROBE_SYMBOL(kretprobe_table_lock);
>
> -void kretprobe_hash_unlock(struct task_struct *tsk,
> - unsigned long *flags)
> +static void kretprobe_table_unlock(unsigned long hash,
> + unsigned long *flags)
> __releases(hlist_lock)
> {
> + raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
> + raw_spin_unlock_irqrestore(hlist_lock, *flags);
> +}
> +NOKPROBE_SYMBOL(kretprobe_table_unlock);
> +
> +void kretprobe_hash_lock(struct task_struct *tsk,
> + struct hlist_head **head, unsigned long *flags)
> +__acquires(hlist_lock)
> +{
> unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
> - raw_spinlock_t *hlist_lock;
>
> - hlist_lock = kretprobe_table_lock_ptr(hash);
> - raw_spin_unlock_irqrestore(hlist_lock, *flags);
> + *head = &kretprobe_inst_table[hash];
> + kretprobe_table_lock(hash, flags);
> }
> -NOKPROBE_SYMBOL(kretprobe_hash_unlock);
> +NOKPROBE_SYMBOL(kretprobe_hash_lock);
>
> -static void kretprobe_table_unlock(unsigned long hash,
> - unsigned long *flags)
> +void kretprobe_hash_unlock(struct task_struct *tsk,
> + unsigned long *flags)
> __releases(hlist_lock)
> {
> - raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
> - raw_spin_unlock_irqrestore(hlist_lock, *flags);
> + unsigned long hash = hash_ptr(tsk, KPROBE_HASH_BITS);
> + kretprobe_table_unlock(hash, flags);
> }
> -NOKPROBE_SYMBOL(kretprobe_table_unlock);
> +NOKPROBE_SYMBOL(kretprobe_hash_unlock);
>
> struct kprobe kprobe_busy = {
> .addr = (void *) get_kprobe,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-08 15:09 ` Masami Hiramatsu
@ 2020-09-09 5:28 ` Masami Hiramatsu
0 siblings, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-09 5:28 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: peterz, Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck, systemtap
On Wed, 9 Sep 2020 00:09:23 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > Of course make it lockless then warning is gone.
> > > But even without the lockless patch, this warning can be false-positive
> > > because we prohibit nested kprobe call, right?
> >
> > Yes, because the actual nesting is avoided by kprobe_busy, but lockdep
> > can't tell. Lockdep sees a regular lock user and an in-nmi lock user and
> > figures that's a bad combination.
Hmm, what about introducing new LOCK_USED_KPROBE bit, which will be set
if the lock is accessed when the current_kprobe is set (including kprobe_busy)?
This means it is in the kprobe user-handler context. If we access the lock always
in the kprobes context, it is never nested.
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v5 00/21] kprobes: Unify kretprobe trampoline handlers and make kretprobe lockless
2020-09-02 7:02 ` peterz
2020-09-02 8:17 ` Masami Hiramatsu
@ 2020-09-11 2:32 ` Masami Hiramatsu
1 sibling, 0 replies; 51+ messages in thread
From: Masami Hiramatsu @ 2020-09-11 2:32 UTC (permalink / raw)
To: peterz
Cc: Ingo Molnar, linux-kernel, Eddy_Wu, x86, davem, rostedt,
naveen.n.rao, anil.s.keshavamurthy, linux-arch, cameron, oleg,
will, paulmck
Hi Peter and Ingo,
On Wed, 2 Sep 2020 09:02:26 +0200
peterz@infradead.org wrote:
> On Wed, Sep 02, 2020 at 09:37:39AM +0900, Masami Hiramatsu wrote:
> > On Tue, 1 Sep 2020 21:08:08 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > On Sat, Aug 29, 2020 at 09:59:49PM +0900, Masami Hiramatsu wrote:
> > > > Masami Hiramatsu (16):
> > > > kprobes: Add generic kretprobe trampoline handler
> > > > x86/kprobes: Use generic kretprobe trampoline handler
> > > > arm: kprobes: Use generic kretprobe trampoline handler
> > > > arm64: kprobes: Use generic kretprobe trampoline handler
> > > > arc: kprobes: Use generic kretprobe trampoline handler
> > > > csky: kprobes: Use generic kretprobe trampoline handler
> > > > ia64: kprobes: Use generic kretprobe trampoline handler
> > > > mips: kprobes: Use generic kretprobe trampoline handler
> > > > parisc: kprobes: Use generic kretprobe trampoline handler
> > > > powerpc: kprobes: Use generic kretprobe trampoline handler
> > > > s390: kprobes: Use generic kretprobe trampoline handler
> > > > sh: kprobes: Use generic kretprobe trampoline handler
> > > > sparc: kprobes: Use generic kretprobe trampoline handler
> > > > kprobes: Remove NMI context check
> > > > kprobes: Free kretprobe_instance with rcu callback
> > > > kprobes: Make local used functions static
> > > >
> > > > Peter Zijlstra (5):
> > > > llist: Add nonatomic __llist_add() and __llist_dell_all()
> > > > kprobes: Remove kretprobe hash
> > > > asm-generic/atomic: Add try_cmpxchg() fallbacks
> > > > freelist: Lock less freelist
> > > > kprobes: Replace rp->free_instance with freelist
> > >
> > > This looks good to me, do you want me to merge them through -tip? If so,
> > > do we want to try and get them in this release still?
> >
> > Yes, thanks. For the kretprobe missing issue, we will need the first half
> > (up to "kprobes: Remove NMI context check"), so we can split the series
> > if someone think the lockless is still immature.
>
> Ok, but then lockdep will yell at you if you have that enabled and run
> the unoptimized things.
>
> > > Ingo, opinions? This basically fixes a regression cauesd by
> > >
> > > 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
So what would you think of this? I saw the unification part of this series
on the tip/master, but lockless part is not there. This might still keep
lockdep to warn on kretprobes if we disable CONFIG_FUNCTION_TRACER and
optprobe.
If we make the kretprobe lockless, we will remove all locks from in-kernel
kprobe handlers. So at least upstream user will be happy.
Or, do we fix lockdep warning on the spinlocks in kprobe handlers first?
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 51+ messages in thread