linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Detecting kprobes generated code addresses
@ 2016-12-22  6:42 Josh Poimboeuf
  2016-12-25  3:13 ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2016-12-22  6:42 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: linux-kernel, x86

Hi Masami,

I would like to make __kernel_text_address() be able to detect whether
an address belongs to code which was generated by kprobes.  As far as I
can tell, that information seems to be in the 'pages' lists of
kprobe_insn_slots and kprobe_optinsn_slots.  But they seem to be
protected by mutexes.  Do you know if there's a sleep-free way to access
that information?

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Detecting kprobes generated code addresses
  2016-12-22  6:42 Detecting kprobes generated code addresses Josh Poimboeuf
@ 2016-12-25  3:13 ` Masami Hiramatsu
  2016-12-25  6:16   ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-25  3:13 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: linux-kernel, x86

On Thu, 22 Dec 2016 00:42:19 -0600
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> Hi Masami,
> 
> I would like to make __kernel_text_address() be able to detect whether
> an address belongs to code which was generated by kprobes.  As far as I
> can tell, that information seems to be in the 'pages' lists of
> kprobe_insn_slots and kprobe_optinsn_slots.  But they seem to be
> protected by mutexes.

Right. It is currently under mutex because it may kick
page allocation. But I think it is easy to fix that :)

>  Do you know if there's a sleep-free way to access
> that information?

Hmm, no, I couldn't find that yet.

Thanks,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Detecting kprobes generated code addresses
  2016-12-25  3:13 ` Masami Hiramatsu
@ 2016-12-25  6:16   ` Masami Hiramatsu
  2016-12-26  4:30     ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-25  6:16 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Josh Poimboeuf, linux-kernel, x86

On Sun, 25 Dec 2016 12:13:20 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Thu, 22 Dec 2016 00:42:19 -0600
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > Hi Masami,
> > 
> > I would like to make __kernel_text_address() be able to detect whether
> > an address belongs to code which was generated by kprobes.  As far as I
> > can tell, that information seems to be in the 'pages' lists of
> > kprobe_insn_slots and kprobe_optinsn_slots.  But they seem to be
> > protected by mutexes.
> 
> Right. It is currently under mutex because it may kick
> page allocation. But I think it is easy to fix that :)

Hmm, IMHO, it seems that we should add a dummy (auto-generated)
symbol for optprobe trampoline code to kallsyms so that
__kernel_text_address() automatically returns true on it.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Detecting kprobes generated code addresses
  2016-12-25  6:16   ` Masami Hiramatsu
@ 2016-12-26  4:30     ` Masami Hiramatsu
  2016-12-26 14:50       ` [PATCH tip/master] kprobes: extable: Identify kprobes' insn-slots as kernel text area Masami Hiramatsu
                         ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-26  4:30 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Josh Poimboeuf, linux-kernel, x86

On Sun, 25 Dec 2016 15:16:00 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Sun, 25 Dec 2016 12:13:20 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Thu, 22 Dec 2016 00:42:19 -0600
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > > Hi Masami,
> > > 
> > > I would like to make __kernel_text_address() be able to detect whether
> > > an address belongs to code which was generated by kprobes.  As far as I
> > > can tell, that information seems to be in the 'pages' lists of
> > > kprobe_insn_slots and kprobe_optinsn_slots.  But they seem to be
> > > protected by mutexes.
> > 
> > Right. It is currently under mutex because it may kick
> > page allocation. But I think it is easy to fix that :)
> 
> Hmm, IMHO, it seems that we should add a dummy (auto-generated)
> symbol for optprobe trampoline code to kallsyms so that
> __kernel_text_address() automatically returns true on it.

Sorry, I reconsidered this idea and conclude it was overkill.
As same as ftrace does, maybe it is enough to add a check
routine to __kernel_text_address().

Thanks,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH tip/master] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26  4:30     ` Masami Hiramatsu
@ 2016-12-26 14:50       ` Masami Hiramatsu
  2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
  2016-12-27  6:14       ` [PATCH tip/master v3] " Masami Hiramatsu
  2 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-26 14:50 UTC (permalink / raw)
  To: Ingo Molnar, Josh Poimboeuf
  Cc: Masami Hiramatsu, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

Make __kernel_text_address()/kernel_text_address() returns
true if the given address is on a kprobe's instruction slot,
which is generated by kprobes as a trampoline code.
This can help stacktraces to determine the address is on a
text area or not.

To implement this without any sleep in is_kprobe_*_slot(),
this also modify insn_cache page list as a rcu list. It may
increase processing deley (not processing time) for garbage
slot collection, because it requires to wait an additional
rcu grance period when freeing a page from the list.
However, since it is not a hot path, we may not take care of it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>

---
 Hi Josh, could check this patch fixes your issue? It will
 enable unwinder code to validate return address by using
 __kernel_text_address() again.
---
 include/linux/kprobes.h |   14 +++++++++
 kernel/extable.c        |    9 +++++-
 kernel/kprobes.c        |   70 ++++++++++++++++++++++++++++++++++++-----------
 3 files changed, 74 insertions(+), 19 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8f68490..400c5c1 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -281,6 +281,9 @@ struct kprobe_insn_cache {
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
+/* sleep-less address checking routine  */
+extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
+				unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -294,6 +297,11 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 {									\
 	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
 }									\
+									\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}
 
 DEFINE_INSN_CACHE_OPS(insn);
 
@@ -330,7 +338,11 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
-
+#else	/* CONFIG_OPTPROBES */
+static inline bool is_kprobe_optinsn_slot(unsigned long addr)
+{
+	return false;
+}
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
diff --git a/kernel/extable.c b/kernel/extable.c
index e820cce..81c9633 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
+#include <linux/kprobes.h>
 
 #include <asm/sections.h>
 #include <asm/uaccess.h>
@@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_ftrace_trampoline(addr))
 		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
 	 * Give those symbols a chance to be printed in
@@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_module_text_address(addr))
 		return 1;
-	return is_ftrace_trampoline(addr);
+	if (is_ftrace_trampoline(addr))
+		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
+	return 0;
 }
 
 /*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d630954..1bd1c17 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	/* Since the slot array is not protected by rcu, we need a mutex */
 	mutex_lock(&c->mutex);
  retry:
-	list_for_each_entry(kip, &c->pages, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (kip->nused < slots_per_page(c)) {
 			int i;
 			for (i = 0; i < slots_per_page(c); i++) {
@@ -167,6 +169,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 			WARN_ON(1);
 		}
 	}
+	rcu_read_unlock();
 
 	/* If there are any garbage slots, collect it and try again. */
 	if (c->nr_garbage && collect_garbage_slots(c) == 0)
@@ -193,13 +196,15 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->nused = 1;
 	kip->ngarbage = 0;
 	kip->cache = c;
-	list_add(&kip->list, &c->pages);
+	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
 out:
 	mutex_unlock(&c->mutex);
 	return slot;
 }
 
+
+
 /* Return 1 if all garbages are collected, otherwise 0. */
 static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 {
@@ -213,7 +218,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
-			list_del(&kip->list);
+			list_del_rcu(&kip->list);
+			synchronize_rcu();
 			kip->cache->free(kip->insns);
 			kfree(kip);
 		}
@@ -248,29 +254,59 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 		      kprobe_opcode_t *slot, int dirty)
 {
 	struct kprobe_insn_page *kip;
+	long idx;
 
 	mutex_lock(&c->mutex);
-	list_for_each_entry(kip, &c->pages, list) {
-		long idx = ((long)slot - (long)kip->insns) /
-				(c->insn_size * sizeof(kprobe_opcode_t));
-		if (idx >= 0 && idx < slots_per_page(c)) {
-			WARN_ON(kip->slot_used[idx] != SLOT_USED);
-			if (dirty) {
-				kip->slot_used[idx] = SLOT_DIRTY;
-				kip->ngarbage++;
-				if (++c->nr_garbage > slots_per_page(c))
-					collect_garbage_slots(c);
-			} else
-				collect_one_slot(kip, idx);
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		idx = ((long)slot - (long)kip->insns) /
+			(c->insn_size * sizeof(kprobe_opcode_t));
+		if (idx >= 0 && idx < slots_per_page(c))
 			goto out;
-		}
 	}
-	/* Could not free this slot. */
+	/* Could not find this slot. */
 	WARN_ON(1);
+	kip = NULL;
 out:
+	rcu_read_unlock();
+	/* Mark and sweep: this may sleep */
+	if (kip) {
+		/* Check double free */
+		WARN_ON(kip->slot_used[idx] != SLOT_USED);
+		if (dirty) {
+			kip->slot_used[idx] = SLOT_DIRTY;
+			kip->ngarbage++;
+			if (++c->nr_garbage > slots_per_page(c))
+				collect_garbage_slots(c);
+		} else
+			collect_one_slot(kip, idx);
+	}
 	mutex_unlock(&c->mutex);
 }
 
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	struct kprobe_insn_page *kip;
+	bool ret = false;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if (addr >= (unsigned long)kip->insns &&
+		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
+			ret = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_OPTPROBES
 /* For optimized_kprobe buffer */
 struct kprobe_insn_cache kprobe_optinsn_slots = {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH tip/master v2] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26  4:30     ` Masami Hiramatsu
  2016-12-26 14:50       ` [PATCH tip/master] kprobes: extable: Identify kprobes' insn-slots as kernel text area Masami Hiramatsu
@ 2016-12-26 15:34       ` Masami Hiramatsu
  2016-12-26 17:21         ` kbuild test robot
                           ` (2 more replies)
  2016-12-27  6:14       ` [PATCH tip/master v3] " Masami Hiramatsu
  2 siblings, 3 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-26 15:34 UTC (permalink / raw)
  To: Ingo Molnar, Josh Poimboeuf
  Cc: Masami Hiramatsu, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

Make __kernel_text_address()/kernel_text_address() returns
true if the given address is on a kprobe's instruction slot,
which is generated by kprobes as a trampoline code.
This can help stacktraces to determine the address is on a
text area or not.

To implement this without any sleep in is_kprobe_*_slot(),
this also modify insn_cache page list as a rcu list. It may
increase processing deley (not processing time) for garbage
slot collection, because it requires to wait an additional
rcu grance period when freeing a page from the list.
However, since it is not a hot path, we may not take care of it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>

---
 V2: Fix build error when CONFIG_KPROBES=n

 Hi Josh, could check this patch fixes your issue? It will
 enable unwinder code to validate return address by using
 __kernel_text_address() again.
---
 include/linux/kprobes.h |   22 ++++++++++++++-
 kernel/extable.c        |    9 +++++-
 kernel/kprobes.c        |   70 ++++++++++++++++++++++++++++++++++++-----------
 3 files changed, 82 insertions(+), 19 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8f68490..f0496b0 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -281,6 +281,9 @@ struct kprobe_insn_cache {
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
+/* sleep-less address checking routine  */
+extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
+				unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -294,6 +297,11 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 {									\
 	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
 }									\
+									\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}
 
 DEFINE_INSN_CACHE_OPS(insn);
 
@@ -330,7 +338,6 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
-
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
@@ -481,6 +488,19 @@ static inline int enable_jprobe(struct jprobe *jp)
 	return enable_kprobe(&jp->kp);
 }
 
+#ifndef CONFIG_KPROBES
+static inline bool is_kprobe_insn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+#ifndef CONFIG_OPTPROBES
+static inline bool is_kprobe_optinsn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+
 #ifdef CONFIG_KPROBES
 /*
  * Blacklist ganerating macro. Specify functions which is not probed
diff --git a/kernel/extable.c b/kernel/extable.c
index e820cce..81c9633 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
+#include <linux/kprobes.h>
 
 #include <asm/sections.h>
 #include <asm/uaccess.h>
@@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_ftrace_trampoline(addr))
 		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
 	 * Give those symbols a chance to be printed in
@@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_module_text_address(addr))
 		return 1;
-	return is_ftrace_trampoline(addr);
+	if (is_ftrace_trampoline(addr))
+		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
+	return 0;
 }
 
 /*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d630954..1bd1c17 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	/* Since the slot array is not protected by rcu, we need a mutex */
 	mutex_lock(&c->mutex);
  retry:
-	list_for_each_entry(kip, &c->pages, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (kip->nused < slots_per_page(c)) {
 			int i;
 			for (i = 0; i < slots_per_page(c); i++) {
@@ -167,6 +169,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 			WARN_ON(1);
 		}
 	}
+	rcu_read_unlock();
 
 	/* If there are any garbage slots, collect it and try again. */
 	if (c->nr_garbage && collect_garbage_slots(c) == 0)
@@ -193,13 +196,15 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->nused = 1;
 	kip->ngarbage = 0;
 	kip->cache = c;
-	list_add(&kip->list, &c->pages);
+	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
 out:
 	mutex_unlock(&c->mutex);
 	return slot;
 }
 
+
+
 /* Return 1 if all garbages are collected, otherwise 0. */
 static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 {
@@ -213,7 +218,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
-			list_del(&kip->list);
+			list_del_rcu(&kip->list);
+			synchronize_rcu();
 			kip->cache->free(kip->insns);
 			kfree(kip);
 		}
@@ -248,29 +254,59 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 		      kprobe_opcode_t *slot, int dirty)
 {
 	struct kprobe_insn_page *kip;
+	long idx;
 
 	mutex_lock(&c->mutex);
-	list_for_each_entry(kip, &c->pages, list) {
-		long idx = ((long)slot - (long)kip->insns) /
-				(c->insn_size * sizeof(kprobe_opcode_t));
-		if (idx >= 0 && idx < slots_per_page(c)) {
-			WARN_ON(kip->slot_used[idx] != SLOT_USED);
-			if (dirty) {
-				kip->slot_used[idx] = SLOT_DIRTY;
-				kip->ngarbage++;
-				if (++c->nr_garbage > slots_per_page(c))
-					collect_garbage_slots(c);
-			} else
-				collect_one_slot(kip, idx);
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		idx = ((long)slot - (long)kip->insns) /
+			(c->insn_size * sizeof(kprobe_opcode_t));
+		if (idx >= 0 && idx < slots_per_page(c))
 			goto out;
-		}
 	}
-	/* Could not free this slot. */
+	/* Could not find this slot. */
 	WARN_ON(1);
+	kip = NULL;
 out:
+	rcu_read_unlock();
+	/* Mark and sweep: this may sleep */
+	if (kip) {
+		/* Check double free */
+		WARN_ON(kip->slot_used[idx] != SLOT_USED);
+		if (dirty) {
+			kip->slot_used[idx] = SLOT_DIRTY;
+			kip->ngarbage++;
+			if (++c->nr_garbage > slots_per_page(c))
+				collect_garbage_slots(c);
+		} else
+			collect_one_slot(kip, idx);
+	}
 	mutex_unlock(&c->mutex);
 }
 
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	struct kprobe_insn_page *kip;
+	bool ret = false;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if (addr >= (unsigned long)kip->insns &&
+		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
+			ret = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_OPTPROBES
 /* For optimized_kprobe buffer */
 struct kprobe_insn_cache kprobe_optinsn_slots = {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v2] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
@ 2016-12-26 17:21         ` kbuild test robot
  2016-12-26 17:46         ` kbuild test robot
  2016-12-27  6:13         ` Masami Hiramatsu
  2 siblings, 0 replies; 23+ messages in thread
From: kbuild test robot @ 2016-12-26 17:21 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: kbuild-all, Ingo Molnar, Josh Poimboeuf, Masami Hiramatsu,
	linux-kernel, Peter Zijlstra, Ananth N Mavinakayanahalli,
	Thomas Gleixner, H . Peter Anvin, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

Hi Masami,

[auto build test ERROR on tip/master]

url:    https://github.com/0day-ci/linux/commits/Masami-Hiramatsu/kprobes-extable-Identify-kprobes-insn-slots-as-kernel-text-area/20161226-233830
config: sparc64-defconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   kernel/built-in.o: In function `__kernel_text_address':
>> (.text+0x202b0): undefined reference to `kprobe_insn_slots'
   kernel/built-in.o: In function `__kernel_text_address':
>> (.text+0x202b8): undefined reference to `__is_insn_slot_addr'
   kernel/built-in.o: In function `__kernel_text_address':
   (.text+0x202bc): undefined reference to `kprobe_insn_slots'
   kernel/built-in.o: In function `kernel_text_address':
   (.text+0x20374): undefined reference to `kprobe_insn_slots'
   kernel/built-in.o: In function `kernel_text_address':
   (.text+0x2037c): undefined reference to `__is_insn_slot_addr'
   kernel/built-in.o: In function `kernel_text_address':
   (.text+0x20380): undefined reference to `kprobe_insn_slots'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 17350 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v2] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
  2016-12-26 17:21         ` kbuild test robot
@ 2016-12-26 17:46         ` kbuild test robot
  2016-12-27  6:13         ` Masami Hiramatsu
  2 siblings, 0 replies; 23+ messages in thread
From: kbuild test robot @ 2016-12-26 17:46 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: kbuild-all, Ingo Molnar, Josh Poimboeuf, Masami Hiramatsu,
	linux-kernel, Peter Zijlstra, Ananth N Mavinakayanahalli,
	Thomas Gleixner, H . Peter Anvin, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 2028 bytes --]

Hi Masami,

[auto build test ERROR on tip/master]

url:    https://github.com/0day-ci/linux/commits/Masami-Hiramatsu/kprobes-extable-Identify-kprobes-insn-slots-as-kernel-text-area/20161226-233830
config: sh-r7785rp_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sh 

All errors (new ones prefixed by >>):

   kernel/built-in.o: In function `init_kernel_text':
>> kernel/extable.c:64: undefined reference to `__is_insn_slot_addr'
>> kernel/extable.c:64: undefined reference to `kprobe_insn_slots'
   kernel/built-in.o: In function `kernel_text_address':
   kernel/extable.c:134: undefined reference to `__is_insn_slot_addr'
   kernel/extable.c:134: undefined reference to `kprobe_insn_slots'

vim +64 kernel/extable.c

^1da177e Linus Torvalds 2005-04-16  58  		e = search_module_extables(addr);
^1da177e Linus Torvalds 2005-04-16  59  	return e;
^1da177e Linus Torvalds 2005-04-16  60  }
^1da177e Linus Torvalds 2005-04-16  61  
4a44bac1 Ingo Molnar    2009-03-19  62  static inline int init_kernel_text(unsigned long addr)
4a44bac1 Ingo Molnar    2009-03-19  63  {
4a44bac1 Ingo Molnar    2009-03-19 @64  	if (addr >= (unsigned long)_sinittext &&
5ecbe3c3 Helge Deller   2013-11-28  65  	    addr < (unsigned long)_einittext)
4a44bac1 Ingo Molnar    2009-03-19  66  		return 1;
4a44bac1 Ingo Molnar    2009-03-19  67  	return 0;

:::::: The code at line 64 was first introduced by commit
:::::: 4a44bac1f98223ed77e47bf3b42fcfd10cddd85f symbols, stacktrace: look up init symbols after module symbols

:::::: TO: Ingo Molnar <mingo@elte.hu>
:::::: CC: Ingo Molnar <mingo@elte.hu>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 15772 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v2] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
  2016-12-26 17:21         ` kbuild test robot
  2016-12-26 17:46         ` kbuild test robot
@ 2016-12-27  6:13         ` Masami Hiramatsu
  2 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-27  6:13 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Josh Poimboeuf, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

Oops, I found I missed an rcu_read_unlock on the fast path...
I'll send v3.

Thanks,


On Tue, 27 Dec 2016 00:34:20 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Make __kernel_text_address()/kernel_text_address() returns
> true if the given address is on a kprobe's instruction slot,
> which is generated by kprobes as a trampoline code.
> This can help stacktraces to determine the address is on a
> text area or not.
> 
> To implement this without any sleep in is_kprobe_*_slot(),
> this also modify insn_cache page list as a rcu list. It may
> increase processing deley (not processing time) for garbage
> slot collection, because it requires to wait an additional
> rcu grance period when freeing a page from the list.
> However, since it is not a hot path, we may not take care of it.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> 
> ---
>  V2: Fix build error when CONFIG_KPROBES=n
> 
>  Hi Josh, could check this patch fixes your issue? It will
>  enable unwinder code to validate return address by using
>  __kernel_text_address() again.
> ---
>  include/linux/kprobes.h |   22 ++++++++++++++-
>  kernel/extable.c        |    9 +++++-
>  kernel/kprobes.c        |   70 ++++++++++++++++++++++++++++++++++++-----------
>  3 files changed, 82 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index 8f68490..f0496b0 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -281,6 +281,9 @@ struct kprobe_insn_cache {
>  extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
>  extern void __free_insn_slot(struct kprobe_insn_cache *c,
>  			     kprobe_opcode_t *slot, int dirty);
> +/* sleep-less address checking routine  */
> +extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
> +				unsigned long addr);
>  
>  #define DEFINE_INSN_CACHE_OPS(__name)					\
>  extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
> @@ -294,6 +297,11 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
>  {									\
>  	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
>  }									\
> +									\
> +static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
> +{									\
> +	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
> +}
>  
>  DEFINE_INSN_CACHE_OPS(insn);
>  
> @@ -330,7 +338,6 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
>  					     int write, void __user *buffer,
>  					     size_t *length, loff_t *ppos);
>  #endif
> -
>  #endif /* CONFIG_OPTPROBES */
>  #ifdef CONFIG_KPROBES_ON_FTRACE
>  extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> @@ -481,6 +488,19 @@ static inline int enable_jprobe(struct jprobe *jp)
>  	return enable_kprobe(&jp->kp);
>  }
>  
> +#ifndef CONFIG_KPROBES
> +static inline bool is_kprobe_insn_slot(unsigned long addr)
> +{
> +	return false;
> +}
> +#endif
> +#ifndef CONFIG_OPTPROBES
> +static inline bool is_kprobe_optinsn_slot(unsigned long addr)
> +{
> +	return false;
> +}
> +#endif
> +
>  #ifdef CONFIG_KPROBES
>  /*
>   * Blacklist ganerating macro. Specify functions which is not probed
> diff --git a/kernel/extable.c b/kernel/extable.c
> index e820cce..81c9633 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -20,6 +20,7 @@
>  #include <linux/module.h>
>  #include <linux/mutex.h>
>  #include <linux/init.h>
> +#include <linux/kprobes.h>
>  
>  #include <asm/sections.h>
>  #include <asm/uaccess.h>
> @@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
>  		return 1;
>  	if (is_ftrace_trampoline(addr))
>  		return 1;
> +	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> +		return 1;
>  	/*
>  	 * There might be init symbols in saved stacktraces.
>  	 * Give those symbols a chance to be printed in
> @@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
>  		return 1;
>  	if (is_module_text_address(addr))
>  		return 1;
> -	return is_ftrace_trampoline(addr);
> +	if (is_ftrace_trampoline(addr))
> +		return 1;
> +	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> +		return 1;
> +	return 0;
>  }
>  
>  /*
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d630954..1bd1c17 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
>  	struct kprobe_insn_page *kip;
>  	kprobe_opcode_t *slot = NULL;
>  
> +	/* Since the slot array is not protected by rcu, we need a mutex */
>  	mutex_lock(&c->mutex);
>   retry:
> -	list_for_each_entry(kip, &c->pages, list) {
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(kip, &c->pages, list) {
>  		if (kip->nused < slots_per_page(c)) {
>  			int i;
>  			for (i = 0; i < slots_per_page(c); i++) {
> @@ -167,6 +169,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
>  			WARN_ON(1);
>  		}
>  	}
> +	rcu_read_unlock();
>  
>  	/* If there are any garbage slots, collect it and try again. */
>  	if (c->nr_garbage && collect_garbage_slots(c) == 0)
> @@ -193,13 +196,15 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
>  	kip->nused = 1;
>  	kip->ngarbage = 0;
>  	kip->cache = c;
> -	list_add(&kip->list, &c->pages);
> +	list_add_rcu(&kip->list, &c->pages);
>  	slot = kip->insns;
>  out:
>  	mutex_unlock(&c->mutex);
>  	return slot;
>  }
>  
> +
> +
>  /* Return 1 if all garbages are collected, otherwise 0. */
>  static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
>  {
> @@ -213,7 +218,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
>  		 * next time somebody inserts a probe.
>  		 */
>  		if (!list_is_singular(&kip->list)) {
> -			list_del(&kip->list);
> +			list_del_rcu(&kip->list);
> +			synchronize_rcu();
>  			kip->cache->free(kip->insns);
>  			kfree(kip);
>  		}
> @@ -248,29 +254,59 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
>  		      kprobe_opcode_t *slot, int dirty)
>  {
>  	struct kprobe_insn_page *kip;
> +	long idx;
>  
>  	mutex_lock(&c->mutex);
> -	list_for_each_entry(kip, &c->pages, list) {
> -		long idx = ((long)slot - (long)kip->insns) /
> -				(c->insn_size * sizeof(kprobe_opcode_t));
> -		if (idx >= 0 && idx < slots_per_page(c)) {
> -			WARN_ON(kip->slot_used[idx] != SLOT_USED);
> -			if (dirty) {
> -				kip->slot_used[idx] = SLOT_DIRTY;
> -				kip->ngarbage++;
> -				if (++c->nr_garbage > slots_per_page(c))
> -					collect_garbage_slots(c);
> -			} else
> -				collect_one_slot(kip, idx);
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(kip, &c->pages, list) {
> +		idx = ((long)slot - (long)kip->insns) /
> +			(c->insn_size * sizeof(kprobe_opcode_t));
> +		if (idx >= 0 && idx < slots_per_page(c))
>  			goto out;
> -		}
>  	}
> -	/* Could not free this slot. */
> +	/* Could not find this slot. */
>  	WARN_ON(1);
> +	kip = NULL;
>  out:
> +	rcu_read_unlock();
> +	/* Mark and sweep: this may sleep */
> +	if (kip) {
> +		/* Check double free */
> +		WARN_ON(kip->slot_used[idx] != SLOT_USED);
> +		if (dirty) {
> +			kip->slot_used[idx] = SLOT_DIRTY;
> +			kip->ngarbage++;
> +			if (++c->nr_garbage > slots_per_page(c))
> +				collect_garbage_slots(c);
> +		} else
> +			collect_one_slot(kip, idx);
> +	}
>  	mutex_unlock(&c->mutex);
>  }
>  
> +/*
> + * Check given address is on the page of kprobe instruction slots.
> + * This will be used for checking whether the address on a stack
> + * is on a text area or not.
> + */
> +bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
> +{
> +	struct kprobe_insn_page *kip;
> +	bool ret = false;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(kip, &c->pages, list) {
> +		if (addr >= (unsigned long)kip->insns &&
> +		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
> +			ret = true;
> +			break;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return ret;
> +}
> +
>  #ifdef CONFIG_OPTPROBES
>  /* For optimized_kprobe buffer */
>  struct kprobe_insn_cache kprobe_optinsn_slots = {
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-26  4:30     ` Masami Hiramatsu
  2016-12-26 14:50       ` [PATCH tip/master] kprobes: extable: Identify kprobes' insn-slots as kernel text area Masami Hiramatsu
  2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
@ 2016-12-27  6:14       ` Masami Hiramatsu
  2017-01-03 10:54         ` Peter Zijlstra
  2 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2016-12-27  6:14 UTC (permalink / raw)
  To: Ingo Molnar, Josh Poimboeuf
  Cc: Masami Hiramatsu, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

Make __kernel_text_address()/kernel_text_address() returns
true if the given address is on a kprobe's instruction slot,
which is generated by kprobes as a trampoline code.
This can help stacktraces to determine the address is on a
text area or not.

To implement this without any sleep in is_kprobe_*_slot(),
this also modify insn_cache page list as a rcu list. It may
increase processing deley (not processing time) for garbage
slot collection, because it requires to wait an additional
rcu grance period when freeing a page from the list.
However, since it is not a hot path, we may not take care of it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>

---
 V3:
  - Fix build error on archs which don't need insn_slot
     (e.g. sh, sparc).
  - Fix a missed rcu_read_unlock() in fast path of __get_insn_slot.
---
 include/linux/kprobes.h |   30 ++++++++++++++++++++
 kernel/extable.c        |    9 +++++-
 kernel/kprobes.c        |   69 +++++++++++++++++++++++++++++++++++------------
 3 files changed, 89 insertions(+), 19 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8f68490..16ddfb8 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -278,9 +278,13 @@ struct kprobe_insn_cache {
 	int nr_garbage;
 };
 
+#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
+/* sleep-less address checking routine  */
+extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
+				unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -294,6 +298,18 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 {									\
 	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
 }									\
+									\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}
+#else /* __ARCH_WANT_KPROBES_INSN_SLOT */
+#define DEFINE_INSN_CACHE_OPS(__name)					\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return 0;							\
+}
+#endif
 
 DEFINE_INSN_CACHE_OPS(insn);
 
@@ -330,7 +346,6 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
-
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
@@ -481,6 +496,19 @@ static inline int enable_jprobe(struct jprobe *jp)
 	return enable_kprobe(&jp->kp);
 }
 
+#ifndef CONFIG_KPROBES
+static inline bool is_kprobe_insn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+#ifndef CONFIG_OPTPROBES
+static inline bool is_kprobe_optinsn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+
 #ifdef CONFIG_KPROBES
 /*
  * Blacklist ganerating macro. Specify functions which is not probed
diff --git a/kernel/extable.c b/kernel/extable.c
index e820cce..81c9633 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
+#include <linux/kprobes.h>
 
 #include <asm/sections.h>
 #include <asm/uaccess.h>
@@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_ftrace_trampoline(addr))
 		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
 	 * Give those symbols a chance to be printed in
@@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_module_text_address(addr))
 		return 1;
-	return is_ftrace_trampoline(addr);
+	if (is_ftrace_trampoline(addr))
+		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
+	return 0;
 }
 
 /*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d630954..be41f6d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	/* Since the slot array is not protected by rcu, we need a mutex */
 	mutex_lock(&c->mutex);
  retry:
-	list_for_each_entry(kip, &c->pages, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (kip->nused < slots_per_page(c)) {
 			int i;
 			for (i = 0; i < slots_per_page(c); i++) {
@@ -159,6 +161,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 					kip->slot_used[i] = SLOT_USED;
 					kip->nused++;
 					slot = kip->insns + (i * c->insn_size);
+					rcu_read_unlock();
 					goto out;
 				}
 			}
@@ -167,6 +170,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 			WARN_ON(1);
 		}
 	}
+	rcu_read_unlock();
 
 	/* If there are any garbage slots, collect it and try again. */
 	if (c->nr_garbage && collect_garbage_slots(c) == 0)
@@ -193,7 +197,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->nused = 1;
 	kip->ngarbage = 0;
 	kip->cache = c;
-	list_add(&kip->list, &c->pages);
+	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
 out:
 	mutex_unlock(&c->mutex);
@@ -213,7 +217,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
-			list_del(&kip->list);
+			list_del_rcu(&kip->list);
+			synchronize_rcu();
 			kip->cache->free(kip->insns);
 			kfree(kip);
 		}
@@ -248,29 +253,59 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 		      kprobe_opcode_t *slot, int dirty)
 {
 	struct kprobe_insn_page *kip;
+	long idx;
 
 	mutex_lock(&c->mutex);
-	list_for_each_entry(kip, &c->pages, list) {
-		long idx = ((long)slot - (long)kip->insns) /
-				(c->insn_size * sizeof(kprobe_opcode_t));
-		if (idx >= 0 && idx < slots_per_page(c)) {
-			WARN_ON(kip->slot_used[idx] != SLOT_USED);
-			if (dirty) {
-				kip->slot_used[idx] = SLOT_DIRTY;
-				kip->ngarbage++;
-				if (++c->nr_garbage > slots_per_page(c))
-					collect_garbage_slots(c);
-			} else
-				collect_one_slot(kip, idx);
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		idx = ((long)slot - (long)kip->insns) /
+			(c->insn_size * sizeof(kprobe_opcode_t));
+		if (idx >= 0 && idx < slots_per_page(c))
 			goto out;
-		}
 	}
-	/* Could not free this slot. */
+	/* Could not find this slot. */
 	WARN_ON(1);
+	kip = NULL;
 out:
+	rcu_read_unlock();
+	/* Mark and sweep: this may sleep */
+	if (kip) {
+		/* Check double free */
+		WARN_ON(kip->slot_used[idx] != SLOT_USED);
+		if (dirty) {
+			kip->slot_used[idx] = SLOT_DIRTY;
+			kip->ngarbage++;
+			if (++c->nr_garbage > slots_per_page(c))
+				collect_garbage_slots(c);
+		} else
+			collect_one_slot(kip, idx);
+	}
 	mutex_unlock(&c->mutex);
 }
 
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	struct kprobe_insn_page *kip;
+	bool ret = false;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if (addr >= (unsigned long)kip->insns &&
+		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
+			ret = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_OPTPROBES
 /* For optimized_kprobe buffer */
 struct kprobe_insn_cache kprobe_optinsn_slots = {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2016-12-27  6:14       ` [PATCH tip/master v3] " Masami Hiramatsu
@ 2017-01-03 10:54         ` Peter Zijlstra
  2017-01-04  5:06           ` Masami Hiramatsu
  2017-01-04 14:21           ` [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots " Steven Rostedt
  0 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2017-01-03 10:54 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Steven Rostedt

On Tue, Dec 27, 2016 at 03:14:10PM +0900, Masami Hiramatsu wrote:

> diff --git a/kernel/extable.c b/kernel/extable.c
> index e820cce..81c9633 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c

> @@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
>  		return 1;
>  	if (is_module_text_address(addr))
>  		return 1;
> -	return is_ftrace_trampoline(addr);
> +	if (is_ftrace_trampoline(addr))
> +		return 1;
> +	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> +		return 1;
> +	return 0;
>  }

> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d630954..be41f6d 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c

> +/*
> + * Check given address is on the page of kprobe instruction slots.
> + * This will be used for checking whether the address on a stack
> + * is on a text area or not.
> + */
> +bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
> +{
> +	struct kprobe_insn_page *kip;
> +	bool ret = false;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(kip, &c->pages, list) {
> +		if (addr >= (unsigned long)kip->insns &&
> +		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
> +			ret = true;
> +			break;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return ret;
> +}

How many entries should one expect on that list? I spend quite a bit of
time reducing the cost of is_module_text_address() a while back and see
that both ftrace (which actually needs this to be fast) and now
kprobes have linear list walks in here.

I'm assuming the ftrace thing to be mostly empty, since I never saw it
on my benchmarks back then, but it is something Steve should look at I
suppose.

Similarly, the changelog here should include some talk about worst case
costs.

FWIW, see commit: 93c2e105f6bc ("module: Optimize __module_address() using a latched RB-tree")

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-03 10:54         ` Peter Zijlstra
@ 2017-01-04  5:06           ` Masami Hiramatsu
  2017-01-04 10:01             ` Peter Zijlstra
  2017-01-04 14:21           ` [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots " Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-04  5:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Steven Rostedt

On Tue, 3 Jan 2017 11:54:02 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Dec 27, 2016 at 03:14:10PM +0900, Masami Hiramatsu wrote:
> 
> > diff --git a/kernel/extable.c b/kernel/extable.c
> > index e820cce..81c9633 100644
> > --- a/kernel/extable.c
> > +++ b/kernel/extable.c
> 
> > @@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
> >  		return 1;
> >  	if (is_module_text_address(addr))
> >  		return 1;
> > -	return is_ftrace_trampoline(addr);
> > +	if (is_ftrace_trampoline(addr))
> > +		return 1;
> > +	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> > +		return 1;
> > +	return 0;
> >  }
> 
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index d630954..be41f6d 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> 
> > +/*
> > + * Check given address is on the page of kprobe instruction slots.
> > + * This will be used for checking whether the address on a stack
> > + * is on a text area or not.
> > + */
> > +bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
> > +{
> > +	struct kprobe_insn_page *kip;
> > +	bool ret = false;
> > +
> > +	rcu_read_lock();
> > +	list_for_each_entry_rcu(kip, &c->pages, list) {
> > +		if (addr >= (unsigned long)kip->insns &&
> > +		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
> > +			ret = true;
> > +			break;
> > +		}
> > +	}
> > +	rcu_read_unlock();
> > +
> > +	return ret;
> > +}
> 
> How many entries should one expect on that list? I spend quite a bit of
> time reducing the cost of is_module_text_address() a while back and see
> that both ftrace (which actually needs this to be fast) and now
> kprobes have linear list walks in here.

It depends on how many probes are used and optimized. However, in most
cases, there should be one entry (unless user defines optimized probes
over 32 on x86, from my experience, it is very rare case. :) )

> I'm assuming the ftrace thing to be mostly empty, since I never saw it
> on my benchmarks back then, but it is something Steve should look at I
> suppose.
> 
> Similarly, the changelog here should include some talk about worst case
> costs.

Would you have any good benchmark to measure it?

Thanks,

> 
> FWIW, see commit: 93c2e105f6bc ("module: Optimize __module_address() using a latched RB-tree")


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-04  5:06           ` Masami Hiramatsu
@ 2017-01-04 10:01             ` Peter Zijlstra
  2017-01-08  4:22               ` Masami Hiramatsu
  2017-01-08 14:58               ` [PATCH tip/master v4] " Masami Hiramatsu
  0 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2017-01-04 10:01 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Steven Rostedt

On Wed, Jan 04, 2017 at 02:06:04PM +0900, Masami Hiramatsu wrote:
> On Tue, 3 Jan 2017 11:54:02 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:

> > How many entries should one expect on that list? I spend quite a bit of
> > time reducing the cost of is_module_text_address() a while back and see
> > that both ftrace (which actually needs this to be fast) and now
> > kprobes have linear list walks in here.
> 
> It depends on how many probes are used and optimized. However, in most
> cases, there should be one entry (unless user defines optimized probes
> over 32 on x86, from my experience, it is very rare case. :) )

OK, that's good :-)

> > I'm assuming the ftrace thing to be mostly empty, since I never saw it
> > on my benchmarks back then, but it is something Steve should look at I
> > suppose.
> > 
> > Similarly, the changelog here should include some talk about worst case
> > costs.
> 
> Would you have any good benchmark to measure it?

Not trivially so; what I did was cobble together a debugfs file that
measures the average of the PMI time in perf_sample_event_took(), and a
module that has a 10 deep callchain around a while(1) loop. Then perf
record with callchains for a few seconds.

Generating the callchain does the unwinder thing and ends up calling
is_kernel_address() lots.

The case I worked on was 0 modules vs 100+ modules in a distro build,
which was fairly obviously painful back then, since
is_module_text_address() used a linear lookup.

I'm not sure I still have all those bits, but I can dig around a bit if
you're interested.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-03 10:54         ` Peter Zijlstra
  2017-01-04  5:06           ` Masami Hiramatsu
@ 2017-01-04 14:21           ` Steven Rostedt
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2017-01-04 14:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

On Tue, 3 Jan 2017 11:54:02 +0100
Peter Zijlstra <peterz@infradead.org> wrote:


> How many entries should one expect on that list? I spend quite a bit of
> time reducing the cost of is_module_text_address() a while back and see
> that both ftrace (which actually needs this to be fast) and now
> kprobes have linear list walks in here.
> 
> I'm assuming the ftrace thing to be mostly empty, since I never saw it
> on my benchmarks back then, but it is something Steve should look at I
> suppose.

Yeah, that do_for_each_ftrace_op() loop iterates all the users that
have connected to function tracing. Which in your case is probably at
most 2 (one for ftrace and one for perf). You could get more by
creating instances and enabling function tracing there too. Oh, and the
stack tracer could add its own.

Hmm, with the addition of live kernel patching, this list could get
larger. As you can have one per updated function. But heh, that's just
part of the overhead for live patching ;-)

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-04 10:01             ` Peter Zijlstra
@ 2017-01-08  4:22               ` Masami Hiramatsu
  2017-01-08 12:31                 ` Masami Hiramatsu
  2017-01-08 14:58               ` [PATCH tip/master v4] " Masami Hiramatsu
  1 sibling, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-08  4:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Steven Rostedt

On Wed, 4 Jan 2017 11:01:02 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Jan 04, 2017 at 02:06:04PM +0900, Masami Hiramatsu wrote:
> > On Tue, 3 Jan 2017 11:54:02 +0100
> > Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > How many entries should one expect on that list? I spend quite a bit of
> > > time reducing the cost of is_module_text_address() a while back and see
> > > that both ftrace (which actually needs this to be fast) and now
> > > kprobes have linear list walks in here.
> > 
> > It depends on how many probes are used and optimized. However, in most
> > cases, there should be one entry (unless user defines optimized probes
> > over 32 on x86, from my experience, it is very rare case. :) )
> 
> OK, that's good :-)

OK, I'll add above comment on the patch.

> 
> > > I'm assuming the ftrace thing to be mostly empty, since I never saw it
> > > on my benchmarks back then, but it is something Steve should look at I
> > > suppose.
> > > 
> > > Similarly, the changelog here should include some talk about worst case
> > > costs.
> > 
> > Would you have any good benchmark to measure it?
> 
> Not trivially so; what I did was cobble together a debugfs file that
> measures the average of the PMI time in perf_sample_event_took(), and a
> module that has a 10 deep callchain around a while(1) loop. Then perf
> record with callchains for a few seconds.
> 
> Generating the callchain does the unwinder thing and ends up calling
> is_kernel_address() lots.
> 
> The case I worked on was 0 modules vs 100+ modules in a distro build,
> which was fairly obviously painful back then, since
> is_module_text_address() used a linear lookup.
> 
> I'm not sure I still have all those bits, but I can dig around a bit if
> you're interested.

Hmm, I tried to do similar thing (make a test module which has a loop with
10 deep recursive call and save stack-trace) on kvm, but got very unstable
results.
Maybe it needs to run on bare-metal machine.

Thanks,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-08  4:22               ` Masami Hiramatsu
@ 2017-01-08 12:31                 ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-08 12:31 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Josh Poimboeuf, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 5592 bytes --]

Hello,

On Sun, 8 Jan 2017 13:22:42 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > Would you have any good benchmark to measure it?
> > 
> > Not trivially so; what I did was cobble together a debugfs file that
> > measures the average of the PMI time in perf_sample_event_took(), and a
> > module that has a 10 deep callchain around a while(1) loop. Then perf
> > record with callchains for a few seconds.
> > 
> > Generating the callchain does the unwinder thing and ends up calling
> > is_kernel_address() lots.
> > 
> > The case I worked on was 0 modules vs 100+ modules in a distro build,
> > which was fairly obviously painful back then, since
> > is_module_text_address() used a linear lookup.
> > 
> > I'm not sure I still have all those bits, but I can dig around a bit if
> > you're interested.
> 
> Hmm, I tried to do similar thing (make a test module which has a loop with
> 10 deep recursive call and save stack-trace) on kvm, but got very unstable
> results.
> Maybe it needs to run on bare-metal machine.

On bare-metal machine, I tried to use attached module to measure
actual performance degrade.

benchmarks with no kprobeq
$ lsmod | wc -l
123

$ for i in {0..10}; do sudo insmod ./unwind_bench.ko && sudo rmmod unwind_bench;done

[  245.639150] Start benchmarking.
[  245.639353] End benchmarking. Elapsed: 620579 cycles, loop count:1000
[  277.481621] Start benchmarking.
[  277.481821] End benchmarking. Elapsed: 609523 cycles, loop count:1000
[  302.210924] Start benchmarking.
[  302.211077] End benchmarking. Elapsed: 441272 cycles, loop count:1000
[  302.242931] Start benchmarking.
[  302.243070] End benchmarking. Elapsed: 401676 cycles, loop count:1000
[  302.270682] Start benchmarking.
[  302.270762] End benchmarking. Elapsed: 242776 cycles, loop count:1000
[  302.294656] Start benchmarking.
[  302.294737] End benchmarking. Elapsed: 247352 cycles, loop count:1000
[  302.318517] Start benchmarking.
[  302.318600] End benchmarking. Elapsed: 255048 cycles, loop count:1000
[  302.344519] Start benchmarking.
[  302.344599] End benchmarking. Elapsed: 242392 cycles, loop count:1000
[  302.369450] Start benchmarking.
[  302.369530] End benchmarking. Elapsed: 242628 cycles, loop count:1000
[  302.399517] Start benchmarking.
[  302.399605] End benchmarking. Elapsed: 268736 cycles, loop count:1000
[  302.423508] Start benchmarking.
[  302.423587] End benchmarking. Elapsed: 242068 cycles, loop count:1000
[  302.451282] Start benchmarking.
[  302.451361] End benchmarking. Elapsed: 242272 cycles, loop count:1000
[  302.480656] Start benchmarking.
[  302.480736] End benchmarking. Elapsed: 243552 cycles, loop count:1000

benchmarks with disabled 106 kprobes (only 89 probes are optimized)

# grep "[tT] audit_*" /proc/kallsyms | while read a b c d; do echo p $c+5 >> kprobe_events ;done 
# cat ../kprobes/list  | wc -l
106
# echo 1 > events/enable 
# grep OPTIMIZED ../kprobes/list  | wc -l
89

[ 1037.065151] Start benchmarking.
[ 1037.065500] End benchmarking. Elapsed: 1067196 cycles, loop count:1000
[ 1037.113893] Start benchmarking.
[ 1037.114766] End benchmarking. Elapsed: 2667646 cycles, loop count:1000
[ 1037.149251] Start benchmarking.
[ 1037.149382] End benchmarking. Elapsed: 402500 cycles, loop count:1000
[ 1037.180287] Start benchmarking.
[ 1037.180368] End benchmarking. Elapsed: 246000 cycles, loop count:1000
[ 1037.208416] Start benchmarking.
[ 1037.208503] End benchmarking. Elapsed: 266768 cycles, loop count:1000
[ 1037.236328] Start benchmarking.
[ 1037.236407] End benchmarking. Elapsed: 242396 cycles, loop count:1000
[ 1037.270207] Start benchmarking.
[ 1037.270288] End benchmarking. Elapsed: 246384 cycles, loop count:1000
[ 1037.302086] Start benchmarking.
[ 1037.302174] End benchmarking. Elapsed: 266796 cycles, loop count:1000
[ 1037.330165] Start benchmarking.
[ 1037.330245] End benchmarking. Elapsed: 242888 cycles, loop count:1000
[ 1037.358207] Start benchmarking.
[ 1037.358287] End benchmarking. Elapsed: 243288 cycles, loop count:1000
[ 1037.388134] Start benchmarking.
[ 1037.388213] End benchmarking. Elapsed: 240936 cycles, loop count:1000

FYI, I also tried same benchmark without patch, but it was very unstable.
Benchmarks without patch:

[ 4583.832368] Start benchmarking.
[ 4583.832943] End benchmarking. Elapsed: 1758512 cycles, loop count:1000
[ 4712.636096] Start benchmarking.
[ 4712.636712] End benchmarking. Elapsed: 1881112 cycles, loop count:1000
[ 4712.668672] Start benchmarking.
[ 4712.669254] End benchmarking. Elapsed: 1777584 cycles, loop count:1000
[ 4712.696150] Start benchmarking.
[ 4712.696925] End benchmarking. Elapsed: 2324292 cycles, loop count:1000
[ 4712.727850] Start benchmarking.
[ 4712.727988] End benchmarking. Elapsed: 422768 cycles, loop count:1000
[ 4712.756108] Start benchmarking.
[ 4712.756684] End benchmarking. Elapsed: 1755676 cycles, loop count:1000
[ 4712.780775] Start benchmarking.
[ 4712.781347] End benchmarking. Elapsed: 1749872 cycles, loop count:1000
[ 4712.811910] Start benchmarking.
[ 4712.812510] End benchmarking. Elapsed: 1825972 cycles, loop count:1000
[ 4712.844676] Start benchmarking.
[ 4712.845270] End benchmarking. Elapsed: 1815532 cycles, loop count:1000
[ 4712.876057] Start benchmarking.
[ 4712.876203] End benchmarking. Elapsed: 421700 cycles, loop count:1000
[ 4712.908149] Start benchmarking.
[ 4712.908741] End benchmarking. Elapsed: 1777924 cycles, loop count:1000
[ 4712.943254] Start benchmarking.
[ 4712.943823] End benchmarking. Elapsed: 1754012 cycles, loop count:1000

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

[-- Attachment #2: unwind_bench.c --]
[-- Type: text/plain, Size: 992 bytes --]

#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/timex.h>
#include <linux/stacktrace.h>

#define MAX_LOOP 1000
#define MAX_CALL 10
#define MAX_STACK 32

unsigned long stack_entries[MAX_STACK];

void stackdump(void)
{
	struct stack_trace st;
	st.max_entries = MAX_STACK;
	st.entries = stack_entries;

	save_stack_trace(&st);
}

void alpha(unsigned long call)
{
	if (call)
		alpha(--call);
	else
		stackdump();
}

static void bench_loop(void)
{
	u64 st, ed;
	unsigned long count;
	unsigned long flags;

	pr_info("Start benchmarking.\n");
	local_irq_save(flags);
	st = get_cycles();
	for (count = 0; count < MAX_LOOP; count++)
		alpha(MAX_CALL);
	ed = get_cycles();
	local_irq_restore(flags);
	pr_info("End benchmarking. Elapsed: %lu cycles, loop count:%lu\n",
		(unsigned long)(ed - st), count);
}

static int __init bench_init(void)
{
	bench_loop();
	return 0;
}

static void __exit bench_exit(void)
{
}

module_init(bench_init)
module_exit(bench_exit)
MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-04 10:01             ` Peter Zijlstra
  2017-01-08  4:22               ` Masami Hiramatsu
@ 2017-01-08 14:58               ` Masami Hiramatsu
  2017-01-09 17:36                 ` Josh Poimboeuf
  2017-01-14  9:56                 ` [tip:perf/core] kprobes, extable: Identify kprobes trampolines " tip-bot for Masami Hiramatsu
  1 sibling, 2 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-08 14:58 UTC (permalink / raw)
  To: Ingo Molnar, Josh Poimboeuf
  Cc: Masami Hiramatsu, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov

Make __kernel_text_address()/kernel_text_address() returns
true if the given address is on a kprobe's instruction slot,
which is generated by kprobes as a trampoline code.
This can help stacktraces to determine the address is on a
text area or not.

To implement this without any sleep in is_kprobe_*_slot(),
this also modify insn_cache page list as a rcu list. It may
increase processing deley (not processing time) for garbage
slot collection, because it requires to wait an additional
rcu grance period when freeing a page from the list.
However, since it is not a hot path, we may not take care of it.

Note: this can give a small overhead to stack unwinders because
this adds 2 checks in __kernel_text_address(). However, the
impact should be very small, kprobe_insn_pages list has 1 entry
per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
In most use cases, the number of kprobe events may be less
than 20, this means is_kprobe_*_slot() will check just 1 entry.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>

---
 V4:
  - Add a performance note on patch description.
  - Update for the latest tip/master.
---
 include/linux/kprobes.h |   30 ++++++++++++++++++++
 kernel/extable.c        |    9 +++++-
 kernel/kprobes.c        |   69 +++++++++++++++++++++++++++++++++++------------
 3 files changed, 89 insertions(+), 19 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8f68490..16ddfb8 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -278,9 +278,13 @@ struct kprobe_insn_cache {
 	int nr_garbage;
 };
 
+#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
+/* sleep-less address checking routine  */
+extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
+				unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -294,6 +298,18 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 {									\
 	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
 }									\
+									\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}
+#else /* __ARCH_WANT_KPROBES_INSN_SLOT */
+#define DEFINE_INSN_CACHE_OPS(__name)					\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return 0;							\
+}
+#endif
 
 DEFINE_INSN_CACHE_OPS(insn);
 
@@ -330,7 +346,6 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
-
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
@@ -481,6 +496,19 @@ static inline int enable_jprobe(struct jprobe *jp)
 	return enable_kprobe(&jp->kp);
 }
 
+#ifndef CONFIG_KPROBES
+static inline bool is_kprobe_insn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+#ifndef CONFIG_OPTPROBES
+static inline bool is_kprobe_optinsn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+
 #ifdef CONFIG_KPROBES
 /*
  * Blacklist ganerating macro. Specify functions which is not probed
diff --git a/kernel/extable.c b/kernel/extable.c
index e3beec4..e135947 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
+#include <linux/kprobes.h>
 
 #include <asm/sections.h>
 #include <linux/uaccess.h>
@@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_ftrace_trampoline(addr))
 		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
 	 * Give those symbols a chance to be printed in
@@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_module_text_address(addr))
 		return 1;
-	return is_ftrace_trampoline(addr);
+	if (is_ftrace_trampoline(addr))
+		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
+	return 0;
 }
 
 /*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4346010..e7e3f18 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	/* Since the slot array is not protected by rcu, we need a mutex */
 	mutex_lock(&c->mutex);
  retry:
-	list_for_each_entry(kip, &c->pages, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (kip->nused < slots_per_page(c)) {
 			int i;
 			for (i = 0; i < slots_per_page(c); i++) {
@@ -159,6 +161,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 					kip->slot_used[i] = SLOT_USED;
 					kip->nused++;
 					slot = kip->insns + (i * c->insn_size);
+					rcu_read_unlock();
 					goto out;
 				}
 			}
@@ -167,6 +170,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 			WARN_ON(1);
 		}
 	}
+	rcu_read_unlock();
 
 	/* If there are any garbage slots, collect it and try again. */
 	if (c->nr_garbage && collect_garbage_slots(c) == 0)
@@ -193,7 +197,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->nused = 1;
 	kip->ngarbage = 0;
 	kip->cache = c;
-	list_add(&kip->list, &c->pages);
+	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
 out:
 	mutex_unlock(&c->mutex);
@@ -213,7 +217,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
-			list_del(&kip->list);
+			list_del_rcu(&kip->list);
+			synchronize_rcu();
 			kip->cache->free(kip->insns);
 			kfree(kip);
 		}
@@ -248,29 +253,59 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 		      kprobe_opcode_t *slot, int dirty)
 {
 	struct kprobe_insn_page *kip;
+	long idx;
 
 	mutex_lock(&c->mutex);
-	list_for_each_entry(kip, &c->pages, list) {
-		long idx = ((long)slot - (long)kip->insns) /
-				(c->insn_size * sizeof(kprobe_opcode_t));
-		if (idx >= 0 && idx < slots_per_page(c)) {
-			WARN_ON(kip->slot_used[idx] != SLOT_USED);
-			if (dirty) {
-				kip->slot_used[idx] = SLOT_DIRTY;
-				kip->ngarbage++;
-				if (++c->nr_garbage > slots_per_page(c))
-					collect_garbage_slots(c);
-			} else
-				collect_one_slot(kip, idx);
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		idx = ((long)slot - (long)kip->insns) /
+			(c->insn_size * sizeof(kprobe_opcode_t));
+		if (idx >= 0 && idx < slots_per_page(c))
 			goto out;
-		}
 	}
-	/* Could not free this slot. */
+	/* Could not find this slot. */
 	WARN_ON(1);
+	kip = NULL;
 out:
+	rcu_read_unlock();
+	/* Mark and sweep: this may sleep */
+	if (kip) {
+		/* Check double free */
+		WARN_ON(kip->slot_used[idx] != SLOT_USED);
+		if (dirty) {
+			kip->slot_used[idx] = SLOT_DIRTY;
+			kip->ngarbage++;
+			if (++c->nr_garbage > slots_per_page(c))
+				collect_garbage_slots(c);
+		} else
+			collect_one_slot(kip, idx);
+	}
 	mutex_unlock(&c->mutex);
 }
 
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	struct kprobe_insn_page *kip;
+	bool ret = false;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if (addr >= (unsigned long)kip->insns &&
+		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
+			ret = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_OPTPROBES
 /* For optimized_kprobe buffer */
 struct kprobe_insn_cache kprobe_optinsn_slots = {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-08 14:58               ` [PATCH tip/master v4] " Masami Hiramatsu
@ 2017-01-09 17:36                 ` Josh Poimboeuf
  2017-01-10  1:11                   ` Masami Hiramatsu
  2017-01-10  8:59                   ` Peter Zijlstra
  2017-01-14  9:56                 ` [tip:perf/core] kprobes, extable: Identify kprobes trampolines " tip-bot for Masami Hiramatsu
  1 sibling, 2 replies; 23+ messages in thread
From: Josh Poimboeuf @ 2017-01-09 17:36 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Andy Lutomirski

On Sun, Jan 08, 2017 at 11:58:09PM +0900, Masami Hiramatsu wrote:
> Make __kernel_text_address()/kernel_text_address() returns
> true if the given address is on a kprobe's instruction slot,
> which is generated by kprobes as a trampoline code.
> This can help stacktraces to determine the address is on a
> text area or not.
> 
> To implement this without any sleep in is_kprobe_*_slot(),
> this also modify insn_cache page list as a rcu list. It may
> increase processing deley (not processing time) for garbage
> slot collection, because it requires to wait an additional
> rcu grance period when freeing a page from the list.
> However, since it is not a hot path, we may not take care of it.
> 
> Note: this can give a small overhead to stack unwinders because
> this adds 2 checks in __kernel_text_address(). However, the
> impact should be very small, kprobe_insn_pages list has 1 entry
> per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
> and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
> In most use cases, the number of kprobe events may be less
> than 20, this means is_kprobe_*_slot() will check just 1 entry.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks for doing this Masami!  I verified that it works:

  Tested-by: Josh Poimboeuf <jpoimboe@redhat.com>

I suspect that BPF generated code also has the same issue, so I'll leave
the unwinder warning disabled for now.

BTW, I think we'll have more problems with generated code if/when we
move to an x86 DWARF unwinder, because it won't have any idea how to
unwind past generated code.  Long term I wonder if it would make sense
to create some kind of framework for creating or registering generated
code, so we can solve these types of problems in a single place.

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-09 17:36                 ` Josh Poimboeuf
@ 2017-01-10  1:11                   ` Masami Hiramatsu
  2017-01-10  8:59                   ` Peter Zijlstra
  1 sibling, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-10  1:11 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, linux-kernel, Peter Zijlstra,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Andy Lutomirski

On Mon, 9 Jan 2017 11:36:48 -0600
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Sun, Jan 08, 2017 at 11:58:09PM +0900, Masami Hiramatsu wrote:
> > Make __kernel_text_address()/kernel_text_address() returns
> > true if the given address is on a kprobe's instruction slot,
> > which is generated by kprobes as a trampoline code.
> > This can help stacktraces to determine the address is on a
> > text area or not.
> > 
> > To implement this without any sleep in is_kprobe_*_slot(),
> > this also modify insn_cache page list as a rcu list. It may
> > increase processing deley (not processing time) for garbage
> > slot collection, because it requires to wait an additional
> > rcu grance period when freeing a page from the list.
> > However, since it is not a hot path, we may not take care of it.
> > 
> > Note: this can give a small overhead to stack unwinders because
> > this adds 2 checks in __kernel_text_address(). However, the
> > impact should be very small, kprobe_insn_pages list has 1 entry
> > per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
> > and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
> > In most use cases, the number of kprobe events may be less
> > than 20, this means is_kprobe_*_slot() will check just 1 entry.
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> 
> Thanks for doing this Masami!  I verified that it works:
> 
>   Tested-by: Josh Poimboeuf <jpoimboe@redhat.com>

Thanks!

> 
> I suspect that BPF generated code also has the same issue, so I'll leave
> the unwinder warning disabled for now.
> 
> BTW, I think we'll have more problems with generated code if/when we
> move to an x86 DWARF unwinder, because it won't have any idea how to
> unwind past generated code.  Long term I wonder if it would make sense
> to create some kind of framework for creating or registering generated
> code, so we can solve these types of problems in a single place.

Agreed. I think that can also help us to make the generated code RO :).

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-09 17:36                 ` Josh Poimboeuf
  2017-01-10  1:11                   ` Masami Hiramatsu
@ 2017-01-10  8:59                   ` Peter Zijlstra
  2017-01-10 21:42                     ` Josh Poimboeuf
  1 sibling, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2017-01-10  8:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Masami Hiramatsu, Ingo Molnar, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Andy Lutomirski

On Mon, Jan 09, 2017 at 11:36:48AM -0600, Josh Poimboeuf wrote:
> BTW, I think we'll have more problems with generated code if/when we
> move to an x86 DWARF unwinder, because it won't have any idea how to
> unwind past generated code.  Long term I wonder if it would make sense
> to create some kind of framework for creating or registering generated
> code, so we can solve these types of problems in a single place.

Yes, this seems like a good idea. Maybe we could pull the rbtree thing
from modules and make that a more generic interface for code
registration.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-10  8:59                   ` Peter Zijlstra
@ 2017-01-10 21:42                     ` Josh Poimboeuf
  2017-01-11  9:57                       ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Josh Poimboeuf @ 2017-01-10 21:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Ingo Molnar, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Andy Lutomirski

On Tue, Jan 10, 2017 at 09:59:23AM +0100, Peter Zijlstra wrote:
> On Mon, Jan 09, 2017 at 11:36:48AM -0600, Josh Poimboeuf wrote:
> > BTW, I think we'll have more problems with generated code if/when we
> > move to an x86 DWARF unwinder, because it won't have any idea how to
> > unwind past generated code.  Long term I wonder if it would make sense
> > to create some kind of framework for creating or registering generated
> > code, so we can solve these types of problems in a single place.
> 
> Yes, this seems like a good idea. Maybe we could pull the rbtree thing
> from modules and make that a more generic interface for code
> registration.

Yeah, a generic fast lookup for module+generated code could be useful.
It would also be nice to associate names with generated functions and
integrate that with kallsyms.  For DWARF we would also need a way to
associate CFI metadata with address ranges.

-- 
Josh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH tip/master v4] kprobes: extable: Identify kprobes' insn-slots as kernel text area
  2017-01-10 21:42                     ` Josh Poimboeuf
@ 2017-01-11  9:57                       ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-01-11  9:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Masami Hiramatsu, Ingo Molnar, linux-kernel,
	Ananth N Mavinakayanahalli, Thomas Gleixner, H . Peter Anvin,
	Andrey Konovalov, Andy Lutomirski

On Tue, 10 Jan 2017 15:42:07 -0600
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Tue, Jan 10, 2017 at 09:59:23AM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 09, 2017 at 11:36:48AM -0600, Josh Poimboeuf wrote:
> > > BTW, I think we'll have more problems with generated code if/when we
> > > move to an x86 DWARF unwinder, because it won't have any idea how to
> > > unwind past generated code.  Long term I wonder if it would make sense
> > > to create some kind of framework for creating or registering generated
> > > code, so we can solve these types of problems in a single place.
> > 
> > Yes, this seems like a good idea. Maybe we could pull the rbtree thing
> > from modules and make that a more generic interface for code
> > registration.
> 
> Yeah, a generic fast lookup for module+generated code could be useful.
> It would also be nice to associate names with generated functions and
> integrate that with kallsyms.  For DWARF we would also need a way to
> associate CFI metadata with address ranges.

>From the kprobe point of view, kallsyms and DWARF seems a bit overkill.
At least for kprobe insn slots, I think one symbol per page instead of
each entry will be better and CFI also should be generated for each call.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:perf/core] kprobes, extable: Identify kprobes trampolines as kernel text area
  2017-01-08 14:58               ` [PATCH tip/master v4] " Masami Hiramatsu
  2017-01-09 17:36                 ` Josh Poimboeuf
@ 2017-01-14  9:56                 ` tip-bot for Masami Hiramatsu
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Masami Hiramatsu @ 2017-01-14  9:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, mingo, ananth, mhiramat, andreyknvl, jpoimboe, torvalds,
	linux-kernel, acme, alexander.shishkin, jolsa, tglx, akpm, hpa

Commit-ID:  5b485629ba0d5d027880769ff467c587b24b4bde
Gitweb:     http://git.kernel.org/tip/5b485629ba0d5d027880769ff467c587b24b4bde
Author:     Masami Hiramatsu <mhiramat@kernel.org>
AuthorDate: Sun, 8 Jan 2017 23:58:09 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 08:38:05 +0100

kprobes, extable: Identify kprobes trampolines as kernel text area

Improve __kernel_text_address()/kernel_text_address() to return
true if the given address is on a kprobe's instruction slot
trampoline.

This can help stacktraces to determine the address is on a
text area or not.

To implement this atomically in is_kprobe_*_slot(), also change
the insn_cache page list to an RCU list.

This changes timings a bit (it delays page freeing to the RCU garbage
collection phase), but none of that is in the hot path.

Note: this change can add small overhead to stack unwinders because
it adds 2 additional checks to __kernel_text_address(). However, the
impact should be very small, because kprobe_insn_pages list has 1 entry
per 256 probes(on x86, on arm/arm64 it will be 1024 probes),
and kprobe_optinsn_pages has 1 entry per 32 probes(on x86).
In most use cases, the number of kprobe events may be less
than 20, which means that is_kprobe_*_slot() will check just one entry.

Tested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/148388747896.6869.6354262871751682264.stgit@devbox
[ Improved the changelog and coding style. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/kprobes.h | 30 +++++++++++++++++++-
 kernel/extable.c        |  9 +++++-
 kernel/kprobes.c        | 73 ++++++++++++++++++++++++++++++++++++-------------
 3 files changed, 91 insertions(+), 21 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 8f68490..16ddfb8 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -278,9 +278,13 @@ struct kprobe_insn_cache {
 	int nr_garbage;
 };
 
+#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
 extern kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c);
 extern void __free_insn_slot(struct kprobe_insn_cache *c,
 			     kprobe_opcode_t *slot, int dirty);
+/* sleep-less address checking routine  */
+extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
+				unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -294,6 +298,18 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 {									\
 	__free_insn_slot(&kprobe_##__name##_slots, slot, dirty);	\
 }									\
+									\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}
+#else /* __ARCH_WANT_KPROBES_INSN_SLOT */
+#define DEFINE_INSN_CACHE_OPS(__name)					\
+static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
+{									\
+	return 0;							\
+}
+#endif
 
 DEFINE_INSN_CACHE_OPS(insn);
 
@@ -330,7 +346,6 @@ extern int proc_kprobes_optimization_handler(struct ctl_table *table,
 					     int write, void __user *buffer,
 					     size_t *length, loff_t *ppos);
 #endif
-
 #endif /* CONFIG_OPTPROBES */
 #ifdef CONFIG_KPROBES_ON_FTRACE
 extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
@@ -481,6 +496,19 @@ static inline int enable_jprobe(struct jprobe *jp)
 	return enable_kprobe(&jp->kp);
 }
 
+#ifndef CONFIG_KPROBES
+static inline bool is_kprobe_insn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+#ifndef CONFIG_OPTPROBES
+static inline bool is_kprobe_optinsn_slot(unsigned long addr)
+{
+	return false;
+}
+#endif
+
 #ifdef CONFIG_KPROBES
 /*
  * Blacklist ganerating macro. Specify functions which is not probed
diff --git a/kernel/extable.c b/kernel/extable.c
index e3beec4..e135947 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
+#include <linux/kprobes.h>
 
 #include <asm/sections.h>
 #include <linux/uaccess.h>
@@ -104,6 +105,8 @@ int __kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_ftrace_trampoline(addr))
 		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
 	 * Give those symbols a chance to be printed in
@@ -123,7 +126,11 @@ int kernel_text_address(unsigned long addr)
 		return 1;
 	if (is_module_text_address(addr))
 		return 1;
-	return is_ftrace_trampoline(addr);
+	if (is_ftrace_trampoline(addr))
+		return 1;
+	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
+		return 1;
+	return 0;
 }
 
 /*
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4346010..ebb4dad 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -149,9 +149,11 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	struct kprobe_insn_page *kip;
 	kprobe_opcode_t *slot = NULL;
 
+	/* Since the slot array is not protected by rcu, we need a mutex */
 	mutex_lock(&c->mutex);
  retry:
-	list_for_each_entry(kip, &c->pages, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (kip->nused < slots_per_page(c)) {
 			int i;
 			for (i = 0; i < slots_per_page(c); i++) {
@@ -159,6 +161,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 					kip->slot_used[i] = SLOT_USED;
 					kip->nused++;
 					slot = kip->insns + (i * c->insn_size);
+					rcu_read_unlock();
 					goto out;
 				}
 			}
@@ -167,6 +170,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 			WARN_ON(1);
 		}
 	}
+	rcu_read_unlock();
 
 	/* If there are any garbage slots, collect it and try again. */
 	if (c->nr_garbage && collect_garbage_slots(c) == 0)
@@ -193,7 +197,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->nused = 1;
 	kip->ngarbage = 0;
 	kip->cache = c;
-	list_add(&kip->list, &c->pages);
+	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
 out:
 	mutex_unlock(&c->mutex);
@@ -213,7 +217,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
-			list_del(&kip->list);
+			list_del_rcu(&kip->list);
+			synchronize_rcu();
 			kip->cache->free(kip->insns);
 			kfree(kip);
 		}
@@ -235,8 +240,7 @@ static int collect_garbage_slots(struct kprobe_insn_cache *c)
 			continue;
 		kip->ngarbage = 0;	/* we will collect all garbages */
 		for (i = 0; i < slots_per_page(c); i++) {
-			if (kip->slot_used[i] == SLOT_DIRTY &&
-			    collect_one_slot(kip, i))
+			if (kip->slot_used[i] == SLOT_DIRTY && collect_one_slot(kip, i))
 				break;
 		}
 	}
@@ -248,29 +252,60 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 		      kprobe_opcode_t *slot, int dirty)
 {
 	struct kprobe_insn_page *kip;
+	long idx;
 
 	mutex_lock(&c->mutex);
-	list_for_each_entry(kip, &c->pages, list) {
-		long idx = ((long)slot - (long)kip->insns) /
-				(c->insn_size * sizeof(kprobe_opcode_t));
-		if (idx >= 0 && idx < slots_per_page(c)) {
-			WARN_ON(kip->slot_used[idx] != SLOT_USED);
-			if (dirty) {
-				kip->slot_used[idx] = SLOT_DIRTY;
-				kip->ngarbage++;
-				if (++c->nr_garbage > slots_per_page(c))
-					collect_garbage_slots(c);
-			} else
-				collect_one_slot(kip, idx);
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		idx = ((long)slot - (long)kip->insns) /
+			(c->insn_size * sizeof(kprobe_opcode_t));
+		if (idx >= 0 && idx < slots_per_page(c))
 			goto out;
-		}
 	}
-	/* Could not free this slot. */
+	/* Could not find this slot. */
 	WARN_ON(1);
+	kip = NULL;
 out:
+	rcu_read_unlock();
+	/* Mark and sweep: this may sleep */
+	if (kip) {
+		/* Check double free */
+		WARN_ON(kip->slot_used[idx] != SLOT_USED);
+		if (dirty) {
+			kip->slot_used[idx] = SLOT_DIRTY;
+			kip->ngarbage++;
+			if (++c->nr_garbage > slots_per_page(c))
+				collect_garbage_slots(c);
+		} else {
+			collect_one_slot(kip, idx);
+		}
+	}
 	mutex_unlock(&c->mutex);
 }
 
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	struct kprobe_insn_page *kip;
+	bool ret = false;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if (addr >= (unsigned long)kip->insns &&
+		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
+			ret = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_OPTPROBES
 /* For optimized_kprobe buffer */
 struct kprobe_insn_cache kprobe_optinsn_slots = {

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-01-14  9:59 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-22  6:42 Detecting kprobes generated code addresses Josh Poimboeuf
2016-12-25  3:13 ` Masami Hiramatsu
2016-12-25  6:16   ` Masami Hiramatsu
2016-12-26  4:30     ` Masami Hiramatsu
2016-12-26 14:50       ` [PATCH tip/master] kprobes: extable: Identify kprobes' insn-slots as kernel text area Masami Hiramatsu
2016-12-26 15:34       ` [PATCH tip/master v2] " Masami Hiramatsu
2016-12-26 17:21         ` kbuild test robot
2016-12-26 17:46         ` kbuild test robot
2016-12-27  6:13         ` Masami Hiramatsu
2016-12-27  6:14       ` [PATCH tip/master v3] " Masami Hiramatsu
2017-01-03 10:54         ` Peter Zijlstra
2017-01-04  5:06           ` Masami Hiramatsu
2017-01-04 10:01             ` Peter Zijlstra
2017-01-08  4:22               ` Masami Hiramatsu
2017-01-08 12:31                 ` Masami Hiramatsu
2017-01-08 14:58               ` [PATCH tip/master v4] " Masami Hiramatsu
2017-01-09 17:36                 ` Josh Poimboeuf
2017-01-10  1:11                   ` Masami Hiramatsu
2017-01-10  8:59                   ` Peter Zijlstra
2017-01-10 21:42                     ` Josh Poimboeuf
2017-01-11  9:57                       ` Masami Hiramatsu
2017-01-14  9:56                 ` [tip:perf/core] kprobes, extable: Identify kprobes trampolines " tip-bot for Masami Hiramatsu
2017-01-04 14:21           ` [PATCH tip/master v3] kprobes: extable: Identify kprobes' insn-slots " Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).