All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] jump label: core updates
@ 2010-10-01 21:23 Jason Baron
  2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:23 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

Hi,

Here are some updates to the jump label code (against -tip tree) found during
testing and review.

After this series, I'm not aware of any outstanding issues.

thanks,

-Jason

Jason Baron (5):
  jump label: fix module __init section race
  jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex
  jump label: add register_jump_label_key/unregister_jump_label_key
  jump label: move jump table to r/w section
  jump label: add docs

 Documentation/jump-label.txt      |  142 +++++++++++++++++++++++
 include/asm-generic/vmlinux.lds.h |   14 +--
 include/linux/jump_label.h        |    9 ++
 kernel/jump_label.c               |  222 +++++++++++++++++++++++++++++++-----
 kernel/kprobes.c                  |    6 +
 kernel/tracepoint.c               |   34 ++++++
 lib/dynamic_debug.c               |   20 ++++
 7 files changed, 406 insertions(+), 41 deletions(-)
 create mode 100644 Documentation/jump-label.txt


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] jump label: fix module __init section race
  2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
@ 2010-10-01 21:23 ` Jason Baron
  2010-10-02  8:58   ` Masami Hiramatsu
  2010-10-30 10:39   ` [tip:perf/urgent] jump label: Fix " tip-bot for Jason Baron
  2010-10-01 21:23 ` [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex Jason Baron
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:23 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

Jump label uses is_module_text_address() to ensure that the module
__init sections are valid before updating them. However, between the
check for a valid module __init section and the subsequent jump
label update, the module's __init section could be freed out from under
us.

We fix this potential race by adding a notifier callback to the
MODULE_STATE_LIVE state. This notifier is called *after* the __init
section has been run but before it is going to be freed. In the
callback, the jump label code zeros the key value for any __init jump
code within the module, and we add a check for a non-zero key value when
we update jump labels. In this way we require no additional data
structures.

Thanks to Mathieu Desnoyers for pointing out this race condition.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 kernel/jump_label.c |   41 ++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 40 insertions(+), 1 deletions(-)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 7be868b..e2fad92 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -168,7 +168,8 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 			count = e_module->nr_entries;
 			iter = e_module->table;
 			while (count--) {
-				if (kernel_text_address(iter->code))
+				if (iter->key &&
+						kernel_text_address(iter->code))
 					arch_jump_label_transform(iter, type);
 				iter++;
 			}
@@ -366,6 +367,39 @@ static void remove_jump_label_module(struct module *mod)
 	}
 }
 
+static void remove_module_init(struct module *mod)
+{
+	struct hlist_head *head;
+	struct hlist_node *node, *node_next, *module_node, *module_node_next;
+	struct jump_label_entry *e;
+	struct jump_label_module_entry *e_module;
+	struct jump_entry *iter;
+	int i, count;
+
+	/* if the module doesn't have jump label entries, just return */
+	if (!mod->num_jump_entries)
+		return;
+
+	for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
+		head = &jump_label_table[i];
+		hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
+			hlist_for_each_entry_safe(e_module, module_node,
+						  module_node_next,
+						  &(e->modules), hlist) {
+				if (e_module->mod != mod)
+					continue;
+				count = e_module->nr_entries;
+				iter = e_module->table;
+				while (count--) {
+					if (within_module_init(iter->code, mod))
+						iter->key = 0;
+					iter++;
+				}
+			}
+		}
+	}
+}
+
 static int
 jump_label_module_notify(struct notifier_block *self, unsigned long val,
 			 void *data)
@@ -386,6 +420,11 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
 		remove_jump_label_module(mod);
 		mutex_unlock(&jump_label_mutex);
 		break;
+	case MODULE_STATE_LIVE:
+		mutex_lock(&jump_label_mutex);
+		remove_module_init(mod);
+		mutex_unlock(&jump_label_mutex);
+		break;
 	}
 	return ret;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex
  2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
  2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
@ 2010-10-01 21:23 ` Jason Baron
  2010-10-02  9:00   ` Masami Hiramatsu
  2010-10-30 10:40   ` [tip:perf/urgent] " tip-bot for Jason Baron
  2010-10-01 21:23 ` [PATCH 3/5] jump label: add register_jump_label_key/unregister_jump_label_key Jason Baron
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:23 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

register_kprobe() downs the 'text_mutex' and then calls
jump_label_text_reserved(), which downs the 'jump_label_mutex'.
However, the jump label code takes those mutexes in the reverse
order.

Fix by requiring the caller of jump_label_text_reserved() to do
the jump label locking via the newly added: jump_label_lock(),
jump_label_unlock(). Currently, kprobes is the only user
of jump_label_text_reserved().


Signed-off-by: Jason Baron <jbaron@redhat.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
---
 include/linux/jump_label.h |    5 +++++
 kernel/jump_label.c        |   33 +++++++++++++++++++++------------
 kernel/kprobes.c           |    6 ++++++
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index b72cd9f..cf213d1 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -18,6 +18,8 @@ struct module;
 extern struct jump_entry __start___jump_table[];
 extern struct jump_entry __stop___jump_table[];
 
+extern void jump_label_lock(void);
+extern void jump_label_unlock(void);
 extern void arch_jump_label_transform(struct jump_entry *entry,
 				 enum jump_label_type type);
 extern void arch_jump_label_text_poke_early(jump_label_t addr);
@@ -59,6 +61,9 @@ static inline int jump_label_text_reserved(void *start, void *end)
 	return 0;
 }
 
+static inline void jump_label_lock(void) {}
+static inline void jump_label_unlock(void) {}
+
 #endif
 
 #endif
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index e2fad92..2add1a7 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -39,6 +39,16 @@ struct jump_label_module_entry {
 	struct module *mod;
 };
 
+void jump_label_lock(void)
+{
+	mutex_lock(&jump_label_mutex);
+}
+
+void jump_label_unlock(void)
+{
+	mutex_unlock(&jump_label_mutex);
+}
+
 static int jump_label_cmp(const void *a, const void *b)
 {
 	const struct jump_entry *jea = a;
@@ -152,7 +162,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 	struct jump_label_module_entry *e_module;
 	int count;
 
-	mutex_lock(&jump_label_mutex);
+	jump_label_lock();
 	entry = get_jump_label_entry((jump_label_t)key);
 	if (entry) {
 		count = entry->nr_entries;
@@ -175,7 +185,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 			}
 		}
 	}
-	mutex_unlock(&jump_label_mutex);
+	jump_label_unlock();
 }
 
 static int addr_conflict(struct jump_entry *entry, void *start, void *end)
@@ -232,6 +242,7 @@ out:
  * overlaps with any of the jump label patch addresses. Code
  * that wants to modify kernel text should first verify that
  * it does not overlap with any of the jump label addresses.
+ * Caller must hold jump_label_mutex.
  *
  * returns 1 if there is an overlap, 0 otherwise
  */
@@ -242,7 +253,6 @@ int jump_label_text_reserved(void *start, void *end)
 	struct jump_entry *iter_stop = __start___jump_table;
 	int conflict = 0;
 
-	mutex_lock(&jump_label_mutex);
 	iter = iter_start;
 	while (iter < iter_stop) {
 		if (addr_conflict(iter, start, end)) {
@@ -257,7 +267,6 @@ int jump_label_text_reserved(void *start, void *end)
 	conflict = module_conflict(start, end);
 #endif
 out:
-	mutex_unlock(&jump_label_mutex);
 	return conflict;
 }
 
@@ -268,7 +277,7 @@ static __init int init_jump_label(void)
 	struct jump_entry *iter_stop = __stop___jump_table;
 	struct jump_entry *iter;
 
-	mutex_lock(&jump_label_mutex);
+	jump_label_lock();
 	ret = build_jump_label_hashtable(__start___jump_table,
 					 __stop___jump_table);
 	iter = iter_start;
@@ -276,7 +285,7 @@ static __init int init_jump_label(void)
 		arch_jump_label_text_poke_early(iter->code);
 		iter++;
 	}
-	mutex_unlock(&jump_label_mutex);
+	jump_label_unlock();
 	return ret;
 }
 early_initcall(init_jump_label);
@@ -409,21 +418,21 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
 
 	switch (val) {
 	case MODULE_STATE_COMING:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		ret = add_jump_label_module(mod);
 		if (ret)
 			remove_jump_label_module(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	case MODULE_STATE_GOING:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		remove_jump_label_module(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	case MODULE_STATE_LIVE:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		remove_module_init(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	}
 	return ret;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index ec4210c..d45d72e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1145,13 +1145,16 @@ int __kprobes register_kprobe(struct kprobe *p)
 		return ret;
 
 	preempt_disable();
+	jump_label_lock();
 	if (!kernel_text_address((unsigned long) p->addr) ||
 	    in_kprobes_functions((unsigned long) p->addr) ||
 	    ftrace_text_reserved(p->addr, p->addr) ||
 	    jump_label_text_reserved(p->addr, p->addr)) {
 		preempt_enable();
+		jump_label_unlock();
 		return -EINVAL;
 	}
+	jump_label_unlock();
 
 	/* User can pass only KPROBE_FLAG_DISABLED to register_kprobe */
 	p->flags &= KPROBE_FLAG_DISABLED;
@@ -1186,6 +1189,8 @@ int __kprobes register_kprobe(struct kprobe *p)
 	INIT_LIST_HEAD(&p->list);
 	mutex_lock(&kprobe_mutex);
 
+	jump_label_lock(); /* needed to call jump_label_text_reserved() */
+
 	get_online_cpus();	/* For avoiding text_mutex deadlock. */
 	mutex_lock(&text_mutex);
 
@@ -1213,6 +1218,7 @@ int __kprobes register_kprobe(struct kprobe *p)
 out:
 	mutex_unlock(&text_mutex);
 	put_online_cpus();
+	jump_label_unlock();
 	mutex_unlock(&kprobe_mutex);
 
 	if (probed_mod)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] jump label: add register_jump_label_key/unregister_jump_label_key
  2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
  2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
  2010-10-01 21:23 ` [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex Jason Baron
@ 2010-10-01 21:23 ` Jason Baron
  2010-10-01 21:23 ` [PATCH 4/5] jump label: move jump table to r/w section Jason Baron
  2010-10-01 21:24 ` [PATCH 5/5] jump label: add docs Jason Baron
  4 siblings, 0 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:23 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

There are two cases where jump labels didn't match the state
transitions of tracepoints.

1)

A tracepoint is enabled in the core kernel for say kmalloc and
then a module is loaded which has a kmalloc call. We currently
would miss enabling the jump label for the kmalloc in the module.
This is addressed by associating an 'enabled' field with each
jump label. Now, when a new module is loaded we check if any of
the jump labels in that module need to be enabled. If so, we enable
them.

2)

If a tracepoint is defined in the core kernel code, but the usage
of the tracepoint is confined to a module, the current jump label
code does not create a entry in its table until the module is
loaded. Thus, if the tracepoint is enabled before the module
is loaded, we would miss the enablement of the jump label.

I'm not sure if there any tracepoints which currently fall into
this category (the bkl could fall into this category at some point
if its only used in modules). However, I do think its an important
case to address to make sure that jump label behave in a consistent
way with how consumers of the tracepoints might expect.

This case is implemented by introducing:

void register_jump_label_key(unsigned long key);
void unregister_jump_label_key(unsigned long key);

So basically any jump label that we want to use in the system must
first be registered, then it can be enabled/disabled, and then
finally it can be unregistered. For core kernel jump labels, I would
only expect them to be registered and never unregistered. However,
a jump label may be unregistred when modules are removed.

Although, this introduces some more work for consumers wanting
to use jump labels, the tracepoint and dynamic debug consumer code
seems fairly contained, at least to me.

Signed-off-by: Jason Baron <jbaron@redhat.com>
---
 include/linux/jump_label.h |    4 +
 kernel/jump_label.c        |  158 +++++++++++++++++++++++++++++++++++++-------
 kernel/tracepoint.c        |   34 ++++++++++
 lib/dynamic_debug.c        |   20 ++++++
 4 files changed, 193 insertions(+), 23 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index cf213d1..382cd23 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -26,6 +26,8 @@ extern void arch_jump_label_text_poke_early(jump_label_t addr);
 extern void jump_label_update(unsigned long key, enum jump_label_type type);
 extern void jump_label_apply_nops(struct module *mod);
 extern int jump_label_text_reserved(void *start, void *end);
+void register_jump_label_key(unsigned long key);
+void unregister_jump_label_key(unsigned long key);
 
 #define enable_jump_label(key) \
 	jump_label_update((unsigned long)key, JUMP_LABEL_ENABLE);
@@ -63,6 +65,8 @@ static inline int jump_label_text_reserved(void *start, void *end)
 
 static inline void jump_label_lock(void) {}
 static inline void jump_label_unlock(void) {}
+static inline void register_jump_label_key(unsigned long key) {}
+static inline void unregister_jump_label_key(unsigned long key) {}
 
 #endif
 
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 2add1a7..4e52e8b 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -19,6 +19,7 @@
 #define JUMP_LABEL_HASH_BITS 6
 #define JUMP_LABEL_TABLE_SIZE (1 << JUMP_LABEL_HASH_BITS)
 static struct hlist_head jump_label_table[JUMP_LABEL_TABLE_SIZE];
+static int registered;
 
 /* mutex to protect coming/going of the the jump_label table */
 static DEFINE_MUTEX(jump_label_mutex);
@@ -26,17 +27,19 @@ static DEFINE_MUTEX(jump_label_mutex);
 struct jump_label_entry {
 	struct hlist_node hlist;
 	struct jump_entry *table;
-	int nr_entries;
 	/* hang modules off here */
 	struct hlist_head modules;
 	unsigned long key;
+	__u32 nr_entries : 30;
+	__u32 registered : 1;
+	__u32 enabled : 1;
 };
 
 struct jump_label_module_entry {
 	struct hlist_node hlist;
 	struct jump_entry *table;
-	int nr_entries;
 	struct module *mod;
+	int nr_entries;
 };
 
 void jump_label_lock(void)
@@ -108,6 +111,8 @@ add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table)
 	e->key = key;
 	e->table = table;
 	e->nr_entries = nr_entries;
+	e->enabled = 0;
+	e->registered = 0;
 	INIT_HLIST_HEAD(&(e->modules));
 	hlist_add_head(&e->hlist, head);
 	return e;
@@ -163,28 +168,43 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 	int count;
 
 	jump_label_lock();
+	if (unlikely(!registered)) {
+		WARN(1, KERN_ERR "jump label: update before activation!\n");
+		goto out;
+	}
 	entry = get_jump_label_entry((jump_label_t)key);
-	if (entry) {
-		count = entry->nr_entries;
-		iter = entry->table;
+	if (unlikely(!entry)) {
+		WARN(1, KERN_ERR "jump label: updating unkown key: %lu\n", key);
+		goto out;
+	}
+	if (unlikely(!entry->registered)) {
+		WARN(1, KERN_ERR "jump label: update without register:"
+			" %lu\n", key);
+		goto out;
+	}
+	if (type == JUMP_LABEL_ENABLE)
+		entry->enabled = 1;
+	else
+		entry->enabled = 0;
+	count = entry->nr_entries;
+	iter = entry->table;
+	while (count--) {
+		if (kernel_text_address(iter->code))
+			arch_jump_label_transform(iter, type);
+		iter++;
+	}
+	/* eanble/disable jump labels in modules */
+	hlist_for_each_entry(e_module, module_node, &(entry->modules),
+						hlist) {
+		count = e_module->nr_entries;
+		iter = e_module->table;
 		while (count--) {
-			if (kernel_text_address(iter->code))
+			if (iter->key && kernel_text_address(iter->code))
 				arch_jump_label_transform(iter, type);
 			iter++;
 		}
-		/* eanble/disable jump labels in modules */
-		hlist_for_each_entry(e_module, module_node, &(entry->modules),
-							hlist) {
-			count = e_module->nr_entries;
-			iter = e_module->table;
-			while (count--) {
-				if (iter->key &&
-						kernel_text_address(iter->code))
-					arch_jump_label_transform(iter, type);
-				iter++;
-			}
-		}
 	}
+out:
 	jump_label_unlock();
 }
 
@@ -197,6 +217,63 @@ static int addr_conflict(struct jump_entry *entry, void *start, void *end)
 	return 0;
 }
 
+/***
+ * register_jump_label_key - register a jump label key
+ * @key -  key value associated with a a jump label
+ *
+ * Code that wants to use a jump label must first register the key before
+ * before calling jump_label_update on it.
+ */
+
+void register_jump_label_key(unsigned long key)
+{
+	struct jump_label_entry *entry;
+
+	jump_label_lock();
+	entry = get_jump_label_entry((jump_label_t)key);
+	if (!entry)
+		entry = add_jump_label_entry((jump_label_t)key, 0, NULL);
+	if (!entry)
+		goto out;
+	if (entry->registered)
+		WARN(1, KERN_ERR "jump label: re-register key: %lu !\n", key);
+	else
+		entry->registered = 1;
+out:
+	jump_label_unlock();
+}
+EXPORT_SYMBOL_GPL(register_jump_label_key);
+
+/***
+ * unregister_jump_label_key - unregister a jump label key
+ * @key -  key value associated with a a jump label
+ *
+ * Code that no longer needs to make use of jump label (such as a module)
+ * needs to call this to clean up state.
+ */
+
+void unregister_jump_label_key(unsigned long key)
+{
+	struct jump_label_entry *entry;
+
+	jump_label_lock();
+	entry = get_jump_label_entry((jump_label_t)key);
+	if (!entry) {
+		jump_label_unlock();
+		WARN(1, KERN_ERR "jump label: remove unknown key: %lu\n", key);
+		return;
+	}
+	if (entry->nr_entries || (!hlist_empty(&entry->modules)))
+		WARN(1, KERN_ERR "jump label: remove used key: %lu\n", key);
+	else {
+		entry->registered = 0;
+		hlist_del(&entry->hlist);
+		kfree(entry);
+	}
+	jump_label_unlock();
+}
+EXPORT_SYMBOL_GPL(unregister_jump_label_key);
+
 #ifdef CONFIG_MODULES
 
 static int module_conflict(void *start, void *end)
@@ -285,6 +362,7 @@ static __init int init_jump_label(void)
 		arch_jump_label_text_poke_early(iter->code);
 		iter++;
 	}
+	registered = 1;
 	jump_label_unlock();
 	return ret;
 }
@@ -309,6 +387,41 @@ add_jump_label_module_entry(struct jump_label_entry *entry,
 	return e;
 }
 
+static void update_jump_label_module(struct module *mod)
+{
+	struct hlist_head *head;
+	struct hlist_node *node, *node_next, *module_node, *module_node_next;
+	struct jump_label_entry *e;
+	struct jump_label_module_entry *e_module;
+	struct jump_entry *iter;
+	int i, count;
+
+	/* if the module doesn't have jump label entries, just return */
+	if (!mod->num_jump_entries)
+		return;
+
+	for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
+		head = &jump_label_table[i];
+		hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
+			if (!e->enabled)
+				continue;
+			hlist_for_each_entry_safe(e_module, module_node,
+						  module_node_next,
+						  &(e->modules), hlist) {
+				if (e_module->mod != mod)
+					continue;
+				count = e_module->nr_entries;
+				iter = e_module->table;
+				while (count--) {
+					arch_jump_label_transform(iter,
+							JUMP_LABEL_ENABLE);
+					iter++;
+				}
+			}
+		}
+	}
+}
+
 static int add_jump_label_module(struct module *mod)
 {
 	struct jump_entry *iter, *iter_begin;
@@ -342,6 +455,9 @@ static int add_jump_label_module(struct module *mod)
 		if (IS_ERR(module_entry))
 			return PTR_ERR(module_entry);
 	}
+	/* update new entries to the correct state */
+	update_jump_label_module(mod);
+
 	return 0;
 }
 
@@ -368,10 +484,6 @@ static void remove_jump_label_module(struct module *mod)
 					kfree(e_module);
 				}
 			}
-			if (hlist_empty(&e->modules) && (e->nr_entries == 0)) {
-				hlist_del(&e->hlist);
-				kfree(e);
-			}
 		}
 	}
 }
@@ -463,7 +575,7 @@ void jump_label_apply_nops(struct module *mod)
 
 struct notifier_block jump_label_module_nb = {
 	.notifier_call = jump_label_module_notify,
-	.priority = 0,
+	.priority = 1, /* higher than tracepoints */
 };
 
 static __init int init_jump_label_module(void)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index d6073a5..00e4074 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -575,6 +575,33 @@ void tracepoint_iter_reset(struct tracepoint_iter *iter)
 }
 EXPORT_SYMBOL_GPL(tracepoint_iter_reset);
 
+
+static void jump_label_reg_unreg(struct tracepoint *begin,
+				 struct tracepoint *end, int reg)
+{
+	struct tracepoint *iter;
+	if (!begin)
+		return;
+
+	mutex_lock(&tracepoints_mutex);
+	for (iter = begin; iter < end; iter++) {
+		if (reg)
+			register_jump_label_key((unsigned long)&iter->state);
+		else
+			unregister_jump_label_key((unsigned long)&iter->state);
+	}
+	mutex_unlock(&tracepoints_mutex);
+}
+#define jump_label_register(begin, end) jump_label_reg_unreg(begin, end, 1);
+#define jump_label_unregister(begin, end) jump_label_reg_unreg(begin, end, 0);
+
+static int __init init_jump_label_register(void)
+{
+	jump_label_register(__start___tracepoints, __stop___tracepoints);
+	return 0;
+}
+core_initcall(init_jump_label_register);
+
 #ifdef CONFIG_MODULES
 
 int tracepoint_module_notify(struct notifier_block *self,
@@ -584,9 +611,16 @@ int tracepoint_module_notify(struct notifier_block *self,
 
 	switch (val) {
 	case MODULE_STATE_COMING:
+		jump_label_register(mod->tracepoints,
+			mod->tracepoints + mod->num_tracepoints);
+		tracepoint_update_probe_range(mod->tracepoints,
+			mod->tracepoints + mod->num_tracepoints);
+		break;
 	case MODULE_STATE_GOING:
 		tracepoint_update_probe_range(mod->tracepoints,
 			mod->tracepoints + mod->num_tracepoints);
+		jump_label_unregister(mod->tracepoints,
+			mod->tracepoints + mod->num_tracepoints);
 		break;
 	}
 	return 0;
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index e925c7b..5ddcb3f 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -79,6 +79,24 @@ static char *ddebug_describe_flags(struct _ddebug *dp, char *buf,
 	return buf;
 }
 
+static void jump_label_reg_unreg(struct ddebug_table *dt, int reg)
+{
+	int i;
+	if (!dt)
+		return;
+
+	for (i = 0 ; i < dt->num_ddebugs ; i++) {
+		struct _ddebug *dp = &dt->ddebugs[i];
+		if (reg)
+			register_jump_label_key((unsigned long)&dp->enabled);
+		else
+			unregister_jump_label_key((unsigned long)&dp->enabled);
+	}
+}
+
+#define jump_label_register(dt) jump_label_reg_unreg(dt, 1);
+#define jump_label_unregister(dt) jump_label_reg_unreg(dt, 0);
+
 /*
  * Search the tables for _ddebug's which match the given
  * `query' and apply the `flags' and `mask' to them.  Tells
@@ -636,6 +654,7 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n,
 
 	mutex_lock(&ddebug_lock);
 	list_add_tail(&dt->link, &ddebug_tables);
+	jump_label_register(dt);
 	mutex_unlock(&ddebug_lock);
 
 	if (verbose)
@@ -647,6 +666,7 @@ EXPORT_SYMBOL_GPL(ddebug_add_module);
 
 static void ddebug_table_free(struct ddebug_table *dt)
 {
+	jump_label_unregister(dt);
 	list_del_init(&dt->link);
 	kfree(dt->mod_name);
 	kfree(dt);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] jump label: move jump table to r/w section
  2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
                   ` (2 preceding siblings ...)
  2010-10-01 21:23 ` [PATCH 3/5] jump label: add register_jump_label_key/unregister_jump_label_key Jason Baron
@ 2010-10-01 21:23 ` Jason Baron
  2010-10-01 21:24 ` [PATCH 5/5] jump label: add docs Jason Baron
  4 siblings, 0 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:23 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

Since we are writing the jump table it should be be in R/W kernel
section. Move it to DATA_DATA

Signed-off-by: Jason Baron <jbaron@redhat.com>
---
 include/asm-generic/vmlinux.lds.h |   14 ++++----------
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ef2af99..24e2daf 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -160,6 +160,10 @@
 	VMLINUX_SYMBOL(__start___tracepoints) = .;			\
 	*(__tracepoints)						\
 	VMLINUX_SYMBOL(__stop___tracepoints) = .;			\
+	. = ALIGN(8);							\
+	VMLINUX_SYMBOL(__start___jump_table) = .;			\
+	*(__jump_table)							\
+	VMLINUX_SYMBOL(__stop___jump_table) = .;			\
 	/* implement dynamic printk debug */				\
 	. = ALIGN(8);							\
 	VMLINUX_SYMBOL(__start___verbose) = .;                          \
@@ -220,8 +224,6 @@
 									\
 	BUG_TABLE							\
 									\
-	JUMP_TABLE							\
-									\
 	/* PCI quirks */						\
 	.pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {	\
 		VMLINUX_SYMBOL(__start_pci_fixups_early) = .;		\
@@ -565,14 +567,6 @@
 #define BUG_TABLE
 #endif
 
-#define JUMP_TABLE							\
-	. = ALIGN(8);							\
-	__jump_table : AT(ADDR(__jump_table) - LOAD_OFFSET) {		\
-		VMLINUX_SYMBOL(__start___jump_table) = .;		\
-		*(__jump_table)						\
-		VMLINUX_SYMBOL(__stop___jump_table) = .;		\
-	}
-
 #ifdef CONFIG_PM_TRACE
 #define TRACEDATA							\
 	. = ALIGN(4);							\
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] jump label: add docs
  2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
                   ` (3 preceding siblings ...)
  2010-10-01 21:23 ` [PATCH 4/5] jump label: move jump table to r/w section Jason Baron
@ 2010-10-01 21:24 ` Jason Baron
  4 siblings, 0 replies; 14+ messages in thread
From: Jason Baron @ 2010-10-01 21:24 UTC (permalink / raw)
  To: rostedt, mingo
  Cc: mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	masami.hiramatsu.pt, fweisbec, avi, davem, vgoyal, sam, tony,
	ddaney, linux-kernel

Add jump label docs as: Documentation/jump-label.txt

Also, fix documentation spelling error in jump_label.c

Signed-off-by: Jason Baron <jbaron@redhat.com>
---
 Documentation/jump-label.txt |  142 ++++++++++++++++++++++++++++++++++++++++++
 kernel/jump_label.c          |    2 +-
 2 files changed, 143 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/jump-label.txt

diff --git a/Documentation/jump-label.txt b/Documentation/jump-label.txt
new file mode 100644
index 0000000..d26df30
--- /dev/null
+++ b/Documentation/jump-label.txt
@@ -0,0 +1,142 @@
+			Jump Label
+			----------
+
+By: Jason Baron <jbaron@redhat.com>
+
+
+1) Motivation
+
+
+Currently, tracepoints are implemented using a conditional. The conditional
+check requires checking a global variable for each tracepoint. Although
+the overhead of this check is small, it increases when the memory cache comes
+under pressure (memory cache lines for these global variables may be shared
+with other memory accesses). As we increase the number of tracepoints in the
+kernel this overhead may become more of an issue. In addition, tracepoints are
+often dormant (disabled) and provide no direct kernel functionality. Thus, it
+is highly desirable to reduce their impact as much as possible. Although
+tracepoints are the original motivation for this work, other kernel code paths
+should be able to make use of the jump label optimization.
+
+
+2) Jump label description/usage
+
+
+gcc (v4.5) adds a new 'asm goto' statement that allows branching to a label.
+http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01556.html
+
+Thus, this patch set introduces an architecture specific 'JUMP_LABEL()' macro as
+follows (x86):
+
+# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
+
+# define JUMP_LABEL(key, label)					\
+	do {                                                    \
+		asm goto("1:"                                   \
+			JUMP_LABEL_INITIAL_NOP			\
+                        ".pushsection __jump_table,  \"a\" \n\t"\
+                        _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
+                        ".popsection \n\t"                      \
+                        : :  "i" (key) :  : label);		\
+        } while (0)
+
+
+For architectures that have not yet introduced jump label support it's simply:
+
+#define JUMP_LABEL(key, label)			\
+	if (unlikely(*key))			\
+		goto label;
+
+which then can be used as:
+
+	....
+        JUMP_LABEL(key, trace_label);
+        printk("not doing tracing\n");
+	return;
+trace_label:
+        printk("doing tracing: %d\n", file);
+	....
+
+The 'key' argument is thus a pointer to a conditional argument that can be used
+if the optimization is not enabled. Otherwise, this address serves as a unique
+key to identify the particular instance of the jump label.
+
+Thus, when tracing is disabled, we simply have a no-op followed by a jump around
+the dormant (disabled) tracing code. The 'JUMP_LABEL()' macro produces a
+'jump_table', which has the following format:
+
+[instruction address] [jump target] [key]
+
+Thus, to enable a tracepoint, we simply patch the 'instruction address' with
+a jump to the 'jump target.'
+
+The calls to enable and disable a jump label are: enable_jump_label(key) and
+disable_jump_label(key).
+
+
+3) Architecture interface
+
+
+There are a few functions and macros that architectures must implement in order
+to take advantage of this optimization. As previously mentioned, if there is no
+architecture support, we simply fall back to a traditional, load, test, and
+jump sequence.
+
+* add "HAVE_ARCH_JUMP_LABEL" to arch/<arch>/Kconfig to indicate support
+
+* #define JUMP_LABEL_NOP_SIZE, arch/x86/include/asm/jump_label.h
+
+* #define "JUMP_LABEL(key, label)", arch/x86/include/asm/jump_label.h
+
+* void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type)
+	see: arch/x86/kernel/jump_label.c
+
+* void arch_jump_label_text_poke_early(jump_label_t addr)
+	see: arch/x86/kernel/jump_label.c
+
+* finally add a definition for "struct jump_entry".
+	see: arch/x86/include/asm/jump_label.h
+
+
+4) Jump label analysis (x86)
+
+
+I've tested the performance of using 'get_cycles()' calls around the
+tracepoint call sites. For an Intel Core 2 Quad cpu (in cycles, averages):
+
+		idle		after tbench run
+		----		----------------
+old code	 32		  88
+new code	  2		   4
+
+
+The performance improvement can be reproduced reliably on both Intel and AMD
+hardware.
+
+In terms of code analysis, the current code for the disabled case is a 'cmpl'
+followed by a 'je' around the tracepoint code. So:
+
+cmpl - 83 3d 0e 77 87 00 00 - 7 bytes
+je   - 74 3e                - 2 bytes
+
+total of 9 instruction bytes.
+
+The new code is a 'nopl' followed by a 'jmp'. Thus:
+
+nopl - 0f 1f 44 00 00 - 5 bytes
+jmp  - eb 3e          - 2 bytes
+
+total of 7 instruction bytes.
+
+So, the new code also accounts for 2 less bytes in the instruction cache per tracepoint.
+
+
+5) Acknowledgements
+
+
+Thanks to Roland McGrath and Richard Henderson for helping come up with the
+initial 'asm goto' and jump label design.
+
+Thanks to Mathieu Desnoyers and H. Peter Anvin for calling attention to this
+issue, and outlining the requirements of a solution. Mathieu also implemented a
+solution in the form of the "Immediate Values" work.
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 4e52e8b..e30f29d 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -193,7 +193,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 			arch_jump_label_transform(iter, type);
 		iter++;
 	}
-	/* eanble/disable jump labels in modules */
+	/* enable/disable jump labels in modules */
 	hlist_for_each_entry(e_module, module_node, &(entry->modules),
 						hlist) {
 		count = e_module->nr_entries;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] jump label: fix module __init section race
  2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
@ 2010-10-02  8:58   ` Masami Hiramatsu
  2010-10-06 13:00     ` Steven Rostedt
  2010-10-30 10:39   ` [tip:perf/urgent] jump label: Fix " tip-bot for Jason Baron
  1 sibling, 1 reply; 14+ messages in thread
From: Masami Hiramatsu @ 2010-10-02  8:58 UTC (permalink / raw)
  To: Jason Baron
  Cc: rostedt, mingo, mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	fweisbec, avi, davem, vgoyal, sam, tony, ddaney, linux-kernel,
	2nddept-manager

(2010/10/02 6:23), Jason Baron wrote:
> Jump label uses is_module_text_address() to ensure that the module
> __init sections are valid before updating them. However, between the
> check for a valid module __init section and the subsequent jump
> label update, the module's __init section could be freed out from under
> us.
> 
> We fix this potential race by adding a notifier callback to the
> MODULE_STATE_LIVE state. This notifier is called *after* the __init
> section has been run but before it is going to be freed. In the
> callback, the jump label code zeros the key value for any __init jump
> code within the module, and we add a check for a non-zero key value when
> we update jump labels. In this way we require no additional data
> structures.
> 
> Thanks to Mathieu Desnoyers for pointing out this race condition.
> 
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  kernel/jump_label.c |   41 ++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 40 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/jump_label.c b/kernel/jump_label.c
> index 7be868b..e2fad92 100644
> --- a/kernel/jump_label.c
> +++ b/kernel/jump_label.c
> @@ -168,7 +168,8 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
>  			count = e_module->nr_entries;
>  			iter = e_module->table;
>  			while (count--) {
> -				if (kernel_text_address(iter->code))
> +				if (iter->key &&
> +						kernel_text_address(iter->code))
>  					arch_jump_label_transform(iter, type);
>  				iter++;
>  			}
> @@ -366,6 +367,39 @@ static void remove_jump_label_module(struct module *mod)
>  	}
>  }
>  
> +static void remove_module_init(struct module *mod)

Hi Jason,

Just a comment, I prefer remove_jump_label_module_init() than this name,
because remove_module_init is too general.

Thank you,


-- 
Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex
  2010-10-01 21:23 ` [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex Jason Baron
@ 2010-10-02  9:00   ` Masami Hiramatsu
  2010-10-30 10:40   ` [tip:perf/urgent] " tip-bot for Jason Baron
  1 sibling, 0 replies; 14+ messages in thread
From: Masami Hiramatsu @ 2010-10-02  9:00 UTC (permalink / raw)
  To: Jason Baron
  Cc: rostedt, mingo, mathieu.desnoyers, hpa, tglx, andi, roland, rth,
	fweisbec, avi, davem, vgoyal, sam, tony, ddaney, linux-kernel,
	2nddept-manager

(2010/10/02 6:23), Jason Baron wrote:
> register_kprobe() downs the 'text_mutex' and then calls
> jump_label_text_reserved(), which downs the 'jump_label_mutex'.
> However, the jump label code takes those mutexes in the reverse
> order.
> 
> Fix by requiring the caller of jump_label_text_reserved() to do
> the jump label locking via the newly added: jump_label_lock(),
> jump_label_unlock(). Currently, kprobes is the only user
> of jump_label_text_reserved().
> 

Looks good for me;)

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Thank you,

> 
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> Reported-by: Ingo Molnar <mingo@elte.hu>
> ---
>  include/linux/jump_label.h |    5 +++++
>  kernel/jump_label.c        |   33 +++++++++++++++++++++------------
>  kernel/kprobes.c           |    6 ++++++
>  3 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
> index b72cd9f..cf213d1 100644
> --- a/include/linux/jump_label.h
> +++ b/include/linux/jump_label.h
> @@ -18,6 +18,8 @@ struct module;
>  extern struct jump_entry __start___jump_table[];
>  extern struct jump_entry __stop___jump_table[];
>  
> +extern void jump_label_lock(void);
> +extern void jump_label_unlock(void);
>  extern void arch_jump_label_transform(struct jump_entry *entry,
>  				 enum jump_label_type type);
>  extern void arch_jump_label_text_poke_early(jump_label_t addr);
> @@ -59,6 +61,9 @@ static inline int jump_label_text_reserved(void *start, void *end)
>  	return 0;
>  }
>  
> +static inline void jump_label_lock(void) {}
> +static inline void jump_label_unlock(void) {}
> +
>  #endif
>  
>  #endif
> diff --git a/kernel/jump_label.c b/kernel/jump_label.c
> index e2fad92..2add1a7 100644
> --- a/kernel/jump_label.c
> +++ b/kernel/jump_label.c
> @@ -39,6 +39,16 @@ struct jump_label_module_entry {
>  	struct module *mod;
>  };
>  
> +void jump_label_lock(void)
> +{
> +	mutex_lock(&jump_label_mutex);
> +}
> +
> +void jump_label_unlock(void)
> +{
> +	mutex_unlock(&jump_label_mutex);
> +}
> +
>  static int jump_label_cmp(const void *a, const void *b)
>  {
>  	const struct jump_entry *jea = a;
> @@ -152,7 +162,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
>  	struct jump_label_module_entry *e_module;
>  	int count;
>  
> -	mutex_lock(&jump_label_mutex);
> +	jump_label_lock();
>  	entry = get_jump_label_entry((jump_label_t)key);
>  	if (entry) {
>  		count = entry->nr_entries;
> @@ -175,7 +185,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
>  			}
>  		}
>  	}
> -	mutex_unlock(&jump_label_mutex);
> +	jump_label_unlock();
>  }
>  
>  static int addr_conflict(struct jump_entry *entry, void *start, void *end)
> @@ -232,6 +242,7 @@ out:
>   * overlaps with any of the jump label patch addresses. Code
>   * that wants to modify kernel text should first verify that
>   * it does not overlap with any of the jump label addresses.
> + * Caller must hold jump_label_mutex.
>   *
>   * returns 1 if there is an overlap, 0 otherwise
>   */
> @@ -242,7 +253,6 @@ int jump_label_text_reserved(void *start, void *end)
>  	struct jump_entry *iter_stop = __start___jump_table;
>  	int conflict = 0;
>  
> -	mutex_lock(&jump_label_mutex);
>  	iter = iter_start;
>  	while (iter < iter_stop) {
>  		if (addr_conflict(iter, start, end)) {
> @@ -257,7 +267,6 @@ int jump_label_text_reserved(void *start, void *end)
>  	conflict = module_conflict(start, end);
>  #endif
>  out:
> -	mutex_unlock(&jump_label_mutex);
>  	return conflict;
>  }
>  
> @@ -268,7 +277,7 @@ static __init int init_jump_label(void)
>  	struct jump_entry *iter_stop = __stop___jump_table;
>  	struct jump_entry *iter;
>  
> -	mutex_lock(&jump_label_mutex);
> +	jump_label_lock();
>  	ret = build_jump_label_hashtable(__start___jump_table,
>  					 __stop___jump_table);
>  	iter = iter_start;
> @@ -276,7 +285,7 @@ static __init int init_jump_label(void)
>  		arch_jump_label_text_poke_early(iter->code);
>  		iter++;
>  	}
> -	mutex_unlock(&jump_label_mutex);
> +	jump_label_unlock();
>  	return ret;
>  }
>  early_initcall(init_jump_label);
> @@ -409,21 +418,21 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
>  
>  	switch (val) {
>  	case MODULE_STATE_COMING:
> -		mutex_lock(&jump_label_mutex);
> +		jump_label_lock();
>  		ret = add_jump_label_module(mod);
>  		if (ret)
>  			remove_jump_label_module(mod);
> -		mutex_unlock(&jump_label_mutex);
> +		jump_label_unlock();
>  		break;
>  	case MODULE_STATE_GOING:
> -		mutex_lock(&jump_label_mutex);
> +		jump_label_lock();
>  		remove_jump_label_module(mod);
> -		mutex_unlock(&jump_label_mutex);
> +		jump_label_unlock();
>  		break;
>  	case MODULE_STATE_LIVE:
> -		mutex_lock(&jump_label_mutex);
> +		jump_label_lock();
>  		remove_module_init(mod);
> -		mutex_unlock(&jump_label_mutex);
> +		jump_label_unlock();
>  		break;
>  	}
>  	return ret;
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index ec4210c..d45d72e 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1145,13 +1145,16 @@ int __kprobes register_kprobe(struct kprobe *p)
>  		return ret;
>  
>  	preempt_disable();
> +	jump_label_lock();
>  	if (!kernel_text_address((unsigned long) p->addr) ||
>  	    in_kprobes_functions((unsigned long) p->addr) ||
>  	    ftrace_text_reserved(p->addr, p->addr) ||
>  	    jump_label_text_reserved(p->addr, p->addr)) {
>  		preempt_enable();
> +		jump_label_unlock();
>  		return -EINVAL;
>  	}
> +	jump_label_unlock();
>  
>  	/* User can pass only KPROBE_FLAG_DISABLED to register_kprobe */
>  	p->flags &= KPROBE_FLAG_DISABLED;
> @@ -1186,6 +1189,8 @@ int __kprobes register_kprobe(struct kprobe *p)
>  	INIT_LIST_HEAD(&p->list);
>  	mutex_lock(&kprobe_mutex);
>  
> +	jump_label_lock(); /* needed to call jump_label_text_reserved() */
> +
>  	get_online_cpus();	/* For avoiding text_mutex deadlock. */
>  	mutex_lock(&text_mutex);
>  
> @@ -1213,6 +1218,7 @@ int __kprobes register_kprobe(struct kprobe *p)
>  out:
>  	mutex_unlock(&text_mutex);
>  	put_online_cpus();
> +	jump_label_unlock();
>  	mutex_unlock(&kprobe_mutex);
>  
>  	if (probed_mod)


-- 
Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] jump label: fix module __init section race
  2010-10-02  8:58   ` Masami Hiramatsu
@ 2010-10-06 13:00     ` Steven Rostedt
  2010-10-06 15:41       ` Jason Baron
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2010-10-06 13:00 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jason Baron, mingo, mathieu.desnoyers, hpa, tglx, andi, roland,
	rth, fweisbec, avi, davem, vgoyal, sam, tony, ddaney,
	linux-kernel, 2nddept-manager

On Sat, 2010-10-02 at 17:58 +0900, Masami Hiramatsu wrote:

> > +static void remove_module_init(struct module *mod)
> 
> Hi Jason,
> 
> Just a comment, I prefer remove_jump_label_module_init() than this name,
> because remove_module_init is too general.

This is probably not too big of a deal since it is static, but the name
change may make it easier for etags and ctags users.

Jason, are you OK if I pull in this patch and make the change myself?

-- Steve



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] jump label: fix module __init section race
  2010-10-06 13:00     ` Steven Rostedt
@ 2010-10-06 15:41       ` Jason Baron
  2010-10-06 15:46         ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Baron @ 2010-10-06 15:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, mingo, mathieu.desnoyers, hpa, tglx, andi,
	roland, rth, fweisbec, avi, davem, vgoyal, sam, tony, ddaney,
	linux-kernel, 2nddept-manager

On Wed, Oct 06, 2010 at 09:00:50AM -0400, Steven Rostedt wrote:
> On Sat, 2010-10-02 at 17:58 +0900, Masami Hiramatsu wrote:
> 
> > > +static void remove_module_init(struct module *mod)
> > 
> > Hi Jason,
> > 
> > Just a comment, I prefer remove_jump_label_module_init() than this name,
> > because remove_module_init is too general.
> 
> This is probably not too big of a deal since it is static, but the name
> change may make it easier for etags and ctags users.
> 
> Jason, are you OK if I pull in this patch and make the change myself?
> 
> -- Steve
> 

fine with me. I had a more specific function name and then shortened it
because it seemed too long, either way is fine with me.

thanks,

-Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] jump label: fix module __init section race
  2010-10-06 15:41       ` Jason Baron
@ 2010-10-06 15:46         ` Steven Rostedt
  2010-10-07  1:56           ` Masami Hiramatsu
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2010-10-06 15:46 UTC (permalink / raw)
  To: Jason Baron
  Cc: Masami Hiramatsu, mingo, mathieu.desnoyers, hpa, tglx, andi,
	roland, rth, fweisbec, avi, davem, vgoyal, sam, tony, ddaney,
	linux-kernel, 2nddept-manager

On Wed, 2010-10-06 at 11:41 -0400, Jason Baron wrote:
> On Wed, Oct 06, 2010 at 09:00:50AM -0400, Steven Rostedt wrote:
> > On Sat, 2010-10-02 at 17:58 +0900, Masami Hiramatsu wrote:
> > 
> > > > +static void remove_module_init(struct module *mod)
> > > 
> > > Hi Jason,
> > > 
> > > Just a comment, I prefer remove_jump_label_module_init() than this name,
> > > because remove_module_init is too general.
> > 
> > This is probably not too big of a deal since it is static, but the name
> > change may make it easier for etags and ctags users.
> > 
> > Jason, are you OK if I pull in this patch and make the change myself?
> > 
> > -- Steve
> > 
> 
> fine with me. I had a more specific function name and then shortened it
> because it seemed too long, either way is fine with me.
> 

Maybe call it jl_remove_module_init()?  For static functions, I usually
use abbreviations. As long as it isn't too confusing.

-- Steve



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] jump label: fix module __init section race
  2010-10-06 15:46         ` Steven Rostedt
@ 2010-10-07  1:56           ` Masami Hiramatsu
  0 siblings, 0 replies; 14+ messages in thread
From: Masami Hiramatsu @ 2010-10-07  1:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jason Baron, mingo, mathieu.desnoyers, hpa, tglx, andi, roland,
	rth, fweisbec, avi, davem, vgoyal, sam, tony, ddaney,
	linux-kernel, 2nddept-manager

(2010/10/07 0:46), Steven Rostedt wrote:
> On Wed, 2010-10-06 at 11:41 -0400, Jason Baron wrote:
>> On Wed, Oct 06, 2010 at 09:00:50AM -0400, Steven Rostedt wrote:
>>> On Sat, 2010-10-02 at 17:58 +0900, Masami Hiramatsu wrote:
>>>
>>>>> +static void remove_module_init(struct module *mod)
>>>>
>>>> Hi Jason,
>>>>
>>>> Just a comment, I prefer remove_jump_label_module_init() than this name,
>>>> because remove_module_init is too general.
>>>
>>> This is probably not too big of a deal since it is static, but the name
>>> change may make it easier for etags and ctags users.
>>>
>>> Jason, are you OK if I pull in this patch and make the change myself?
>>>
>>> -- Steve
>>>
>>
>> fine with me. I had a more specific function name and then shortened it
>> because it seemed too long, either way is fine with me.
>>
> 
> Maybe call it jl_remove_module_init()?  For static functions, I usually
> use abbreviations. As long as it isn't too confusing.
> 

Ah, that's fine for me too :)

-- 
Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:perf/urgent] jump label: Fix module __init section race
  2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
  2010-10-02  8:58   ` Masami Hiramatsu
@ 2010-10-30 10:39   ` tip-bot for Jason Baron
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Jason Baron @ 2010-10-30 10:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, masami.hiramatsu.pt, mathieu.desnoyers,
	rostedt, jbaron, tglx

Commit-ID:  b842f8faf6c7dc2005c6a70631c1a91bac02f180
Gitweb:     http://git.kernel.org/tip/b842f8faf6c7dc2005c6a70631c1a91bac02f180
Author:     Jason Baron <jbaron@redhat.com>
AuthorDate: Fri, 1 Oct 2010 17:23:41 -0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 28 Oct 2010 09:17:02 -0400

jump label: Fix module __init section race

Jump label uses is_module_text_address() to ensure that the module
__init sections are valid before updating them. However, between the
check for a valid module __init section and the subsequent jump
label update, the module's __init section could be freed out from under
us.

We fix this potential race by adding a notifier callback to the
MODULE_STATE_LIVE state. This notifier is called *after* the __init
section has been run but before it is going to be freed. In the
callback, the jump label code zeros the key value for any __init jump
code within the module, and we add a check for a non-zero key value when
we update jump labels. In this way we require no additional data
structures.

Thanks to Mathieu Desnoyers for pointing out this race condition.

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <c6f037b7598777668025ceedd9294212fd95fa34.1285965957.git.jbaron@redhat.com>

[ Renamed remove_module_init() to remove_jump_label_module_init()
  as suggested by Masami Hiramatsu. ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/jump_label.c |   41 ++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 40 insertions(+), 1 deletions(-)

diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 7be868b..be9e105 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -168,7 +168,8 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 			count = e_module->nr_entries;
 			iter = e_module->table;
 			while (count--) {
-				if (kernel_text_address(iter->code))
+				if (iter->key &&
+						kernel_text_address(iter->code))
 					arch_jump_label_transform(iter, type);
 				iter++;
 			}
@@ -366,6 +367,39 @@ static void remove_jump_label_module(struct module *mod)
 	}
 }
 
+static void remove_jump_label_module_init(struct module *mod)
+{
+	struct hlist_head *head;
+	struct hlist_node *node, *node_next, *module_node, *module_node_next;
+	struct jump_label_entry *e;
+	struct jump_label_module_entry *e_module;
+	struct jump_entry *iter;
+	int i, count;
+
+	/* if the module doesn't have jump label entries, just return */
+	if (!mod->num_jump_entries)
+		return;
+
+	for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) {
+		head = &jump_label_table[i];
+		hlist_for_each_entry_safe(e, node, node_next, head, hlist) {
+			hlist_for_each_entry_safe(e_module, module_node,
+						  module_node_next,
+						  &(e->modules), hlist) {
+				if (e_module->mod != mod)
+					continue;
+				count = e_module->nr_entries;
+				iter = e_module->table;
+				while (count--) {
+					if (within_module_init(iter->code, mod))
+						iter->key = 0;
+					iter++;
+				}
+			}
+		}
+	}
+}
+
 static int
 jump_label_module_notify(struct notifier_block *self, unsigned long val,
 			 void *data)
@@ -386,6 +420,11 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
 		remove_jump_label_module(mod);
 		mutex_unlock(&jump_label_mutex);
 		break;
+	case MODULE_STATE_LIVE:
+		mutex_lock(&jump_label_mutex);
+		remove_jump_label_module_init(mod);
+		mutex_unlock(&jump_label_mutex);
+		break;
 	}
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip:perf/urgent] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex
  2010-10-01 21:23 ` [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex Jason Baron
  2010-10-02  9:00   ` Masami Hiramatsu
@ 2010-10-30 10:40   ` tip-bot for Jason Baron
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Jason Baron @ 2010-10-30 10:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, masami.hiramatsu.pt, rostedt, jbaron,
	tglx, mingo

Commit-ID:  91bad2f8d3057482b9afb599f14421b007136960
Gitweb:     http://git.kernel.org/tip/91bad2f8d3057482b9afb599f14421b007136960
Author:     Jason Baron <jbaron@redhat.com>
AuthorDate: Fri, 1 Oct 2010 17:23:48 -0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 28 Oct 2010 09:17:40 -0400

jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex

register_kprobe() downs the 'text_mutex' and then calls
jump_label_text_reserved(), which downs the 'jump_label_mutex'.
However, the jump label code takes those mutexes in the reverse
order.

Fix by requiring the caller of jump_label_text_reserved() to do
the jump label locking via the newly added: jump_label_lock(),
jump_label_unlock(). Currently, kprobes is the only user
of jump_label_text_reserved().

Reported-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <759032c48d5e30c27f0bba003d09bffa8e9f28bb.1285965957.git.jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/jump_label.h |    5 +++++
 kernel/jump_label.c        |   33 +++++++++++++++++++++------------
 kernel/kprobes.c           |    6 ++++++
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index b67cb18..1947a12 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -18,6 +18,8 @@ struct module;
 extern struct jump_entry __start___jump_table[];
 extern struct jump_entry __stop___jump_table[];
 
+extern void jump_label_lock(void);
+extern void jump_label_unlock(void);
 extern void arch_jump_label_transform(struct jump_entry *entry,
 				 enum jump_label_type type);
 extern void arch_jump_label_text_poke_early(jump_label_t addr);
@@ -59,6 +61,9 @@ static inline int jump_label_text_reserved(void *start, void *end)
 	return 0;
 }
 
+static inline void jump_label_lock(void) {}
+static inline void jump_label_unlock(void) {}
+
 #endif
 
 #define COND_STMT(key, stmt)					\
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index be9e105..12cce78 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -39,6 +39,16 @@ struct jump_label_module_entry {
 	struct module *mod;
 };
 
+void jump_label_lock(void)
+{
+	mutex_lock(&jump_label_mutex);
+}
+
+void jump_label_unlock(void)
+{
+	mutex_unlock(&jump_label_mutex);
+}
+
 static int jump_label_cmp(const void *a, const void *b)
 {
 	const struct jump_entry *jea = a;
@@ -152,7 +162,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 	struct jump_label_module_entry *e_module;
 	int count;
 
-	mutex_lock(&jump_label_mutex);
+	jump_label_lock();
 	entry = get_jump_label_entry((jump_label_t)key);
 	if (entry) {
 		count = entry->nr_entries;
@@ -175,7 +185,7 @@ void jump_label_update(unsigned long key, enum jump_label_type type)
 			}
 		}
 	}
-	mutex_unlock(&jump_label_mutex);
+	jump_label_unlock();
 }
 
 static int addr_conflict(struct jump_entry *entry, void *start, void *end)
@@ -232,6 +242,7 @@ out:
  * overlaps with any of the jump label patch addresses. Code
  * that wants to modify kernel text should first verify that
  * it does not overlap with any of the jump label addresses.
+ * Caller must hold jump_label_mutex.
  *
  * returns 1 if there is an overlap, 0 otherwise
  */
@@ -242,7 +253,6 @@ int jump_label_text_reserved(void *start, void *end)
 	struct jump_entry *iter_stop = __start___jump_table;
 	int conflict = 0;
 
-	mutex_lock(&jump_label_mutex);
 	iter = iter_start;
 	while (iter < iter_stop) {
 		if (addr_conflict(iter, start, end)) {
@@ -257,7 +267,6 @@ int jump_label_text_reserved(void *start, void *end)
 	conflict = module_conflict(start, end);
 #endif
 out:
-	mutex_unlock(&jump_label_mutex);
 	return conflict;
 }
 
@@ -268,7 +277,7 @@ static __init int init_jump_label(void)
 	struct jump_entry *iter_stop = __stop___jump_table;
 	struct jump_entry *iter;
 
-	mutex_lock(&jump_label_mutex);
+	jump_label_lock();
 	ret = build_jump_label_hashtable(__start___jump_table,
 					 __stop___jump_table);
 	iter = iter_start;
@@ -276,7 +285,7 @@ static __init int init_jump_label(void)
 		arch_jump_label_text_poke_early(iter->code);
 		iter++;
 	}
-	mutex_unlock(&jump_label_mutex);
+	jump_label_unlock();
 	return ret;
 }
 early_initcall(init_jump_label);
@@ -409,21 +418,21 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val,
 
 	switch (val) {
 	case MODULE_STATE_COMING:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		ret = add_jump_label_module(mod);
 		if (ret)
 			remove_jump_label_module(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	case MODULE_STATE_GOING:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		remove_jump_label_module(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	case MODULE_STATE_LIVE:
-		mutex_lock(&jump_label_mutex);
+		jump_label_lock();
 		remove_jump_label_module_init(mod);
-		mutex_unlock(&jump_label_mutex);
+		jump_label_unlock();
 		break;
 	}
 	return ret;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 99865c3..9437e14 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1146,13 +1146,16 @@ int __kprobes register_kprobe(struct kprobe *p)
 		return ret;
 
 	preempt_disable();
+	jump_label_lock();
 	if (!kernel_text_address((unsigned long) p->addr) ||
 	    in_kprobes_functions((unsigned long) p->addr) ||
 	    ftrace_text_reserved(p->addr, p->addr) ||
 	    jump_label_text_reserved(p->addr, p->addr)) {
 		preempt_enable();
+		jump_label_unlock();
 		return -EINVAL;
 	}
+	jump_label_unlock();
 
 	/* User can pass only KPROBE_FLAG_DISABLED to register_kprobe */
 	p->flags &= KPROBE_FLAG_DISABLED;
@@ -1187,6 +1190,8 @@ int __kprobes register_kprobe(struct kprobe *p)
 	INIT_LIST_HEAD(&p->list);
 	mutex_lock(&kprobe_mutex);
 
+	jump_label_lock(); /* needed to call jump_label_text_reserved() */
+
 	get_online_cpus();	/* For avoiding text_mutex deadlock. */
 	mutex_lock(&text_mutex);
 
@@ -1214,6 +1219,7 @@ int __kprobes register_kprobe(struct kprobe *p)
 out:
 	mutex_unlock(&text_mutex);
 	put_online_cpus();
+	jump_label_unlock();
 	mutex_unlock(&kprobe_mutex);
 
 	if (probed_mod)

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-10-30 10:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-01 21:23 [PATCH 0/5] jump label: core updates Jason Baron
2010-10-01 21:23 ` [PATCH 1/5] jump label: fix module __init section race Jason Baron
2010-10-02  8:58   ` Masami Hiramatsu
2010-10-06 13:00     ` Steven Rostedt
2010-10-06 15:41       ` Jason Baron
2010-10-06 15:46         ` Steven Rostedt
2010-10-07  1:56           ` Masami Hiramatsu
2010-10-30 10:39   ` [tip:perf/urgent] jump label: Fix " tip-bot for Jason Baron
2010-10-01 21:23 ` [PATCH 2/5] jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex Jason Baron
2010-10-02  9:00   ` Masami Hiramatsu
2010-10-30 10:40   ` [tip:perf/urgent] " tip-bot for Jason Baron
2010-10-01 21:23 ` [PATCH 3/5] jump label: add register_jump_label_key/unregister_jump_label_key Jason Baron
2010-10-01 21:23 ` [PATCH 4/5] jump label: move jump table to r/w section Jason Baron
2010-10-01 21:24 ` [PATCH 5/5] jump label: add docs Jason Baron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.