linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix kallsyms/insmod/rmmod race
@ 2005-01-17 16:27 David Howells
  2005-01-18  2:20 ` Rusty Russell
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: David Howells @ 2005-01-17 16:27 UTC (permalink / raw)
  To: rusty, akpm, torvalds; +Cc: linux-kernel, linuxppc64-dev


The attached patch fixes a race between kallsyms and insmod/rmmod.

The problem is this:

 (1) The various kallsyms functions poke around in the module list without any
     locking so that they can be called from the oops handler.

 (2) Although insmod and rmmod use locks to exclude each other, these have no
     effect on the kallsyms function.

 (3) Although rmmod modifies the module state with the machine "stopped", it
     hasn't removed the metadata from the module metadata list, meaning that
     as soon as the machine is "restarted", the metadata can be observed by
     kallsyms.

     It's not possible to say that an item in that list should be ignored if
     it's state is marked as inactive - you can't get at the state information
     because you can't trust the metadata in which it is embedded.

     Furthermore, list linkage information is embedded in the metadata too, so
     you can't trust that either...

 (4) kallsyms may be walking the module list without a lock whilst either
     insmod or rmmod are busy changing it. insmod probably isn't a problem
     since nothing is going a way, but rmmod is as it's deleting an entry.

 (5) Therefore nothing that uses these functions can in any way trust any
     pointers to "static" data (such as module symbol names or module names)
     that are returned.

 (6) On ppc64 the problems are exacerbated since the hypervisor may reschedule
     bits of the kernel, making operations that appear adjacent occur a long
     time apart.

This patch fixes the race by only linking/unlinking modules into/from the
master module list with the machine in the "stopped" state. This means that
any "static" information can be trusted as far as the next kernel reschedule
on any given CPU without the need to hold any locks.

However, I'm not sure how this is affected by preemption. I suspect more work
may need to be done in that case, but I'm not entirely sure.

This also means that rmmod has to bump the machine into the stopped state
twice... but since that shouldn't be a common operation, I don't think that's
a problem.

Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat kallsyms-race-2611rc1.diff
 kallsyms.c |   16 ++++++++++++++--
 module.c   |   35 ++++++++++++++++++++++++++++-------
 2 files changed, 42 insertions(+), 9 deletions(-)

diff -uNrp linux-2.6.11-rc1/kernel/kallsyms.c linux-2.6.11-rc1-kallsyms/kernel/kallsyms.c
--- linux-2.6.11-rc1/kernel/kallsyms.c	2005-01-12 19:09:18.000000000 +0000
+++ linux-2.6.11-rc1-kallsyms/kernel/kallsyms.c	2005-01-17 15:33:55.000000000 +0000
@@ -139,13 +139,20 @@ unsigned long kallsyms_lookup_name(const
 	return module_kallsyms_lookup_name(name);
 }
 
-/* Lookup an address.  modname is set to NULL if it's in the kernel. */
+/*
+ * Lookup an address
+ * - modname is set to NULL if it's in the kernel
+ * - we guarantee that the returned name is valid until we reschedule even if
+ *   it resides in a module
+ * - we also guarantee that modname will be valid until rescheduled
+ */
 const char *kallsyms_lookup(unsigned long addr,
 			    unsigned long *symbolsize,
 			    unsigned long *offset,
 			    char **modname, char *namebuf)
 {
 	unsigned long i, low, high, mid;
+	const char *msym;
 
 	/* This kernel should never had been booted. */
 	BUG_ON(!kallsyms_addresses);
@@ -196,7 +203,12 @@ const char *kallsyms_lookup(unsigned lon
 		return namebuf;
 	}
 
-	return module_address_lookup(addr, symbolsize, offset, modname);
+	/* see if it's in a module */
+	msym = module_address_lookup(addr, symbolsize, offset, modname);
+	if (msym)
+		return strncpy(namebuf, msym, KSYM_NAME_LEN);
+
+	return NULL;
 }
 
 /* Replace "%s" in format with address, or returns -errno. */
diff -uNrp linux-2.6.11-rc1/kernel/module.c linux-2.6.11-rc1-kallsyms/kernel/module.c
--- linux-2.6.11-rc1/kernel/module.c	2005-01-12 19:09:18.000000000 +0000
+++ linux-2.6.11-rc1-kallsyms/kernel/module.c	2005-01-17 15:31:42.000000000 +0000
@@ -1072,14 +1072,24 @@ static void mod_kobject_remove(struct mo
 	kobject_unregister(&mod->mkobj.kobj);
 }
 
+/*
+ * unlink the module with the whole machine is stopped with interrupts off
+ * - this defends against kallsyms not taking locks
+ */
+static inline int __unlink_module(void *_mod)
+{
+	struct module *mod = _mod;
+	spin_lock(&modlist_lock);
+	list_del(&mod->list);
+	spin_unlock(&modlist_lock);
+	return 0;
+}
+
 /* Free a module, remove from lists, etc (must hold module mutex). */
 static void free_module(struct module *mod)
 {
 	/* Delete from various lists */
-	spin_lock_irq(&modlist_lock);
-	list_del(&mod->list);
-	spin_unlock_irq(&modlist_lock);
-
+	stop_machine_run(__unlink_module, mod, NR_CPUS);
 	remove_sect_attrs(mod);
 	mod_kobject_remove(mod);
 
@@ -1732,6 +1742,19 @@ static struct module *load_module(void _
 	goto free_hdr;
 }
 
+/*
+ * link the module with the whole machine is stopped with interrupts off
+ * - this defends against kallsyms not taking locks
+ */
+static inline int __link_module(void *_mod)
+{
+	struct module *mod = _mod;
+	spin_lock(&modlist_lock);
+	list_add(&mod->list, &modules);
+	spin_unlock(&modlist_lock);
+	return 0;
+}
+
 /* This is where the real work happens */
 asmlinkage long
 sys_init_module(void __user *umod,
@@ -1766,9 +1789,7 @@ sys_init_module(void __user *umod,
 
 	/* Now sew it into the lists.  They won't access us, since
            strong_try_module_get() will fail. */
-	spin_lock_irq(&modlist_lock);
-	list_add(&mod->list, &modules);
-	spin_unlock_irq(&modlist_lock);
+	stop_machine_run(__link_module, mod, NR_CPUS);
 
 	/* Drop lock so they can recurse */
 	up(&module_mutex);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix kallsyms/insmod/rmmod race
  2005-01-17 16:27 [PATCH] Fix kallsyms/insmod/rmmod race David Howells
@ 2005-01-18  2:20 ` Rusty Russell
  2005-01-18 19:44 ` David Howells
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2005-01-18  2:20 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, lkml - Kernel Mailing List,
	linuxppc64-dev

On Mon, 2005-01-17 at 16:27 +0000, David Howells wrote:
> The attached patch fixes a race between kallsyms and insmod/rmmod.

Hi David,

	The more I looked at this, the more I warmed to it.  I've known for a
while that people are using kallsyms not for OOPS (eg. /proc/$$/wchan),
so we should provide a "grabs locks" version, but this solution gets
around that nicely, while making life more certain for the oops case,
too.

Good work!
Rusty.
-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix kallsyms/insmod/rmmod race
  2005-01-17 16:27 [PATCH] Fix kallsyms/insmod/rmmod race David Howells
  2005-01-18  2:20 ` Rusty Russell
@ 2005-01-18 19:44 ` David Howells
  2005-01-27 14:02 ` David Howells
  2005-01-27 14:08 ` [PATCH] Fix kallsyms/insmod/rmmod race [try #2] David Howells
  3 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2005-01-18 19:44 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Linus Torvalds, lkml - Kernel Mailing List,
	linuxppc64-dev


Rusty Russell <rusty@rustcorp.com.au> wrote:

> 	The more I looked at this, the more I warmed to it.  I've known for a
> while that people are using kallsyms not for OOPS (eg. /proc/$$/wchan),
> so we should provide a "grabs locks" version, but this solution gets
> around that nicely, while making life more certain for the oops case,
> too.


Hmmm... though it works on i386 SMP, it doesn't, however, seem to work on
ppc64 SMP:-/

My pSeries box seems to think that it can't find any symbols from previously
loaded modules, and my Power5 box is quite happy to load modules that depend
on other modules but panics because it can't mount its root fs.

This is very odd, because the patch is simple enough. Is there anything
obvious I've missed that you can see? Or maybe I'm just misunderstanding how
stop_machine_run() works... maybe it can't be called during initialisation.

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix kallsyms/insmod/rmmod race
  2005-01-17 16:27 [PATCH] Fix kallsyms/insmod/rmmod race David Howells
  2005-01-18  2:20 ` Rusty Russell
  2005-01-18 19:44 ` David Howells
@ 2005-01-27 14:02 ` David Howells
  2005-01-27 14:08 ` [PATCH] Fix kallsyms/insmod/rmmod race [try #2] David Howells
  3 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2005-01-27 14:02 UTC (permalink / raw)
  Cc: Rusty Russell, Andrew Morton, linuxppc64-dev, Linus Torvalds,
	lkml - Kernel Mailing List

David Howells <dhowells@redhat.com> wrote:

> Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > 	The more I looked at this, the more I warmed to it.  I've known for a
> > while that people are using kallsyms not for OOPS (eg. /proc/$$/wchan),
> > so we should provide a "grabs locks" version, but this solution gets
> > around that nicely, while making life more certain for the oops case,
> > too.
> 
> Hmmm... though it works on i386 SMP, it doesn't, however, seem to work on
> ppc64 SMP:-/
> 
> My pSeries box seems to think that it can't find any symbols from previously
> loaded modules, and my Power5 box is quite happy to load modules that depend
> on other modules but panics because it can't mount its root fs.

Turns out that the patch works. Userspace was being bad. The stripped down
shell running as init (pid #1) wasn't taking into account that it would get
notification of kernel threads exiting when it called wait(), and so ended up
trying to load several modules at once, some of which required dependency
modules loading first.

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Fix kallsyms/insmod/rmmod race [try #2]
  2005-01-17 16:27 [PATCH] Fix kallsyms/insmod/rmmod race David Howells
                   ` (2 preceding siblings ...)
  2005-01-27 14:02 ` David Howells
@ 2005-01-27 14:08 ` David Howells
  2005-01-28  0:42   ` Rusty Russell
  3 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2005-01-27 14:08 UTC (permalink / raw)
  To: rusty, akpm, torvalds, linuxppc64-dev, linux-kernel


The attached patch fixes a race between kallsyms and insmod/rmmod.

The problem is this:

 (1) The various kallsyms functions poke around in the module list without any
     locking so that they can be called from the oops handler.

 (2) Although insmod and rmmod use locks to exclude each other, these have no
     effect on the kallsyms function.

 (3) Although rmmod modifies the module state with the machine "stopped", it
     hasn't removed the metadata from the module metadata list, meaning that
     as soon as the machine is "restarted", the metadata can be observed by
     kallsyms.

     It's not possible to say that an item in that list should be ignored if
     it's state is marked as inactive - you can't get at the state information
     because you can't trust the metadata in which it is embedded.

     Furthermore, list linkage information is embedded in the metadata too, so
     you can't trust that either...

 (4) kallsyms may be walking the module list without a lock whilst either
     insmod or rmmod are busy changing it. insmod probably isn't a problem
     since nothing is going a way, but rmmod is as it's deleting an entry.

 (5) Therefore nothing that uses these functions can in any way trust any
     pointers to "static" data (such as module symbol names or module names)
     that are returned.

 (6) On ppc64 the problems are exacerbated since the hypervisor may reschedule
     bits of the kernel, making operations that appear adjacent occur a long
     time apart.

This patch fixes the race by only linking/unlinking modules into/from the
master module list with the machine in the "stopped" state. This means that
any "static" information can be trusted as far as the next kernel reschedule
on any given CPU without the need to hold any locks.

However, I'm not sure how this is affected by preemption. I suspect more work
may need to be done in that case, but I'm not entirely sure.

This also means that rmmod has to bump the machine into the stopped state
twice... but since that shouldn't be a common operation, I don't think that's
a problem.

I've amended this patch to not get spinlocks whilst in the machine locked
state - there's no point as nothing else can be holding spinlocks.

Signed-Off-By: David Howells <dhowells@redhat.com>
---
warthog>diffstat kallsyms-race-2611rc1.diff
 kallsyms.c |   16 ++++++++++++++--
 module.c   |   31 ++++++++++++++++++++++++-------
 2 files changed, 38 insertions(+), 9 deletions(-)

diff -uNrp linux-2.6.11-rc1/kernel/kallsyms.c linux-2.6.11-rc1-kallsyms/kernel/kallsyms.c
--- linux-2.6.11-rc1/kernel/kallsyms.c	2005-01-12 19:09:18.000000000 +0000
+++ linux-2.6.11-rc1-kallsyms/kernel/kallsyms.c	2005-01-17 15:33:55.000000000 +0000
@@ -139,13 +139,20 @@ unsigned long kallsyms_lookup_name(const
 	return module_kallsyms_lookup_name(name);
 }
 
-/* Lookup an address.  modname is set to NULL if it's in the kernel. */
+/*
+ * Lookup an address
+ * - modname is set to NULL if it's in the kernel
+ * - we guarantee that the returned name is valid until we reschedule even if
+ *   it resides in a module
+ * - we also guarantee that modname will be valid until rescheduled
+ */
 const char *kallsyms_lookup(unsigned long addr,
 			    unsigned long *symbolsize,
 			    unsigned long *offset,
 			    char **modname, char *namebuf)
 {
 	unsigned long i, low, high, mid;
+	const char *msym;
 
 	/* This kernel should never had been booted. */
 	BUG_ON(!kallsyms_addresses);
@@ -196,7 +203,12 @@ const char *kallsyms_lookup(unsigned lon
 		return namebuf;
 	}
 
-	return module_address_lookup(addr, symbolsize, offset, modname);
+	/* see if it's in a module */
+	msym = module_address_lookup(addr, symbolsize, offset, modname);
+	if (msym)
+		return strncpy(namebuf, msym, KSYM_NAME_LEN);
+
+	return NULL;
 }
 
 /* Replace "%s" in format with address, or returns -errno. */
diff -uNrp linux-2.6.11-rc1/kernel/module.c linux-2.6.11-rc1-kallsyms/kernel/module.c
--- linux-2.6.11-rc1/kernel/module.c	2005-01-12 19:09:18.000000000 +0000
+++ linux-2.6.11-rc1-kallsyms/kernel/module.c	2005-01-27 14:06:22.857054758 +0000
@@ -1072,14 +1072,22 @@ static void mod_kobject_remove(struct mo
 	kobject_unregister(&mod->mkobj.kobj);
 }
 
+/*
+ * unlink the module with the whole machine is stopped with interrupts off
+ * - this defends against kallsyms not taking locks
+ */
+static inline int __unlink_module(void *_mod)
+{
+	struct module *mod = _mod;
+	list_del(&mod->list);
+	return 0;
+}
+
 /* Free a module, remove from lists, etc (must hold module mutex). */
 static void free_module(struct module *mod)
 {
 	/* Delete from various lists */
-	spin_lock_irq(&modlist_lock);
-	list_del(&mod->list);
-	spin_unlock_irq(&modlist_lock);
-
+	stop_machine_run(__unlink_module, mod, NR_CPUS);
 	remove_sect_attrs(mod);
 	mod_kobject_remove(mod);
 
@@ -1732,6 +1740,17 @@ static struct module *load_module(void _
 	goto free_hdr;
 }
 
+/*
+ * link the module with the whole machine is stopped with interrupts off
+ * - this defends against kallsyms not taking locks
+ */
+static inline int __link_module(void *_mod)
+{
+	struct module *mod = _mod;
+	list_add(&mod->list, &modules);
+	return 0;
+}
+
 /* This is where the real work happens */
 asmlinkage long
 sys_init_module(void __user *umod,
@@ -1766,9 +1785,7 @@ sys_init_module(void __user *umod,
 
 	/* Now sew it into the lists.  They won't access us, since
            strong_try_module_get() will fail. */
-	spin_lock_irq(&modlist_lock);
-	list_add(&mod->list, &modules);
-	spin_unlock_irq(&modlist_lock);
+	stop_machine_run(__link_module, mod, NR_CPUS);
 
 	/* Drop lock so they can recurse */
 	up(&module_mutex);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix kallsyms/insmod/rmmod race [try #2]
  2005-01-27 14:08 ` [PATCH] Fix kallsyms/insmod/rmmod race [try #2] David Howells
@ 2005-01-28  0:42   ` Rusty Russell
  0 siblings, 0 replies; 6+ messages in thread
From: Rusty Russell @ 2005-01-28  0:42 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, linuxppc64-dev,
	lkml - Kernel Mailing List

On Thu, 2005-01-27 at 14:08 +0000, David Howells wrote:
> Signed-Off-By: David Howells <dhowells@redhat.com>

Excellent.  Thanks David!

Rusty.
-- 

A bad analogy is like a leaky screwdriver -- Richard Braakman


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-01-28  1:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-17 16:27 [PATCH] Fix kallsyms/insmod/rmmod race David Howells
2005-01-18  2:20 ` Rusty Russell
2005-01-18 19:44 ` David Howells
2005-01-27 14:02 ` David Howells
2005-01-27 14:08 ` [PATCH] Fix kallsyms/insmod/rmmod race [try #2] David Howells
2005-01-28  0:42   ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).