All of lore.kernel.org
 help / color / mirror / Atom feed
* [Regression] Crash in load_module() while freeing args
@ 2010-05-25 21:00 Rafael J. Wysocki
  2010-05-25 22:54 ` Rafael J. Wysocki
  0 siblings, 1 reply; 71+ messages in thread
From: Rafael J. Wysocki @ 2010-05-25 21:00 UTC (permalink / raw)
  To: LKML; +Cc: Linus Torvalds, Andrew Morton, Rusty Russell

Hi,

Recent -git kernels crash for me quite often during boot, reporting that kernel
paging request could not be satisfied at load_module+0x18e4, which
corresponds to line 2483 of kernel/module.c for the kernel in question (however
the same call trace says IP is at precpu_modfree+0x1 at the moment of the
crash, which seems to point to line 2481 of the same file).  It doesn't happen
every time, but it does happen often enough to be annoying and the call
trace is always exactly the same.

I'm going to revert post-2.6.34 commits that touch kernel/module.c directly
and see what happens.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-25 21:00 [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
@ 2010-05-25 22:54 ` Rafael J. Wysocki
  2010-05-25 23:47   ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rafael J. Wysocki @ 2010-05-25 22:54 UTC (permalink / raw)
  To: Rusty Russell; +Cc: LKML, Linus Torvalds, Andrew Morton, Brandon Philips

On Tuesday 25 May 2010, Rafael J. Wysocki wrote:
> Hi,
> 
> Recent -git kernels crash for me quite often during boot, reporting that kernel
> paging request could not be satisfied at load_module+0x18e4, which
> corresponds to line 2483 of kernel/module.c for the kernel in question (however
> the same call trace says IP is at precpu_modfree+0x1 at the moment of the
> crash, which seems to point to line 2481 of the same file).  It doesn't happen
> every time, but it does happen often enough to be annoying and the call
> trace is always exactly the same.
> 
> I'm going to revert post-2.6.34 commits that touch kernel/module.c directly
> and see what happens.

I'm not able to reproduce the issue with the following commit reverted:

commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Wed May 19 17:33:39 2010 -0600

    module: drop the lock while waiting for module to complete initialization.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-25 22:54 ` Rafael J. Wysocki
@ 2010-05-25 23:47   ` Linus Torvalds
  2010-05-26  8:00     ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-05-25 23:47 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Rusty Russell, LKML, Andrew Morton, Brandon Philips



On Wed, 26 May 2010, Rafael J. Wysocki wrote:
>
> I'm not able to reproduce the issue with the following commit reverted:
> 
> commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> Author: Rusty Russell <rusty@rustcorp.com.au>
> Date:   Wed May 19 17:33:39 2010 -0600
> 
>     module: drop the lock while waiting for module to complete initialization.

Hmm. That does seem to be buggy. We can't just drop and re-take the lock: 
that may make sense _internally_ as far as resolve_symbol() itself is 
concerned, but the caller will its own local variables, and some of those 
will no longer be valid if the lock was dropped. 

That commit also changes the return value semantics of "use_module()", 
which is an exported interface where the only users seem to be 
out-of-kernel (the only in-kernel use is in kernel/module.c itself). That 
seems like a really really bad idea too.

So I think reverting it is definitely the right thing to do. The commit 
seems fundamentally broken. And having modules do request_module() in 
their init functions has always been invalid anyway, so that excuse 
doesn't really seem to be a reason to do anything crazy like this either.

Rewriting the logic to
 - not drop the lock
 - not change the return semantics of an exported interface
 - just make 'resolve_symbol()' fail if the module isn't fully loaded
would seem to be a more reasonable approach, no?

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-25 23:47   ` Linus Torvalds
@ 2010-05-26  8:00     ` Rusty Russell
  2010-05-26 11:57       ` Rusty Russell
  2010-05-26 15:41       ` [Regression] Crash in load_module() while freeing args Linus Torvalds
  0 siblings, 2 replies; 71+ messages in thread
From: Rusty Russell @ 2010-05-26  8:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, LKML, Andrew Morton, Brandon Philips, Jon Masters

On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
> 
> On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> >
> > I'm not able to reproduce the issue with the following commit reverted:
> > 
> > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > Author: Rusty Russell <rusty@rustcorp.com.au>
> > Date:   Wed May 19 17:33:39 2010 -0600
> > 
> >     module: drop the lock while waiting for module to complete initialization.
> 
> Hmm. That does seem to be buggy. We can't just drop and re-take the lock: 
> that may make sense _internally_ as far as resolve_symbol() itself is 
> concerned, but the caller will its own local variables, and some of those 
> will no longer be valid if the lock was dropped. 

Well, yes, obviously I missed something :(  I'll look at it tonight after
Arabella is asleep.

> That commit also changes the return value semantics of "use_module()", 
> which is an exported interface where the only users seem to be 
> out-of-kernel (the only in-kernel use is in kernel/module.c itself). That 
> seems like a really really bad idea too.

The kprobes guys: they were cc'd about the change.

> So I think reverting it is definitely the right thing to do. The commit 
> seems fundamentally broken.

> And having modules do request_module() in 
> their init functions has always been invalid anyway, so that excuse 
> doesn't really seem to be a reason to do anything crazy like this either.

I'd have to look back through the pre-git history, but we've dropped the
lock around the initfn for a long time now because people wanted to do odd
things (ISTR it sucked when modules oopsed on load, too).

So then we have the problem that crc32 is finished its init and needs the
lock back, and bnx2x which needs crc32 is waiting for it.  We could just
fail bnx2x; and in fact, we did prior to this patch (we timeout) and it breaks
network on booting on some box according to Brandon:

http://www.mail-archive.com/linux-crypto@vger.kernel.org/msg04331.html

This *used* not to be a problem, because userspace placed locks on
modules and so it would never try to load bnx2x until crc32 was loaded.
ISTR a mention that Jon removed that...

> Rewriting the logic to
>  - not drop the lock
>  - not change the return semantics of an exported interface
>  - just make 'resolve_symbol()' fail if the module isn't fully loaded
> would seem to be a more reasonable approach, no?

Sure, then userspace needs to change :(

Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-26  8:00     ` Rusty Russell
@ 2010-05-26 11:57       ` Rusty Russell
  2010-05-26 22:56         ` Rafael J. Wysocki
  2010-05-26 15:41       ` [Regression] Crash in load_module() while freeing args Linus Torvalds
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-05-26 11:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Wed, 26 May 2010 05:30:58 pm Rusty Russell wrote:
> On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
> > 
> > On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> > >
> > > I'm not able to reproduce the issue with the following commit reverted:
> > > 
> > > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > > Author: Rusty Russell <rusty@rustcorp.com.au>
> > > Date:   Wed May 19 17:33:39 2010 -0600
> > > 
> > >     module: drop the lock while waiting for module to complete initialization.
> > 
> > Hmm. That does seem to be buggy. We can't just drop and re-take the lock: 
> > that may make sense _internally_ as far as resolve_symbol() itself is 
> > concerned, but the caller will its own local variables, and some of those 
> > will no longer be valid if the lock was dropped. 
> 
> Well, yes, obviously I missed something :(  I'll look at it tonight after
> Arabella is asleep.

See if you can spot it (I acked the patch, so I can't point fingers):

 free_core:
	module_free(mod, mod->module_core);
	/* mod will be freed with core. Don't access it beyond this line! */
 free_percpu:
	percpu_modfree(mod);

Only a year after Masami fixed that and added the comment, too :(

I suspect that the increased parallelism enabled by this patch uncovered this
bug.  Does this fix it?

(Side note: the locking should be simplified.  No code before simplify_symbols
actually needs the lock, so we should grab it just for that, then again at the
end.  We use kobjects to protect us from multiple loads as a side-effect, but
we should move that registration to the end).

Subject: module: fix reference to mod->percpu after freeing module.

The comment about the mod being freed is self-explanatory, but neither
Tejun nor I read it.  This bug was introduced in 259354deaa, after it
had previously been fixed in 6e2b75740b.  How embarrassing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2031,6 +2031,7 @@ static noinline struct module *load_modu
 	long err = 0;
 	void *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long symoffs, stroffs, *strmap;
+	void __percpu *percpu;
 
 	mm_segment_t old_fs;
 
@@ -2175,6 +2176,8 @@ static noinline struct module *load_modu
 			goto free_mod;
 		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
 	}
+	/* Keep this around for failure path. */
+	percpu = mod_percpu(mod);
 
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
@@ -2480,7 +2483,7 @@ static noinline struct module *load_modu
 	module_free(mod, mod->module_core);
 	/* mod will be freed with core. Don't access it beyond this line! */
  free_percpu:
-	percpu_modfree(mod);
+	free_percpu(percpu);
  free_mod:
 	kfree(args);
 	kfree(strmap);

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-26  8:00     ` Rusty Russell
  2010-05-26 11:57       ` Rusty Russell
@ 2010-05-26 15:41       ` Linus Torvalds
  1 sibling, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-26 15:41 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Rafael J. Wysocki, LKML, Andrew Morton, Brandon Philips, Jon Masters



On Wed, 26 May 2010, Rusty Russell wrote:
> 
> So then we have the problem that crc32 is finished its init and needs the
> lock back, and bnx2x which needs crc32 is waiting for it.  We could just
> fail bnx2x; and in fact, we did prior to this patch (we timeout) and it breaks
> network on booting on some box according to Brandon:

So the sane thing to do seems to simply not get that mutex_lock after a 
successful init. Which probably implies having a new lock for the actual 
low-level symbol table stuff.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-26 11:57       ` Rusty Russell
@ 2010-05-26 22:56         ` Rafael J. Wysocki
  2010-05-26 23:07           ` Linus Torvalds
  2010-05-27  5:26           ` Rusty Russell
  0 siblings, 2 replies; 71+ messages in thread
From: Rafael J. Wysocki @ 2010-05-26 22:56 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Wednesday 26 May 2010, Rusty Russell wrote:
> On Wed, 26 May 2010 05:30:58 pm Rusty Russell wrote:
> > On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
> > > 
> > > On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> > > >
> > > > I'm not able to reproduce the issue with the following commit reverted:
> > > > 
> > > > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > > > Author: Rusty Russell <rusty@rustcorp.com.au>
> > > > Date:   Wed May 19 17:33:39 2010 -0600
> > > > 
> > > >     module: drop the lock while waiting for module to complete initialization.
> > > 
> > > Hmm. That does seem to be buggy. We can't just drop and re-take the lock: 
> > > that may make sense _internally_ as far as resolve_symbol() itself is 
> > > concerned, but the caller will its own local variables, and some of those 
> > > will no longer be valid if the lock was dropped. 
> > 
> > Well, yes, obviously I missed something :(  I'll look at it tonight after
> > Arabella is asleep.
> 
> See if you can spot it (I acked the patch, so I can't point fingers):
> 
>  free_core:
> 	module_free(mod, mod->module_core);
> 	/* mod will be freed with core. Don't access it beyond this line! */
>  free_percpu:
> 	percpu_modfree(mod);
> 
> Only a year after Masami fixed that and added the comment, too :(
> 
> I suspect that the increased parallelism enabled by this patch uncovered this
> bug.  Does this fix it?

Since the commit has been reverted, do you still want me to test this patch?
Quite frankly I'd prefer to test a complete replacement for that commit on top
of current -git.

Rafael


> (Side note: the locking should be simplified.  No code before simplify_symbols
> actually needs the lock, so we should grab it just for that, then again at the
> end.  We use kobjects to protect us from multiple loads as a side-effect, but
> we should move that registration to the end).
> 
> Subject: module: fix reference to mod->percpu after freeing module.
> 
> The comment about the mod being freed is self-explanatory, but neither
> Tejun nor I read it.  This bug was introduced in 259354deaa, after it
> had previously been fixed in 6e2b75740b.  How embarrassing.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Masami Hiramatsu <mhiramat@redhat.com>
> 
> diff --git a/kernel/module.c b/kernel/module.c
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2031,6 +2031,7 @@ static noinline struct module *load_modu
>  	long err = 0;
>  	void *ptr = NULL; /* Stops spurious gcc warning */
>  	unsigned long symoffs, stroffs, *strmap;
> +	void __percpu *percpu;
>  
>  	mm_segment_t old_fs;
>  
> @@ -2175,6 +2176,8 @@ static noinline struct module *load_modu
>  			goto free_mod;
>  		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
>  	}
> +	/* Keep this around for failure path. */
> +	percpu = mod_percpu(mod);
>  
>  	/* Determine total sizes, and put offsets in sh_entsize.  For now
>  	   this is done generically; there doesn't appear to be any
> @@ -2480,7 +2483,7 @@ static noinline struct module *load_modu
>  	module_free(mod, mod->module_core);
>  	/* mod will be freed with core. Don't access it beyond this line! */
>   free_percpu:
> -	percpu_modfree(mod);
> +	free_percpu(percpu);
>   free_mod:
>  	kfree(args);
>  	kfree(strmap);
> 
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-26 22:56         ` Rafael J. Wysocki
@ 2010-05-26 23:07           ` Linus Torvalds
  2010-05-27  5:26           ` Rusty Russell
  1 sibling, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-26 23:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rusty Russell, LKML, Andrew Morton, Brandon Philips, Jon Masters,
	Tejun Heo, Masami Hiramatsu



On Thu, 27 May 2010, Rafael J. Wysocki wrote:
> 
> Since the commit has been reverted, do you still want me to test this patch?
> Quite frankly I'd prefer to test a complete replacement for that commit on top
> of current -git.

I'm not re-applying it with the pointless semantic changes that are 
visible to modules. It doesn't matter if they were informed, if it means 
that they'll then just have some odd version dependency and add crazy 
"#if LINUX_VERSION" tests that aren't even exact.

I also wonder exactly what that module_mutex() actually ends up 
protecting. 99% of load_module() seems to be stuff that is purely about 
local issues. Maybe we could tighten the actual lock section to just the
parts that actually expose the vmalloc'ed area to others?

It's generally pointless releasing a lock in the middle - that just makes 
the lock even less valid. If it's valid to just release the lock (without 
some retry logic starting everything from the beginning), it likely the 
lock shouldn't have been held there in the first place.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-26 22:56         ` Rafael J. Wysocki
  2010-05-26 23:07           ` Linus Torvalds
@ 2010-05-27  5:26           ` Rusty Russell
  2010-05-27 18:46             ` Brandon Philips
  2010-05-27 21:57             ` [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
  1 sibling, 2 replies; 71+ messages in thread
From: Rusty Russell @ 2010-05-27  5:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> On Wednesday 26 May 2010, Rusty Russell wrote:
> > I suspect that the increased parallelism enabled by this patch uncovered this
> > bug.  Does this fix it?
> 
> Since the commit has been reverted, do you still want me to test this patch?
> Quite frankly I'd prefer to test a complete replacement for that commit on top
> of current -git.

OK, combo meal deal below, against Linus' latest.  I'd really appreciate
a report, since AFAIK you're the only one hitting it, and only when that
other (now reverted) patch was applied.

As an side to Brandon: I can see how my patch fixed an explicit request_module
inside module_init (that's how I tested it).  I can't see how we have a
problem with an implicit dependency such as bne2x->crc32.  Modules go into
the live state without retaking the lock.  If you can still reproduce that
now Linus has reverted, I'm afraid we need to dig deeper...

Thanks,
Rusty.

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -563,33 +563,26 @@ int use_module(struct module *a, struct 
 	struct module_use *use;
 	int no_warn, err;
 
-	if (b == NULL || already_uses(a, b)) return 1;
+	if (b == NULL || already_uses(a, b))
+		return 0;
 
 	/* If we're interrupted or time out, we fail. */
-	if (wait_event_interruptible_timeout(
-		    module_wq, (err = strong_try_module_get(b)) != -EBUSY,
-		    30 * HZ) <= 0) {
-		printk("%s: gave up waiting for init of module %s.\n",
-		       a->name, b->name);
-		return 0;
-	}
-
-	/* If strong_try_module_get() returned a different error, we fail. */
+	err = strong_try_module_get(b);
 	if (err)
-		return 0;
+		return err;
 
 	DEBUGP("Allocating new usage for %s.\n", a->name);
 	use = kmalloc(sizeof(*use), GFP_ATOMIC);
 	if (!use) {
 		printk("%s: out of memory loading\n", a->name);
 		module_put(b);
-		return 0;
+		return -ENOMEM;
 	}
 
 	use->module_which_uses = a;
 	list_add(&use->list, &b->modules_which_use_me);
 	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
-	return 1;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(use_module);
 
@@ -882,7 +875,7 @@ static inline void module_unload_free(st
 
 int use_module(struct module *a, struct module *b)
 {
-	return strong_try_module_get(b) == 0;
+	return strong_try_module_get(b);
 }
 EXPORT_SYMBOL_GPL(use_module);
 
@@ -1053,17 +1046,39 @@ static const struct kernel_symbol *resol
 	struct module *owner;
 	const struct kernel_symbol *sym;
 	const unsigned long *crc;
-
+	DEFINE_WAIT(wait);
+	int err;
+	long timeleft = 30 * HZ;
+
+again:
 	sym = find_symbol(name, &owner, &crc,
 			  !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
-	/* use_module can fail due to OOM,
-	   or module initialization or unloading */
-	if (sym) {
-		if (!check_version(sechdrs, versindex, name, mod, crc, owner)
-		    || !use_module(mod, owner))
-			sym = NULL;
+	if (!sym)
+		return NULL;
+
+	if (!check_version(sechdrs, versindex, name, mod, crc, owner))
+		return NULL;
+
+	prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
+	err = use_module(mod, owner);
+	if (likely(!err) || err != -EBUSY || signal_pending(current)) {
+		finish_wait(&module_wq, &wait);
+		return err ? NULL : sym;
 	}
-	return sym;
+
+	/* Module is still loading.  Drop lock and wait. */
+	mutex_unlock(&module_mutex);
+	timeleft = schedule_timeout(timeleft);
+	mutex_lock(&module_mutex);
+	finish_wait(&module_wq, &wait);
+
+	/* Module might be gone entirely, or replaced.  Re-lookup. */
+	if (timeleft)
+		goto again;
+
+	printk(KERN_WARNING "%s: gave up waiting for init of module %s.\n",
+	       mod->name, owner->name);
+	return NULL;
 }
 
 /*
@@ -2014,6 +2029,7 @@ static noinline struct module *load_modu
 	long err = 0;
 	void *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long symoffs, stroffs, *strmap;
+	void __percpu *percpu;
 
 	mm_segment_t old_fs;
 
@@ -2158,6 +2174,8 @@ static noinline struct module *load_modu
 			goto free_mod;
 		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
 	}
+	/* Keep this around for failure path. */
+	percpu = mod_percpu(mod);
 
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
@@ -2463,7 +2481,7 @@ static noinline struct module *load_modu
 	module_free(mod, mod->module_core);
 	/* mod will be freed with core. Don't access it beyond this line! */
  free_percpu:
-	percpu_modfree(mod);
+	free_percpu(percpu);
  free_mod:
 	kfree(args);
 	kfree(strmap);



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-27  5:26           ` Rusty Russell
@ 2010-05-27 18:46             ` Brandon Philips
  2010-05-31  9:40               ` Rusty Russell
  2010-05-27 21:57             ` [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
  1 sibling, 1 reply; 71+ messages in thread
From: Brandon Philips @ 2010-05-27 18:46 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Andrew Morton,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On 14:56 Thu 27 May 2010, Rusty Russell wrote:
> On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > bug.  Does this fix it?
> > 
> > Since the commit has been reverted, do you still want me to test this patch?
> > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > of current -git.
> 
> OK, combo meal deal below, against Linus' latest.  I'd really appreciate
> a report, since AFAIK you're the only one hitting it, and only when that
> other (now reverted) patch was applied.

I tested this patch on my machine on top of Linus's latest and it
fixes the issue. Without the patch and using Linus's latest I
reproduce the original issue:

[   60.836022] bnx2: gave up waiting for init of module libcrc32c.
[   60.847997] bnx2: Unknown symbol crc32c

Note: Again, since I don't have bnx2x hardware I forced bnx2.ko to
depend on libcrc32c as bnx2x does:
  http://ifup.org/~philips/review/bnx2-hack-to-use-libcrc32c.patch

> As an side to Brandon: I can see how my patch fixed an explicit
> request_module inside module_init (that's how I tested it).  I can't
> see how we have a problem with an implicit dependency such as
> bne2x->crc32.  Modules go into the live state without retaking the
> lock.

libcrc32c is doing an explicit request_module inside of its
module_init. Follow the call chain in libcrc32c_mod_init()

libcrc32c_mod_init
 crypto_alloc_shash
  crypto_alloc_tfm
   crypto_find_alg
    crypto_alg_mod_lookup
     crypto_larval_lookup
      request_module

Cheers,

	Brandon

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-27  5:26           ` Rusty Russell
  2010-05-27 18:46             ` Brandon Philips
@ 2010-05-27 21:57             ` Rafael J. Wysocki
  2010-05-31  7:54               ` Rusty Russell
  2010-05-31 10:23               ` [PATCH] module: fix reference to mod->percpu after freeing module Rusty Russell
  1 sibling, 2 replies; 71+ messages in thread
From: Rafael J. Wysocki @ 2010-05-27 21:57 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Thursday 27 May 2010, Rusty Russell wrote:
> On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > bug.  Does this fix it?
> > 
> > Since the commit has been reverted, do you still want me to test this patch?
> > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > of current -git.
> 
> OK, combo meal deal below, against Linus' latest.  I'd really appreciate
> a report, since AFAIK you're the only one hitting it, and only when that
> other (now reverted) patch was applied.

I cannot reproduce the crash with the patch below.

Thanks,
Rafael


> As an side to Brandon: I can see how my patch fixed an explicit request_module
> inside module_init (that's how I tested it).  I can't see how we have a
> problem with an implicit dependency such as bne2x->crc32.  Modules go into
> the live state without retaking the lock.  If you can still reproduce that
> now Linus has reverted, I'm afraid we need to dig deeper...
> 
> Thanks,
> Rusty.
> 
> diff --git a/kernel/module.c b/kernel/module.c
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -563,33 +563,26 @@ int use_module(struct module *a, struct 
>  	struct module_use *use;
>  	int no_warn, err;
>  
> -	if (b == NULL || already_uses(a, b)) return 1;
> +	if (b == NULL || already_uses(a, b))
> +		return 0;
>  
>  	/* If we're interrupted or time out, we fail. */
> -	if (wait_event_interruptible_timeout(
> -		    module_wq, (err = strong_try_module_get(b)) != -EBUSY,
> -		    30 * HZ) <= 0) {
> -		printk("%s: gave up waiting for init of module %s.\n",
> -		       a->name, b->name);
> -		return 0;
> -	}
> -
> -	/* If strong_try_module_get() returned a different error, we fail. */
> +	err = strong_try_module_get(b);
>  	if (err)
> -		return 0;
> +		return err;
>  
>  	DEBUGP("Allocating new usage for %s.\n", a->name);
>  	use = kmalloc(sizeof(*use), GFP_ATOMIC);
>  	if (!use) {
>  		printk("%s: out of memory loading\n", a->name);
>  		module_put(b);
> -		return 0;
> +		return -ENOMEM;
>  	}
>  
>  	use->module_which_uses = a;
>  	list_add(&use->list, &b->modules_which_use_me);
>  	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
> -	return 1;
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(use_module);
>  
> @@ -882,7 +875,7 @@ static inline void module_unload_free(st
>  
>  int use_module(struct module *a, struct module *b)
>  {
> -	return strong_try_module_get(b) == 0;
> +	return strong_try_module_get(b);
>  }
>  EXPORT_SYMBOL_GPL(use_module);
>  
> @@ -1053,17 +1046,39 @@ static const struct kernel_symbol *resol
>  	struct module *owner;
>  	const struct kernel_symbol *sym;
>  	const unsigned long *crc;
> -
> +	DEFINE_WAIT(wait);
> +	int err;
> +	long timeleft = 30 * HZ;
> +
> +again:
>  	sym = find_symbol(name, &owner, &crc,
>  			  !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
> -	/* use_module can fail due to OOM,
> -	   or module initialization or unloading */
> -	if (sym) {
> -		if (!check_version(sechdrs, versindex, name, mod, crc, owner)
> -		    || !use_module(mod, owner))
> -			sym = NULL;
> +	if (!sym)
> +		return NULL;
> +
> +	if (!check_version(sechdrs, versindex, name, mod, crc, owner))
> +		return NULL;
> +
> +	prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
> +	err = use_module(mod, owner);
> +	if (likely(!err) || err != -EBUSY || signal_pending(current)) {
> +		finish_wait(&module_wq, &wait);
> +		return err ? NULL : sym;
>  	}
> -	return sym;
> +
> +	/* Module is still loading.  Drop lock and wait. */
> +	mutex_unlock(&module_mutex);
> +	timeleft = schedule_timeout(timeleft);
> +	mutex_lock(&module_mutex);
> +	finish_wait(&module_wq, &wait);
> +
> +	/* Module might be gone entirely, or replaced.  Re-lookup. */
> +	if (timeleft)
> +		goto again;
> +
> +	printk(KERN_WARNING "%s: gave up waiting for init of module %s.\n",
> +	       mod->name, owner->name);
> +	return NULL;
>  }
>  
>  /*
> @@ -2014,6 +2029,7 @@ static noinline struct module *load_modu
>  	long err = 0;
>  	void *ptr = NULL; /* Stops spurious gcc warning */
>  	unsigned long symoffs, stroffs, *strmap;
> +	void __percpu *percpu;
>  
>  	mm_segment_t old_fs;
>  
> @@ -2158,6 +2174,8 @@ static noinline struct module *load_modu
>  			goto free_mod;
>  		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
>  	}
> +	/* Keep this around for failure path. */
> +	percpu = mod_percpu(mod);
>  
>  	/* Determine total sizes, and put offsets in sh_entsize.  For now
>  	   this is done generically; there doesn't appear to be any
> @@ -2463,7 +2481,7 @@ static noinline struct module *load_modu
>  	module_free(mod, mod->module_core);
>  	/* mod will be freed with core. Don't access it beyond this line! */
>   free_percpu:
> -	percpu_modfree(mod);
> +	free_percpu(percpu);
>   free_mod:
>  	kfree(args);
>  	kfree(strmap);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-27 21:57             ` [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
@ 2010-05-31  7:54               ` Rusty Russell
  2010-05-31 10:23               ` [PATCH] module: fix reference to mod->percpu after freeing module Rusty Russell
  1 sibling, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-05-31  7:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Fri, 28 May 2010 07:27:28 am Rafael J. Wysocki wrote:
> On Thursday 27 May 2010, Rusty Russell wrote:
> > On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > > bug.  Does this fix it?
> > > 
> > > Since the commit has been reverted, do you still want me to test this patch?
> > > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > > of current -git.
> > 
> > OK, combo meal deal below, against Linus' latest.  I'd really appreciate
> > a report, since AFAIK you're the only one hitting it, and only when that
> > other (now reverted) patch was applied.
> 
> I cannot reproduce the crash with the patch below.
> 
> Thanks,
> Rafael

Great, thanks for testing!  Though the patch which uncovered this has been
reverted, it could still bite someone.

Linus, I'll send the fix separately, then rework the other fix with clearer
locking and more love for the kprobes guys...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Regression] Crash in load_module() while freeing args
  2010-05-27 18:46             ` Brandon Philips
@ 2010-05-31  9:40               ` Rusty Russell
  2010-05-31 12:00                 ` [PATCH 0/2] kernel/module.c locking changes Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-05-31  9:40 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Andrew Morton,
	Jon Masters, Tejun Heo, Masami Hiramatsu

On Fri, 28 May 2010 04:16:02 am Brandon Philips wrote:
> On 14:56 Thu 27 May 2010, Rusty Russell wrote:
> > On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > > bug.  Does this fix it?
> > > 
> > > Since the commit has been reverted, do you still want me to test this patch?
> > > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > > of current -git.
> > 
> > OK, combo meal deal below, against Linus' latest.  I'd really appreciate
> > a report, since AFAIK you're the only one hitting it, and only when that
> > other (now reverted) patch was applied.
> 
> I tested this patch on my machine on top of Linus's latest and it
> fixes the issue. Without the patch and using Linus's latest I
> reproduce the original issue:
> 
> [   60.836022] bnx2: gave up waiting for init of module libcrc32c.
> [   60.847997] bnx2: Unknown symbol crc32c
> 
> Note: Again, since I don't have bnx2x hardware I forced bnx2.ko to
> depend on libcrc32c as bnx2x does:
>   http://ifup.org/~philips/review/bnx2-hack-to-use-libcrc32c.patch
> 
> > As an side to Brandon: I can see how my patch fixed an explicit
> > request_module inside module_init (that's how I tested it).  I can't
> > see how we have a problem with an implicit dependency such as
> > bne2x->crc32.  Modules go into the live state without retaking the
> > lock.
> 
> libcrc32c is doing an explicit request_module inside of its
> module_init. Follow the call chain in libcrc32c_mod_init()

Thanks for confirmation, I figured that must be the case as I pondered
it on the weekend after sending my query.

Linus didn't like dropping the lock, so I'll create a more ambitious patch
which reduces the lock coverage to those places which really need it.

Cheers,
Rusty.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH] module: fix reference to mod->percpu after freeing module.
  2010-05-27 21:57             ` [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
  2010-05-31  7:54               ` Rusty Russell
@ 2010-05-31 10:23               ` Rusty Russell
  2010-05-31 10:25                 ` Tejun Heo
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-05-31 10:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, LKML, Andrew Morton, Brandon Philips,
	Jon Masters, Tejun Heo, Masami Hiramatsu

Rafael sees a sometimes crash at precpu_modfree from kernel/module.c; it
only occurred with another (since-reverted) patch, but that patch simply
changed timing to uncover this bug, it was otherwise unrelated.

The comment about the mod being freed is self-explanatory, but neither
Tejun nor I read it.  This bug was introduced in 259354deaa, after it
had previously been fixed in 6e2b75740b.  How embarrassing.

Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
---
 kernel/module.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2031,6 +2031,7 @@ static noinline struct module *load_modu
 	long err = 0;
 	void *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long symoffs, stroffs, *strmap;
+	void __percpu *percpu;
 
 	mm_segment_t old_fs;
 
@@ -2175,6 +2176,8 @@ static noinline struct module *load_modu
 			goto free_mod;
 		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
 	}
+	/* Keep this around for failure path. */
+	percpu = mod_percpu(mod);
 
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
@@ -2480,7 +2483,7 @@ static noinline struct module *load_modu
 	module_free(mod, mod->module_core);
 	/* mod will be freed with core. Don't access it beyond this line! */
  free_percpu:
-	percpu_modfree(mod);
+	free_percpu(percpu);
  free_mod:
 	kfree(args);
 	kfree(strmap);

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH] module: fix reference to mod->percpu after freeing module.
  2010-05-31 10:23               ` [PATCH] module: fix reference to mod->percpu after freeing module Rusty Russell
@ 2010-05-31 10:25                 ` Tejun Heo
  0 siblings, 0 replies; 71+ messages in thread
From: Tejun Heo @ 2010-05-31 10:25 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, Rafael J. Wysocki, LKML, Andrew Morton,
	Brandon Philips, Jon Masters, Masami Hiramatsu

On 05/31/2010 12:23 PM, Rusty Russell wrote:
> Rafael sees a sometimes crash at precpu_modfree from kernel/module.c; it
> only occurred with another (since-reverted) patch, but that patch simply
> changed timing to uncover this bug, it was otherwise unrelated.
> 
> The comment about the mod being freed is self-explanatory, but neither
> Tejun nor I read it.  This bug was introduced in 259354deaa, after it
> had previously been fixed in 6e2b75740b.  How embarrassing.
> 
> Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Masami Hiramatsu <mhiramat@redhat.com>
> Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>

Embarrassingly-Acked-by: Tejun Heo <tj@kernel.org>

Thanks. :-)

-- 
tejun

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 0/2] kernel/module.c locking changes
  2010-05-31  9:40               ` Rusty Russell
@ 2010-05-31 12:00                 ` Rusty Russell
  2010-05-31 12:01                   ` [PATCH 1/2] module: make locking more fine-grained Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-05-31 12:00 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Andrew Morton,
	Jon Masters, Tejun Heo, Masami Hiramatsu

In response to Linus' almost-rational dislike of the lock-drop-and-regrab
patch, I've dug out and updated the fine-grained locking patch for module.c.
Simple testing here seems to work, and it might have a good effect on boot
times too (its original target).

Pull request for those who find that easier:

The following changes since commit 67a3e12b05e055c0415c556a315a3d3eb637e29e:
  Linus Torvalds (1):
        Linux 2.6.35-rc1

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6.git modules

Rusty Russell (3):
      module: fix reference to mod->percpu after freeing module.
      module: make locking more fine-grained.
      module: fix bne2 "gave up waiting for init of module libcrc32c"

 kernel/module.c |  151 +++++++++++++++++++++++++++++++++++-------------------
 1 files changed, 98 insertions(+), 53 deletions(-)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 1/2] module: make locking more fine-grained.
  2010-05-31 12:00                 ` [PATCH 0/2] kernel/module.c locking changes Rusty Russell
@ 2010-05-31 12:01                   ` Rusty Russell
  2010-05-31 12:02                     ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
  2010-06-01  5:38                     ` [PATCH 1/2] module: make locking more fine-grained Américo Wang
  0 siblings, 2 replies; 71+ messages in thread
From: Rusty Russell @ 2010-05-31 12:01 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Andrew Morton,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

Kay Sievers <kay.sievers@vrfy.org> reports that we still have some
contention over module loading which is slowing boot.

Linus also disliked a previous "drop lock and regrab" patch to fix the
bne2 "gave up waiting for init of module libcrc32c" message.

This is more ambitious in that we only grab the lock where we need it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Brandon Philips <brandon@ifup.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/module.c |   57 +++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -90,7 +90,8 @@ static DECLARE_WAIT_QUEUE_HEAD(module_wq
 
 static BLOCKING_NOTIFIER_HEAD(module_notify_list);
 
-/* Bounds of module allocation, for speeding __module_address */
+/* Bounds of module allocation, for speeding __module_address.
+ * Protected by module_mutex. */
 static unsigned long module_addr_min = -1UL, module_addr_max = 0;
 
 int register_module_notifier(struct notifier_block * nb)
@@ -329,7 +330,7 @@ static bool find_symbol_in_section(const
 }
 
 /* Find a symbol and return it, along with, (optional) crc and
- * (optional) module which owns it */
+ * (optional) module which owns it.  Needs preempt disabled or module_mutex. */
 const struct kernel_symbol *find_symbol(const char *name,
 					struct module **owner,
 					const unsigned long **crc,
@@ -557,7 +558,7 @@ static int already_uses(struct module *a
 	return 0;
 }
 
-/* Module a uses b */
+/* Module a uses b: caller needs module_mutex() */
 int use_module(struct module *a, struct module *b)
 {
 	struct module_use *use;
@@ -598,6 +599,7 @@ static void module_unload_free(struct mo
 {
 	struct module *i;
 
+	mutex_lock(&module_mutex);
 	list_for_each_entry(i, &modules, list) {
 		struct module_use *use;
 
@@ -613,6 +615,7 @@ static void module_unload_free(struct mo
 			}
 		}
 	}
+	mutex_unlock(&module_mutex);
 }
 
 #ifdef CONFIG_MODULE_FORCE_UNLOAD
@@ -665,6 +668,7 @@ static int try_stop_module(struct module
 		synchronize_sched();
 		return 0;
 	}
+	mutex_unlock(&module_mutex);
 }
 
 unsigned int module_refcount(struct module *mod)
@@ -783,9 +787,13 @@ SYSCALL_DEFINE2(delete_module, const cha
 	/* Store the name of the last unloaded module for diagnostic purposes */
 	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
 	ddebug_remove_module(mod->name);
+
+	/* free_module doesn't want module_mutex held by caller. */
+	mutex_unlock(&module_mutex);
 	free_module(mod);
-
- out:
+	goto out_stop;
+
+out:
 	mutex_unlock(&module_mutex);
 	return ret;
 }
@@ -1001,6 +1009,8 @@ static inline int check_modstruct_versio
 {
 	const unsigned long *crc;
 
+	/* Since this should be found in kernel (which can't be removed),
+	 * no locking is necessary. */
 	if (!find_symbol(MODULE_SYMBOL_PREFIX "module_layout", NULL,
 			 &crc, true, false))
 		BUG();
@@ -1043,8 +1053,7 @@ static inline int same_magic(const char 
 }
 #endif /* CONFIG_MODVERSIONS */
 
-/* Resolve a symbol for this module.  I.e. if we find one, record usage.
-   Must be holding module_mutex. */
+/* Resolve a symbol for this module.  I.e. if we find one, record usage. */
 static const struct kernel_symbol *resolve_symbol(Elf_Shdr *sechdrs,
 						  unsigned int versindex,
 						  const char *name,
@@ -1054,6 +1063,7 @@ static const struct kernel_symbol *resol
 	const struct kernel_symbol *sym;
 	const unsigned long *crc;
 
+	mutex_lock(&module_mutex);
 	sym = find_symbol(name, &owner, &crc,
 			  !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
 	/* use_module can fail due to OOM,
@@ -1063,6 +1073,7 @@ static const struct kernel_symbol *resol
 		    || !use_module(mod, owner))
 			sym = NULL;
 	}
+	mutex_unlock(&module_mutex);
 	return sym;
 }
 
@@ -1436,13 +1447,15 @@ static int __unlink_module(void *_mod)
 	return 0;
 }
 
-/* Free a module, remove from lists, etc (must hold module_mutex). */
+/* Free a module, remove from lists, etc. */
 static void free_module(struct module *mod)
 {
 	trace_module_free(mod);
 
 	/* Delete from various lists */
+	mutex_lock(&module_mutex);
 	stop_machine(__unlink_module, mod, NULL);
+	mutex_unlock(&module_mutex);
 	remove_notes_attrs(mod);
 	remove_sect_attrs(mod);
 	mod_kobject_remove(mod);
@@ -1514,7 +1527,14 @@ static int verify_export_symbols(struct 
 
 	for (i = 0; i < ARRAY_SIZE(arr); i++) {
 		for (s = arr[i].sym; s < arr[i].sym + arr[i].num; s++) {
-			if (find_symbol(s->name, &owner, NULL, true, false)) {
+			const struct kernel_symbol *sym;
+
+			/* Stopping preemption makes find_symbol safe. */
+			preempt_disable();
+			sym = find_symbol(s->name, &owner, NULL, true, false);
+			preempt_enable();
+
+			if (sym) {
 				printk(KERN_ERR
 				       "%s: exports duplicate symbol %s"
 				       " (owned by %s)\n",
@@ -1960,11 +1980,13 @@ static void *module_alloc_update_bounds(
 	void *ret = module_alloc(size);
 
 	if (ret) {
+		mutex_lock(&module_mutex);
 		/* Update module bounds. */
 		if ((unsigned long)ret < module_addr_min)
 			module_addr_min = (unsigned long)ret;
 		if ((unsigned long)ret + size > module_addr_max)
 			module_addr_max = (unsigned long)ret + size;
+		mutex_unlock(&module_mutex);
 	}
 	return ret;
 }
@@ -2426,7 +2448,9 @@ static noinline struct module *load_modu
 	 * function to insert in a way safe to concurrent readers.
 	 * The mutex protects against concurrent writers.
 	 */
+	mutex_lock(&module_mutex);
 	list_add_rcu(&mod->list, &modules);
+	mutex_unlock(&module_mutex);
 
 	err = parse_args(mod->name, mod->args, mod->kp, mod->num_kp, NULL);
 	if (err < 0)
@@ -2447,8 +2471,10 @@ static noinline struct module *load_modu
 	return mod;
 
  unlink:
+	mutex_lock(&module_mutex);
 	/* Unlink carefully: kallsyms could be walking list. */
 	list_del_rcu(&mod->list);
+	mutex_unlock(&module_mutex);
 	synchronize_sched();
 	module_arch_cleanup(mod);
  cleanup:
@@ -2502,19 +2528,10 @@ SYSCALL_DEFINE3(init_module, void __user
 	if (!capable(CAP_SYS_MODULE) || modules_disabled)
 		return -EPERM;
 
-	/* Only one module load at a time, please */
-	if (mutex_lock_interruptible(&module_mutex) != 0)
-		return -EINTR;
-
 	/* Do all the hard work */
 	mod = load_module(umod, len, uargs);
-	if (IS_ERR(mod)) {
-		mutex_unlock(&module_mutex);
+	if (IS_ERR(mod))
 		return PTR_ERR(mod);
-	}
-
-	/* Drop lock so they can recurse */
-	mutex_unlock(&module_mutex);
 
 	blocking_notifier_call_chain(&module_notify_list,
 			MODULE_STATE_COMING, mod);
@@ -2531,9 +2548,7 @@ SYSCALL_DEFINE3(init_module, void __user
 		module_put(mod);
 		blocking_notifier_call_chain(&module_notify_list,
 					     MODULE_STATE_GOING, mod);
-		mutex_lock(&module_mutex);
 		free_module(mod);
-		mutex_unlock(&module_mutex);
 		wake_up(&module_wq);
 		return ret;
 	}

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 12:01                   ` [PATCH 1/2] module: make locking more fine-grained Rusty Russell
@ 2010-05-31 12:02                     ` Rusty Russell
  2010-05-31 16:48                       ` Andrew Morton
  2010-06-01  5:38                     ` [PATCH 1/2] module: make locking more fine-grained Américo Wang
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-05-31 12:02 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Andrew Morton,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

Problem: it's hard to avoid an init routine stumbling over a
request_module these days.  And it's not clear it's always a bad idea:
for example, a module like kvm with dynamic dependencies on kvm-intel
or kvm-amd would be neater if it could simply request_module the right
one.

In this particular case, it's libcrc32c:

	libcrc32c_mod_init
	 crypto_alloc_shash
	  crypto_alloc_tfm
	   crypto_find_alg
	    crypto_alg_mod_lookup
	     crypto_larval_lookup
	      request_module

If another module is waiting for libcrc32c to finish initializing
(ie. bne2 depends on libcrc32c) then it does so holding the module
lock, and our request_module() can't make progress until that is
released.

Waiting without the lock isn't all that hard: we just need to pass the
-EBUSY up the call chain so we can sleep where we don't hold the lock.
Error reporting is a bit trickier: we need to copy the name of the
unfinished module before releasing the lock.

Other notes:
1) This also fixes a theoretical issue where a weak dependency would allow
   symbol version mismatches to be ignored.
2) We rename use_module to ref_module to make life easier for the only
   external user (the out-of-tree ksplice patches).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Brandon Philips <brandon@ifup.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Abbot <tabbott@ksplice.com>
---
 kernel/module.c |   91 ++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 59 insertions(+), 32 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -559,40 +559,33 @@ static int already_uses(struct module *a
 }
 
 /* Module a uses b: caller needs module_mutex() */
-int use_module(struct module *a, struct module *b)
+int ref_module(struct module *a, struct module *b)
 {
 	struct module_use *use;
 	int no_warn, err;
 
-	if (b == NULL || already_uses(a, b)) return 1;
+	if (b == NULL || already_uses(a, b))
+		return 0;
 
 	/* If we're interrupted or time out, we fail. */
-	if (wait_event_interruptible_timeout(
-		    module_wq, (err = strong_try_module_get(b)) != -EBUSY,
-		    30 * HZ) <= 0) {
-		printk("%s: gave up waiting for init of module %s.\n",
-		       a->name, b->name);
-		return 0;
-	}
-
-	/* If strong_try_module_get() returned a different error, we fail. */
+	err = strong_try_module_get(b);
 	if (err)
-		return 0;
+		return err;
 
 	DEBUGP("Allocating new usage for %s.\n", a->name);
 	use = kmalloc(sizeof(*use), GFP_ATOMIC);
 	if (!use) {
 		printk("%s: out of memory loading\n", a->name);
 		module_put(b);
-		return 0;
+		return -ENOMEM;
 	}
 
 	use->module_which_uses = a;
 	list_add(&use->list, &b->modules_which_use_me);
 	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
-	return 1;
+	return 0;
 }
-EXPORT_SYMBOL_GPL(use_module);
+EXPORT_SYMBOL_GPL(ref_module);
 
 /* Clear the unload stuff of the module. */
 static void module_unload_free(struct module *mod)
@@ -888,11 +881,11 @@ static inline void module_unload_free(st
 {
 }
 
-int use_module(struct module *a, struct module *b)
+int ref_module(struct module *a, struct module *b)
 {
-	return strong_try_module_get(b) == 0;
+	return strong_try_module_get(b);
 }
-EXPORT_SYMBOL_GPL(use_module);
+EXPORT_SYMBOL_GPL(ref_module);
 
 static inline void module_unload_init(struct module *mod)
 {
@@ -1057,26 +1050,58 @@ static inline int same_magic(const char 
 static const struct kernel_symbol *resolve_symbol(Elf_Shdr *sechdrs,
 						  unsigned int versindex,
 						  const char *name,
-						  struct module *mod)
+						  struct module *mod,
+						  char ownername[])
 {
 	struct module *owner;
 	const struct kernel_symbol *sym;
 	const unsigned long *crc;
+	int err;
 
 	mutex_lock(&module_mutex);
 	sym = find_symbol(name, &owner, &crc,
 			  !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
-	/* use_module can fail due to OOM,
-	   or module initialization or unloading */
-	if (sym) {
-		if (!check_version(sechdrs, versindex, name, mod, crc, owner)
-		    || !use_module(mod, owner))
-			sym = NULL;
+	if (!sym)
+		goto unlock;
+
+	if (!check_version(sechdrs, versindex, name, mod, crc, owner)) {
+		sym = ERR_PTR(-EINVAL);
+		goto getname;
 	}
+
+	err = ref_module(mod, owner);
+	if (err) {
+		sym = ERR_PTR(err);
+		goto getname;
+	}
+
+getname:
+	/* We must make copy under the lock if we failed to get ref. */
+	strncpy(ownername, module_name(owner), MODULE_NAME_LEN);
+unlock:
 	mutex_unlock(&module_mutex);
 	return sym;
 }
 
+static const struct kernel_symbol *resolve_symbol_wait(Elf_Shdr *sechdrs,
+						       unsigned int versindex,
+						       const char *name,
+						       struct module *mod)
+{
+	const struct kernel_symbol *ksym;
+	char ownername[MODULE_NAME_LEN];
+
+	if (wait_event_interruptible_timeout(module_wq,
+			IS_ERR(ksym = resolve_symbol(sechdrs, versindex, name,
+						     mod, ownername)) &&
+			PTR_ERR(ksym) == -EBUSY,
+					     30 * HZ) <= 0) {
+		printk(KERN_WARNING "%s: gave up waiting for init of module %s.\n",
+		       mod->name, ownername);
+	}
+	return ksym;
+}
+
 /*
  * /sys/module/foo/sections stuff
  * J. Corbet <corbet@lwn.net>
@@ -1578,21 +1603,23 @@ static int simplify_symbols(Elf_Shdr *se
 			break;
 
 		case SHN_UNDEF:
-			ksym = resolve_symbol(sechdrs, versindex,
-					      strtab + sym[i].st_name, mod);
+			ksym = resolve_symbol_wait(sechdrs, versindex,
+						   strtab + sym[i].st_name,
+						   mod);
 			/* Ok if resolved.  */
-			if (ksym) {
+			if (ksym && !IS_ERR(ksym)) {
 				sym[i].st_value = ksym->value;
 				break;
 			}
 
 			/* Ok if weak.  */
-			if (ELF_ST_BIND(sym[i].st_info) == STB_WEAK)
+			if (!ksym && ELF_ST_BIND(sym[i].st_info) == STB_WEAK)
 				break;
 
-			printk(KERN_WARNING "%s: Unknown symbol %s\n",
-			       mod->name, strtab + sym[i].st_name);
-			ret = -ENOENT;
+			printk(KERN_WARNING "%s: Unknown symbol %s (err %li)\n",
+			       mod->name, strtab + sym[i].st_name,
+			       PTR_ERR(ksym));
+			ret = PTR_ERR(ksym) ?: -ENOENT;
 			break;
 
 		default:

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 12:02                     ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
@ 2010-05-31 16:48                       ` Andrew Morton
  2010-05-31 18:19                         ` Linus Torvalds
  2010-06-01  1:04                         ` Rusty Russell
  0 siblings, 2 replies; 71+ messages in thread
From: Andrew Morton @ 2010-05-31 16:48 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Rafael J. Wysocki, Linus Torvalds, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Mon, 31 May 2010 21:32:27 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote:

> Problem: it's hard to avoid an init routine stumbling over a
> request_module these days.  And it's not clear it's always a bad idea:
> for example, a module like kvm with dynamic dependencies on kvm-intel
> or kvm-amd would be neater if it could simply request_module the right
> one.
> 
> In this particular case, it's libcrc32c:
> 
> 	libcrc32c_mod_init
> 	 crypto_alloc_shash
> 	  crypto_alloc_tfm
> 	   crypto_find_alg
> 	    crypto_alg_mod_lookup
> 	     crypto_larval_lookup
> 	      request_module
> 
> If another module is waiting for libcrc32c to finish initializing
> (ie. bne2 depends on libcrc32c) then it does so holding the module
> lock, and our request_module() can't make progress until that is
> released.
> 
> Waiting without the lock isn't all that hard: we just need to pass the
> -EBUSY up the call chain so we can sleep where we don't hold the lock.
> Error reporting is a bit trickier: we need to copy the name of the
> unfinished module before releasing the lock.

Who's returning -EBUSY?  request_module()?  If so, are you requiring
that all code which might call request_module() be correctly
propagating error codes back?  Please spell this all out?

Because I keep on coming across code which does

	if (foo() < 0)
		return -EWHATEVER;

or
		return -1;

I try to stamp it out, but they have me outnumbered.

I think transgressions are sufficiently rare that the patch will be OK.
Plus we needed to fix transgressors anyway.

After your changes, what would be the observable effects if this code
encountered a return-value-corruptor?


Also, I bet there are drivers which return -EBUSY from their
module_init() functions if the hardware's in an unexpected state.  What
happens?


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 16:48                       ` Andrew Morton
@ 2010-05-31 18:19                         ` Linus Torvalds
  2010-05-31 20:15                           ` Linus Torvalds
  2010-06-01  1:21                           ` Rusty Russell
  2010-06-01  1:04                         ` Rusty Russell
  1 sibling, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-31 18:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rusty Russell, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Mon, 31 May 2010, Andrew Morton wrote:
> 
> Who's returning -EBUSY?  request_module()?  If so, are you requiring
> that all code which might call request_module() be correctly
> propagating error codes back?  Please spell this all out?

The problem roughly as follows:

	task 1				task 2
	------				------

	request_module("crc32c")
	gets module_mutex
	...
	drops module_mutex in otder to run "init"
					request_module("bne2");
					gets module_mutex
					wants to link to crc32:
					use_module("bne2", "crc32c")
					.. 
					strong_try_module_get() returns -EBUSY
					because it's not initialized yet
	calls libcrc32c_mod_init
	 ...
          request_module(optimized crc32c)
	  waits for module_mutex
	  that is held by the bne2 loading

					.. gives up .. BOOM ..
					releases module_mutex
					returns error
	finishes successfully


because the module locking is pure and utter crap. It uses one hug lock 
that it tries to hold for a long time, rather than protecting just the 
parts it needs.

Rusty's fix is to just drop the lock around use_module(), and it seems to 
work. It's may be right for 'use_module()', but totally wrong from a 
conceptual locking standpoint, though - dropping the lock in the middle of 
module loading may well "work", but who the hell knows what it really 
results in?

IOW, it's one of those "this works, but it's very wrong" things. It makes 
the whole module_mutex pretty much a random thing with even less semantics 
than it has now. Right now it has some clear area that it protects - the 
area may be too _big_, but at least it makes some amount of sense. 

The proper fix would appear to be to actually fix locking, which probably 
implies turning most of "module_mutex" into a spinlock that protects just 
the _real_ critical sections. I don't think there are any real blocking 
things except for that "wait for another module to load" case, which is 
obviously exactly where we cannot hold the lock in the first place.

So rather than having one large area that gets protected but then dropping 
the lock in random places, we should probably just have lots of small 
areas that are clearly defined and protected.

And a _lot_ of the module loading doesn't need any locks at all. Much of 
the real work is probably totally private, ie loading the actual module 
and setting things up before we really expose it. We hold that big lock 
over a ridiculously large area right now (basically _all_ of module 
loading except the actual init sequence for a module).

> Also, I bet there are drivers which return -EBUSY from their
> module_init() functions if the hardware's in an unexpected state.  What
> happens?

Nothing. See above: this is a special case, and it's really just about 
strong_try_module_get() returning EBUSY for one special reason.

It's entirely possible that an interim fix (if we can't just fix the 
locking) is to _not_ use "strong_try_module_get()" at all, but instead 
just use "try_module_get()", and then after we've dropped the 
module_mutex, but _before_ we call the "init" function for the module, we 
wait for all the modules that this module depends on.

IOW, we'd link to other modules _before_ they are necessarily initialized 
(their symbol tables will be as initialized as they are going to be), but 
then before we call our own initialization routines we make sure that the 
modules we linked to have finished theirs.

Doesn't that sound like the logical thing to do? And it wouldn't change 
any locking.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 18:19                         ` Linus Torvalds
@ 2010-05-31 20:15                           ` Linus Torvalds
  2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
  2010-06-01  1:57                             ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
  2010-06-01  1:21                           ` Rusty Russell
  1 sibling, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-31 20:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rusty Russell, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Mon, 31 May 2010, Linus Torvalds wrote:
> 
> It's entirely possible that an interim fix (if we can't just fix the 
> locking) is to _not_ use "strong_try_module_get()" at all, but instead 
> just use "try_module_get()", and then after we've dropped the 
> module_mutex, but _before_ we call the "init" function for the module, we 
> wait for all the modules that this module depends on.
> 
> IOW, we'd link to other modules _before_ they are necessarily initialized 
> (their symbol tables will be as initialized as they are going to be), but 
> then before we call our own initialization routines we make sure that the 
> modules we linked to have finished theirs.
> 
> Doesn't that sound like the logical thing to do? And it wouldn't change 
> any locking.

Ok, this is a two-patch series to do exactly that. It's totally untested, 
although I _have_ booted the result, and tested that module loading and 
unloading works. But I haven't tested the race condition, obviously.

It looks sane, and quite frankly, I think it's a much better design than 
what we have now. I also note that the old code was broken for the case of 
not having CONFIG_MODULE_UNLOAD - I didn't fix it, but it should be easy 
to do (basically, we should do the whole module dependency list regardless 
of whether we can unload modules or not - we _do_ need it in order to get 
that whole case of waiting for them to load right).

The first patch doesn't really change anything, it just cleans things up 
and introduces the two-way module usage list that the second patch needs 
in order to wait for modules that we use to initialize.

Comments? Brandon - does this work for you? I may well have missed 
something.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-05-31 20:15                           ` Linus Torvalds
@ 2010-05-31 20:16                             ` Linus Torvalds
  2010-05-31 20:17                               ` [PATCH 2/2] module: wait for other modules after dropping the module_mutex Linus Torvalds
                                                 ` (2 more replies)
  2010-06-01  1:57                             ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
  1 sibling, 3 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-31 20:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rusty Russell, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 31 May 2010 12:19:37 -0700
Subject: [PATCH 1/2] Make the module 'usage' lists be two-way

When adding a module that depends on another one, we used to create a
one-way list of "modules_which_use_me", so that module unloading could
see who needs a module.

It's actually quite simple to make that list go both ways: so that we
not only can see "who uses me", but also see a list of modules that are
"used by me".

In fact, we always wanted that list in "module_unload_free()": when we
unload a module, we want to also release all the other modules that are
used by that module.  But because we didn't have that list, we used to
first iterate over all modules, and then iterate over each "used by me"
list of that module.

By making the list two-way, we simplify module_unload_free(), and it
allows for some trivial fixes later too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/module.h |    4 ++-
 kernel/module.c        |   86 +++++++++++++++++++++++++++--------------------
 2 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 6914fca..680db9e 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -359,7 +359,9 @@ struct module
 
 #ifdef CONFIG_MODULE_UNLOAD
 	/* What modules depend on me? */
-	struct list_head modules_which_use_me;
+	struct list_head source_list;
+	/* What modules do I depend on? */
+	struct list_head target_list;
 
 	/* Who is waiting for us to be unloaded */
 	struct task_struct *waiter;
diff --git a/kernel/module.c b/kernel/module.c
index d806e00..82ae1e4 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -523,7 +523,8 @@ static void module_unload_init(struct module *mod)
 {
 	int cpu;
 
-	INIT_LIST_HEAD(&mod->modules_which_use_me);
+	INIT_LIST_HEAD(&mod->source_list);
+	INIT_LIST_HEAD(&mod->target_list);
 	for_each_possible_cpu(cpu) {
 		per_cpu_ptr(mod->refptr, cpu)->incs = 0;
 		per_cpu_ptr(mod->refptr, cpu)->decs = 0;
@@ -538,8 +539,9 @@ static void module_unload_init(struct module *mod)
 /* modules using other modules */
 struct module_use
 {
-	struct list_head list;
-	struct module *module_which_uses;
+	struct list_head source_list;
+	struct list_head target_list;
+	struct module *source, *target;
 };
 
 /* Does a already use b? */
@@ -547,8 +549,8 @@ static int already_uses(struct module *a, struct module *b)
 {
 	struct module_use *use;
 
-	list_for_each_entry(use, &b->modules_which_use_me, list) {
-		if (use->module_which_uses == a) {
+	list_for_each_entry(use, &b->source_list, source_list) {
+		if (use->source == a) {
 			DEBUGP("%s uses %s!\n", a->name, b->name);
 			return 1;
 		}
@@ -557,11 +559,38 @@ static int already_uses(struct module *a, struct module *b)
 	return 0;
 }
 
+/*
+ * Module a uses b
+ *  - we add 'a' as a "source", 'b' as a "target" of module use
+ *  - the module_use is added to the list of 'b' sources (so
+ *    'b' can walk the list to see who sourced them, and of 'a'
+ *    targets (so 'a' can see what modules it targets).
+ */
+static int add_module_usage(struct module *a, struct module *b)
+{
+	int no_warn;
+	struct module_use *use;
+
+	DEBUGP("Allocating new usage for %s.\n", a->name);
+	use = kmalloc(sizeof(*use), GFP_ATOMIC);
+	if (!use) {
+		printk("%s: out of memory loading\n", a->name);
+		module_put(b);
+		return 0;
+	}
+
+	use->source = a;
+	use->target = b;
+	list_add(&use->source_list, &b->source_list);
+	list_add(&use->target_list, &a->target_list);
+	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
+	return 1;
+}
+
 /* Module a uses b */
 int use_module(struct module *a, struct module *b)
 {
-	struct module_use *use;
-	int no_warn, err;
+	int err;
 
 	if (b == NULL || already_uses(a, b)) return 1;
 
@@ -578,40 +607,23 @@ int use_module(struct module *a, struct module *b)
 	if (err)
 		return 0;
 
-	DEBUGP("Allocating new usage for %s.\n", a->name);
-	use = kmalloc(sizeof(*use), GFP_ATOMIC);
-	if (!use) {
-		printk("%s: out of memory loading\n", a->name);
-		module_put(b);
-		return 0;
-	}
-
-	use->module_which_uses = a;
-	list_add(&use->list, &b->modules_which_use_me);
-	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
-	return 1;
+	return add_module_usage(a, b);
 }
 EXPORT_SYMBOL_GPL(use_module);
 
 /* Clear the unload stuff of the module. */
 static void module_unload_free(struct module *mod)
 {
-	struct module *i;
+	struct module_use *use, *tmp;
 
-	list_for_each_entry(i, &modules, list) {
-		struct module_use *use;
-
-		list_for_each_entry(use, &i->modules_which_use_me, list) {
-			if (use->module_which_uses == mod) {
-				DEBUGP("%s unusing %s\n", mod->name, i->name);
-				module_put(i);
-				list_del(&use->list);
-				kfree(use);
-				sysfs_remove_link(i->holders_dir, mod->name);
-				/* There can be at most one match. */
-				break;
-			}
-		}
+	list_for_each_entry_safe(use, tmp, &mod->target_list, target_list) {
+		struct module *i = use->target;
+		DEBUGP("%s unusing %s\n", mod->name, i->name);
+		module_put(i);
+		list_del(&use->source_list);
+		list_del(&use->target_list);
+		kfree(use);
+		sysfs_remove_link(i->holders_dir, mod->name);
 	}
 }
 
@@ -735,7 +747,7 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
 		goto out;
 	}
 
-	if (!list_empty(&mod->modules_which_use_me)) {
+	if (!list_empty(&mod->source_list)) {
 		/* Other modules depend on us: get rid of them first. */
 		ret = -EWOULDBLOCK;
 		goto out;
@@ -799,9 +811,9 @@ static inline void print_unload_info(struct seq_file *m, struct module *mod)
 
 	/* Always include a trailing , so userspace can differentiate
            between this and the old multi-field proc format. */
-	list_for_each_entry(use, &mod->modules_which_use_me, list) {
+	list_for_each_entry(use, &mod->source_list, source_list) {
 		printed_something = 1;
-		seq_printf(m, "%s,", use->module_which_uses->name);
+		seq_printf(m, "%s,", use->source->name);
 	}
 
 	if (mod->init != NULL && mod->exit == NULL) {
-- 
1.7.1.227.gb676f


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 2/2] module: wait for other modules after dropping the module_mutex
  2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
@ 2010-05-31 20:17                               ` Linus Torvalds
  2010-06-01  1:37                               ` [PATCH 1/2] Make the module 'usage' lists be two-way Rusty Russell
  2010-06-01  2:44                               ` Américo Wang
  2 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-05-31 20:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rusty Russell, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 31 May 2010 13:09:21 -0700
Subject: [PATCH 2/2] module: wait for other modules after dropping the module_mutex

Instead of doing a "use_module()" when linking to other modules, we do a
simpler link_module() that does not require the dependent modules to be
initialized.  We then wait for all dependent modules _after_ we have
dropped the module_mutex lock, which avoids a deadlock.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

This is the patch that should hopefully fix Brandon's deadlock. But I 
haven't really tested that case - so who knows?

 kernel/module.c |  105 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 89 insertions(+), 16 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 82ae1e4..c95f97e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -611,6 +611,54 @@ int use_module(struct module *a, struct module *b)
 }
 EXPORT_SYMBOL_GPL(use_module);
 
+/* Module a links to b */
+static int link_module(struct module *a, struct module *b)
+{
+	if (b == NULL || already_uses(a, b))
+		return 1;
+	if (!try_module_get(b))
+		return 0;
+	return add_module_usage(a, b);
+}
+
+/*
+ * Are all modules LIVE? Return 0 (ok) if so.
+ * Return -EINVAL if any are going, and EAGAIN
+ * if there are modules still initializing.
+ */
+static int all_modules_ok(struct module *mod)
+{
+	int err = 0;
+	struct module_use *use;
+
+	mutex_lock(&module_mutex);
+	list_for_each_entry(use, &mod->target_list, target_list) {
+		struct module *target = use->target;
+		switch (target->state) {
+		case MODULE_STATE_LIVE:
+			break;
+		case MODULE_STATE_COMING:
+			if (!err)
+				err = -EAGAIN;
+			break;
+		default:
+			err = -EINVAL;
+		}
+	}
+	mutex_unlock(&module_mutex);
+	return err;
+}
+
+/* Wait for dependent modules to finish */
+static int module_load_wait(struct module *mod)
+{
+	int err;
+
+	if (wait_event_interruptible_timeout(module_wq, (err = all_modules_ok(mod)) != -EAGAIN, 30 * HZ) <= 0)
+		printk("%s: gave up waiting for init of modules.\n", mod->name);
+	return err;
+}
+
 /* Clear the unload stuff of the module. */
 static void module_unload_free(struct module *mod)
 {
@@ -898,6 +946,21 @@ int use_module(struct module *a, struct module *b)
 }
 EXPORT_SYMBOL_GPL(use_module);
 
+/* FIXME! This is wrong. We should do the full link_module. Otherwise
+ * we'll fail if a module we depend on just happens to be loading */
+static int link_module(struct module *a, struct module *b)
+{
+	return strong_try_module_get(b) == 0;
+}
+
+/* Wait for dependent modules to finish. See the FIXME on link_module()
+ * above: we've required that the modules were initialized too early,
+ * so there is nothing to wait on now (nor do we maintain any lists) */
+static int module_load_wait(struct module *mod)
+{
+	return 0;
+}
+
 static inline void module_unload_init(struct module *mod)
 {
 }
@@ -1068,11 +1131,11 @@ static const struct kernel_symbol *resolve_symbol(Elf_Shdr *sechdrs,
 
 	sym = find_symbol(name, &owner, &crc,
 			  !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
-	/* use_module can fail due to OOM,
+	/* link_module can fail due to OOM,
 	   or module initialization or unloading */
 	if (sym) {
 		if (!check_version(sechdrs, versindex, name, mod, crc, owner)
-		    || !use_module(mod, owner))
+		    || !link_module(mod, owner))
 			sym = NULL;
 	}
 	return sym;
@@ -2528,6 +2591,14 @@ SYSCALL_DEFINE3(init_module, void __user *, umod,
 	/* Drop lock so they can recurse */
 	mutex_unlock(&module_mutex);
 
+	/*
+	 * Now that we've released the module lock, wait for dependent
+	 * modules to finish loading
+	 */
+	ret = module_load_wait(mod);
+	if (ret)
+		goto unload;
+
 	blocking_notifier_call_chain(&module_notify_list,
 			MODULE_STATE_COMING, mod);
 
@@ -2535,20 +2606,8 @@ SYSCALL_DEFINE3(init_module, void __user *, umod,
 	/* Start the module */
 	if (mod->init != NULL)
 		ret = do_one_initcall(mod->init);
-	if (ret < 0) {
-		/* Init routine failed: abort.  Try to protect us from
-                   buggy refcounters. */
-		mod->state = MODULE_STATE_GOING;
-		synchronize_sched();
-		module_put(mod);
-		blocking_notifier_call_chain(&module_notify_list,
-					     MODULE_STATE_GOING, mod);
-		mutex_lock(&module_mutex);
-		free_module(mod);
-		mutex_unlock(&module_mutex);
-		wake_up(&module_wq);
-		return ret;
-	}
+	if (ret < 0)
+		goto unload;
 	if (ret > 0) {
 		printk(KERN_WARNING
 "%s: '%s'->init suspiciously returned %d, it should follow 0/-E convention\n"
@@ -2583,6 +2642,20 @@ SYSCALL_DEFINE3(init_module, void __user *, umod,
 	mutex_unlock(&module_mutex);
 
 	return 0;
+
+unload:
+	/* Initialization failed: abort.  Try to protect us from
+	   buggy refcounters. */
+	mod->state = MODULE_STATE_GOING;
+	synchronize_sched();
+	module_put(mod);
+	blocking_notifier_call_chain(&module_notify_list,
+				     MODULE_STATE_GOING, mod);
+	mutex_lock(&module_mutex);
+	free_module(mod);
+	mutex_unlock(&module_mutex);
+	wake_up(&module_wq);
+	return ret;
 }
 
 static inline int within(unsigned long addr, void *start, unsigned long size)
-- 
1.7.1.227.gb676f


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 16:48                       ` Andrew Morton
  2010-05-31 18:19                         ` Linus Torvalds
@ 2010-06-01  1:04                         ` Rusty Russell
  1 sibling, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  1:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Brandon Philips, Rafael J. Wysocki, Linus Torvalds, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 02:18:34 am Andrew Morton wrote:
> On Mon, 31 May 2010 21:32:27 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > Problem: it's hard to avoid an init routine stumbling over a
> > request_module these days.  And it's not clear it's always a bad idea:
> > for example, a module like kvm with dynamic dependencies on kvm-intel
> > or kvm-amd would be neater if it could simply request_module the right
> > one.
> > 
> > In this particular case, it's libcrc32c:
> > 
> > 	libcrc32c_mod_init
> > 	 crypto_alloc_shash
> > 	  crypto_alloc_tfm
> > 	   crypto_find_alg
> > 	    crypto_alg_mod_lookup
> > 	     crypto_larval_lookup
> > 	      request_module
> > 
> > If another module is waiting for libcrc32c to finish initializing
> > (ie. bne2 depends on libcrc32c) then it does so holding the module
> > lock, and our request_module() can't make progress until that is
> > released.
> > 
> > Waiting without the lock isn't all that hard: we just need to pass the
> > -EBUSY up the call chain so we can sleep where we don't hold the lock.
> > Error reporting is a bit trickier: we need to copy the name of the
> > unfinished module before releasing the lock.
> 
> Who's returning -EBUSY?  request_module()?  If so, are you requiring
> that all code which might call request_module() be correctly
> propagating error codes back?  Please spell this all out?

Sorry if I was unclear.  It's all inside module.c.  Here's the updated
commentry:

 If another module is waiting inside resolve_symbol() for libcrc32c to
 finish initializing (ie. bne2 depends on libcrc32c) then it does so
 holding the module lock, and our request_module() can't make progress
 until that is released.

 Waiting inside resolve_symbol() without the lock isn't all that hard:
 we just need to pass the -EBUSY up the call chain so we can sleep
 where we don't hold the lock.  Error reporting is a bit trickier: we
 need to copy the name of the unfinished module before releasing the
 lock.

Hope that helps,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 18:19                         ` Linus Torvalds
  2010-05-31 20:15                           ` Linus Torvalds
@ 2010-06-01  1:21                           ` Rusty Russell
  2010-06-01  3:24                             ` Linus Torvalds
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  1:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 03:49:14 am Linus Torvalds wrote:
> because the module locking is pure and utter crap. It uses one hug lock 
> that it tries to hold for a long time, rather than protecting just the 
> parts it needs.

Originally it was a big lock around module loading.  That was simple, and
simple is good.

Sure, that simplicity has been eroded, but "crap" is harsh.
 
> Rusty's fix is to just drop the lock around use_module(), and it seems to 
> work. It's may be right for 'use_module()', but totally wrong from a 
> conceptual locking standpoint, though - dropping the lock in the middle of 
> module loading may well "work", but who the hell knows what it really 
> results in?

I do.  If I didn't think so, I wouldn't have pushed the patch.

> IOW, it's one of those "this works, but it's very wrong" things. It makes 
> the whole module_mutex pretty much a random thing with even less semantics 
> than it has now. Right now it has some clear area that it protects - the 
> area may be too _big_, but at least it makes some amount of sense.

See, this I agree with, but you could have said this in far fewer words and
much more politely.

As posted, I had a patch to clean up the locking.  Seems you ignored it.

> It's entirely possible that an interim fix (if we can't just fix the 
> locking) is to _not_ use "strong_try_module_get()" at all, but instead 
> just use "try_module_get()", and then after we've dropped the 
> module_mutex, but _before_ we call the "init" function for the module, we 
> wait for all the modules that this module depends on.

No, those modules could still fail init.

> Doesn't that sound like the logical thing to do? And it wouldn't change 
> any locking.

No, it sounds wrong, complex and fundamentally broken.

Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
  2010-05-31 20:17                               ` [PATCH 2/2] module: wait for other modules after dropping the module_mutex Linus Torvalds
@ 2010-06-01  1:37                               ` Rusty Russell
  2010-06-01  3:42                                 ` Rusty Russell
  2010-06-01  2:44                               ` Américo Wang
  2 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  1:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 05:46:23 am Linus Torvalds wrote:
> 
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 31 May 2010 12:19:37 -0700
> Subject: [PATCH 1/2] Make the module 'usage' lists be two-way
> 
> When adding a module that depends on another one, we used to create a
> one-way list of "modules_which_use_me", so that module unloading could
> see who needs a module.
> 
> It's actually quite simple to make that list go both ways: so that we
> not only can see "who uses me", but also see a list of modules that are
> "used by me".

Thanks Linus, this is a nice cleanup.  One minor comment:

> +static int add_module_usage(struct module *a, struct module *b)
> +{
> +	int no_warn;
> +	struct module_use *use;

These days I tend to use bool for functions which return 1/0 like this.
(Older functions here I haven't churned, but I like it for new functions).
I've frobbed that and put this in my patch queue.

I'll submit it to Linus in the next merge window :)

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-05-31 20:15                           ` Linus Torvalds
  2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
@ 2010-06-01  1:57                             ` Rusty Russell
  2010-06-01  3:40                               ` Linus Torvalds
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  1:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 05:45:37 am Linus Torvalds wrote:
> 
> basically, we should do the whole module dependency list regardless 
> of whether we can unload modules or not 

Why?

Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
  2010-05-31 20:17                               ` [PATCH 2/2] module: wait for other modules after dropping the module_mutex Linus Torvalds
  2010-06-01  1:37                               ` [PATCH 1/2] Make the module 'usage' lists be two-way Rusty Russell
@ 2010-06-01  2:44                               ` Américo Wang
  2010-06-01  3:51                                 ` Linus Torvalds
  2 siblings, 1 reply; 71+ messages in thread
From: Américo Wang @ 2010-06-01  2:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Rusty Russell, Brandon Philips, Rafael J. Wysocki,
	LKML, Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Mon, May 31, 2010 at 01:16:23PM -0700, Linus Torvalds wrote:
>
>From: Linus Torvalds <torvalds@linux-foundation.org>
>Date: Mon, 31 May 2010 12:19:37 -0700
>Subject: [PATCH 1/2] Make the module 'usage' lists be two-way
>
>When adding a module that depends on another one, we used to create a
>one-way list of "modules_which_use_me", so that module unloading could
>see who needs a module.
>
>It's actually quite simple to make that list go both ways: so that we
>not only can see "who uses me", but also see a list of modules that are
>"used by me".
>
>In fact, we always wanted that list in "module_unload_free()": when we
>unload a module, we want to also release all the other modules that are
>used by that module.  But because we didn't have that list, we used to
>first iterate over all modules, and then iterate over each "used by me"
>list of that module.
>
>By making the list two-way, we simplify module_unload_free(), and it
>allows for some trivial fixes later too.
>
>Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Hi, Linus,

> 
> /* Clear the unload stuff of the module. */
> static void module_unload_free(struct module *mod)
> {
>-	struct module *i;
>+	struct module_use *use, *tmp;
> 
>-	list_for_each_entry(i, &modules, list) {
>-		struct module_use *use;
>-
>-		list_for_each_entry(use, &i->modules_which_use_me, list) {
>-			if (use->module_which_uses == mod) {
>-				DEBUGP("%s unusing %s\n", mod->name, i->name);
>-				module_put(i);
>-				list_del(&use->list);
>-				kfree(use);
>-				sysfs_remove_link(i->holders_dir, mod->name);
>-				/* There can be at most one match. */
>-				break;
>-			}
>-		}
>+	list_for_each_entry_safe(use, tmp, &mod->target_list, target_list) {
>+		struct module *i = use->target;
>+		DEBUGP("%s unusing %s\n", mod->name, i->name);
>+		module_put(i);
>+		list_del(&use->source_list);
>+		list_del(&use->target_list);
>+		kfree(use);
>+		sysfs_remove_link(i->holders_dir, mod->name);


I think it's nice to have a remove_module_usage() here, since we already
have add_module_usage().

Thanks.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  1:21                           ` Rusty Russell
@ 2010-06-01  3:24                             ` Linus Torvalds
  2010-06-01  5:22                               ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  3:24 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Rusty Russell wrote:
> 
> Sure, that simplicity has been eroded, but "crap" is harsh.

Crap is crap. Crap in locking is _especially_ dangerous.

> > module loading may well "work", but who the hell knows what it really 
> > results in?
> 
> I do.  If I didn't think so, I wouldn't have pushed the patch.

I'm sorry, but that's simply not good enough. We do not do ad-hoc locking 
"just becuse it works". Locking is too damn easy to get wrong, and "one 
person knows how the locking works" is not an excuse for anything at all.

> See, this I agree with, but you could have said this in far fewer words and
> much more politely.

Why? Why does "locking is crap" need any politeness? And why is it wrong 
to explain at length exactly _why_ our module locking is a piece of sh*t?

So explain why I should be more polite, or more terse?

> As posted, I had a patch to clean up the locking.  Seems you ignored it.

Umm. That patch was not the original one that I objected to, is it?

I do agree with your 1/2, btw, the one you posted under protest after I 
pointed out that the locking was crap. I was just explaining _why_ the 
locking was crap, and what the problem was to Andrew. 

I happen to also think that my solution to the problem is actually better 
and more straightforward than your one is. 

> > It's entirely possible that an interim fix (if we can't just fix the 
> > locking) is to _not_ use "strong_try_module_get()" at all, but instead 
> > just use "try_module_get()", and then after we've dropped the 
> > module_mutex, but _before_ we call the "init" function for the module, we 
> > wait for all the modules that this module depends on.
> 
> No, those modules could still fail init.

Umm. Which I took care of.

> > Doesn't that sound like the logical thing to do? And it wouldn't change 
> > any locking.
> 
> No, it sounds wrong, complex and fundamentally broken.

An dby "complex and fundamentally broken" you obviously mean "simpler and 
cleaner than the series posted by yourself". Or what?

I posted the damn series. Your next email even claims that my first one in 
the series (the bigger one) was a nice cleanup. You didn't comment on the 
second one, apparently because you're too embarrassed to admit you were 
wrong, and it wasn't all that 'wrong, complex, and fundamentally broken' 
to begin with. 

It was pretty damn straightforward, with no subtleties at all, in fact. 
Wasn't it?

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  1:57                             ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
@ 2010-06-01  3:40                               ` Linus Torvalds
  2010-06-01  4:27                                 ` Linus Torvalds
  2010-06-01  5:19                                 ` Rusty Russell
  0 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  3:40 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Rusty Russell wrote:

> On Tue, 1 Jun 2010 05:45:37 am Linus Torvalds wrote:
> > 
> > basically, we should do the whole module dependency list regardless 
> > of whether we can unload modules or not 
> 
> Why?

Because the current non-CONFIG_MODULE_UNLOAD code is currently broken, and 
it was broken exactly because the code had two totally different paths and 
totally different logic. And one part simply missed the case.

We'd be much better off having as much of the logic shared as possible. 
No?

Your 2/2 actually fixed that, because it moved the broken 
wait_event_interruptible_timeout() out of the (non-shared) use_module() 
into the (shared) resolve_symbol_wait(). But even that seemed to be almost 
accidental, and seemed to be more about the fact that now the locking 
rules required it (if you wanted to wait without holding the lock), rather 
than anything else.

In contrast, wouldn't it be nice if we just made the waiting be a separate 
event entirely? And have the !MODULE_UNLOAD case also able to share the 
/proc/module format, for example? 

You do realize - although maybe you don't - that right now, even with your 
2/2 case, the !MODULE_UNLOAD case seems to get the module ref-counts 
wrong. I didn't look through it all, but it _seems_ to increment the 
ref-count for every symbol it resolves. 

What happens if the module load fails in the middle (say, because some 
symbols referred to module 'a' that was loaded successfully, and others 
refer to module 'b' that had some probelsm)? I think the !MODULE_UNLOAD 
case is _broken_ again.

Why?

Take a small guess. It's broken - even with your 2/2 - exactly because the 
!MODULE_UNLOAD case does something fundamentally different from the 
regular case. It increments the module counts for each symbol it looks up, 
exactly BECAUSE IT DOESN'T TRACK MODULE DEPENDENCIES!

Which means that it then cannot clean up properly afterwards if an error 
happens in the middle.

So I'd suggest we should just track those module dependencies, and share 
more of the code and the logic. Because it looks to me like not sharing it 
continually results in bugs.

No?

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-06-01  1:37                               ` [PATCH 1/2] Make the module 'usage' lists be two-way Rusty Russell
@ 2010-06-01  3:42                                 ` Rusty Russell
  2010-06-01  4:00                                   ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  3:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 11:07:34 am Rusty Russell wrote:
> On Tue, 1 Jun 2010 05:46:23 am Linus Torvalds wrote:
> > +static int add_module_usage(struct module *a, struct module *b)
> > +{
> > +	int no_warn;
> > +	struct module_use *use;
> 
> These days I tend to use bool for functions which return 1/0 like this.

Actually, I changed it to -ENOMEM/0 when I applied it on top of the other
fixes.

But this is ugly:

> +       use = kmalloc(sizeof(*use), GFP_ATOMIC);
> +       if (!use) {
> +               printk("%s: out of memory loading\n", a->name);
> +               module_put(b);
> +               return 0;

The module_get is in the caller, but the module_put is here on failure.
Don't half split-out a function like this.  I also added KERN_WARNING to the
printk and fixed the Subject line.

Result below (rebased on top of the locking cleanup).  Here's the
updated git tree, too:

	git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Thanks,
Rusty.

Subject: module: Make the 'usage' lists be two-way
Date: Mon, 31 May 2010 12:19:37 -0700
From: Linus Torvalds <torvalds@linux-foundation.org>

When adding a module that depends on another one, we used to create a
one-way list of "modules_which_use_me", so that module unloading could
see who needs a module.

It's actually quite simple to make that list go both ways: so that we
not only can see "who uses me", but also see a list of modules that are
"used by me".

In fact, we always wanted that list in "module_unload_free()": when we
unload a module, we want to also release all the other modules that are
used by that module.  But because we didn't have that list, we used to
first iterate over all modules, and then iterate over each "used by me"
list of that module.

By making the list two-way, we simplify module_unload_free(), and it
allows for some trivial fixes later too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned & rebased)
---
 include/linux/module.h |    4 +-
 kernel/module.c        |   86 ++++++++++++++++++++++++++++---------------------
 2 files changed, 53 insertions(+), 37 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -359,7 +359,9 @@ struct module
 
 #ifdef CONFIG_MODULE_UNLOAD
 	/* What modules depend on me? */
-	struct list_head modules_which_use_me;
+	struct list_head source_list;
+	/* What modules do I depend on? */
+	struct list_head target_list;
 
 	/* Who is waiting for us to be unloaded */
 	struct task_struct *waiter;
diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -524,7 +524,8 @@ static void module_unload_init(struct mo
 {
 	int cpu;
 
-	INIT_LIST_HEAD(&mod->modules_which_use_me);
+	INIT_LIST_HEAD(&mod->source_list);
+	INIT_LIST_HEAD(&mod->target_list);
 	for_each_possible_cpu(cpu) {
 		per_cpu_ptr(mod->refptr, cpu)->incs = 0;
 		per_cpu_ptr(mod->refptr, cpu)->decs = 0;
@@ -539,8 +540,9 @@ static void module_unload_init(struct mo
 /* modules using other modules */
 struct module_use
 {
-	struct list_head list;
-	struct module *module_which_uses;
+	struct list_head source_list;
+	struct list_head target_list;
+	struct module *source, *target;
 };
 
 /* Does a already use b? */
@@ -548,8 +550,8 @@ static int already_uses(struct module *a
 {
 	struct module_use *use;
 
-	list_for_each_entry(use, &b->modules_which_use_me, list) {
-		if (use->module_which_uses == a) {
+	list_for_each_entry(use, &b->source_list, source_list) {
+		if (use->source == a) {
 			DEBUGP("%s uses %s!\n", a->name, b->name);
 			return 1;
 		}
@@ -558,11 +560,37 @@ static int already_uses(struct module *a
 	return 0;
 }
 
+/*
+ * Module a uses b
+ *  - we add 'a' as a "source", 'b' as a "target" of module use
+ *  - the module_use is added to the list of 'b' sources (so
+ *    'b' can walk the list to see who sourced them), and of 'a'
+ *    targets (so 'a' can see what modules it targets).
+ */
+static int add_module_usage(struct module *a, struct module *b)
+{
+	int no_warn;
+	struct module_use *use;
+
+	DEBUGP("Allocating new usage for %s.\n", a->name);
+	use = kmalloc(sizeof(*use), GFP_ATOMIC);
+	if (!use) {
+		printk(KERN_WARNING "%s: out of memory loading\n", a->name);
+		return -ENOMEM;
+	}
+
+	use->source = a;
+	use->target = b;
+	list_add(&use->source_list, &b->source_list);
+	list_add(&use->target_list, &a->target_list);
+	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
+	return 0;
+}
+
 /* Module a uses b: caller needs module_mutex() */
 int ref_module(struct module *a, struct module *b)
 {
-	struct module_use *use;
-	int no_warn, err;
+	int err;
 
 	if (b == NULL || already_uses(a, b))
 		return 0;
@@ -572,41 +600,27 @@ int ref_module(struct module *a, struct 
 	if (err)
 		return err;
 
-	DEBUGP("Allocating new usage for %s.\n", a->name);
-	use = kmalloc(sizeof(*use), GFP_ATOMIC);
-	if (!use) {
-		printk("%s: out of memory loading\n", a->name);
+	err = add_module_usage(a, b);
+	if (err)
 		module_put(b);
-		return -ENOMEM;
-	}
-
-	use->module_which_uses = a;
-	list_add(&use->list, &b->modules_which_use_me);
-	no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
-	return 0;
+	return err;
 }
 EXPORT_SYMBOL_GPL(ref_module);
 
 /* Clear the unload stuff of the module. */
 static void module_unload_free(struct module *mod)
 {
-	struct module *i;
+	struct module_use *use, *tmp;
 
 	mutex_lock(&module_mutex);
-	list_for_each_entry(i, &modules, list) {
-		struct module_use *use;
-
-		list_for_each_entry(use, &i->modules_which_use_me, list) {
-			if (use->module_which_uses == mod) {
-				DEBUGP("%s unusing %s\n", mod->name, i->name);
-				module_put(i);
-				list_del(&use->list);
-				kfree(use);
-				sysfs_remove_link(i->holders_dir, mod->name);
-				/* There can be at most one match. */
-				break;
-			}
-		}
+	list_for_each_entry_safe(use, tmp, &mod->target_list, target_list) {
+		struct module *i = use->target;
+		DEBUGP("%s unusing %s\n", mod->name, i->name);
+		module_put(i);
+		list_del(&use->source_list);
+		list_del(&use->target_list);
+		kfree(use);
+		sysfs_remove_link(i->holders_dir, mod->name);
 	}
 	mutex_unlock(&module_mutex);
 }
@@ -731,7 +745,7 @@ SYSCALL_DEFINE2(delete_module, const cha
 		goto out;
 	}
 
-	if (!list_empty(&mod->modules_which_use_me)) {
+	if (!list_empty(&mod->source_list)) {
 		/* Other modules depend on us: get rid of them first. */
 		ret = -EWOULDBLOCK;
 		goto out;
@@ -796,9 +810,9 @@ static inline void print_unload_info(str
 
 	/* Always include a trailing , so userspace can differentiate
            between this and the old multi-field proc format. */
-	list_for_each_entry(use, &mod->modules_which_use_me, list) {
+	list_for_each_entry(use, &mod->source_list, source_list) {
 		printed_something = 1;
-		seq_printf(m, "%s,", use->module_which_uses->name);
+		seq_printf(m, "%s,", use->source->name);
 	}
 
 	if (mod->init != NULL && mod->exit == NULL) {

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-06-01  2:44                               ` Américo Wang
@ 2010-06-01  3:51                                 ` Linus Torvalds
  0 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  3:51 UTC (permalink / raw)
  To: Américo Wang
  Cc: Andrew Morton, Rusty Russell, Brandon Philips, Rafael J. Wysocki,
	LKML, Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Américo Wang wrote:
> >+	list_for_each_entry_safe(use, tmp, &mod->target_list, target_list) {
> >+		struct module *i = use->target;
> >+		DEBUGP("%s unusing %s\n", mod->name, i->name);
> >+		module_put(i);
> >+		list_del(&use->source_list);
> >+		list_del(&use->target_list);
> >+		kfree(use);
> >+		sysfs_remove_link(i->holders_dir, mod->name);
> 
> I think it's nice to have a remove_module_usage() here, since we already
> have add_module_usage().

I agree that those five lines could easily be a function of their own. It 
would make sense and be symmetric. That said, I didn't do it because there 
was just a single use place, and it was so simple.

Also, if you do turn it into a function, it's a bit dubious what the 
proper calling convention would be. The clean interface would be to just 
pass in "use", since you can figure out both the source and target from 
there. At the same time, it's a bit sad to re-load "mod" from 
"use->source", when we had it explicitly and started from it.  But maybe 
that doesn't really matter.

One thing I react to now that I look at it again - I think we should also 
move the "module_put(i)" to the end. In fact I thought I did it when I 
moved the code around, but clearly I hadn't.

It won't really matter as-is (we hold the module_mutex, so "i" isn't going 
away), and it's what the old code used to do, but we should probably aim 
for more of a ref-counted worldview where you don't put the module until 
after it's not used again. And we clearly still use "i" after the 
module_put() above in the sysfs_remove_link thing. So that's a bit ugly 
too.

I suspect the module code could do with quite a bit of a cleanup. The 
primary target of that patch was a different issue, though.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-06-01  3:42                                 ` Rusty Russell
@ 2010-06-01  4:00                                   ` Linus Torvalds
  2010-06-01  4:05                                     ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  4:00 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Rusty Russell wrote:
> 
> But this is ugly:
> 
> > +       use = kmalloc(sizeof(*use), GFP_ATOMIC);
> > +       if (!use) {
> > +               printk("%s: out of memory loading\n", a->name);
> > +               module_put(b);
> > +               return 0;
> 
> The module_get is in the caller, but the module_put is here on failure.
> Don't half split-out a function like this.

I agree. That happened as part of moving the code around mostly 
mechanically, but you're right, that fixup is better done in the caller 
that did the get.

Also, looking at it, I don't think that should be GFP_ATOMIC. I wonder why 
it is. I don't think we should have recursion issues with memory freeing 
needing new modules due to IO/filesystem accesses, but maybe there are 
cases like that. But again, that was just moving old code around.

And with the old "use_module()" having done a wait, we can't have had 
people calling this from atomic contexts. So I wonder where that 
GFP_ATOMIC comes from. It goes all the way back to the original in-kernel 
module loader code in 2002 according to git.

Oh. And back then, it was inside a "modlist_lock". And that lock is long 
gone, but the GFP_ATOMIC remains.

Of course, it's a small data structure, and there aren't many of them, so 
nobody would ever notice. It's just an oddity right now.

Anyway, modified patch looks fine to me.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] Make the module 'usage' lists be two-way
  2010-06-01  4:00                                   ` Linus Torvalds
@ 2010-06-01  4:05                                     ` Linus Torvalds
  0 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  4:05 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Mon, 31 May 2010, Linus Torvalds wrote:
> 
> Oh. And back then, it was inside a "modlist_lock". And that lock is long 
> gone, but the GFP_ATOMIC remains.

Actually, keep it that way. Because I think that with your locking 
cleanup, we could actually turn the module_mutex back into a spinlock.

And the thing is, we don't necessarily want it to be a spinlock per se, 
but at the same time, I think that would be a good sanity test. If we can 
turn it into a spinlock without triggering the "might_sleep()" debugging 
code, that means that we are only holding it over real critical regions.

So while I don't think it needs to be a spinlock, it would be a nice added 
sanity check if it were one. We have all this helper debugging code to 
verify that spinlocked regions don't do "bad" things.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  3:40                               ` Linus Torvalds
@ 2010-06-01  4:27                                 ` Linus Torvalds
  2010-06-01  5:19                                 ` Rusty Russell
  1 sibling, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01  4:27 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Mon, 31 May 2010, Linus Torvalds wrote:
> 
> Which means that it then cannot clean up properly afterwards if an error 
> happens in the middle.

Oh, I guess it doesn't matter. For the non-unload case, we don't even do 
any module counts, so the fixup ends up being a no-op. So with the error 
handling fixed (which your 2/2 did), I guess it's actually ok. I still 
think it would be nice to have /proc/modules show the module dependencies 
regardless of whether they can be unloaded or not, but that's really just 
an oddity, maybe not a "bug" per se.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  3:40                               ` Linus Torvalds
  2010-06-01  4:27                                 ` Linus Torvalds
@ 2010-06-01  5:19                                 ` Rusty Russell
  2010-06-02  3:15                                   ` Rusty Russell
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  5:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 01:10:36 pm Linus Torvalds wrote:
> 
> On Tue, 1 Jun 2010, Rusty Russell wrote:
> 
> > On Tue, 1 Jun 2010 05:45:37 am Linus Torvalds wrote:
> > > 
> > > basically, we should do the whole module dependency list regardless 
> > > of whether we can unload modules or not 
> > 
> > Why?
> 
> Because the current non-CONFIG_MODULE_UNLOAD code is currently broken, and 
> it was broken exactly because the code had two totally different paths and 
> totally different logic. And one part simply missed the case.
> 
> We'd be much better off having as much of the logic shared as possible. 
> No?
> 
> Your 2/2 actually fixed that, because it moved the broken 
> wait_event_interruptible_timeout() out of the (non-shared) use_module() 
> into the (shared) resolve_symbol_wait(). But even that seemed to be almost 
> accidental, and seemed to be more about the fact that now the locking 
> rules required it (if you wanted to wait without holding the lock), rather 
> than anything else.

!CONFIG_MODULE_UNLOAD still does the old "fail don't wait" behavior.

So yes, moving the waiting into common code was a win for consistency,
either with your patch or mine.

> So I'd suggest we should just track those module dependencies, and share 
> more of the code and the logic. Because it looks to me like not sharing it 
> continually results in bugs.
> 
> No?

I wonder if we should just get rid of !CONFIG_UNLOAD then?  I have a soft spot
for it because it keeps us honest and shows how much shit is there simply for
our poor man's pagable kernel.

Let me compile up a kernel with and without and see what it's really doing
to us...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  3:24                             ` Linus Torvalds
@ 2010-06-01  5:22                               ` Rusty Russell
  2010-06-01 14:58                                 ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  5:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 12:54:18 pm Linus Torvalds wrote:
> I'm sorry, but that's simply not good enough. We do not do ad-hoc locking 
> "just becuse it works". Locking is too damn easy to get wrong, and "one 
> person knows how the locking works" is not an excuse for anything at all.

You asked who the hell knew what the results were.

You couldn't figure it out, so you shotgunned an innocent patch, which papered
over the real problem.

> > See, this I agree with, but you could have said this in far fewer words and
> > much more politely.
> 
> Why? Why does "locking is crap" need any politeness? And why is it wrong 
> to explain at length exactly _why_ our module locking is a piece of sh*t?
> 
> So explain why I should be more polite, or more terse?

How about this:

"I hate this patch.  It makes already-ugly locking in module.c awful, and I
can't see that it's correct.  Why not just reduce the locking to cover
the minimum needed?"

Getting email from you is the least fun part of kernel hacking, and that
sucks.  It's never "this code sucks, let's make it better", and always
"your code sucks and you're wrong" :(

> > As posted, I had a patch to clean up the locking.  Seems you ignored it.
> 
> Umm. That patch was not the original one that I objected to, is it?

Nope, but you didn't respond to it either.

> I do agree with your 1/2, btw, the one you posted under protest after I 
> pointed out that the locking was crap. I was just explaining _why_ the 
> locking was crap, and what the problem was to Andrew. 

Ugly as dropping the lock was, it was in some ways less ambitious than
either patch.  I understand that you disagree.

> I happen to also think that my solution to the problem is actually better 
> and more straightforward than your one is.

I don't think so, for several reasons.

(1) You no longer print out which module you gave up waiting for.
(2) You now need that code complexity for !CONFIG_MODULE_UNLOAD.
(3) It's obviously *not* straightforward otherwise you'd see why it's buggy.
(4) The fast booting nutters want more parallelism in module loading anyway.
(5) The locking *is* getting hairy (we drop it once already), and my patch
    makes that more explicit and clearer.

> > No, those modules could still fail init.
> 
> Umm. Which I took care of.

No, you didn't.

Prior to your patch, noone could get a reference on a module before it had
done init.  So when init fails, we simply free the module.

Now, you've changed that with your grab-and-check-later code.  And you can't
fix it for the !CONFIG_MODULE_UNLOAD, because there are no reference counts
for that case.

> I posted the damn series. Your next email even claims that my first one in 
> the series (the bigger one) was a nice cleanup.

I was being polite.  In practice, it added lines of code for no real win,
but hey, you obviously needed some positive feedback or something.

> You didn't comment on the 
> second one, apparently because you're too embarrassed to admit you were 
> wrong, and it wasn't all that 'wrong, complex, and fundamentally broken' 
> to begin with. 

No, Linus.  Difficult as it is, I always try to frankly and publicly admit
when I was wrong.  I'm sure you've experienced this in the past.

I didn't respond to it because you'd said "if we can't fix the
locking, we should try...".

I fixed the locking, so what gain in pointing out your mistakes?

Hope that helps,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] module: make locking more fine-grained.
  2010-05-31 12:01                   ` [PATCH 1/2] module: make locking more fine-grained Rusty Russell
  2010-05-31 12:02                     ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
@ 2010-06-01  5:38                     ` Américo Wang
  2010-06-01  5:55                       ` Rusty Russell
  1 sibling, 1 reply; 71+ messages in thread
From: Américo Wang @ 2010-06-01  5:38 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Rafael J. Wysocki, Linus Torvalds, LKML,
	Andrew Morton, Jon Masters, Tejun Heo, Masami Hiramatsu,
	Kay Sievers

On Mon, May 31, 2010 at 09:31:42PM +0930, Rusty Russell wrote:
>@@ -783,9 +787,13 @@ SYSCALL_DEFINE2(delete_module, const cha
> 	/* Store the name of the last unloaded module for diagnostic purposes */
> 	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> 	ddebug_remove_module(mod->name);
>+
>+	/* free_module doesn't want module_mutex held by caller. */
>+	mutex_unlock(&module_mutex);
> 	free_module(mod);
>-
>- out:
>+	goto out_stop;

As Stephen found in linux-next, this line doesn't pass compiling.
I think this line should be just deleted.

>+
>+out:
> 	mutex_unlock(&module_mutex);
> 	return ret;
> }


Thanks.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 1/2] module: make locking more fine-grained.
  2010-06-01  5:38                     ` [PATCH 1/2] module: make locking more fine-grained Américo Wang
@ 2010-06-01  5:55                       ` Rusty Russell
  0 siblings, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-01  5:55 UTC (permalink / raw)
  To: Américo Wang
  Cc: Brandon Philips, Rafael J. Wysocki, Linus Torvalds, LKML,
	Andrew Morton, Jon Masters, Tejun Heo, Masami Hiramatsu,
	Kay Sievers

On Tue, 1 Jun 2010 03:08:42 pm Américo Wang wrote:
> On Mon, May 31, 2010 at 09:31:42PM +0930, Rusty Russell wrote:
> >@@ -783,9 +787,13 @@ SYSCALL_DEFINE2(delete_module, const cha
> > 	/* Store the name of the last unloaded module for diagnostic purposes */
> > 	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> > 	ddebug_remove_module(mod->name);
> >+
> >+	/* free_module doesn't want module_mutex held by caller. */
> >+	mutex_unlock(&module_mutex);
> > 	free_module(mod);
> >-
> >- out:
> >+	goto out_stop;
> 
> As Stephen found in linux-next, this line doesn't pass compiling.
> I think this line should be just deleted.

Yep, I didn't commit my fixes to the backport before pushing & posting;
I've done that now.  You can get the usable one from the git tree:

        git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  5:22                               ` Rusty Russell
@ 2010-06-01 14:58                                 ` Linus Torvalds
  2010-06-01 17:53                                   ` Linus Torvalds
  2010-06-02  3:09                                   ` Rusty Russell
  0 siblings, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01 14:58 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Rusty Russell wrote:
>
> > So explain why I should be more polite, or more terse?
> 
> How about this:
> 
> "I hate this patch.  It makes already-ugly locking in module.c awful, and I
> can't see that it's correct.  Why not just reduce the locking to cover
> the minimum needed?"

Umm. Did you read the first few emails I sent out on this thread 
originally? They were actually _politer_ than your suggestion above (no 
crap mentioned), and did exactly what you ask for. Let me quote the one 
where I suggested tightening the lock, for example (along with saying that 
I don't want to see pointless semantic changes):

   I'm not re-applying it with the pointless semantic changes that are 
   visible to modules. It doesn't matter if they were informed, if it means 
   that they'll then just have some odd version dependency and add crazy 
   "#if LINUX_VERSION" tests that aren't even exact.
   
   I also wonder exactly what that module_mutex() actually ends up 
   protecting. 99% of load_module() seems to be stuff that is purely about 
   local issues. Maybe we could tighten the actual lock section to just the
   parts that actually expose the vmalloc'ed area to others?
   
   It's generally pointless releasing a lock in the middle - that just makes 
   the lock even less valid. If it's valid to just release the lock (without 
   some retry logic starting everything from the beginning), it likely the 
   lock shouldn't have been held there in the first place.

See? How bad was that? 

> Getting email from you is the least fun part of kernel hacking, and that
> sucks.  It's never "this code sucks, let's make it better", and always
> "your code sucks and you're wrong" :(

See above. That wasn't what I said. That "you're wrong" was what I got to 
when I got frustrated with you for complaining about me actually trying to 
fix it.

And "crap locking" only came up in the long explanation to Andrew about 
what the original problem was, and why the EBUSY error wasn't a problem. 
And we all know the module locking was/is crap. You must have known it 
too, since you apparently had a "fix up locking" patch from before.

So what was so bad about calling the locking crap? Really?

So yes, we both get frustrated here. But read the thread again, Rusty, and 
look who it is that actually starts complaining about other peoples code. 
I called the locking crap, you're the one who called something "wrong, 
complex and fundamentally broken".

Kettle. Pot. Black.

> > I do agree with your 1/2, btw, the one you posted under protest after I 
> > pointed out that the locking was crap. I was just explaining _why_ the 
> > locking was crap, and what the problem was to Andrew. 
> 
> Ugly as dropping the lock was, it was in some ways less ambitious than
> either patch.  I understand that you disagree.

I agree with "less ambitious". But I'd also call it "papering over bad 
code", or "making bad locking much worse" or "likely introduce new and 
even more subtle problems".

In fact, I might even go so far as using "crap".

Because it really is fundamentally wrong to drop a lock in the middle of 
an operation. It may _work_, but it certainly doesn't make the code 
better.

The thing is, when we have a problem in some code, we should try to make 
sure that the fixed code is _better_ than the old code. Not just fix the 
symptoms. Fix the underlying _reason_ for why the bug happened in the 
first place.

And the underlying reason the bug happened is (I think we both agree) 
simply that the locking was broken. Your patch fixed the symptoms, yes, 
but it did so by making the locking _worse_. That's why I hated it. I 
really don't think it was a good approach. 

And I do think you agree in the end, and know that's true.

> > I happen to also think that my solution to the problem is actually better 
> > and more straightforward than your one is.
> 
> I don't think so, for several reasons.
> 
> (1) You no longer print out which module you gave up waiting for.
> (2) You now need that code complexity for !CONFIG_MODULE_UNLOAD.
> (3) It's obviously *not* straightforward otherwise you'd see why it's buggy.
> (4) The fast booting nutters want more parallelism in module loading anyway.
> (5) The locking *is* getting hairy (we drop it once already), and my patch
>     makes that more explicit and clearer.

I agree with 1-2, and even had a comment about #2 pointing that out, and 
thinking that it would be good for other reasons (ie /proc/modules).

And as to #4, the "wait for everything after the fact" is actually the 
faster approach, although in practice it probably doesn't matter (you'd 
hope that modules depending on other modules that are still busy loading 
is such a rare case that it does _not_ matter whether you wait for them to 
load serially or batched up).

And as to #5, I would not actually suggest that we do "wait later _or_ fix 
the locking", I'd do both. It really isn't a either-or situation, Rusty.

So I don't think those ones matter.

HOWEVER. 

But as to #3, I think you bring up an interesting issue:

> Prior to your patch, noone could get a reference on a module before it had
> done init.  So when init fails, we simply free the module.
> 
> Now, you've changed that with your grab-and-check-later code.  And you can't
> fix it for the !CONFIG_MODULE_UNLOAD, because there are no reference counts
> for that case.

I agree. And you're right, we'd have to move even more code from the 
MODULE_UNLOAD case into !MODULE_UNLOAD to fix it. At which point I do 
agree that if we want to keep !MODULE_UNLOD (and I do think we do), my 
approach really doesn't work. Or we'd basically make !MODULE_UNLOAD a 
pointless exercise where unloading is not allowed, but we still do 
everything else the same way.

So I do agree that your later two-patch series is the way to go.

I would like to note that your original "fix things by dropping the lock" 
patch that I hated so much had this exact bug too. Making this a good 
example of _why_ it's basically always wrong to drop locks in the middle - 
even if you claimed you knew and understood the locking.

So I do hope we can agree to call the module locking "total and utter 
crap", ok? 

And it really wasn't a personal complaint about your prowess. Crap locking 
(or code in general) is crap, but calling the code crap shouldn't be 
something you take personally. Especially not when there are various valid 
historical reasons _why_ it is all crap. 

And yes, my code was crap too. I wrote it for the MODULE_UNLOAD case, and 
only added a comment saying the !unload case was broken, but didn't look 
at just _how_ broken it was. My bad.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01 14:58                                 ` Linus Torvalds
@ 2010-06-01 17:53                                   ` Linus Torvalds
  2010-06-01 23:24                                     ` Brandon Philips
  2010-06-02  3:09                                   ` Rusty Russell
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01 17:53 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Linus Torvalds wrote:
> 
> So I do agree that your later two-patch series is the way to go.

Oh, btw, did we ever get that version tested by the only person who seems 
to see the problem? Brandon?

I'd like to have a Tested-By: line too.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01 17:53                                   ` Linus Torvalds
@ 2010-06-01 23:24                                     ` Brandon Philips
  2010-06-01 23:51                                       ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Brandon Philips @ 2010-06-01 23:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rusty Russell, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On 10:53 Tue 01 Jun 2010, Linus Torvalds wrote:
> On Tue, 1 Jun 2010, Linus Torvalds wrote:
> > So I do agree that your later two-patch series is the way to go.
> 
> Oh, btw, did we ever get that version tested by the only person who seems 
> to see the problem? Brandon?

Sorry for the delay. I broke the fuse on the remote power switch I was
using.

This is the subset of patches that everyone agreed, right?
 git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules
Otherwise what patches should I be applying to 2.6.35-rc1?

When I tested a Kernel with Rusty's modules branch pulled onto
2.6.35-rc1 I got duplicate sysfs path errors:

[    8.985303] WARNING: at fs/sysfs/dir.c:451 sysfs_add_one+0xb2/0xd0()
[    8.998161] Hardware name: Dinar
[    9.004734] sysfs: cannot create duplicate filename '/module/libcrc32c'
[    9.018068] Modules linked in: libcrc32c(+) button ohci_hcd
ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan thermal
processor thermal_sys hwmon ide_pci_generic atiixp ide_core
ata_generic ahci libahci pata_atiixp libata scsi_mod
[    9.062682] Pid: 1588, comm: modprobe Not tainted 2.6.35-rc1-0.7-default #3
[    9.076711] Call Trace:
[    9.081731]  [<ffffffff8116a162>] ? sysfs_add_one+0xb2/0xd0
[    9.092988]  [<ffffffff8104f6aa>] warn_slowpath_common+0x7a/0xd0
[    9.105115]  [<ffffffff8104f7a1>] warn_slowpath_fmt+0x41/0x50
[    9.116718]  [<ffffffff8116a162>] sysfs_add_one+0xb2/0xd0
[    9.127628]  [<ffffffff8116ace5>] create_dir+0x75/0xc0
[    9.138015]  [<ffffffff8116adc6>] sysfs_create_dir+0x96/0xc0
[    9.149447]  [<ffffffff811c5297>] kobject_add_internal+0xe7/0x250
[    9.161744]  [<ffffffff811c5518>] kobject_add_varg+0x38/0x60
[    9.173173]  [<ffffffff811c5593>] kobject_init_and_add+0x53/0x70
[    9.185295]  [<ffffffff811c5160>] ? kset_find_obj+0x40/0x90
[    9.196555]  [<ffffffff8107f193>] mod_sysfs_init+0x83/0xf0
[    9.207639]  [<ffffffff810807ec>] load_module+0xbec/0x1e10
[    9.218723]  [<ffffffff810d7467>] ? handle_mm_fault+0x227/0xaf0
[    9.230672]  [<ffffffff810da47c>] ? vma_link+0xac/0x110
[    9.241236]  [<ffffffff81385168>] ? do_page_fault+0x228/0x410
[    9.252839]  [<ffffffff810dcc19>] ? do_mmap_pgoff+0x359/0x380
[    9.264442]  [<ffffffff81081a6a>] sys_init_module+0x5a/0x220
[    9.275873]  [<ffffffff81002ec2>] system_call_fastpath+0x16/0x1b
[    9.287993] ---[ end trace 2a73785cc061234f ]---
[    9.297350] kobject_add_internal failed for libcrc32c with -EEXIST,
don't try to register things with the same name in the same directory.

To fix this we need to make sure that we only have one copy of a module
going through load_module at a time.  Patch suggestion follows which
boots without errors. I am sure there is a better way to do what it does
   ;)

Full dmesg:
  http://ifup.org/~philips/rusty-module-branch.dmesg

Cheers,

	Brandon

>From 35c6bb7c9a48b3ce53bc4b51871e6bed6e09c80a Mon Sep 17 00:00:00 2001
From: Brandon Philips <bphilips@suse.de>
Date: Wed, 2 Jun 2010 01:13:54 +0200
Subject: [PATCH] module: track module names currently loading

After "make locking more fine-grained" multiple instances of the same module
can reach mod_sysfs_init() and spew -EEXIST from kobject_add_internal().

Track the modules that are currently in load_module() but are not in the
modules list yet so we can kick out duplicate modules early.

NOTE: I didn't use mod->list to track the loading modules because I
couldn't figure out how to unwind from an error in a nice way.

Signed-off-by: Brandon Philips <bphilips@suse.de>
---
 kernel/module.c |   45 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 41 insertions(+), 4 deletions(-)

Index: linux-2.6/kernel/module.c
===================================================================
--- linux-2.6.orig/kernel/module.c
+++ linux-2.6/kernel/module.c
@@ -77,6 +77,7 @@
 DEFINE_MUTEX(module_mutex);
 EXPORT_SYMBOL_GPL(module_mutex);
 static LIST_HEAD(modules);
+static LIST_HEAD(modules_loading);
 #ifdef CONFIG_KGDB_KDB
 struct list_head *kdb_modules = &modules; /* kdb needs the list of modules */
 #endif /* CONFIG_KGDB_KDB */
@@ -2055,6 +2056,23 @@ static inline void kmemleak_load_module(
 }
 #endif
 
+/* Track module names running through load_module but not in modules list */
+struct module_name {
+	struct list_head list;
+	char name[MODULE_NAME_LEN];
+};
+
+static struct module_name *find_module_loading(const char *name)
+{
+	struct module_name *mod_name;
+
+	list_for_each_entry(mod_name, &modules_loading, list) {
+		if (strcmp(mod_name->name, name) == 0)
+			return mod_name;
+	}
+	return NULL;
+}
+
 /* Allocate and load the module: note that size of section 0 is always
    zero, and we rely on this for optional sections. */
 static noinline struct module *load_module(void __user *umod,
@@ -2074,6 +2092,7 @@ static noinline struct module *load_modu
 	void *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long symoffs, stroffs, *strmap;
 	void __percpu *percpu;
+	struct module_name *mod_name;
 
 	mm_segment_t old_fs;
 
@@ -2198,24 +2217,36 @@ static noinline struct module *load_modu
 		goto free_mod;
 	}
 
-	if (find_module(mod->name)) {
-		err = -EEXIST;
+	mod_name = kzalloc(sizeof(*mod_name), GFP_KERNEL);
+	if (!mod_name) {
+		err = -ENOMEM;
 		goto free_mod;
 	}
+	strcpy(mod_name->name, mod->name);
+	INIT_LIST_HEAD(&mod_name->list);
+
+	mutex_lock(&module_mutex);
+	if (find_module(mod->name) || find_module_loading(mod->name)) {
+		mutex_unlock(&module_mutex);
+		err = -EEXIST;
+		goto free_mod_name;
+	}
+	list_add(&mod_name->list, &modules_loading);
+	mutex_unlock(&module_mutex);
 
 	mod->state = MODULE_STATE_COMING;
 
 	/* Allow arches to frob section contents and sizes.  */
 	err = module_frob_arch_sections(hdr, sechdrs, secstrings, mod);
 	if (err < 0)
-		goto free_mod;
+		goto del_loading;
 
 	if (pcpuindex) {
 		/* We have a special allocation for this section. */
 		err = percpu_modalloc(mod, sechdrs[pcpuindex].sh_size,
 				      sechdrs[pcpuindex].sh_addralign);
 		if (err)
-			goto free_mod;
+			goto del_loading;
 		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
 	}
 	/* Keep this around for failure path. */
@@ -2486,6 +2517,8 @@ static noinline struct module *load_modu
 	 * The mutex protects against concurrent writers.
 	 */
 	mutex_lock(&module_mutex);
+	list_del(&mod_name->list);
+	kfree(mod_name);
 	list_add_rcu(&mod->list, &modules);
 	mutex_unlock(&module_mutex);
 
@@ -2530,6 +2563,10 @@ static noinline struct module *load_modu
 	/* mod will be freed with core. Don't access it beyond this line! */
  free_percpu:
 	free_percpu(percpu);
+ del_loading:
+	list_del(&mod_name->list);
+ free_mod_name:
+	kfree(mod_name);
  free_mod:
 	kfree(args);
 	kfree(strmap);

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01 23:24                                     ` Brandon Philips
@ 2010-06-01 23:51                                       ` Linus Torvalds
  2010-06-02  2:10                                         ` Brandon Philips
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-01 23:51 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rusty Russell, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Brandon Philips wrote:
> 
> When I tested a Kernel with Rusty's modules branch pulled onto
> 2.6.35-rc1 I got duplicate sysfs path errors:

Hmm. Yeah, the module_mutex used to be held across the whole "find -> add" 
state, but isn't any more.

> To fix this we need to make sure that we only have one copy of a module
> going through load_module at a time.  Patch suggestion follows which
> boots without errors. I am sure there is a better way to do what it does
>    ;)

I think Rusty may have made the lock a bit _too_ finegrained there, and 
didn't add it to some places that needed it. It looks, for example, like 
PATCH 1/2 actually drops the lock in places where it's needed 
("find_module()" is documented to need it, but now load_module() didn't 
hold it at all when it did the find_module()).

Rather than adding a new "module_loading" list, I think we should be able 
to just use the existing "modules" list, and just fix up the locking a 
bit.

In fact, maybe we could just move the "look up existing module" a bit 
later - optimistically assuming that the module doesn't exist, and then 
just undoing the work if it turns out that we were wrong, just before 
adding ourselves to the list.

A patch something like the appended (TOTALLY UNTESTED!)

		Linus
---
 kernel/module.c |   13 ++++++++-----
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a1f46a5..21f7ffa 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2198,11 +2198,6 @@ static noinline struct module *load_module(void __user *umod,
 		goto free_mod;
 	}
 
-	if (find_module(mod->name)) {
-		err = -EEXIST;
-		goto free_mod;
-	}
-
 	mod->state = MODULE_STATE_COMING;
 
 	/* Allow arches to frob section contents and sizes.  */
@@ -2486,6 +2481,13 @@ static noinline struct module *load_module(void __user *umod,
 	 * The mutex protects against concurrent writers.
 	 */
 	mutex_lock(&module_mutex);
+
+	if (find_module(mod->name)) {
+		err = -EEXIST;
+		/* This will also unlock the mutex */
+		goto already_exists;
+	}
+
 	list_add_rcu(&mod->list, &modules);
 	mutex_unlock(&module_mutex);
 
@@ -2511,6 +2513,7 @@ static noinline struct module *load_module(void __user *umod,
 	mutex_lock(&module_mutex);
 	/* Unlink carefully: kallsyms could be walking list. */
 	list_del_rcu(&mod->list);
+ already_exists:
 	mutex_unlock(&module_mutex);
 	synchronize_sched();
 	module_arch_cleanup(mod);

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01 23:51                                       ` Linus Torvalds
@ 2010-06-02  2:10                                         ` Brandon Philips
  2010-06-02  3:03                                           ` Rusty Russell
  2010-06-02  4:35                                           ` Linus Torvalds
  0 siblings, 2 replies; 71+ messages in thread
From: Brandon Philips @ 2010-06-02  2:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rusty Russell, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On 16:51 Tue 01 Jun 2010, Linus Torvalds wrote:
> On Tue, 1 Jun 2010, Brandon Philips wrote:
> > When I tested a Kernel with Rusty's modules branch pulled onto
> > 2.6.35-rc1 I got duplicate sysfs path errors:
> Hmm. Yeah, the module_mutex used to be held across the whole "find -> add" 
> state, but isn't any more.

Right.

> > To fix this we need to make sure that we only have one copy of a
> > module going through load_module at a time.  Patch suggestion
> > follows which boots without errors. I am sure there is a better
> > way to do what it does ;)
> 
> I think Rusty may have made the lock a bit _too_ finegrained there,
> and didn't add it to some places that needed it. It looks, for
> example, like PATCH 1/2 actually drops the lock in places where it's
> needed ("find_module()" is documented to need it, but now
> load_module() didn't hold it at all when it did the find_module()).

Right, I noticed that too and held the lock in the patch I sent.

> Rather than adding a new "module_loading" list, I think we should be
> able to just use the existing "modules" list, and just fix up the
> locking a bit.
>
> In fact, maybe we could just move the "look up existing module" a
> bit later - optimistically assuming that the module doesn't exist,
> and then just undoing the work if it turns out that we were wrong,
> just before adding ourselves to the list.
>
> A patch something like the appended (TOTALLY UNTESTED!)

FWIW, I tried this same idea initially and it breaks because the
kobject EEXIST is coming from mod_sysfs_init() which happens further
up in load_module() before the list_add_rcu().

I also tried the obvious variation of moving the list_add_rcu() up
to where the find_module is but got:

[    5.495549] sd 0:0:1:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[    5.496931] ehci_hcd: Unknown symbol usb_hcd_resume_root_hub (err 0)
[    5.497002] ehci_hcd: Unknown symbol usb_hcd_pci_probe (err 0)
[    5.497070] ehci_hcd: Unknown symbol usb_hcd_unlink_urb_from_ep (err 0)

Feeling a bit like GoldiLocks I gave up and sent the modules_loading
patch to illustrate the issue. :)

I will keep working out all the interdependencies to see if I can get
something to boot without something like the modules_loading list.

Cheers,

	Brandon

> diff --git a/kernel/module.c b/kernel/module.c
> index a1f46a5..21f7ffa 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2198,11 +2198,6 @@ static noinline struct module *load_module(void __user *umod,
>  		goto free_mod;
>  	}
>  
> -	if (find_module(mod->name)) {
> -		err = -EEXIST;
> -		goto free_mod;
> -	}
> -
>  	mod->state = MODULE_STATE_COMING;
>  
>  	/* Allow arches to frob section contents and sizes.  */
> @@ -2486,6 +2481,13 @@ static noinline struct module *load_module(void __user *umod,
>  	 * The mutex protects against concurrent writers.
>  	 */
>  	mutex_lock(&module_mutex);
> +
> +	if (find_module(mod->name)) {
> +		err = -EEXIST;
> +		/* This will also unlock the mutex */
> +		goto already_exists;
> +	}
> +
>  	list_add_rcu(&mod->list, &modules);
>  	mutex_unlock(&module_mutex);
>  
> @@ -2511,6 +2513,7 @@ static noinline struct module *load_module(void __user *umod,
>  	mutex_lock(&module_mutex);
>  	/* Unlink carefully: kallsyms could be walking list. */
>  	list_del_rcu(&mod->list);
> + already_exists:
>  	mutex_unlock(&module_mutex);
>  	synchronize_sched();
>  	module_arch_cleanup(mod);

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  2:10                                         ` Brandon Philips
@ 2010-06-02  3:03                                           ` Rusty Russell
  2010-06-02  4:35                                           ` Linus Torvalds
  1 sibling, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  3:03 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 11:40:29 am Brandon Philips wrote:
> On 16:51 Tue 01 Jun 2010, Linus Torvalds wrote:
> > On Tue, 1 Jun 2010, Brandon Philips wrote:
> > > When I tested a Kernel with Rusty's modules branch pulled onto
> > > 2.6.35-rc1 I got duplicate sysfs path errors:
> > Hmm. Yeah, the module_mutex used to be held across the whole "find -> add" 
> > state, but isn't any more.
> 
> Right.
> 
> > > To fix this we need to make sure that we only have one copy of a
> > > module going through load_module at a time.  Patch suggestion
> > > follows which boots without errors. I am sure there is a better
> > > way to do what it does ;)
> > 
> > I think Rusty may have made the lock a bit _too_ finegrained there,
> > and didn't add it to some places that needed it. It looks, for
> > example, like PATCH 1/2 actually drops the lock in places where it's
> > needed ("find_module()" is documented to need it, but now
> > load_module() didn't hold it at all when it did the find_module()).
> 
> Right, I noticed that too and held the lock in the patch I sent.
> 
> > Rather than adding a new "module_loading" list, I think we should be
> > able to just use the existing "modules" list, and just fix up the
> > locking a bit.
> >
> > In fact, maybe we could just move the "look up existing module" a
> > bit later - optimistically assuming that the module doesn't exist,
> > and then just undoing the work if it turns out that we were wrong,
> > just before adding ourselves to the list.
> >
> > A patch something like the appended (TOTALLY UNTESTED!)
> 
> FWIW, I tried this same idea initially and it breaks because the
> kobject EEXIST is coming from mod_sysfs_init() which happens further
> up in load_module() before the list_add_rcu().

I'll take a look.  Linus was right that my lock reduction was overzealous.

The kobject error should be harmless.

I have a simplified version of your problem (see testing patch); I'll
see what I can fix...

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01 14:58                                 ` Linus Torvalds
  2010-06-01 17:53                                   ` Linus Torvalds
@ 2010-06-02  3:09                                   ` Rusty Russell
  2010-06-02  4:32                                     ` Linus Torvalds
  2010-06-02  4:56                                     ` Linus Torvalds
  1 sibling, 2 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  3:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 12:28:31 am Linus Torvalds wrote:
> 
> On Tue, 1 Jun 2010, Rusty Russell wrote:
> >
> > > So explain why I should be more polite, or more terse?
> > 
> > How about this:
> > 
> > "I hate this patch.  It makes already-ugly locking in module.c awful, and I
> > can't see that it's correct.  Why not just reduce the locking to cover
> > the minimum needed?"
> 
> Umm. Did you read the first few emails I sent out on this thread 
> originally?

OK, this might be worthwhile.  Take the very first mail you sent:

   Hmm. That does seem to be buggy.  We can't just drop and re-take the lock: 
   that may make sense internally as far as resolve_symbol() itself is 
   concerned, but the caller will its own local variables, and some of those 
   will no longer be valid if the lock was dropped. 

The implication here is that that I don't know locking, and that this was
done without thought or care.  In fact, I did worry about it and decided it
was less risky than a locking reduction given my limited cycles (indeed,
it has been slightly).

   That commit also changes the return value semantics of "use_module()", 
   which is an exported interface where the only users seem to be 
   out-of-kernel (the only in-kernel use is in kernel/module.c itself). That 
   seems like a really really bad idea too.

Again with the implication that this was done without consideration.  I keep
that exported as a courtesy to the ksplice team, but I never intended to let
that shackle internal changes even slightly. And I think it's wrong to even
start down that path (though I changed the name for you in the recent patches)

   So I think reverting it is definitely the right thing to do. The commit 
   seems fundamentally broken. And having modules do request_module() in 
   their init functions has always been invalid anyway, so that excuse 
   doesn't really seem to be a reason to do anything crazy like this either.

The code is "fundamentally broken" and "crazy".  The purpose of the patch
was "invalid anyway".

You managed to accuse me four times of being an incompetent and downright
crazy maintainer in your first three paragraphs of the first mail!

> They were actually _politer_ than your suggestion above (no 
> crap mentioned), and did exactly what you ask for.

My example quote entirely lacked accusations of incompetence; the only
implications are:
(1) Hey, I might be wrong, maybe I just didn't spend enough time on it.
(2) Otherwise, here's what I'd do, you're smart enough to run with it if
    it makes sense or explain why I'm barking up the wrong tree.

See Andrew Morton's correspondence for how his "maybe I'm dumb but" tone
is used to brutal effect...

> I would like to note that your original "fix things by dropping the lock" 
> patch that I hated so much had this exact bug too. Making this a good 
> example of _why_ it's basically always wrong to drop locks in the middle - 
> even if you claimed you knew and understood the locking.

And I would like to note that it didn't :)  It grabbed references only on
completed modules.

It still had that find_module race, but the kset error actually it.  The
lock reduction patch needs that fixed.

> So I do hope we can agree to call the module locking "total and utter 
> crap", ok? 

OK.  And I agree that I was too cautious here; I should have taken the time
to fix locking once and for all.  That's much easier to do when there's a
Linus yelling at you about it.

> And yes, my code was crap too. I wrote it for the MODULE_UNLOAD case, and 
> only added a comment saying the !unload case was broken, but didn't look 
> at just _how_ broken it was. My bad.

Oh no, your code was broken for UNLOAD too.

And sloppy; you would have got a checklist reply email if you had mailed it
anonymously :)

Thanks for the consideration,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-01  5:19                                 ` Rusty Russell
@ 2010-06-02  3:15                                   ` Rusty Russell
  0 siblings, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  3:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Tue, 1 Jun 2010 02:49:50 pm Rusty Russell wrote:
> I wonder if we should just get rid of !CONFIG_UNLOAD then?  I have a soft spot
> for it because it keeps us honest and shows how much shit is there simply for
> our poor man's pagable kernel.
> 
> Let me compile up a kernel with and without and see what it's really doing
> to us...

With a distro-style config (copied Ubuntu then held down Enter on oldconfig)
it's:
		Vmlinux	module.o (text/data)	Total module size (text/data)
With unload	8976331	24144/1036		47255565/3298004
Without unload	8962022	20551/732		47176064/3222068
Without modules 8723931

So, we pay 14k for module unload support, or 0.2%, and 0.3% across the
modules themselves.  There's real runtime costs, too, but someone would need
to annotate and see how often we do inc/dec in a real system.

Not sure that last one is exactly comparable, but it looks like module
support is a significant cost...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  3:09                                   ` Rusty Russell
@ 2010-06-02  4:32                                     ` Linus Torvalds
  2010-06-02  4:56                                     ` Linus Torvalds
  1 sibling, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  4:32 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> > I would like to note that your original "fix things by dropping the lock" 
> > patch that I hated so much had this exact bug too. Making this a good 
> > example of _why_ it's basically always wrong to drop locks in the middle - 
> > even if you claimed you knew and understood the locking.
> 
> And I would like to note that it didn't :)  It grabbed references only on
> completed modules.

Right you are, I misread the patch. You re-did the whole module lookup, 
something that the original code didn't do. And yes, that should be safe.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  2:10                                         ` Brandon Philips
  2010-06-02  3:03                                           ` Rusty Russell
@ 2010-06-02  4:35                                           ` Linus Torvalds
  2010-06-02  4:44                                             ` Linus Torvalds
  2010-06-02  5:52                                             ` Rusty Russell
  1 sibling, 2 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  4:35 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rusty Russell, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Brandon Philips wrote:
> 
> FWIW, I tried this same idea initially and it breaks because the
> kobject EEXIST is coming from mod_sysfs_init() which happens further
> up in load_module() before the list_add_rcu().

Ahh, right. So we'd need to mvoe that down too. As Rusty says, the kobject 
EEXIST warning should be fairly harmless - albeit annoying. Do things 
actually _work_ with that thing apart from the kobject warning?

> I also tried the obvious variation of moving the list_add_rcu() up
> to where the find_module is but got:

Yeah, I don't think moving it up will work, because then the module symbol 
resolver will see itself before having set up the symbols. So I think we 
need to expose it on the modules list only after having done the 
"simplify_symbols()" etc.

I dunno.

Does this work?

(Caveat emptor - I tried to make sure the error cases nest correctly, and 
it all compiles, but it's untested. As usual. Rusty mentioned a "see 
testing patch", but I didn't see it. Maybe he did the same thing)

		Linus
---
 kernel/module.c |   35 +++++++++++++++++++----------------
 1 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a1f46a5..247c0aa 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2198,11 +2198,6 @@ static noinline struct module *load_module(void __user *umod,
 		goto free_mod;
 	}
 
-	if (find_module(mod->name)) {
-		err = -EEXIST;
-		goto free_mod;
-	}
-
 	mod->state = MODULE_STATE_COMING;
 
 	/* Allow arches to frob section contents and sizes.  */
@@ -2293,11 +2288,6 @@ static noinline struct module *load_module(void __user *umod,
 	/* Now we've moved module, initialize linked lists, etc. */
 	module_unload_init(mod);
 
-	/* add kobject, so we can reference it. */
-	err = mod_sysfs_init(mod);
-	if (err)
-		goto free_unload;
-
 	/* Set up license info based on the info section */
 	set_license(mod, get_modinfo(sechdrs, infoindex, "license"));
 
@@ -2486,16 +2476,28 @@ static noinline struct module *load_module(void __user *umod,
 	 * The mutex protects against concurrent writers.
 	 */
 	mutex_lock(&module_mutex);
+
+	if (find_module(mod->name)) {
+		err = -EEXIST;
+		/* This will also unlock the mutex */
+		goto already_exists;
+	}
+
 	list_add_rcu(&mod->list, &modules);
 	mutex_unlock(&module_mutex);
 
+	/* add kobject, so we can reference it. */
+	err = mod_sysfs_init(mod);
+	if (err)
+		goto unlink;
+
 	err = parse_args(mod->name, mod->args, mod->kp, mod->num_kp, NULL);
 	if (err < 0)
-		goto unlink;
+		goto sysfs_uninit;
 
 	err = mod_sysfs_setup(mod, mod->kp, mod->num_kp);
 	if (err < 0)
-		goto unlink;
+		goto sysfs_uninit;
 	add_sect_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
 	add_notes_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
 
@@ -2507,18 +2509,19 @@ static noinline struct module *load_module(void __user *umod,
 	/* Done! */
 	return mod;
 
+ sysfs_uninit:
+	free_modinfo(mod);
+	kobject_del(&mod->mkobj.kobj);
+	kobject_put(&mod->mkobj.kobj);
  unlink:
 	mutex_lock(&module_mutex);
 	/* Unlink carefully: kallsyms could be walking list. */
 	list_del_rcu(&mod->list);
+ already_exists:
 	mutex_unlock(&module_mutex);
 	synchronize_sched();
 	module_arch_cleanup(mod);
  cleanup:
-	free_modinfo(mod);
-	kobject_del(&mod->mkobj.kobj);
-	kobject_put(&mod->mkobj.kobj);
- free_unload:
 	module_unload_free(mod);
 #if defined(CONFIG_MODULE_UNLOAD)
 	free_percpu(mod->refptr);

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  4:35                                           ` Linus Torvalds
@ 2010-06-02  4:44                                             ` Linus Torvalds
  2010-06-02  6:35                                               ` Rusty Russell
  2010-06-02  5:52                                             ` Rusty Russell
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  4:44 UTC (permalink / raw)
  To: Brandon Philips
  Cc: Rusty Russell, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Tue, 1 Jun 2010, Linus Torvalds wrote:
> 
> (Caveat emptor - I tried to make sure the error cases nest correctly, and 
> it all compiles, but it's untested. As usual.

In fact, I didn't nest it right.

The "free_modinfo()" pairs with the "setup_modinfo()" call, and should go 
into the "cleanup" error case, not the "sysfs_uninit" error case. IOW, I 
moved one too many error case cleanup lines.

So in that patch, the "free_modinfo()" call should move back to the 
cleanup case. Like the appended (still untested - I just stared at the 
code some more, rather than do anything as mundane as _test_ it) patch.

It may still not be right, of course. But it might be closer.

(That function _really_ should be peeled like an onion, and split into 
many smaller functions, so that we don't have ten error cases needing 
unwinding. I like "goto error", but at some point you can't see the 
unwinding any more, and that function has passed that point a long time 
ago, I think)

		Linus

---
 kernel/module.c |   33 ++++++++++++++++++---------------
 1 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a1f46a5..135577c 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2198,11 +2198,6 @@ static noinline struct module *load_module(void __user *umod,
 		goto free_mod;
 	}
 
-	if (find_module(mod->name)) {
-		err = -EEXIST;
-		goto free_mod;
-	}
-
 	mod->state = MODULE_STATE_COMING;
 
 	/* Allow arches to frob section contents and sizes.  */
@@ -2293,11 +2288,6 @@ static noinline struct module *load_module(void __user *umod,
 	/* Now we've moved module, initialize linked lists, etc. */
 	module_unload_init(mod);
 
-	/* add kobject, so we can reference it. */
-	err = mod_sysfs_init(mod);
-	if (err)
-		goto free_unload;
-
 	/* Set up license info based on the info section */
 	set_license(mod, get_modinfo(sechdrs, infoindex, "license"));
 
@@ -2486,16 +2476,28 @@ static noinline struct module *load_module(void __user *umod,
 	 * The mutex protects against concurrent writers.
 	 */
 	mutex_lock(&module_mutex);
+
+	if (find_module(mod->name)) {
+		err = -EEXIST;
+		/* This will also unlock the mutex */
+		goto already_exists;
+	}
+
 	list_add_rcu(&mod->list, &modules);
 	mutex_unlock(&module_mutex);
 
+	/* add kobject, so we can reference it. */
+	err = mod_sysfs_init(mod);
+	if (err)
+		goto unlink;
+
 	err = parse_args(mod->name, mod->args, mod->kp, mod->num_kp, NULL);
 	if (err < 0)
-		goto unlink;
+		goto sysfs_uninit;
 
 	err = mod_sysfs_setup(mod, mod->kp, mod->num_kp);
 	if (err < 0)
-		goto unlink;
+		goto sysfs_uninit;
 	add_sect_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
 	add_notes_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
 
@@ -2507,18 +2509,19 @@ static noinline struct module *load_module(void __user *umod,
 	/* Done! */
 	return mod;
 
+ sysfs_uninit:
+	kobject_del(&mod->mkobj.kobj);
+	kobject_put(&mod->mkobj.kobj);
  unlink:
 	mutex_lock(&module_mutex);
 	/* Unlink carefully: kallsyms could be walking list. */
 	list_del_rcu(&mod->list);
+ already_exists:
 	mutex_unlock(&module_mutex);
 	synchronize_sched();
 	module_arch_cleanup(mod);
  cleanup:
 	free_modinfo(mod);
-	kobject_del(&mod->mkobj.kobj);
-	kobject_put(&mod->mkobj.kobj);
- free_unload:
 	module_unload_free(mod);
 #if defined(CONFIG_MODULE_UNLOAD)
 	free_percpu(mod->refptr);

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  3:09                                   ` Rusty Russell
  2010-06-02  4:32                                     ` Linus Torvalds
@ 2010-06-02  4:56                                     ` Linus Torvalds
  2010-06-02  5:52                                       ` Rusty Russell
  1 sibling, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  4:56 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> OK, this might be worthwhile.  Take the very first mail you sent:
> 
>    Hmm. That does seem to be buggy.  We can't just drop and re-take the lock: 
>    that may make sense internally as far as resolve_symbol() itself is 
>    concerned, but the caller will its own local variables, and some of those 
>    will no longer be valid if the lock was dropped. 
> 
> The implication here is that that I don't know locking, and that this was
> done without thought or care.  In fact, I did worry about it and decided it
> was less risky than a locking reduction given my limited cycles (indeed,
> it has been slightly).

Well, part of the context here is that the commit had been bisected as 
being buggy. It turns out the bug was a different issue, but at the same 
time, it very much looked like the locking was simply known a-priori to be 
buggy. No?

And I do agree with the notion that it might have been a "simpler hack for 
a quick fix", and potentially less risky (due to being more targeted to 
the particular problem), when it then causes oopses, the default 
explanation is that the dubious locking trick really was broken. No?

>    That commit also changes the return value semantics of "use_module()", 
>    which is an exported interface where the only users seem to be 
>    out-of-kernel (the only in-kernel use is in kernel/module.c itself). That 
>    seems like a really really bad idea too.
> 
> Again with the implication that this was done without consideration.

No. The implication was that it's wrong to do and a bad idea. Anything 
else is just you reading things into it.

It's an exported interface, and you changed it with no real reason. 
Whether you then talked to users or not is immaterial.

You could easily have just changed the _internal_ thing, and exported the 
unchanged interface.

Even your second version is just very confused. It renames it, but then 
exports the renamed version. What does that help? It still means that the 
external module - for no good reason - needs to have basically a 
source-level version number check.

Why?

It's still unclear to me. No reason for that exported interface change 
seems to exist.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  4:56                                     ` Linus Torvalds
@ 2010-06-02  5:52                                       ` Rusty Russell
  2010-06-02  6:59                                         ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  5:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers,
	Tim Abbott

On Wed, 2 Jun 2010 02:26:52 pm Linus Torvalds wrote:
> 
> On Wed, 2 Jun 2010, Rusty Russell wrote:
> > 
> > OK, this might be worthwhile.  Take the very first mail you sent:
> > 
> >    Hmm. That does seem to be buggy.  We can't just drop and re-take the lock: 
> >    that may make sense internally as far as resolve_symbol() itself is 
> >    concerned, but the caller will its own local variables, and some of those 
> >    will no longer be valid if the lock was dropped. 
> > 
> > The implication here is that that I don't know locking, and that this was
> > done without thought or care.  In fact, I did worry about it and decided it
> > was less risky than a locking reduction given my limited cycles (indeed,
> > it has been slightly).
> 
> Well, part of the context here is that the commit had been bisected as 
> being buggy. It turns out the bug was a different issue, but at the same 
> time, it very much looked like the locking was simply known a-priori to be 
> buggy. No?

Hmm, you're still missing it.  Let me try again.

You weren't the best person to make the call.  That didn't occur to you, did
it?

Subsystem maintainers aren't just patch monkeys, we collect all the metadata
which informs these decisions.  Without that, you wandered in, came to the
obvious conclusion, and were wrong.

Really grating is the arrogant lack of respect for that knowledge combined
with your multiple mistakes since then due to your lack of it.  See below.

> >    That commit also changes the return value semantics of "use_module()", 
> >    which is an exported interface where the only users seem to be 
> >    out-of-kernel (the only in-kernel use is in kernel/module.c itself). That 
> >    seems like a really really bad idea too.
> > 
> > Again with the implication that this was done without consideration.
> 
> No. The implication was that it's wrong to do and a bad idea. Anything 
> else is just you reading things into it.
> 
> It's an exported interface, and you changed it with no real reason. 
> Whether you then talked to users or not is immaterial.
> 
> You could easily have just changed the _internal_ thing, and exported the 
> unchanged interface.

I *genuinely* thought our policy with out-of-tree code was to smother
out-of-tree code with absolute apathy so they'll get sick of riding the
churn.

Still, easy to avoid in this case.

> Even your second version is just very confused. It renames it, but then 
> exports the renamed version. What does that help?

This *almost* looks like an honest "hey, I don't understand this".

But no, "very confused" already assumes you're right.  You do ask if it
helps...

> It still means that the  external module - for no good reason - needs to
> have basically a  source-level version number check.

...and go on to assume it doesn't.  I must be really confused, right?

Or maybe, I know something you don't about the code I maintain.

We have infrastructure to look up symbols dynamically: symbol_get().

Sure, they have to adapt their code, but now they can maintain their own
damn wrapper, *and* they'll discover the change as soon as they try to
build their module against the new kernel.

I'd go so far as to say it's clever...
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  4:35                                           ` Linus Torvalds
  2010-06-02  4:44                                             ` Linus Torvalds
@ 2010-06-02  5:52                                             ` Rusty Russell
  2010-06-02  7:21                                               ` Linus Torvalds
  1 sibling, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  5:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 02:05:10 pm Linus Torvalds wrote:
> 
> On Tue, 1 Jun 2010, Brandon Philips wrote:
> > 
> > FWIW, I tried this same idea initially and it breaks because the
> > kobject EEXIST is coming from mod_sysfs_init() which happens further
> > up in load_module() before the list_add_rcu().
> 
> Ahh, right. So we'd need to mvoe that down too. As Rusty says, the kobject 
> EEXIST warning should be fairly harmless - albeit annoying. Do things 
> actually _work_ with that thing apart from the kobject warning?
> 
> > I also tried the obvious variation of moving the list_add_rcu() up
> > to where the find_module is but got:
> 
> Yeah, I don't think moving it up will work, because then the module symbol 
> resolver will see itself before having set up the symbols. So I think we 
> need to expose it on the modules list only after having done the 
> "simplify_symbols()" etc.
> 
> I dunno.
> 
> Does this work?
> 
> (Caveat emptor - I tried to make sure the error cases nest correctly, and 
> it all compiles, but it's untested. As usual. Rusty mentioned a "see 
> testing patch", but I didn't see it. Maybe he did the same thing)

Yep, oops.  I am testing a similar thing, but a bit more of a cleanup
on the way.

Moved all the sysfs-exposing stuff to the end just after we put in the
list (and thus to after the find check).  In turn spawns a sysfs cleanup
(those funcs exposed for no good reason).  Also, your two-way deps patch
really wins since we need to walk the list to create the links.

Code soon...
Rusty

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  4:44                                             ` Linus Torvalds
@ 2010-06-02  6:35                                               ` Rusty Russell
  2010-06-02  7:45                                                 ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  6:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 02:14:37 pm Linus Torvalds wrote:
> (That function _really_ should be peeled like an onion, and split into 
> many smaller functions, so that we don't have ten error cases needing 
> unwinding. I like "goto error", but at some point you can't see the 
> unwinding any more, and that function has passed that point a long time 
> ago, I think)

Agreed.  Feel free to take a stab at it if you're bored.  Last I tried,
there wasn't an obvious split point which actually reduced the size of the
function.

And here's the tree I'm testing now (it compiles, currently rebasing to
dodge some unrelated -rc1 fs/block weirdness):

The following changes since commit 67a3e12b05e055c0415c556a315a3d3eb637e29e:
  Linus Torvalds (1):
        Linux 2.6.35-rc1

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Eric Dumazet (1):
      module: module_unload_init() cleanup

Linus Torvalds (2):
      module: Make the 'usage' lists be two-way
      module: move find_module check to end

Rusty Russell (6):
      module: fix reference to mod->percpu after freeing module.
      module: fix kdb's illicit use of struct module_use.
      module: move sysfs exposure to end of load_module
      module: Make module sysfs functions private.
      module: make locking more fine-grained.
      module: fix bne2 "gave up waiting for init of module libcrc32c"

 include/linux/module.h      |   44 ++-----
 kernel/debug/kdb/kdb_main.c |   12 +--
 kernel/module.c             |  325 ++++++++++++++++++++++++++++---------------
 3 files changed, 226 insertions(+), 155 deletions(-)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  5:52                                       ` Rusty Russell
@ 2010-06-02  6:59                                         ` Linus Torvalds
  0 siblings, 0 replies; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  6:59 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Brandon Philips, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers,
	Tim Abbott



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> Hmm, you're still missing it.  Let me try again.
> 
> You weren't the best person to make the call.  That didn't occur to you, did
> it?

Sure, fair enough. 

At the same time, I'm the one who has to make the call. And quite often, 
that call is "this causes more problems than it fixes, we need to revert 
it".

Maybe I was too eager to revert this time. Quite often it ends up the 
other way, where we end up having broken kernels for too long because 
people weren't eager _enough_ to revert commits that had been bisected to 
be troublesome. It's hard to tell beforehand.

And quite often, subsystem maintainers are _way_ too eager to not revert 
the commits they wrote. So I really do end up having to balance that 
force. 

Does it always work out? Nope. 

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  5:52                                             ` Rusty Russell
@ 2010-06-02  7:21                                               ` Linus Torvalds
  2010-06-02 14:06                                                 ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  7:21 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> Moved all the sysfs-exposing stuff to the end just after we put in the
> list (and thus to after the find check).

Yeah, makes more sense that way. No reason to expose anything to sysfs 
early. And splitting it into two patches makes it easier to follow than 
the patch I posted. Ack.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  6:35                                               ` Rusty Russell
@ 2010-06-02  7:45                                                 ` Linus Torvalds
  2010-06-02  8:12                                                   ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  7:45 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> Agreed.  Feel free to take a stab at it if you're bored.  Last I tried,
> there wasn't an obvious split point which actually reduced the size of the
> function.

I'd start from the trivial stuff. There's a fair amount of straight-line 
code that just makes the function hard to read just because you have to 
page up and down so far. Some of it is trivial to just create a helper 
function for. 

IOW, things like this.. Pure code movement to peel off some stuff.

No, this patch on its own doesn't really make anything easier to read, but 
a handful of these kinds of things might bring down what is currently an 
almost 500-line function (yeah, really) to the point where it could 
potentially be just fifty lines (most of which is just calling helper 
functions, and checking an error case).

At that point, it should be possible to see it in one (biggish) window, 
which would make it a _lot_ easier to follow the cleanup cases.

Does this make the function smaller in any _absolute_ sense? No. The patch 
has 6 added lines, because the whole function declaration, braces, empty 
lines, call site. And the code is obviously not going to be smaller. It 
would just be a bit more easy to get an overview of.

Worth it? I dunno. But currently that 'load_module()' thing does end up 
being the function from hell. Trying to figure out the nesting of the 
error cases was a matter of paging up-and-down and trying to remember what 
particular 'goto' target I was looking for. It _should_ be possible to do 
better.

		Linus

---
 kernel/module.c |  124 +++++++++++++++++++++++++++++--------------------------
 1 files changed, 65 insertions(+), 59 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index d4a55f1..165d97e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2106,6 +2106,70 @@ static inline void kmemleak_load_module(struct module *mod, Elf_Ehdr *hdr,
 }
 #endif
 
+static void find_module_sections(struct module *mod, Elf_Ehdr *hdr, Elf_Shdr *sechdrs, char *secstrings)
+{
+	mod->kp = section_objs(hdr, sechdrs, secstrings, "__param",
+			       sizeof(*mod->kp), &mod->num_kp);
+	mod->syms = section_objs(hdr, sechdrs, secstrings, "__ksymtab",
+				 sizeof(*mod->syms), &mod->num_syms);
+	mod->crcs = section_addr(hdr, sechdrs, secstrings, "__kcrctab");
+	mod->gpl_syms = section_objs(hdr, sechdrs, secstrings, "__ksymtab_gpl",
+				     sizeof(*mod->gpl_syms),
+				     &mod->num_gpl_syms);
+	mod->gpl_crcs = section_addr(hdr, sechdrs, secstrings, "__kcrctab_gpl");
+	mod->gpl_future_syms = section_objs(hdr, sechdrs, secstrings,
+					    "__ksymtab_gpl_future",
+					    sizeof(*mod->gpl_future_syms),
+					    &mod->num_gpl_future_syms);
+	mod->gpl_future_crcs = section_addr(hdr, sechdrs, secstrings,
+					    "__kcrctab_gpl_future");
+
+#ifdef CONFIG_UNUSED_SYMBOLS
+	mod->unused_syms = section_objs(hdr, sechdrs, secstrings,
+					"__ksymtab_unused",
+					sizeof(*mod->unused_syms),
+					&mod->num_unused_syms);
+	mod->unused_crcs = section_addr(hdr, sechdrs, secstrings,
+					"__kcrctab_unused");
+	mod->unused_gpl_syms = section_objs(hdr, sechdrs, secstrings,
+					    "__ksymtab_unused_gpl",
+					    sizeof(*mod->unused_gpl_syms),
+					    &mod->num_unused_gpl_syms);
+	mod->unused_gpl_crcs = section_addr(hdr, sechdrs, secstrings,
+					    "__kcrctab_unused_gpl");
+#endif
+#ifdef CONFIG_CONSTRUCTORS
+	mod->ctors = section_objs(hdr, sechdrs, secstrings, ".ctors",
+				  sizeof(*mod->ctors), &mod->num_ctors);
+#endif
+
+#ifdef CONFIG_TRACEPOINTS
+	mod->tracepoints = section_objs(hdr, sechdrs, secstrings,
+					"__tracepoints",
+					sizeof(*mod->tracepoints),
+					&mod->num_tracepoints);
+#endif
+#ifdef CONFIG_EVENT_TRACING
+	mod->trace_events = section_objs(hdr, sechdrs, secstrings,
+					 "_ftrace_events",
+					 sizeof(*mod->trace_events),
+					 &mod->num_trace_events);
+	/*
+	 * This section contains pointers to allocated objects in the trace
+	 * code and not scanning it leads to false positives.
+	 */
+	kmemleak_scan_area(mod->trace_events, sizeof(*mod->trace_events) *
+			   mod->num_trace_events, GFP_KERNEL);
+#endif
+#ifdef CONFIG_FTRACE_MCOUNT_RECORD
+	/* sechdrs[0].sh_size is always zero */
+	mod->ftrace_callsites = section_objs(hdr, sechdrs, secstrings,
+					     "__mcount_loc",
+					     sizeof(*mod->ftrace_callsites),
+					     &mod->num_ftrace_callsites);
+#endif
+}
+
 /* Allocate and load the module: note that size of section 0 is always
    zero, and we rely on this for optional sections. */
 static noinline struct module *load_module(void __user *umod,
@@ -2365,66 +2429,8 @@ static noinline struct module *load_module(void __user *umod,
 
 	/* Now we've got everything in the final locations, we can
 	 * find optional sections. */
-	mod->kp = section_objs(hdr, sechdrs, secstrings, "__param",
-			       sizeof(*mod->kp), &mod->num_kp);
-	mod->syms = section_objs(hdr, sechdrs, secstrings, "__ksymtab",
-				 sizeof(*mod->syms), &mod->num_syms);
-	mod->crcs = section_addr(hdr, sechdrs, secstrings, "__kcrctab");
-	mod->gpl_syms = section_objs(hdr, sechdrs, secstrings, "__ksymtab_gpl",
-				     sizeof(*mod->gpl_syms),
-				     &mod->num_gpl_syms);
-	mod->gpl_crcs = section_addr(hdr, sechdrs, secstrings, "__kcrctab_gpl");
-	mod->gpl_future_syms = section_objs(hdr, sechdrs, secstrings,
-					    "__ksymtab_gpl_future",
-					    sizeof(*mod->gpl_future_syms),
-					    &mod->num_gpl_future_syms);
-	mod->gpl_future_crcs = section_addr(hdr, sechdrs, secstrings,
-					    "__kcrctab_gpl_future");
+	find_module_sections(mod, hdr, sechdrs, secstrings);
 
-#ifdef CONFIG_UNUSED_SYMBOLS
-	mod->unused_syms = section_objs(hdr, sechdrs, secstrings,
-					"__ksymtab_unused",
-					sizeof(*mod->unused_syms),
-					&mod->num_unused_syms);
-	mod->unused_crcs = section_addr(hdr, sechdrs, secstrings,
-					"__kcrctab_unused");
-	mod->unused_gpl_syms = section_objs(hdr, sechdrs, secstrings,
-					    "__ksymtab_unused_gpl",
-					    sizeof(*mod->unused_gpl_syms),
-					    &mod->num_unused_gpl_syms);
-	mod->unused_gpl_crcs = section_addr(hdr, sechdrs, secstrings,
-					    "__kcrctab_unused_gpl");
-#endif
-#ifdef CONFIG_CONSTRUCTORS
-	mod->ctors = section_objs(hdr, sechdrs, secstrings, ".ctors",
-				  sizeof(*mod->ctors), &mod->num_ctors);
-#endif
-
-#ifdef CONFIG_TRACEPOINTS
-	mod->tracepoints = section_objs(hdr, sechdrs, secstrings,
-					"__tracepoints",
-					sizeof(*mod->tracepoints),
-					&mod->num_tracepoints);
-#endif
-#ifdef CONFIG_EVENT_TRACING
-	mod->trace_events = section_objs(hdr, sechdrs, secstrings,
-					 "_ftrace_events",
-					 sizeof(*mod->trace_events),
-					 &mod->num_trace_events);
-	/*
-	 * This section contains pointers to allocated objects in the trace
-	 * code and not scanning it leads to false positives.
-	 */
-	kmemleak_scan_area(mod->trace_events, sizeof(*mod->trace_events) *
-			   mod->num_trace_events, GFP_KERNEL);
-#endif
-#ifdef CONFIG_FTRACE_MCOUNT_RECORD
-	/* sechdrs[0].sh_size is always zero */
-	mod->ftrace_callsites = section_objs(hdr, sechdrs, secstrings,
-					     "__mcount_loc",
-					     sizeof(*mod->ftrace_callsites),
-					     &mod->num_ftrace_callsites);
-#endif
 #ifdef CONFIG_MODVERSIONS
 	if ((mod->num_syms && !mod->crcs)
 	    || (mod->num_gpl_syms && !mod->gpl_crcs)

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  7:45                                                 ` Linus Torvalds
@ 2010-06-02  8:12                                                   ` Linus Torvalds
  2010-06-02  9:07                                                     ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02  8:12 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Linus Torvalds wrote:
> 
> IOW, things like this.. Pure code movement to peel off some stuff.

Here's a second one. It's slightly less trivial - since we now have error 
cases - and equally untested so it may well be totally broken. But it also 
cleans up a bit more, and avoids one of the goto targets, because the 
"move_module()" helper now does both allocations or none.

But now I'm bored and tired, so I think this will be all for tonight. The 
load_module function has gone from ~490 lines to ~370 lines, but that's 
still so many that I don't honestly think this makes any readability 
improvement yet.

But add a few more helper functions (say, one for doing relocations, one 
for doing some of the verification, one for doing the license and 
tainting, one to trivially push the icache flush into its own helper etc), 
and it probably will never fit in 50 lines, but maybe it could be 150.

			Linus

---
 kernel/module.c |  119 ++++++++++++++++++++++++++++++-------------------------
 1 files changed, 65 insertions(+), 54 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 165d97e..72567e6 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2170,6 +2170,67 @@ static void find_module_sections(struct module *mod, Elf_Ehdr *hdr, Elf_Shdr *se
 #endif
 }
 
+static struct module *move_module(struct module *mod, Elf_Ehdr *hdr, Elf_Shdr *sechdrs, char *secstrings, unsigned modindex)
+{
+	int i;
+	void *ptr;
+
+	/* Do the allocs. */
+	ptr = module_alloc_update_bounds(mod->core_size);
+	/*
+	 * The pointer to this block is stored in the module structure
+	 * which is inside the block. Just mark it as not being a
+	 * leak.
+	 */
+	kmemleak_not_leak(ptr);
+	if (!ptr)
+		return ERR_PTR(-ENOMEM);
+
+	memset(ptr, 0, mod->core_size);
+	mod->module_core = ptr;
+
+	ptr = module_alloc_update_bounds(mod->init_size);
+	/*
+	 * The pointer to this block is stored in the module structure
+	 * which is inside the block. This block doesn't need to be
+	 * scanned as it contains data and code that will be freed
+	 * after the module is initialized.
+	 */
+	kmemleak_ignore(ptr);
+	if (!ptr && mod->init_size) {
+		module_free(mod, mod->module_core);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(ptr, 0, mod->init_size);
+	mod->module_init = ptr;
+
+	/* Transfer each section which specifies SHF_ALLOC */
+	DEBUGP("final section addresses:\n");
+	for (i = 0; i < hdr->e_shnum; i++) {
+		void *dest;
+
+		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
+			continue;
+
+		if (sechdrs[i].sh_entsize & INIT_OFFSET_MASK)
+			dest = mod->module_init
+				+ (sechdrs[i].sh_entsize & ~INIT_OFFSET_MASK);
+		else
+			dest = mod->module_core + sechdrs[i].sh_entsize;
+
+		if (sechdrs[i].sh_type != SHT_NOBITS)
+			memcpy(dest, (void *)sechdrs[i].sh_addr,
+			       sechdrs[i].sh_size);
+		/* Update sh_addr to point to copy in image. */
+		sechdrs[i].sh_addr = (unsigned long)dest;
+		DEBUGP("\t0x%lx %s\n", sechdrs[i].sh_addr, secstrings + sechdrs[i].sh_name);
+	}
+	/* Module has been moved. */
+	mod = (void *)sechdrs[modindex].sh_addr;
+	kmemleak_load_module(mod, hdr, sechdrs, secstrings);
+	return mod;
+}
+
 /* Allocate and load the module: note that size of section 0 is always
    zero, and we rely on this for optional sections. */
 static noinline struct module *load_module(void __user *umod,
@@ -2186,7 +2247,6 @@ static noinline struct module *load_module(void __user *umod,
 	unsigned int modindex, versindex, infoindex, pcpuindex;
 	struct module *mod;
 	long err = 0;
-	void *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long symoffs, stroffs, *strmap;
 	void __percpu *percpu;
 
@@ -2338,60 +2398,12 @@ static noinline struct module *load_module(void __user *umod,
 	symoffs = layout_symtab(mod, sechdrs, symindex, strindex, hdr,
 				secstrings, &stroffs, strmap);
 
-	/* Do the allocs. */
-	ptr = module_alloc_update_bounds(mod->core_size);
-	/*
-	 * The pointer to this block is stored in the module structure
-	 * which is inside the block. Just mark it as not being a
-	 * leak.
-	 */
-	kmemleak_not_leak(ptr);
-	if (!ptr) {
-		err = -ENOMEM;
+	/* Allocate and move to the final place */
+	mod = move_module(mod, hdr, sechdrs, secstrings, modindex);
+	if (IS_ERR(mod)) {
+		err = PTR_ERR(mod);
 		goto free_percpu;
 	}
-	memset(ptr, 0, mod->core_size);
-	mod->module_core = ptr;
-
-	ptr = module_alloc_update_bounds(mod->init_size);
-	/*
-	 * The pointer to this block is stored in the module structure
-	 * which is inside the block. This block doesn't need to be
-	 * scanned as it contains data and code that will be freed
-	 * after the module is initialized.
-	 */
-	kmemleak_ignore(ptr);
-	if (!ptr && mod->init_size) {
-		err = -ENOMEM;
-		goto free_core;
-	}
-	memset(ptr, 0, mod->init_size);
-	mod->module_init = ptr;
-
-	/* Transfer each section which specifies SHF_ALLOC */
-	DEBUGP("final section addresses:\n");
-	for (i = 0; i < hdr->e_shnum; i++) {
-		void *dest;
-
-		if (!(sechdrs[i].sh_flags & SHF_ALLOC))
-			continue;
-
-		if (sechdrs[i].sh_entsize & INIT_OFFSET_MASK)
-			dest = mod->module_init
-				+ (sechdrs[i].sh_entsize & ~INIT_OFFSET_MASK);
-		else
-			dest = mod->module_core + sechdrs[i].sh_entsize;
-
-		if (sechdrs[i].sh_type != SHT_NOBITS)
-			memcpy(dest, (void *)sechdrs[i].sh_addr,
-			       sechdrs[i].sh_size);
-		/* Update sh_addr to point to copy in image. */
-		sechdrs[i].sh_addr = (unsigned long)dest;
-		DEBUGP("\t0x%lx %s\n", sechdrs[i].sh_addr, secstrings + sechdrs[i].sh_name);
-	}
-	/* Module has been moved. */
-	mod = (void *)sechdrs[modindex].sh_addr;
-	kmemleak_load_module(mod, hdr, sechdrs, secstrings);
 
 #if defined(CONFIG_MODULE_UNLOAD)
 	mod->refptr = alloc_percpu(struct module_ref);
@@ -2577,7 +2589,6 @@ static noinline struct module *load_module(void __user *umod,
  free_init:
 #endif
 	module_free(mod, mod->module_init);
- free_core:
 	module_free(mod, mod->module_core);
 	/* mod will be freed with core. Don't access it beyond this line! */
  free_percpu:

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  8:12                                                   ` Linus Torvalds
@ 2010-06-02  9:07                                                     ` Rusty Russell
  0 siblings, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-02  9:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 05:42:32 pm Linus Torvalds wrote:
> 
> On Wed, 2 Jun 2010, Linus Torvalds wrote:
> > 
> > IOW, things like this.. Pure code movement to peel off some stuff.
> 
> Here's a second one. It's slightly less trivial - since we now have error 
> cases - and equally untested so it may well be totally broken. But it also 
> cleans up a bit more, and avoids one of the goto targets, because the 
> "move_module()" helper now does both allocations or none.
> 
> But now I'm bored and tired, so I think this will be all for tonight. The 
> load_module function has gone from ~490 lines to ~370 lines, but that's 
> still so many that I don't honestly think this makes any readability 
> improvement yet.

Both applied and I'll play a little more later...

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02  7:21                                               ` Linus Torvalds
@ 2010-06-02 14:06                                                 ` Rusty Russell
  2010-06-02 14:50                                                   ` Linus Torvalds
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-02 14:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Wed, 2 Jun 2010 04:51:46 pm Linus Torvalds wrote:
> 
> On Wed, 2 Jun 2010, Rusty Russell wrote:
> > 
> > Moved all the sysfs-exposing stuff to the end just after we put in the
> > list (and thus to after the find check).
> 
> Yeah, makes more sense that way. No reason to expose anything to sysfs 
> early. And splitting it into two patches makes it easier to follow than 
> the patch I posted. Ack.

Found another locking issue: the code which verifies we don't export over
an existing symbol needs to be atomic wrt. adding us to the list.

This will be in tomorrow's linux-next.

And load_module is down to 259 lines.  The label chain at the end is no
shorter tho :(  I'll leave those cleanups until next merge window.

The following changes since commit aef4b9aaae1decc775778903922bd0075cce7a88:
  Linus Torvalds (1):
        Merge branch 'next' of git://git.kernel.org/.../benh/powerpc

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Eric Dumazet (1):
      module: module_unload_init() cleanup

Linus Torvalds (4):
      module: Make the 'usage' lists be two-way
      module: move find_module check to end
      module: refactor load_module
      module: refactor load_module part 2

Rusty Russell (9):
      module: fix kdb's illicit use of struct module_use.
      module: move sysfs exposure to end of load_module
      module: Make module sysfs functions private.
      module: make locking more fine-grained.
      module: verify_export_symbols under the lock
      module: fix bne2 "gave up waiting for init of module libcrc32c"
      module: refactor load_module part 3
      module: refactor load_module part 4
      module: refactor load_module part 5

 include/linux/module.h      |   44 +--
 kernel/debug/kdb/kdb_main.c |   12 +-
 kernel/module.c             |  899 +++++++++++++++++++++++++------------------
 3 files changed, 547 insertions(+), 408 deletions(-)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02 14:06                                                 ` Rusty Russell
@ 2010-06-02 14:50                                                   ` Linus Torvalds
  2010-06-03 13:06                                                     ` Rusty Russell
  2010-06-02 16:53                                                   ` Brandon Philips
  2010-06-02 18:01                                                   ` Linus Torvalds
  2 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02 14:50 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> Found another locking issue: the code which verifies we don't export over
> an existing symbol needs to be atomic wrt. adding us to the list.

Yup.

And now that I'm looking at that call-chain (to see if it would make sense 
to use some other more specific lock - doesn't look like it: all the 
readers are using RCU and this is the only writer), I also give you this 
trivial one-liner. It changes each_symbol() to not put that constant array 
on the stack, resulting in changing

        movq    $C.388.31095, %rsi      #, tmp85
        subq    $376, %rsp      #,
        movq    %rdi, %rbx      # fn, fn
        leaq    -208(%rbp), %rdi        #, tmp84
        movq    %rbx, %rdx      # fn,
        rep movsl
        xorl    %esi, %esi      #
        leaq    -208(%rbp), %rdi        #, tmp87
        movq    %r12, %rcx      # data,
        call    each_symbol_in_section.clone.0  #

into

        xorl    %esi, %esi      #
        subq    $216, %rsp      #,
        movq    %rdi, %rbx      # fn, fn  
        movq    $arr.31078, %rdi        #,
        call    each_symbol_in_section.clone.0  #

which is not so much about being obviously shorter and simpler because we 
don't unnecessarily copy that constant array around onto the stack, but 
also about having a much smaller stack footprint (376 vs 216 bytes - see 
the update of 'rsp').

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Btw, feel free to consider all those peeling patches signed-off-on, since 
they seem to work.

> And load_module is down to 259 lines.  The label chain at the end is no
> shorter tho :(  I'll leave those cleanups until next merge window.

I don't think it's necessarily bad to have multiple exit targets (although 
9 is pushing it), if the thing is short enough that you can fairly easily 
see how they work. Not that it's quite there yet.

		Linus
---
 kernel/module.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 01e8874..2602264 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -227,7 +227,7 @@ bool each_symbol(bool (*fn)(const struct symsearch *arr, struct module *owner,
 			    unsigned int symnum, void *data), void *data)
 {
 	struct module *mod;
-	const struct symsearch arr[] = {
+	static const struct symsearch arr[] = {
 		{ __start___ksymtab, __stop___ksymtab, __start___kcrctab,
 		  NOT_GPL_ONLY, false },
 		{ __start___ksymtab_gpl, __stop___ksymtab_gpl,

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02 14:06                                                 ` Rusty Russell
  2010-06-02 14:50                                                   ` Linus Torvalds
@ 2010-06-02 16:53                                                   ` Brandon Philips
  2010-06-02 18:01                                                   ` Linus Torvalds
  2 siblings, 0 replies; 71+ messages in thread
From: Brandon Philips @ 2010-06-02 16:53 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On 23:36 Wed 02 Jun 2010, Rusty Russell wrote:
> On Wed, 2 Jun 2010 04:51:46 pm Linus Torvalds wrote:
> > On Wed, 2 Jun 2010, Rusty Russell wrote:
> > > 
> > > Moved all the sysfs-exposing stuff to the end just after we put in the
> > > list (and thus to after the find check).
> > 
> > Yeah, makes more sense that way. No reason to expose anything to sysfs 
> > early. And splitting it into two patches makes it easier to follow than 
> > the patch I posted. Ack.
> 
> Found another locking issue: the code which verifies we don't export over
> an existing symbol needs to be atomic wrt. adding us to the list.
> 
...
> are available in the git repository at:
> 
>   ssh://master.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

I tested this on the machine with the bnx2/libcrc32c issue and it
works. Feel free to add Tested-by: Brandon Philips <bphilips@suse.de>

Thanks,

	Brandon

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02 14:06                                                 ` Rusty Russell
  2010-06-02 14:50                                                   ` Linus Torvalds
  2010-06-02 16:53                                                   ` Brandon Philips
@ 2010-06-02 18:01                                                   ` Linus Torvalds
  2010-06-03  5:20                                                     ` Rusty Russell
  2 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-02 18:01 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Wed, 2 Jun 2010, Rusty Russell wrote:
> 
> And load_module is down to 259 lines.  The label chain at the end is no
> shorter tho :(  I'll leave those cleanups until next merge window.

Btw, here's a patch that _looks_ large, but it really pretty trivial, and 
sets things up so that it would be way easier to split off pieces of the 
module loading.

The reason it looks large is that it creates a "module_info" structure 
that contains all the module state that we're building up while loading, 
instead of having individual variables for all the indices etc.

So the patch ends up being large, because every "symindex" access instead 
becomes "info.index.sym" etc. That may be a few characters longer, but it 
then means that we can just pass a pointer to that "info" structure 
around. and let all the pieces fill it in very naturally.

As an example of that, the patch also moves the initialization of all 
those convenience variables into a "setup_module_info()" function. And at 
this point it really does become very natural to start to peel off some of 
the error labels and move them into the helper functions - now the 
"truncated" case is gone, and is handled inside that setup function 
instead.

So maybe you don't like this approach, and it does make the variable 
accesses a bit longer, but I don't think unreadably so. And the patch 
really does look big and scary, but there really should be absolutely no 
semantic changes - most of it was a trivial and mindless rename.

In fact, it was so mindless that I on purpose kept the existing helper 
functions looking like this:

-       err = check_modinfo(mod, sechdrs, infoindex, versindex);
+       err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);

rather than changing them to just take the "info" pointer. IOW, a second 
phase (if you think the approach is ok) would change that calling 
convention to just do

	err = check_modinfo(mod, &info);

(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly, 
while right now it makes things _look_ bigger, with things like this:

	versindex = find_sec(hdr, sechdrs, secstrings, "__versions");

becoming

	info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");

in the new "setup_module_info()" function, that's again just a result of 
it being a search-and-replace patch. By using the 'info' pointer, we could 
just change the 'find_sec()' interface so that it ends up being

	info->index.vers = find_sec(info, "__versions");

instead, and then we'd actually have a shorter and more readable line. So 
for a lot of those mindless variable name expansions there's would be room 
for separate cleanups.

I didn't move quite everything in there - if we do this to layout_symtabs, 
for example, we'd want to move the percpu, symoffs, stroffs, *strmap 
variables to be fields in that module_info structure too. But that's a 
much smaller patch, I moved just the really core stuff that is currently 
being set up and used in various parts.

But even in this rough form, it removes close to 70 lines from that 
function (but adds 22 lines overall, of course - the structure definition, 
the helper function declarations and call-sites etc etc).

		Linus

---
 kernel/module.c |  228 ++++++++++++++++++++++++++++++-------------------------
 1 files changed, 125 insertions(+), 103 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 01e8874..5cfe82a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2139,8 +2139,17 @@ static inline void kmemleak_load_module(struct module *mod, Elf_Ehdr *hdr,
 }
 #endif
 
-static int copy_and_check(Elf_Ehdr **hdrp,
-			  const void __user *umod, unsigned long len)
+struct module_info {
+	Elf_Ehdr *hdr;
+	unsigned long len;
+	Elf_Shdr *sechdrs;
+	char *secstrings, *args, *strtab;
+	struct {
+		unsigned int sym, str, mod, vers, info, pcpu;
+	} index;
+};
+
+static int copy_and_check(struct module_info *info, const void __user *umod, unsigned long len)
 {
 	int err;
 	Elf_Ehdr *hdr;
@@ -2150,7 +2159,7 @@ static int copy_and_check(Elf_Ehdr **hdrp,
 
 	/* Suck in entire file: we'll want most of it. */
 	/* vmalloc barfs on "unusual" numbers.  Check here */
-	if (len > 64 * 1024 * 1024 || (hdr = *hdrp = vmalloc(len)) == NULL)
+	if (len > 64 * 1024 * 1024 || (hdr = vmalloc(len)) == NULL)
 		return -ENOMEM;
 
 	if (copy_from_user(hdr, umod, len) != 0) {
@@ -2172,6 +2181,8 @@ static int copy_and_check(Elf_Ehdr **hdrp,
 		err = -ENOEXEC;
 		goto free_hdr;
 	}
+	info->hdr = hdr;
+	info->len = len;
 	return 0;
 
 free_hdr:
@@ -2179,6 +2190,80 @@ free_hdr:
 	return err;
 }
 
+/*
+ * Set up our basic convenience variables (pointers to section headers,
+ * search for module section index etc), and do some basic section
+ * verification.
+ *
+ * Return the temporary module pointer (we'll replace it with the final
+ * one when we move the module sections around).
+ */
+static struct module *setup_module_info(struct module_info *info)
+{
+	unsigned int i;
+	struct module *mod;
+
+	/* Set up the convenience variables */
+	info->sechdrs = (void *)info->hdr + info->hdr->e_shoff;
+	info->secstrings = (void *)info->hdr + info->sechdrs[info->hdr->e_shstrndx].sh_offset;
+	info->sechdrs[0].sh_addr = 0;
+
+	for (i = 1; i < info->hdr->e_shnum; i++) {
+		if (info->sechdrs[i].sh_type != SHT_NOBITS
+		    && info->len < info->sechdrs[i].sh_offset + info->sechdrs[i].sh_size)
+			goto truncated;
+
+		/* Mark all sections sh_addr with their address in the
+		   temporary image. */
+		info->sechdrs[i].sh_addr = (size_t)info->hdr + info->sechdrs[i].sh_offset;
+
+		/* Internal symbols and strings. */
+		if (info->sechdrs[i].sh_type == SHT_SYMTAB) {
+			info->index.sym = i;
+			info->index.str = info->sechdrs[i].sh_link;
+			info->strtab = (char *)info->hdr + info->sechdrs[info->index.str].sh_offset;
+		}
+#ifndef CONFIG_MODULE_UNLOAD
+		/* Don't load .exit sections */
+		if (strstarts(info->secstrings+info->sechdrs[i].sh_name, ".exit"))
+			info->sechdrs[i].sh_flags &= ~(unsigned long)SHF_ALLOC;
+#endif
+	}
+
+	info->index.mod = find_sec(info->hdr, info->sechdrs, info->secstrings,
+			    ".gnu.linkonce.this_module");
+	if (!info->index.mod) {
+		printk(KERN_WARNING "No module found in object\n");
+		return ERR_PTR(-ENOEXEC);
+	}
+	/* This is temporary: point mod into copy of data. */
+	mod = (void *)info->sechdrs[info->index.mod].sh_addr;
+
+	if (info->index.sym == 0) {
+		printk(KERN_WARNING "%s: module has no symbols (stripped?)\n",
+		       mod->name);
+		return ERR_PTR(-ENOEXEC);
+	}
+
+	info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
+	info->index.info = find_sec(info->hdr, info->sechdrs, info->secstrings, ".modinfo");
+	info->index.pcpu = find_pcpusec(info->hdr, info->sechdrs, info->secstrings);
+
+	/* Don't keep modinfo and version sections. */
+	info->sechdrs[info->index.info].sh_flags &= ~(unsigned long)SHF_ALLOC;
+	info->sechdrs[info->index.vers].sh_flags &= ~(unsigned long)SHF_ALLOC;
+
+	/* Check module struct version now, before we try to use module. */
+	if (!check_modstruct_version(info->sechdrs, info->index.vers, mod))
+		return ERR_PTR(-ENOEXEC);
+
+	return mod;
+
+ truncated:
+	printk(KERN_ERR "Module len %lu truncated\n", info->len);
+	return ERR_PTR(-ENOEXEC);
+}
+
 static int check_modinfo(struct module *mod,
 			 const Elf_Shdr *sechdrs,
 			 unsigned int infoindex, unsigned int versindex)
@@ -2404,13 +2489,7 @@ static noinline struct module *load_module(void __user *umod,
 				  unsigned long len,
 				  const char __user *uargs)
 {
-	Elf_Ehdr *hdr;
-	Elf_Shdr *sechdrs;
-	char *secstrings, *args, *strtab = NULL;
-	unsigned int i;
-	unsigned int symindex = 0;
-	unsigned int strindex = 0;
-	unsigned int modindex, versindex, infoindex, pcpuindex;
+	struct module_info info = { NULL, };
 	struct module *mod;
 	long err;
 	unsigned long symoffs, stroffs, *strmap;
@@ -2419,80 +2498,28 @@ static noinline struct module *load_module(void __user *umod,
 	DEBUGP("load_module: umod=%p, len=%lu, uargs=%p\n",
 	       umod, len, uargs);
 
-	err = copy_and_check(&hdr, umod, len);
+	err = copy_and_check(&info, umod, len);
 	if (err)
 		return ERR_PTR(err);
 
-	/* Convenience variables */
-	sechdrs = (void *)hdr + hdr->e_shoff;
-	secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
-	sechdrs[0].sh_addr = 0;
-
-	for (i = 1; i < hdr->e_shnum; i++) {
-		if (sechdrs[i].sh_type != SHT_NOBITS
-		    && len < sechdrs[i].sh_offset + sechdrs[i].sh_size)
-			goto truncated;
-
-		/* Mark all sections sh_addr with their address in the
-		   temporary image. */
-		sechdrs[i].sh_addr = (size_t)hdr + sechdrs[i].sh_offset;
-
-		/* Internal symbols and strings. */
-		if (sechdrs[i].sh_type == SHT_SYMTAB) {
-			symindex = i;
-			strindex = sechdrs[i].sh_link;
-			strtab = (char *)hdr + sechdrs[strindex].sh_offset;
-		}
-#ifndef CONFIG_MODULE_UNLOAD
-		/* Don't load .exit sections */
-		if (strstarts(secstrings+sechdrs[i].sh_name, ".exit"))
-			sechdrs[i].sh_flags &= ~(unsigned long)SHF_ALLOC;
-#endif
-	}
-
-	modindex = find_sec(hdr, sechdrs, secstrings,
-			    ".gnu.linkonce.this_module");
-	if (!modindex) {
-		printk(KERN_WARNING "No module found in object\n");
-		err = -ENOEXEC;
-		goto free_hdr;
-	}
-	/* This is temporary: point mod into copy of data. */
-	mod = (void *)sechdrs[modindex].sh_addr;
-
-	if (symindex == 0) {
-		printk(KERN_WARNING "%s: module has no symbols (stripped?)\n",
-		       mod->name);
-		err = -ENOEXEC;
-		goto free_hdr;
-	}
-
-	versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
-	infoindex = find_sec(hdr, sechdrs, secstrings, ".modinfo");
-	pcpuindex = find_pcpusec(hdr, sechdrs, secstrings);
-
-	/* Don't keep modinfo and version sections. */
-	sechdrs[infoindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
-	sechdrs[versindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
-
-	/* Check module struct version now, before we try to use module. */
-	if (!check_modstruct_version(sechdrs, versindex, mod)) {
-		err = -ENOEXEC;
+	mod = setup_module_info(&info);
+	if (IS_ERR(mod)) {
+		err = PTR_ERR(mod);
 		goto free_hdr;
 	}
 
-	err = check_modinfo(mod, sechdrs, infoindex, versindex);
+	err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
 	if (err)
 		goto free_hdr;
 
 	/* Now copy in args */
-	args = strndup_user(uargs, ~0UL >> 1);
-	if (IS_ERR(args)) {
-		err = PTR_ERR(args);
+	info.args = strndup_user(uargs, ~0UL >> 1);
+	if (IS_ERR(info.args)) {
+		err = PTR_ERR(info.args);
 		goto free_hdr;
 	}
 
-	strmap = kzalloc(BITS_TO_LONGS(sechdrs[strindex].sh_size)
+	strmap = kzalloc(BITS_TO_LONGS(info.sechdrs[info.index.str].sh_size)
 			 * sizeof(long), GFP_KERNEL);
 	if (!strmap) {
 		err = -ENOMEM;
@@ -2502,17 +2529,17 @@ static noinline struct module *load_module(void __user *umod,
 	mod->state = MODULE_STATE_COMING;
 
 	/* Allow arches to frob section contents and sizes.  */
-	err = module_frob_arch_sections(hdr, sechdrs, secstrings, mod);
+	err = module_frob_arch_sections(info.hdr, info.sechdrs, info.secstrings, mod);
 	if (err < 0)
 		goto free_mod;
 
-	if (pcpuindex) {
+	if (info.index.pcpu) {
 		/* We have a special allocation for this section. */
-		err = percpu_modalloc(mod, sechdrs[pcpuindex].sh_size,
-				      sechdrs[pcpuindex].sh_addralign);
+		err = percpu_modalloc(mod, info.sechdrs[info.index.pcpu].sh_size,
+				      info.sechdrs[info.index.pcpu].sh_addralign);
 		if (err)
 			goto free_mod;
-		sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
+		info.sechdrs[info.index.pcpu].sh_flags &= ~(unsigned long)SHF_ALLOC;
 	}
 	/* Keep this around for failure path. */
 	percpu = mod_percpu(mod);
@@ -2520,12 +2547,12 @@ static noinline struct module *load_module(void __user *umod,
 	/* Determine total sizes, and put offsets in sh_entsize.  For now
 	   this is done generically; there doesn't appear to be any
 	   special cases for the architectures. */
-	layout_sections(mod, hdr, sechdrs, secstrings);
-	symoffs = layout_symtab(mod, sechdrs, symindex, strindex, hdr,
-				secstrings, &stroffs, strmap);
+	layout_sections(mod, info.hdr, info.sechdrs, info.secstrings);
+	symoffs = layout_symtab(mod, info.sechdrs, info.index.sym, info.index.str, info.hdr,
+				info.secstrings, &stroffs, strmap);
 
 	/* Allocate and move to the final place */
-	mod = move_module(mod, hdr, sechdrs, secstrings, modindex);
+	mod = move_module(mod, info.hdr, info.sechdrs, info.secstrings, info.index.mod);
 	if (IS_ERR(mod)) {
 		err = PTR_ERR(mod);
 		goto free_percpu;
@@ -2538,36 +2565,36 @@ static noinline struct module *load_module(void __user *umod,
 
 	/* Now we've got everything in the final locations, we can
 	 * find optional sections. */
-	find_module_sections(mod, hdr, sechdrs, secstrings);
+	find_module_sections(mod, info.hdr, info.sechdrs, info.secstrings);
 
-	err = check_module_license_and_versions(mod, sechdrs);
+	err = check_module_license_and_versions(mod, info.sechdrs);
 	if (err)
 		goto free_unload;
 
 	/* Set up MODINFO_ATTR fields */
-	setup_modinfo(mod, sechdrs, infoindex);
+	setup_modinfo(mod, info.sechdrs, info.index.info);
 
 	/* Fix up syms, so that st_value is a pointer to location. */
-	err = simplify_symbols(sechdrs, symindex, strtab, versindex, pcpuindex,
+	err = simplify_symbols(info.sechdrs, info.index.sym, info.strtab, info.index.vers, info.index.pcpu,
 			       mod);
 	if (err < 0)
 		goto cleanup;
 
-	err = apply_relocations(mod, hdr, sechdrs, symindex, strindex);
+	err = apply_relocations(mod, info.hdr, info.sechdrs, info.index.sym, info.index.str);
 	if (err < 0)
 		goto cleanup;
 
   	/* Set up and sort exception table */
-	mod->extable = section_objs(hdr, sechdrs, secstrings, "__ex_table",
+	mod->extable = section_objs(info.hdr, info.sechdrs, info.secstrings, "__ex_table",
 				    sizeof(*mod->extable), &mod->num_exentries);
 	sort_extable(mod->extable, mod->extable + mod->num_exentries);
 
 	/* Finally, copy percpu area over. */
-	percpu_modcopy(mod, (void *)sechdrs[pcpuindex].sh_addr,
-		       sechdrs[pcpuindex].sh_size);
+	percpu_modcopy(mod, (void *)info.sechdrs[info.index.pcpu].sh_addr,
+		       info.sechdrs[info.index.pcpu].sh_size);
 
-	add_kallsyms(mod, sechdrs, hdr->e_shnum, symindex, strindex,
-		     symoffs, stroffs, secstrings, strmap);
+	add_kallsyms(mod, info.sechdrs, info.hdr->e_shnum, info.index.sym, info.index.str,
+		     symoffs, stroffs, info.secstrings, strmap);
 	kfree(strmap);
 	strmap = NULL;
 
@@ -2575,19 +2602,19 @@ static noinline struct module *load_module(void __user *umod,
 		struct _ddebug *debug;
 		unsigned int num_debug;
 
-		debug = section_objs(hdr, sechdrs, secstrings, "__verbose",
+		debug = section_objs(info.hdr, info.sechdrs, info.secstrings, "__verbose",
 				     sizeof(*debug), &num_debug);
 		if (debug)
 			dynamic_debug_setup(debug, num_debug);
 	}
 
-	err = module_finalize(hdr, sechdrs, mod);
+	err = module_finalize(info.hdr, info.sechdrs, mod);
 	if (err < 0)
 		goto cleanup;
 
 	flush_module_icache(mod);
 
-	mod->args = args;
+	mod->args = info.args;
 
 	/* Now sew it into the lists so we can get lockdep and oops
 	 * info during argument parsing.  Noone should access us, since
@@ -2618,11 +2645,11 @@ static noinline struct module *load_module(void __user *umod,
 	if (err < 0)
 		goto unlink;
 
-	add_sect_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
-	add_notes_attrs(mod, hdr->e_shnum, secstrings, sechdrs);
+	add_sect_attrs(mod, info.hdr->e_shnum, info.secstrings, info.sechdrs);
+	add_notes_attrs(mod, info.hdr->e_shnum, info.secstrings, info.sechdrs);
 
 	/* Get rid of temporary copy */
-	vfree(hdr);
+	vfree(info.hdr);
 
 	trace_module_load(mod);
 
@@ -2648,16 +2675,11 @@ static noinline struct module *load_module(void __user *umod,
  free_percpu:
 	free_percpu(percpu);
  free_mod:
-	kfree(args);
+	kfree(info.args);
 	kfree(strmap);
  free_hdr:
-	vfree(hdr);
+	vfree(info.hdr);
 	return ERR_PTR(err);
-
- truncated:
-	printk(KERN_ERR "Module len %lu truncated\n", len);
-	err = -ENOEXEC;
-	goto free_hdr;
 }
 
 /* Call module constructors. */

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02 18:01                                                   ` Linus Torvalds
@ 2010-06-03  5:20                                                     ` Rusty Russell
  2010-06-03 16:24                                                       ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-03  5:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Thu, 3 Jun 2010 03:31:06 am Linus Torvalds wrote:
> 
> On Wed, 2 Jun 2010, Rusty Russell wrote:
> > 
> > And load_module is down to 259 lines.  The label chain at the end is no
> > shorter tho :(  I'll leave those cleanups until next merge window.
> 
> Btw, here's a patch that _looks_ large, but it really pretty trivial, and 
> sets things up so that it would be way easier to split off pieces of the 
> module loading.
> 
> The reason it looks large is that it creates a "module_info" structure 
> that contains all the module state that we're building up while loading, 
> instead of having individual variables for all the indices etc.
> 
> So the patch ends up being large, because every "symindex" access instead 
> becomes "info.index.sym" etc. That may be a few characters longer, but it 
> then means that we can just pass a pointer to that "info" structure 
> around. and let all the pieces fill it in very naturally.
> 
> As an example of that, the patch also moves the initialization of all 
> those convenience variables into a "setup_module_info()" function. And at 
> this point it really does become very natural to start to peel off some of 
> the error labels and move them into the helper functions - now the 
> "truncated" case is gone, and is handled inside that setup function 
> instead.
> 
> So maybe you don't like this approach, and it does make the variable 
> accesses a bit longer, but I don't think unreadably so. And the patch 
> really does look big and scary, but there really should be absolutely no 
> semantic changes - most of it was a trivial and mindless rename.
> 
> In fact, it was so mindless that I on purpose kept the existing helper 
> functions looking like this:
> 
> -       err = check_modinfo(mod, sechdrs, infoindex, versindex);
> +       err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
> 
> rather than changing them to just take the "info" pointer. IOW, a second 
> phase (if you think the approach is ok) would change that calling 
> convention to just do
> 
> 	err = check_modinfo(mod, &info);
> 
> (and same for "layout_sections()", "layout_symtabs()" etc.) Similarly, 
> while right now it makes things _look_ bigger, with things like this:
> 
> 	versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
> 
> becoming
> 
> 	info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
> 
> in the new "setup_module_info()" function, that's again just a result of 
> it being a search-and-replace patch. By using the 'info' pointer, we could 
> just change the 'find_sec()' interface so that it ends up being
> 
> 	info->index.vers = find_sec(info, "__versions");
> 
> instead, and then we'd actually have a shorter and more readable line. So 
> for a lot of those mindless variable name expansions there's would be room 
> for separate cleanups.
> 
> I didn't move quite everything in there - if we do this to layout_symtabs, 
> for example, we'd want to move the percpu, symoffs, stroffs, *strmap 
> variables to be fields in that module_info structure too. But that's a 
> much smaller patch, I moved just the really core stuff that is currently 
> being set up and used in various parts.
> 
> But even in this rough form, it removes close to 70 lines from that 
> function (but adds 22 lines overall, of course - the structure definition, 
> the helper function declarations and call-sites etc etc).

Applied.  I thought about the same thing but had the same doubts as you.

However, you're right that it has potential.  I'll rename module_info to
load_info if you don't mind tho: contains more semantic punch IMHO.

On top of this, I'm right now closing on another ideal of mine: encapsulate
all the "before we move module" into one function.  That before vs. after
always made me nervous...

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-02 14:50                                                   ` Linus Torvalds
@ 2010-06-03 13:06                                                     ` Rusty Russell
  0 siblings, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-03 13:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Thu, 3 Jun 2010 12:20:48 am Linus Torvalds wrote:
> 
> On Wed, 2 Jun 2010, Rusty Russell wrote:
> > 
> > Found another locking issue: the code which verifies we don't export over
> > an existing symbol needs to be atomic wrt. adding us to the list.
> 
> Yup.
> 
> And now that I'm looking at that call-chain (to see if it would make sense 
> to use some other more specific lock - doesn't look like it: all the 
> readers are using RCU and this is the only writer), I also give you this 
> trivial one-liner. It changes each_symbol() to not put that constant array 
> on the stack, resulting in changing
> 
>         movq    $C.388.31095, %rsi      #, tmp85
>         subq    $376, %rsp      #,
>         movq    %rdi, %rbx      # fn, fn
>         leaq    -208(%rbp), %rdi        #, tmp84
>         movq    %rbx, %rdx      # fn,
>         rep movsl
>         xorl    %esi, %esi      #
>         leaq    -208(%rbp), %rdi        #, tmp87
>         movq    %r12, %rcx      # data,
>         call    each_symbol_in_section.clone.0  #
> 
> into
> 
>         xorl    %esi, %esi      #
>         subq    $216, %rsp      #,
>         movq    %rdi, %rbx      # fn, fn  
>         movq    $arr.31078, %rdi        #,
>         call    each_symbol_in_section.clone.0  #
> 
> which is not so much about being obviously shorter and simpler because we 
> don't unnecessarily copy that constant array around onto the stack, but 
> also about having a much smaller stack footprint (376 vs 216 bytes - see 
> the update of 'rsp').
> 
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

BTW, applied, thanks!

I've finished the cleanup now, and removed the noinline on load_module;
we're down to 124 bytes of stack for sys_init_module here (32 bit).

The following changes since commit aef4b9aaae1decc775778903922bd0075cce7a88:
  Linus Torvalds (1):
        Merge branch 'next' of git://git.kernel.org/.../benh/powerpc

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Eric Dumazet (1):
      module: module_unload_init() cleanup

Linus Torvalds (6):
      module: Make the 'usage' lists be two-way
      module: move find_module check to end
      module: refactor load_module
      module: refactor load_module part 2
      module: reduce stack usage for each_symbol()
      module: fix bne2 "gave up waiting for init of module libcrc32c"

Rusty Russell (18):
      module: fix kdb's illicit use of struct module_use.
      module: move sysfs exposure to end of load_module
      module: Make module sysfs functions private.
      module: make locking more fine-grained.
      module: verify_export_symbols under the lock
      module: fix bne2 "gave up waiting for init of module libcrc32c"
      module: refactor load_module part 3
      module: refactor load_module part 4
      module: refactor load_module part 5
      module: refactor out section header rewriting
      module: kallsyms functions take struct load_info
      module: layout_and_allocate
      module: sysfs cleanup
      module: pass load_info into other functions
      module: move module args strndup_user to just before use
      module: simplify per-cpu handling a little
      module: group post-relocation functions into post_relocation()
      module: cleanup comments, remove noinline

 include/linux/module.h      |   44 +--
 kernel/debug/kdb/kdb_main.c |   12 +-
 kernel/module.c             | 1361 ++++++++++++++++++++++++-------------------




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-03  5:20                                                     ` Rusty Russell
@ 2010-06-03 16:24                                                       ` Linus Torvalds
  2010-06-04  1:02                                                         ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-03 16:24 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Thu, 3 Jun 2010, Rusty Russell wrote:
> 
> However, you're right that it has potential.  I'll rename module_info to
> load_info if you don't mind tho: contains more semantic punch IMHO.

Umm. One problem is that you will almost certainly eventually want to 
expose that to the architecture "fixup" routines (ie things like 
module_frob_arch_sections(), arch_mod_section_prepend()), and at that 
point "load_info" is a horribly bad structure name, since it would show 
up in <linux/module.h> and thus be exported all over.

At least call it "struct module_load_info". But yes, I do agree that the 
"load" part is important.

> On top of this, I'm right now closing on another ideal of mine: encapsulate
> all the "before we move module" into one function.  That before vs. after
> always made me nervous...

Yeah, that should be trivial, and I agree that it would be good to not 
have "mod" mean two things in the same function. Especially with all the 
"goto failure-case", and some of the failure cases using "mod", it is a 
bit scary for it to point into the (before movement) 'hdr+len' structure, 
and then (after movement) into the relocated module allocations.

I looked at that particularly when doing that whole

	mod = setup_module_info(&info);
	if (IS_ERR(mod)) {
		err = PTR_ERR(mod);
		goto free_hdr;
	}

thing, because that made "mod" have _three_ totally different values 
(error, before, after) when jumping out to the failure paths. 

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-03 16:24                                                       ` Linus Torvalds
@ 2010-06-04  1:02                                                         ` Rusty Russell
  2010-06-04  1:55                                                           ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-04  1:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Fri, 4 Jun 2010 01:54:19 am Linus Torvalds wrote:
> 
> On Thu, 3 Jun 2010, Rusty Russell wrote:
> > 
> > However, you're right that it has potential.  I'll rename module_info to
> > load_info if you don't mind tho: contains more semantic punch IMHO.
> 
> Umm. One problem is that you will almost certainly eventually want to 
> expose that to the architecture "fixup" routines (ie things like 
> module_frob_arch_sections(), arch_mod_section_prepend()), and at that 
> point "load_info" is a horribly bad structure name, since it would show 
> up in <linux/module.h> and thus be exported all over.
> 
> At least call it "struct module_load_info". But yes, I do agree that the 
> "load" part is important.

Looking at the arch code, it has the advantage that it's self-contained.
They've been pleasantly undemanding from the core over the years; I think
archs doing tricky things with elf prefer to parse the object themselves
anyway.  And I'm not sure they want to revisit it, either.

So I don't think we'd win much from changing them.  I'm wrong later, I'll
prepend "module_" to the struct name as an internal change then hit them
all.

> I looked at that particularly when doing that whole
> 
> 	mod = setup_module_info(&info);
> 	if (IS_ERR(mod)) {
> 		err = PTR_ERR(mod);
> 		goto free_hdr;
> 	}
> 
> thing, because that made "mod" have _three_ totally different values 
> (error, before, after) when jumping out to the failure paths. 

Yep, it now is back to sanity.  Let's see if today's linux-next is
happy.

If so, do you want just the fixes or the whole refactoring too, while
it's nice and fresh?

Thanks!
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-04  1:02                                                         ` Rusty Russell
@ 2010-06-04  1:55                                                           ` Linus Torvalds
  2010-06-04  5:20                                                             ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-04  1:55 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Fri, 4 Jun 2010, Rusty Russell wrote:
> > 
> > At least call it "struct module_load_info". But yes, I do agree that the 
> > "load" part is important.
> 
> Looking at the arch code, it has the advantage that it's self-contained.
> They've been pleasantly undemanding from the core over the years; I think
> archs doing tricky things with elf prefer to parse the object themselves
> anyway.  And I'm not sure they want to revisit it, either.
> 
> So I don't think we'd win much from changing them.  I'm wrong later, I'll
> prepend "module_" to the struct name as an internal change then hit them
> all.

Ok. So if we don't expect to ever pass the full load_info struct down to 
the arch code, and we can keep it entirely internal to module.c, then 
"struct load_info" is fine by me.

> If so, do you want just the fixes or the whole refactoring too, while
> it's nice and fresh?

Gaah. "Just the fixes" is definitely the prudent thing to do. At the same 
time, I've now so deeply bought into the whole cleanup thing too, that I 
want to argue that the cleanup might make it easier to handle any locking 
problems if we find them.

But I suspect that is just myself trying to fool/argue my smarter half 
into taking it all.

So you can probably push me either way.

			Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-04  1:55                                                           ` Linus Torvalds
@ 2010-06-04  5:20                                                             ` Rusty Russell
  2010-06-04 22:48                                                               ` Linus Torvalds
  0 siblings, 1 reply; 71+ messages in thread
From: Rusty Russell @ 2010-06-04  5:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Fri, 4 Jun 2010 11:25:00 am Linus Torvalds wrote:
> But I suspect that is just myself trying to fool/argue my smarter half 
> into taking it all.

Similar motivation for even asking.

They can stew in linux-next for another cycle: just found a trivial
!SMP compile issue, and with all the other config options in there
it could use some baking...

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-04  5:20                                                             ` Rusty Russell
@ 2010-06-04 22:48                                                               ` Linus Torvalds
  2010-06-05  1:49                                                                 ` Rusty Russell
  0 siblings, 1 reply; 71+ messages in thread
From: Linus Torvalds @ 2010-06-04 22:48 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers



On Fri, 4 Jun 2010, Rusty Russell wrote:

> On Fri, 4 Jun 2010 11:25:00 am Linus Torvalds wrote:
> > But I suspect that is just myself trying to fool/argue my smarter half 
> > into taking it all.
> 
> Similar motivation for even asking.
> 
> They can stew in linux-next for another cycle: just found a trivial
> !SMP compile issue, and with all the other config options in there
> it could use some baking...

Yeah, considering the ia64 report, I can't lie to myself and say that the 
cleanups are going to be totally safe.

Can you send me a pointer to the safe/bugfix part, and I'll pull it for 
-rc2? I'd like to do that Saturday.

		Linus

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c"
  2010-06-04 22:48                                                               ` Linus Torvalds
@ 2010-06-05  1:49                                                                 ` Rusty Russell
  0 siblings, 0 replies; 71+ messages in thread
From: Rusty Russell @ 2010-06-05  1:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Philips, Andrew Morton, Rafael J. Wysocki, LKML,
	Jon Masters, Tejun Heo, Masami Hiramatsu, Kay Sievers

On Sat, 5 Jun 2010 08:18:13 am Linus Torvalds wrote:
> Can you send me a pointer to the safe/bugfix part, and I'll pull it for 
> -rc2? I'd like to do that Saturday.

Yep, here it is:

The following changes since commit ad8456361fa19068cf49b50a4f98e41b73c08e76:
  Linus Torvalds (1):
        Merge branch 'upstream-linus' of git://git.kernel.org/.../jgarzik/libata-dev

are available in the git repository at:

  ssh://master.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus.git master

Linus Torvalds (2):
      module: Make the 'usage' lists be two-way
      module: move find_module check to end

Rusty Russell (6):
      module: fix kdb's illicit use of struct module_use.
      module: move sysfs exposure to end of load_module
      module: Make module sysfs functions private.
      module: make locking more fine-grained.
      module: verify_export_symbols under the lock
      module: fix bne2 "gave up waiting for init of module libcrc32c"

 include/linux/module.h      |   44 ++-----
 kernel/debug/kdb/kdb_main.c |   12 +--
 kernel/module.c             |  320 ++++++++++++++++++++++++++++---------------
 3 files changed, 221 insertions(+), 155 deletions(-)

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2010-06-05  1:50 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-25 21:00 [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
2010-05-25 22:54 ` Rafael J. Wysocki
2010-05-25 23:47   ` Linus Torvalds
2010-05-26  8:00     ` Rusty Russell
2010-05-26 11:57       ` Rusty Russell
2010-05-26 22:56         ` Rafael J. Wysocki
2010-05-26 23:07           ` Linus Torvalds
2010-05-27  5:26           ` Rusty Russell
2010-05-27 18:46             ` Brandon Philips
2010-05-31  9:40               ` Rusty Russell
2010-05-31 12:00                 ` [PATCH 0/2] kernel/module.c locking changes Rusty Russell
2010-05-31 12:01                   ` [PATCH 1/2] module: make locking more fine-grained Rusty Russell
2010-05-31 12:02                     ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
2010-05-31 16:48                       ` Andrew Morton
2010-05-31 18:19                         ` Linus Torvalds
2010-05-31 20:15                           ` Linus Torvalds
2010-05-31 20:16                             ` [PATCH 1/2] Make the module 'usage' lists be two-way Linus Torvalds
2010-05-31 20:17                               ` [PATCH 2/2] module: wait for other modules after dropping the module_mutex Linus Torvalds
2010-06-01  1:37                               ` [PATCH 1/2] Make the module 'usage' lists be two-way Rusty Russell
2010-06-01  3:42                                 ` Rusty Russell
2010-06-01  4:00                                   ` Linus Torvalds
2010-06-01  4:05                                     ` Linus Torvalds
2010-06-01  2:44                               ` Américo Wang
2010-06-01  3:51                                 ` Linus Torvalds
2010-06-01  1:57                             ` [PATCH 2/2] module: fix bne2 "gave up waiting for init of module libcrc32c" Rusty Russell
2010-06-01  3:40                               ` Linus Torvalds
2010-06-01  4:27                                 ` Linus Torvalds
2010-06-01  5:19                                 ` Rusty Russell
2010-06-02  3:15                                   ` Rusty Russell
2010-06-01  1:21                           ` Rusty Russell
2010-06-01  3:24                             ` Linus Torvalds
2010-06-01  5:22                               ` Rusty Russell
2010-06-01 14:58                                 ` Linus Torvalds
2010-06-01 17:53                                   ` Linus Torvalds
2010-06-01 23:24                                     ` Brandon Philips
2010-06-01 23:51                                       ` Linus Torvalds
2010-06-02  2:10                                         ` Brandon Philips
2010-06-02  3:03                                           ` Rusty Russell
2010-06-02  4:35                                           ` Linus Torvalds
2010-06-02  4:44                                             ` Linus Torvalds
2010-06-02  6:35                                               ` Rusty Russell
2010-06-02  7:45                                                 ` Linus Torvalds
2010-06-02  8:12                                                   ` Linus Torvalds
2010-06-02  9:07                                                     ` Rusty Russell
2010-06-02  5:52                                             ` Rusty Russell
2010-06-02  7:21                                               ` Linus Torvalds
2010-06-02 14:06                                                 ` Rusty Russell
2010-06-02 14:50                                                   ` Linus Torvalds
2010-06-03 13:06                                                     ` Rusty Russell
2010-06-02 16:53                                                   ` Brandon Philips
2010-06-02 18:01                                                   ` Linus Torvalds
2010-06-03  5:20                                                     ` Rusty Russell
2010-06-03 16:24                                                       ` Linus Torvalds
2010-06-04  1:02                                                         ` Rusty Russell
2010-06-04  1:55                                                           ` Linus Torvalds
2010-06-04  5:20                                                             ` Rusty Russell
2010-06-04 22:48                                                               ` Linus Torvalds
2010-06-05  1:49                                                                 ` Rusty Russell
2010-06-02  3:09                                   ` Rusty Russell
2010-06-02  4:32                                     ` Linus Torvalds
2010-06-02  4:56                                     ` Linus Torvalds
2010-06-02  5:52                                       ` Rusty Russell
2010-06-02  6:59                                         ` Linus Torvalds
2010-06-01  1:04                         ` Rusty Russell
2010-06-01  5:38                     ` [PATCH 1/2] module: make locking more fine-grained Américo Wang
2010-06-01  5:55                       ` Rusty Russell
2010-05-27 21:57             ` [Regression] Crash in load_module() while freeing args Rafael J. Wysocki
2010-05-31  7:54               ` Rusty Russell
2010-05-31 10:23               ` [PATCH] module: fix reference to mod->percpu after freeing module Rusty Russell
2010-05-31 10:25                 ` Tejun Heo
2010-05-26 15:41       ` [Regression] Crash in load_module() while freeing args Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.