linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [3.1.6] Deadlock upon recursive call_usermodehelper_exec().
@ 2011-12-27  7:01 penguin-kernel
  2011-12-27  8:14 ` Tetsuo Handa
  2012-01-04  7:43 ` [PATCH] kmod: Avoid deadlock by recursive kmod call Tetsuo Handa
  0 siblings, 2 replies; 9+ messages in thread
From: penguin-kernel @ 2011-12-27  7:01 UTC (permalink / raw)
  To: gregkh, linux-fsdevel, linux-kernel

I have two problems.



Problem 1 (I don't know what bug is triggering this problem.):

I experienced /sbin/modprobe and /sbin/hotplug being truncated to 0 bytes (and
thereby subsequent execution of /sbin/modprobe results in
"request_module: runaway loop modprobe binfmt-0000") for some reason.

In my environment, /sbin/modprobe and /sbin/hotplug are pathnames stored in
/proc/sys/kernel/modprobe and /proc/sys/kernel/hotplug .
I assume this is a kernel bug because
(a) I can't imagine a broken userspace program that does something like
    ": > /sbin/modprobe" or ": > /sbin/hotplug".
(b) I experienced this problem in both CentOS 6 (for once) and Debian Sarge
    (for twice) environments.



Problem 2 (Main topic on this post):

At first, when abovementioned problem 1 occurred, I didn't notice that not only
/sbin/modprobe but also /sbin/hotplug are damaged. Therefore, I repaired only
/sbin/modprobe by reinstalling module-init-tools package. As a result, since
/sbin/hotplug remained damaged, the system always hanged inside bootup scripts.

I identified that the system hangs when /sbin/hotplug is executed, and
repairing /sbin/hotplug by reinstalling hotplug package solved this hang.

Below is output of "sysrq"-"w" keys which I obtained while I was trying to
identify the location of hang. Characteristic point is that both modprobe and
kworker/u:0 are calling call_usermodehelper_exec() but wait_for_completion()
cannot be completed for some reason.

SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) 
SysRq : Show Blocked State
  task                PC stack   pid father
kworker/u:0     D f6483e94     0     5      2 0x00000000
 f6483dd4 00000046 c102aac5 f6483e94 00000003 00000010 000000d0 f64cdc40
 f6478e00 00000246 c1462940 00000001 00000001 f5dea040 f6b05080 f6b05080
 f6478f7c 00000001 f75aeac0 c1462640 f75aeac0 00000001 f6483dc4 c1086271
Call Trace:
 [<c102aac5>] ? find_busiest_group+0x845/0xaa0
 [<c1086271>] ? get_page_from_freelist+0x121/0x260
 [<c1028b0b>] ? sched_slice+0x5b/0x90
 [<c131c4f5>] schedule+0x55/0x60
 [<c131cbda>] schedule_timeout+0x14a/0x150
 [<c1028997>] ? __enqueue_entity+0x77/0x90
 [<c131c5b2>] wait_for_common+0xb2/0x120
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c10281ab>] ? resched_task+0x5b/0x70
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c102e1d0>] ? wake_up_new_task+0xc0/0xf0
 [<c131c632>] wait_for_completion+0x12/0x20
 [<c1033bfd>] do_fork+0x16d/0x240
 [<c1009368>] kernel_thread+0x88/0xa0
 [<c131c1a1>] ? __schedule+0x2a1/0x5a0
 [<c10460e0>] ? __request_module+0x120/0x120
 [<c10460e0>] ? __request_module+0x120/0x120
 [<c131ed30>] ? common_interrupt+0x30/0x30
 [<c102f226>] ? complete+0x46/0x60
 [<c1046318>] __call_usermodehelper+0x28/0x90
 [<c1047b6c>] process_one_work+0xbc/0x2b0
 [<c103f3ff>] ? mod_timer+0xcf/0x160
 [<c10462f0>] ? wait_for_helper+0xa0/0xa0
 [<c1047e46>] worker_thread+0xa6/0x1f0
 [<c104c6b5>] kthread+0x75/0x80
 [<c1047da0>] ? process_scheduled_works+0x40/0x40
 [<c104c640>] ? kthread_data+0x20/0x20
 [<c131ed36>] kernel_thread_helper+0x6/0xd
modprobe        D f5945d38     0  2193   2170 0x00000000
 f5945dcc 00000082 f5945d3c f5945d38 c10284a2 f6478e00 00000000 f64ad890
 f65feb10 00000400 00000000 00000800 00000000 f5f02c80 f6905080 f6905080
 f65fec8c 00000400 00000000 00000070 00000000 00000001 00000003 f5945da0
Call Trace:
 [<c10284a2>] ? target_load+0x22/0x40
 [<c1029717>] ? select_idle_sibling+0xd7/0x100
 [<c131c4f5>] schedule+0x55/0x60
 [<c131cbda>] schedule_timeout+0x14a/0x150
 [<c102dd2d>] ? ttwu_queue+0x3d/0x70
 [<c102de56>] ? try_to_wake_up+0xf6/0x140
 [<c131c5b2>] wait_for_common+0xb2/0x120
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c1046f99>] ? queue_work_on+0x39/0x40
 [<c131c632>] wait_for_completion+0x12/0x20
 [<c10465e0>] call_usermodehelper_exec+0x90/0xa0
 [<c116ad98>] kobject_uevent_env+0x3d8/0x490
 [<c116a930>] ? kobject_action_type+0x90/0x90
 [<c116ae5a>] kobject_uevent+0xa/0x10
 [<c10613f8>] mod_sysfs_setup+0x88/0xc0
 [<c1062ea9>] load_module+0x1b9/0x270
 [<c1062fb0>] sys_init_module+0x40/0x1c0
 [<c131e07d>] syscall_call+0x7/0xb
kworker/u:0     D f5957cd4     0  2194      5 0x00000000
 f5957d68 00000046 f76eba00 f5957cd4 c10284a2 f64cd560 00000000 c1442480
 f6598e80 00000400 00000000 00000800 00000000 f5fa5200 f6805080 f6805080
 f6598ffc 00000400 00000000 00000070 00000000 00000000 00000002 f5957d3c
Call Trace:
 [<c10284a2>] ? target_load+0x22/0x40
 [<c1029717>] ? select_idle_sibling+0xd7/0x100
 [<c131c4f5>] schedule+0x55/0x60
 [<c131cbda>] schedule_timeout+0x14a/0x150
 [<c102dd2d>] ? ttwu_queue+0x3d/0x70
 [<c102de56>] ? try_to_wake_up+0xf6/0x140
 [<c131c5b2>] wait_for_common+0xb2/0x120
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c102f030>] ? mutex_spin_on_owner+0x40/0x40
 [<c1046f99>] ? queue_work_on+0x39/0x40
 [<c131c632>] wait_for_completion+0x12/0x20
 [<c10465e0>] call_usermodehelper_exec+0x90/0xa0
 [<c1046091>] __request_module+0xd1/0x120
 [<c1096997>] ? page_address+0xa7/0xc0
 [<c10b1c04>] search_binary_handler+0x1a4/0x250
 [<c10b1e46>] do_execve_common+0x196/0x260
 [<c10b1f24>] do_execve+0x14/0x20
 [<c10093c2>] sys_execve+0x42/0x60
 [<c131e883>] ptregs_execve+0x13/0x18
 [<c131e07d>] ? syscall_call+0x7/0xb
 [<c1006472>] ? kernel_execve+0x22/0x40
 [<c1046215>] ? ____call_usermodehelper+0x135/0x150
 [<c10460e0>] ? __request_module+0x120/0x120
 [<c131ed36>] ? kernel_thread_helper+0x6/0xd

It seems to me that

	retval = call_usermodehelper(argv[0], argv, env->envp, UMH_WAIT_EXEC);

in kobject_uevent_env() upon automatic module loading triggers deadlock when
call_usermodehelper_exec() is already in progress from request_module() from
do_execve().

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [3.1.6] Deadlock upon recursive call_usermodehelper_exec().
  2011-12-27  7:01 [3.1.6] Deadlock upon recursive call_usermodehelper_exec() penguin-kernel
@ 2011-12-27  8:14 ` Tetsuo Handa
  2012-01-04  7:43 ` [PATCH] kmod: Avoid deadlock by recursive kmod call Tetsuo Handa
  1 sibling, 0 replies; 9+ messages in thread
From: Tetsuo Handa @ 2011-12-27  8:14 UTC (permalink / raw)
  To: gregkh, linux-fsdevel, linux-kernel

Additional info. I added debug printk()

--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -305,8 +305,13 @@ int kobject_uevent_env(struct kobject *k
 		if (retval)
 			goto exit;
 
+		printk(KERN_INFO "*** Calling call_usermodehelper(\"%s\")\n",
+		       argv[0]);
 		retval = call_usermodehelper(argv[0], argv,
 					     env->envp, UMH_WAIT_EXEC);
+		printk(KERN_INFO
+		       "*** Returned from call_usermodehelper(\"%s\")\n",
+		       argv[0]);
 	}
 
 exit:
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1435,7 +1435,14 @@ int search_binary_handler(struct linux_b
 				break; /* -ENOEXEC */
 			if (try)
 				break; /* -ENOEXEC */
+			printk(KERN_INFO
+			       "*** Calling request_module(\"binfmt-%04x\")\n",
+			       *(unsigned short *)(&bprm->buf[2]));
 			request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
+			printk(KERN_INFO
+			       "*** Returned from "
+			       "request_module(\"binfmt-%04x\")\n",
+			       *(unsigned short *)(&bprm->buf[2]));
 		}
 #else
 		break;

and got below output just before deadlock.

*** Calling call_usermodehelper("/sbin/hotplug")
*** Calling request_module("binfmt-0000")

So, this deadlock occurs when call_usermodehelper(UMH_WAIT_EXEC) is called and
request_module() is also called from search_binary_handler() when processing
call_usermodehelper(UMH_WAIT_EXEC).

This would not hanppen for normal condition, but 100% happens when the
userspace helper program is damaged.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] kmod: Avoid deadlock by recursive kmod call.
  2011-12-27  7:01 [3.1.6] Deadlock upon recursive call_usermodehelper_exec() penguin-kernel
  2011-12-27  8:14 ` Tetsuo Handa
@ 2012-01-04  7:43 ` Tetsuo Handa
  2012-01-04 13:08   ` [PATCH v2] " Tetsuo Handa
  1 sibling, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2012-01-04  7:43 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

I noticed that the system deadlocks if do_execve() request by kmod() triggered
recursive do_execve() request.

An example operation for triggering this deadlock:

  # : > /tmp/dummy
  # chmod 755 /tmp/dummy
  # echo /tmp/dummy > /proc/sys/kernel/hotplug
  # modprobe whatever

Although this patch works for me, I'm not sure that this is a correct fix.
Also, my analysis in the patch description may be wrong. Please check.
--------------------
[PATCH] kmod: Avoid deadlock by recursive kmod call.

call_usermodehelper(UMH_WAIT_EXEC) creates a CLONE_VFORK'ed thread.
The vforked thread may call call_usermodehelper(UMH_WAIT_PROC) from
search_binary_handler() which in turn calls wait_for_completion().

This sequence results in a deadlock because "khelper cannot start processing
call_usermodehelper(UMH_WAIT_PROC) until do_execve() of vforked thread by
call_usermodehelper(UMH_WAIT_EXEC) completes" but "vforked thread cannot
complete do_execve() until call_usermodehelper(UMH_WAIT_PROC) completes".

We need to make sure that do_execve() request by vforked thread does not
trigger call_usermodehelper(UMH_WAIT_PROC or UMH_WAIT_EXEC).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/kmod.c |    9 +++++++++
 1 files changed, 9 insertions(+)

--- linux-3.1.7.orig/kernel/kmod.c
+++ linux-3.1.7/kernel/kmod.c
@@ -428,6 +428,15 @@ int call_usermodehelper_exec(struct subp
 		retval = -EBUSY;
 		goto out;
 	}
+	/*
+	 * We can't call wait_for_completion() if current thread was created
+	 * with CLONE_VFORK flag, or we will deadlock when current thread calls
+	 * request_module() from search_binary_handler().
+	 */
+	if (wait != UMH_NO_WAIT && current->vfork_done) {
+		retval = -EBUSY;
+		goto out;
+	}
 
 	sub_info->complete = &done;
 	sub_info->wait = wait;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] kmod: Avoid deadlock by recursive kmod call.
  2012-01-04  7:43 ` [PATCH] kmod: Avoid deadlock by recursive kmod call Tetsuo Handa
@ 2012-01-04 13:08   ` Tetsuo Handa
       [not found]     ` <201201062139.JFC73472.SMQFOFOFHLVOtJ@I-love.SAKURA.ne.jp>
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2012-01-04 13:08 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

Tetsuo Handa wrote:
> I noticed that the system deadlocks if do_execve() request by kmod() triggered
> recursive do_execve() request.
> 
> An example operation for triggering this deadlock:
> 
>   # : > /tmp/dummy
>   # chmod 755 /tmp/dummy
>   # echo /tmp/dummy > /proc/sys/kernel/hotplug
>   # modprobe whatever
> 
> Although this patch works for me, I'm not sure that this is a correct fix.
> Also, my analysis in the patch description may be wrong. Please check.

Well, we should not unconditionally prevent all vfork()ed threads from using
request_module() at search_binary_handler(). We should prevent only kernel
threads created by __call_usermodehelper(). Updated to filter only kmod threads.
--------------------
[PATCH v2] kmod: Avoid deadlock by recursive kmod call.

call_usermodehelper(UMH_WAIT_EXEC or UMH_WAIT_PROC) request depends on an
assumption that do_execve() will not trigger recursive
call_usermodehelper(UMH_WAIT_EXEC or UMH_WAIT_PROC) request, for
"khelper cannot start processing recursive call_usermodehelper() request until
do_execve() of original call_usermodehelper() request completes" but
"do_execve() of original call_usermodehelper() cannot be completed until
recursive call_usermodehelper() request completes".

There are several callers that break this assumption and lead to deadlock.
We need to make sure that do_execve() request by kmod thread does not trigger
recursive call_usermodehelper(UMH_WAIT_PROC or UMH_WAIT_EXEC).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 include/linux/sched.h |    3 +++
 kernel/kmod.c         |   18 +++++++++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

--- linux-3.1.7.orig/include/linux/sched.h
+++ linux-3.1.7/include/linux/sched.h
@@ -1305,6 +1305,9 @@ struct task_struct {
 	unsigned sched_reset_on_fork:1;
 	unsigned sched_contributes_to_load:1;
 
+	/* Prevent recursive kmod request. */
+	unsigned kmod_thread:1;
+
 	pid_t pid;
 	pid_t tgid;
 
--- linux-3.1.7.orig/kernel/kmod.c
+++ linux-3.1.7/kernel/kmod.c
@@ -189,6 +189,13 @@ fail:
 	do_exit(0);
 }
 
+static int call_helper(void *data)
+{
+	/* Do not trigger recursive kmod call. */
+	current->kmod_thread = 1;
+	return ____call_usermodehelper(data);
+}
+
 void call_usermodehelper_freeinfo(struct subprocess_info *info)
 {
 	if (info->cleanup)
@@ -252,7 +259,7 @@ static void __call_usermodehelper(struct
 		pid = kernel_thread(wait_for_helper, sub_info,
 				    CLONE_FS | CLONE_FILES | SIGCHLD);
 	else
-		pid = kernel_thread(____call_usermodehelper, sub_info,
+		pid = kernel_thread(call_helper, sub_info,
 				    CLONE_VFORK | SIGCHLD);
 
 	switch (wait) {
@@ -428,6 +435,15 @@ int call_usermodehelper_exec(struct subp
 		retval = -EBUSY;
 		goto out;
 	}
+	/*
+	 * We can't call wait_for_completion() if current thread was created by
+	 * call_helper(), or we will deadlock when current thread calls
+	 * request_module() from search_binary_handler().
+	 */
+	if (wait != UMH_NO_WAIT && current->kmod_thread) {
+		retval = -EBUSY;
+		goto out;
+	}
 
 	sub_info->complete = &done;
 	sub_info->wait = wait;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] kmod: Avoid deadlock by recursive kmod call.
       [not found]     ` <201201062139.JFC73472.SMQFOFOFHLVOtJ@I-love.SAKURA.ne.jp>
@ 2012-01-19  0:45       ` Andrew Morton
  2012-01-19  2:07         ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2012-01-19  0:45 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: arjan, richard, linux-fsdevel, linux-kernel

On Fri, 6 Jan 2012 21:39:17 +0900
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> Tetsuo Handa wrote:
> > An example operation for triggering this deadlock:
> > 
> >   # : > /tmp/dummy
> >   # chmod 755 /tmp/dummy
> >   # echo /tmp/dummy > /proc/sys/kernel/hotplug
> >   # modprobe whatever
> 
> I wrote two patches. Please review.
> 
> ----- Patch A start -----
> [PATCH v3] kmod: Avoid deadlock by recursive kmod call.
> 
> The system deadlocks when call_usermodehelper(UMH_WAIT_EXEC) request triggered
> call_usermodehelper(UMH_WAIT_PROC) request.
> 
> This is because "khelper thread is waiting at wait_for_completion() in
> do_fork() since the worker thread was created with CLONE_VFORK flag" and
> "the worker thread cannot call complete() because do_execve() is blocked at
> UMH_WAIT_PROC request (e.g. request_module() from search_binary_handler())" and
> "the khelper thread cannot start processing UMH_WAIT_PROC request because
> the khelper thread is waiting at wait_for_completion() in do_fork()".
> 
> In order to avoid deadlock, do not try to call wait_for_completion() in
> call_usermodehelper_exec() if the worker thread was created by khelper thread
> with CLONE_VFORK flag.

I suppose we should fix this, although it's obscure.

The changelog doesn't tell people how the bug might occur, which is
bad.  I suggest adding this text to it:

: This bug was observed when using a corrupted /sbin/hotplug binary.  The
: corrupted binary caused a call to request_module("binfmt-0000"). 
: search_binary_handler() uses UMH_WAIT_EXEC.
: 
: An example operation for triggering this deadlock:
: 
:    # : > /tmp/dummy
:    # chmod 755 /tmp/dummy
:    # echo /tmp/dummy > /proc/sys/kernel/hotplug
:    # modprobe whatever

which might be inaccurate/inadequate.  I didn't try very hard.  Please
provide a complete description and analysis of the bug.

I prefer patch A - all that poking around at stack slots is hacky.

> --- linux-3.2.orig/fs/exec.c
> +++ linux-3.2/fs/exec.c
> @@ -1406,6 +1406,7 @@ int search_binary_handler(struct linux_b
>  					fput(bprm->file);
>  				bprm->file = NULL;
>  				current->did_exec = 1;
> +				current->kmod_thread = 0;
>  				proc_exec_connector(current);
>  				return retval;
>  			}

Special-casing this assignment down in this particular client of kmod
looks bad.  We need to find somewhere else to do this.  Perferably
within the kmod code itself, or possibly over in exec.c or fork.c.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] kmod: Avoid deadlock by recursive kmod call.
  2012-01-19  0:45       ` [PATCH v3] " Andrew Morton
@ 2012-01-19  2:07         ` Tetsuo Handa
  2012-01-24  4:32           ` [PATCH v4] " Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2012-01-19  2:07 UTC (permalink / raw)
  To: akpm; +Cc: arjan, richard, linux-fsdevel, linux-kernel

("[PATCH v3] kmod: Avoid deadlock by recursive kmod call." didn't arrive at ML
 due to mail delivery failure. The V3 patch can be found at
 https://bugzilla.kernel.org/show_bug.cgi?id=42579 )

Andrew Morton wrote:
> I suppose we should fix this, although it's obscure.
> 
> The changelog doesn't tell people how the bug might occur, which is
> bad.  I suggest adding this text to it:
> 
> : This bug was observed when using a corrupted /sbin/hotplug binary.  The
> : corrupted binary caused a call to request_module("binfmt-0000"). 
> : search_binary_handler() uses UMH_WAIT_EXEC.

A corrupted /sbin/hotplug binary is just a simplest example. There are various
hooks called during do_execve() operation (e.g. security_bprm_check(),
audit_bprm(), "struct linux_binfmt"->load_binary()). If one of such hooks
triggers (in the future, but maybe already) UMH_WAIT_EXEC, this deadlock
will happen even if /sbin/hotplug is not corrupted.

> I prefer patch A - all that poking around at stack slots is hacky.
OK.

> > --- linux-3.2.orig/fs/exec.c
> > +++ linux-3.2/fs/exec.c
> > @@ -1406,6 +1406,7 @@ int search_binary_handler(struct linux_b
> >  					fput(bprm->file);
> >  				bprm->file = NULL;
> >  				current->did_exec = 1;
> > +				current->kmod_thread = 0;
> >  				proc_exec_connector(current);
> >  				return retval;
> >  			}
> 
> Special-casing this assignment down in this particular client of kmod
> looks bad.  We need to find somewhere else to do this.  Perferably
> within the kmod code itself, or possibly over in exec.c or fork.c.
> 
Well, kmod code is no longer called if do_execve() succeeded.

In "[PATCH v2] kmod: Avoid deadlock by recursive kmod call"
( https://lkml.org/lkml/2012/1/4/111 ), I forgot to do

	current->kmod_thread = 0;

when do_execve() succeeded. Not clearing this flag might overkill functionality
of /sbin/hotplug and its descendants. If we want to clear this flag, I think we
cannot help doing it in fork.c or exec.c (or do like patch B which does not
need this flag).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4] kmod: Avoid deadlock by recursive kmod call.
  2012-01-19  2:07         ` Tetsuo Handa
@ 2012-01-24  4:32           ` Tetsuo Handa
  2012-01-26  0:40             ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2012-01-24  4:32 UTC (permalink / raw)
  To: akpm, arjan, richard, linux-fsdevel, linux-kernel

Tetsuo Handa wrote:
> Andrew Morton wrote:
> > > --- linux-3.2.orig/fs/exec.c
> > > +++ linux-3.2/fs/exec.c
> > > @@ -1406,6 +1406,7 @@ int search_binary_handler(struct linux_b
> > >  					fput(bprm->file);
> > >  				bprm->file = NULL;
> > >  				current->did_exec = 1;
> > > +				current->kmod_thread = 0;
> > >  				proc_exec_connector(current);
> > >  				return retval;
> > >  			}
> > 
> > Special-casing this assignment down in this particular client of kmod
> > looks bad.  We need to find somewhere else to do this.  Perferably
> > within the kmod code itself, or possibly over in exec.c or fork.c.
> > 
> Well, kmod code is no longer called if do_execve() succeeded.
> 
> In "[PATCH v2] kmod: Avoid deadlock by recursive kmod call"
> ( https://lkml.org/lkml/2012/1/4/111 ), I forgot to do
> 
> 	current->kmod_thread = 0;
> 
> when do_execve() succeeded. Not clearing this flag might overkill functionality
> of /sbin/hotplug and its descendants. If we want to clear this flag, I think we
> cannot help doing it in fork.c or exec.c (or do like patch B which does not
> need this flag).

I found another approach which mixed patch A and patch B.
Since __call_usermodehelper() is exclusively called (am I right?), we don't
need to use per "struct task_struct" flag.
----------------------------------------
[PATCH v4] kmod: Avoid deadlock by recursive kmod call.

The system deadlocks (at least since 2.6.10) when
call_usermodehelper(UMH_WAIT_EXEC) request triggered
call_usermodehelper(UMH_WAIT_PROC) request.

This is because "khelper thread is waiting for the worker thread at
wait_for_completion() in do_fork() since the worker thread was created with
CLONE_VFORK flag" and "the worker thread cannot call complete() because
do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper thread cannot
start processing UMH_WAIT_PROC request because the khelper thread is waiting
for the worker thread at wait_for_completion() in do_fork()".

In order to avoid deadlock, do not try to call wait_for_completion() in
call_usermodehelper_exec() if the worker thread was created by khelper thread
with CLONE_VFORK flag.

The easiest example to observe this deadlock is to use a corrupted
/sbin/hotplug binary (like shown below).

  # : > /tmp/dummy
  # chmod 755 /tmp/dummy
  # echo /tmp/dummy > /proc/sys/kernel/hotplug
  # modprobe whatever

call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a module.
do_execve("/tmp/dummy") triggers a call to request_module("binfmt-0000") from
search_binary_handler() which in turn calls call_usermodehelper(UMH_WAIT_PROC).

There are various hooks called during do_execve() operation (e.g.
security_bprm_check(), audit_bprm(), "struct linux_binfmt"->load_binary()).
If one of such hooks triggers UMH_WAIT_EXEC, this deadlock will happen even if
/sbin/hotplug is not corrupted.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/kmod.c |   25 +++++++++++++++++++++++--
 1 files changed, 23 insertions(+), 2 deletions(-)

--- linux-3.2.1.orig/kernel/kmod.c
+++ linux-3.2.1/kernel/kmod.c
@@ -43,6 +43,7 @@
 extern int max_threads;
 
 static struct workqueue_struct *khelper_wq;
+static const struct task_struct *kmod_thread_locker;
 
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
@@ -189,6 +190,13 @@ fail:
 	do_exit(0);
 }
 
+static int call_helper(void *data)
+{
+	/* Worker thread started blocking khelper thread. */
+	kmod_thread_locker = current;
+	return ____call_usermodehelper(data);
+}
+
 void call_usermodehelper_freeinfo(struct subprocess_info *info)
 {
 	if (info->cleanup)
@@ -251,9 +259,12 @@ static void __call_usermodehelper(struct
 	if (wait == UMH_WAIT_PROC)
 		pid = kernel_thread(wait_for_helper, sub_info,
 				    CLONE_FS | CLONE_FILES | SIGCHLD);
-	else
-		pid = kernel_thread(____call_usermodehelper, sub_info,
+	else {
+		pid = kernel_thread(call_helper, sub_info,
 				    CLONE_VFORK | SIGCHLD);
+		/* Worker thread stopped blocking khelper thread. */
+		kmod_thread_locker = NULL;
+	}
 
 	switch (wait) {
 	case UMH_NO_WAIT:
@@ -428,6 +439,16 @@ int call_usermodehelper_exec(struct subp
 		retval = -EBUSY;
 		goto out;
 	}
+	/*
+	 * Worker thread must not wait for khelper thread at below
+	 * wait_for_completion() if the thread was created with CLONE_VFORK
+	 * flag, for khelper thread is already waiting for the thread at
+	 * wait_for_completion() in do_fork().
+	 */
+	if (wait != UMH_NO_WAIT && current == kmod_thread_locker) {
+		retval = -EBUSY;
+		goto out;
+	}
 
 	sub_info->complete = &done;
 	sub_info->wait = wait;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4] kmod: Avoid deadlock by recursive kmod call.
  2012-01-24  4:32           ` [PATCH v4] " Tetsuo Handa
@ 2012-01-26  0:40             ` Andrew Morton
  2012-01-26  1:32               ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2012-01-26  0:40 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: arjan, richard, linux-fsdevel, linux-kernel

On Tue, 24 Jan 2012 13:32:40 +0900
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:

> Tetsuo Handa wrote:
> > Andrew Morton wrote:
> > > > --- linux-3.2.orig/fs/exec.c
> > > > +++ linux-3.2/fs/exec.c
> > > > @@ -1406,6 +1406,7 @@ int search_binary_handler(struct linux_b
> > > >  					fput(bprm->file);
> > > >  				bprm->file = NULL;
> > > >  				current->did_exec = 1;
> > > > +				current->kmod_thread = 0;
> > > >  				proc_exec_connector(current);
> > > >  				return retval;
> > > >  			}
> > > 
> > > Special-casing this assignment down in this particular client of kmod
> > > looks bad.  We need to find somewhere else to do this.  Perferably
> > > within the kmod code itself, or possibly over in exec.c or fork.c.
> > > 
> > Well, kmod code is no longer called if do_execve() succeeded.
> > 
> > In "[PATCH v2] kmod: Avoid deadlock by recursive kmod call"
> > ( https://lkml.org/lkml/2012/1/4/111 ), I forgot to do
> > 
> > 	current->kmod_thread = 0;
> > 
> > when do_execve() succeeded. Not clearing this flag might overkill functionality
> > of /sbin/hotplug and its descendants. If we want to clear this flag, I think we
> > cannot help doing it in fork.c or exec.c (or do like patch B which does not
> > need this flag).

(This email is difficult to reply to :(.  I think I'll reorganise it for you)

> I found another approach which mixed patch A and patch B.
>
> ----------------------------------------
> [PATCH v4] kmod: Avoid deadlock by recursive kmod call.
> 
> The system deadlocks (at least since 2.6.10) when
> call_usermodehelper(UMH_WAIT_EXEC) request triggered
> call_usermodehelper(UMH_WAIT_PROC) request.
> 
> This is because "khelper thread is waiting for the worker thread at
> wait_for_completion() in do_fork() since the worker thread was created with
> CLONE_VFORK flag" and "the worker thread cannot call complete() because
> do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper thread cannot
> start processing UMH_WAIT_PROC request because the khelper thread is waiting
> for the worker thread at wait_for_completion() in do_fork()".
> 
> In order to avoid deadlock, do not try to call wait_for_completion() in
> call_usermodehelper_exec() if the worker thread was created by khelper thread
> with CLONE_VFORK flag.
> 
> The easiest example to observe this deadlock is to use a corrupted
> /sbin/hotplug binary (like shown below).
> 
>   # : > /tmp/dummy
>   # chmod 755 /tmp/dummy
>   # echo /tmp/dummy > /proc/sys/kernel/hotplug
>   # modprobe whatever
> 
> call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
> kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a module.
> do_execve("/tmp/dummy") triggers a call to request_module("binfmt-0000") from
> search_binary_handler() which in turn calls call_usermodehelper(UMH_WAIT_PROC).
> 
> There are various hooks called during do_execve() operation (e.g.
> security_bprm_check(), audit_bprm(), "struct linux_binfmt"->load_binary()).
> If one of such hooks triggers UMH_WAIT_EXEC, this deadlock will happen even if
> /sbin/hotplug is not corrupted.


> Since __call_usermodehelper() is exclusively called (am I right?), we don't
> need to use per "struct task_struct" flag.

The locking for the new kmod_thread_locker global is quite unobvious. 
>From a quick look I can't say that I obviously agree with the above
claim.

Maybe it relies upon there only ever being a single khelper thread,
system wide?  If so then, err, maybe this is OK.  But I think the
analysis should be fully spelled out in the changelog and described in
the code in some manner which will prevent us from accidentally
breaking this exclusivity as the code evolves.

Does this look truthful and complete?

--- a/kernel/kmod.c~kmod-avoid-deadlock-by-recursive-kmod-call-fix
+++ a/kernel/kmod.c
@@ -44,6 +44,12 @@
 extern int max_threads;
 
 static struct workqueue_struct *khelper_wq;
+
+/*
+ * kmod_thread_locker is used for deadlock avoidance.  There is no explicit
+ * locking to protect this global - it is private to the singleton khelper
+ * thread and should only ever be modified by that thread.
+ */
 static const struct task_struct *kmod_thread_locker;
 
 #define CAP_BSET	(void *)1
_


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4] kmod: Avoid deadlock by recursive kmod call.
  2012-01-26  0:40             ` Andrew Morton
@ 2012-01-26  1:32               ` Tetsuo Handa
  0 siblings, 0 replies; 9+ messages in thread
From: Tetsuo Handa @ 2012-01-26  1:32 UTC (permalink / raw)
  To: akpm; +Cc: arjan, richard, linux-fsdevel, linux-kernel

Andrew Morton wrote:
> > Since __call_usermodehelper() is exclusively called (am I right?), we don't
> > need to use per "struct task_struct" flag.
> 
> The locking for the new kmod_thread_locker global is quite unobvious. 
> From a quick look I can't say that I obviously agree with the above
> claim.

Should we use semaphore in order to utilize lockdep?

> 
> Maybe it relies upon there only ever being a single khelper thread,
> system wide?

Yes. This patch relies on khelper being singlethreaded.

   khelper_wq = create_singlethread_workqueue("khelper");

Below output (taken without this patch) shows that __call_usermodehelper() is
called with khelper lock held.

[   95.333533] INFO: task kworker/u:0:5 blocked for more than 20 seconds.
[   95.342879] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   95.493057] kworker/u:0     D 67af850a  6084     5      2 0x00000000
[   95.570863]  f61a1d1c 00000046 00000000 67af850a 0000011c 0000019a 00000000 f61a1ca8
[   95.585696]  c1070315 f619e120 f62e8560 00000003 00000002 f619e5d0 00000330 00000000
[   95.669557]  b194689f c174b8e0 f23ffd78 00000000 f6b458e0 f6b458e0 f619e120 0000019a
[   95.752002] Call Trace:
[   95.823457]  [<c1070315>] ? validate_chain+0x3d5/0x520
[   95.830678]  [<c1070315>] ? validate_chain+0x3d5/0x520
[   95.906497]  [<c139efa5>] schedule+0x55/0x60
[   95.980630]  [<c139f6ec>] schedule_timeout+0x19c/0x1b0
[   96.059353]  [<c106d833>] ? lock_release_holdtime+0x73/0xb0
[   96.134561]  [<c1071082>] ? mark_held_locks+0x92/0xf0
[   96.141343]  [<c13a17a2>] ? _raw_spin_unlock_irq+0x22/0x30
[   96.217468]  [<c1071220>] ? trace_hardirqs_on_caller+0x90/0x100
[   96.293072]  [<c107129b>] ? trace_hardirqs_on+0xb/0x10
[   96.369510]  [<c139f094>] wait_for_common+0xe4/0x120
[   96.443832]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   96.450638]  [<c1071220>] ? trace_hardirqs_on_caller+0x90/0x100
[   96.526705]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   96.601386]  [<c1037f56>] ? wake_up_new_task+0xe6/0x1c0
[   96.680872]  [<c139f0e2>] wait_for_completion+0x12/0x20
[   96.755649]  [<c103f429>] do_fork+0x169/0x250
[   96.761904]  [<c139efce>] ? wait_for_common+0x1e/0x120
[   96.851797]  [<c100a488>] kernel_thread+0x88/0xa0
[   96.925236]  [<c1054310>] ? __request_module+0x130/0x130
[   96.935826]  [<c1054310>] ? __request_module+0x130/0x130
[   97.013681]  [<c13a2db4>] ? common_interrupt+0x34/0x34
[   97.086563]  [<c1054558>] __call_usermodehelper+0x28/0x90
[   97.152805]  [<c1056099>] process_one_work+0x149/0x3b0
[   97.158095]  [<c1056028>] ? process_one_work+0xd8/0x3b0
[   97.217462]  [<c1074149>] ? __lock_acquired+0x119/0x1c0
[   97.278057]  [<c1054530>] ? wait_for_helper+0xa0/0xa0
[   97.283217]  [<c105635c>] ? worker_thread+0x1c/0x200
[   97.342278]  [<c10563e6>] worker_thread+0xa6/0x200
[   97.347193]  [<c105b8c5>] kthread+0x75/0x80
[   97.405213]  [<c1056340>] ? process_scheduled_works+0x40/0x40
[   97.463717]  [<c105b850>] ? kthread_data+0x20/0x20
[   97.468829]  [<c13a2dba>] kernel_thread_helper+0x6/0xd
[   97.528077] 2 locks held by kworker/u:0/5:
[   97.532442]  #0:  (khelper){.+.+.+}, at: [<c1056028>] process_one_work+0xd8/0x3b0
[   97.593214]  #1:  ((&sub_info->work)){+.+.+.}, at: [<c1056028>] process_one_work+0xd8/0x3b0
[   97.656084] INFO: task modprobe:1158 blocked for more than 20 seconds.
[   97.716441] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   97.778390] modprobe        D 0000011c  6088  1158      1 0x00000000
[   97.838302]  f1c13d68 00000046 67abe0ee 0000011c 0000019a 00000000 34dfa000 c16074e0
[   97.850432]  000002dc f3fd25a0 f62142a0 00000000 67abe8f0 0000011c 0000019a 00000000
[   97.913373]  34dfa000 c174b8e0 f63edd78 00000000 f65458e0 f65458e0 f3fd25a0 0000019a
[   97.976795] Call Trace:
[   97.979423]  [<c1070315>] ? validate_chain+0x3d5/0x520
[   98.038568]  [<c139efa5>] schedule+0x55/0x60
[   98.095457]  [<c139f6ec>] schedule_timeout+0x19c/0x1b0
[   98.100976]  [<c106d833>] ? lock_release_holdtime+0x73/0xb0
[   98.159662]  [<c139f08d>] ? wait_for_common+0xdd/0x120
[   98.164966]  [<c13a17a2>] ? _raw_spin_unlock_irq+0x22/0x30
[   98.223160]  [<c1071220>] ? trace_hardirqs_on_caller+0x90/0x100
[   98.283620]  [<c107129b>] ? trace_hardirqs_on+0xb/0x10
[   98.289086]  [<c139f094>] wait_for_common+0xe4/0x120
[   98.342483]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   98.346653]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   98.394918]  [<c1055389>] ? queue_work_on+0x39/0x40
[   98.399412]  [<c139f0e2>] wait_for_completion+0x12/0x20
[   98.447573]  [<c1054838>] call_usermodehelper_exec+0x88/0x90
[   98.452396]  [<c1070101>] ? validate_chain+0x1c1/0x520
[   98.501567]  [<c139efce>] ? wait_for_common+0x1e/0x120
[   98.506082]  [<c105475f>] ? call_usermodehelper_setup+0x5f/0x90
[   98.554976]  [<c11b91d8>] kobject_uevent_env+0x3d8/0x490
[   98.559546]  [<c11b8d70>] ? kobject_action_type+0x90/0x90
[   98.608237]  [<c11b929a>] kobject_uevent+0xa/0x10
[   98.612311]  [<c107e3c8>] mod_sysfs_setup+0x88/0xc0
[   98.660562]  [<c107ff6d>] load_module+0x1ad/0x240
[   98.664596]  [<c1080051>] sys_init_module+0x41/0x1c0
[   98.715885]  [<c1071220>] ? trace_hardirqs_on_caller+0x90/0x100
[   98.764376]  [<c11c21e4>] ? trace_hardirqs_on_thunk+0xc/0x10
[   98.769215]  [<c13a2049>] syscall_call+0x7/0xb
[   98.817009] no locks held by modprobe/1158.
[   98.820564] INFO: task kworker/u:0:1159 blocked for more than 20 seconds.
[   98.870372] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   98.877038] kworker/u:0     D 0000011c  6644  1159      5 0x00000000
[   98.926711]  f22a1cf4 00000046 c18cf200 0000011c 0000019a 00000000 34dfa000 c16074e0
[   98.970856]  00000000 f38b08e0 c15557e0 00000000 f38b0908 00000001 f38b0d40 f22a1c88
[   98.977575]  c106febb c174b8e0 f63edd78 00000000 f65458e0 f65458e0 f38b08e0 00000001
[   99.022205] Call Trace:
[   99.024099]  [<c106febb>] ? check_prevs_add+0xab/0x100
[   99.065494]  [<c10701e7>] ? validate_chain+0x2a7/0x520
[   99.069244]  [<c139efa5>] schedule+0x55/0x60
[   99.072374]  [<c139f6ec>] schedule_timeout+0x19c/0x1b0
[   99.113619]  [<c106d833>] ? lock_release_holdtime+0x73/0xb0
[   99.117657]  [<c1071082>] ? mark_held_locks+0x92/0xf0
[   99.159953]  [<c13a17a2>] ? _raw_spin_unlock_irq+0x22/0x30
[   99.163931]  [<c1071220>] ? trace_hardirqs_on_caller+0x90/0x100
[   99.168162]  [<c107129b>] ? trace_hardirqs_on+0xb/0x10
[   99.209277]  [<c139f094>] wait_for_common+0xe4/0x120
[   99.212890]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   99.253976]  [<c1038fd0>] ? put_prev_task+0x40/0x40
[   99.257568]  [<c1055389>] ? queue_work_on+0x39/0x40
[   99.261105]  [<c139f0e2>] wait_for_completion+0x12/0x20
[   99.302652]  [<c1054838>] call_usermodehelper_exec+0x88/0x90
[   99.306783]  [<c1070101>] ? validate_chain+0x1c1/0x520
[   99.348505]  [<c139efce>] ? wait_for_common+0x1e/0x120
[   99.353139]  [<c105475f>] ? call_usermodehelper_setup+0x5f/0x90
[   99.395366]  [<c10542c2>] __request_module+0xe2/0x130
[   99.400335]  [<c10dabc8>] ? search_binary_handler+0x158/0x2a0
[   99.442201]  [<c10738b7>] ? __lock_release+0x47/0x70
[   99.445882]  [<c10dabc8>] ? search_binary_handler+0x158/0x2a0
[   99.449998]  [<c11151c0>] ? randomize_stack_top+0x50/0x50
[   99.753876]  [<c11151c0>] ? randomize_stack_top+0x50/0x50
[   99.757523]  [<c10dac64>] search_binary_handler+0x1f4/0x2a0
[   99.794255]  [<c10daaae>] ? search_binary_handler+0x3e/0x2a0
[   99.797829]  [<c10daeab>] do_execve_common+0x19b/0x270
[   99.801079]  [<c10daf94>] do_execve+0x14/0x20
[   99.837998]  [<c100a4e2>] sys_execve+0x42/0x60
[   99.840850]  [<c13a2903>] ptregs_execve+0x13/0x18
[   99.843811]  [<c13a2049>] ? syscall_call+0x7/0xb
[   99.879826]  [<c1007182>] ? kernel_execve+0x22/0x40
[   99.883203]  [<c105444a>] ? ____call_usermodehelper+0x13a/0x160
[   99.887070]  [<c1054310>] ? __request_module+0x130/0x130
[   99.923601]  [<c13a2dba>] ? kernel_thread_helper+0x6/0xd
[   99.926983] 1 lock held by kworker/u:0/1159:
[   99.929732]  #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<c10da618>] prepare_bprm_creds+0x28/0x70

If khelper becomes multithreaded, this patch will become unnecessary.

>               If so then, err, maybe this is OK.  But I think the
> analysis should be fully spelled out in the changelog and described in
> the code in some manner which will prevent us from accidentally
> breaking this exclusivity as the code evolves.
> 
> Does this look truthful and complete?
> 
Yes, thank you.

Maybe kmod_thread_blocker conveys better than kmod_thread_locker?

> --- a/kernel/kmod.c~kmod-avoid-deadlock-by-recursive-kmod-call-fix
> +++ a/kernel/kmod.c
> @@ -44,6 +44,12 @@
>  extern int max_threads;
>  
>  static struct workqueue_struct *khelper_wq;
> +
> +/*
> + * kmod_thread_locker is used for deadlock avoidance.  There is no explicit
> + * locking to protect this global - it is private to the singleton khelper
> + * thread and should only ever be modified by that thread.
> + */
>  static const struct task_struct *kmod_thread_locker;
>  
>  #define CAP_BSET	(void *)1
> _
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-01-26  1:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-27  7:01 [3.1.6] Deadlock upon recursive call_usermodehelper_exec() penguin-kernel
2011-12-27  8:14 ` Tetsuo Handa
2012-01-04  7:43 ` [PATCH] kmod: Avoid deadlock by recursive kmod call Tetsuo Handa
2012-01-04 13:08   ` [PATCH v2] " Tetsuo Handa
     [not found]     ` <201201062139.JFC73472.SMQFOFOFHLVOtJ@I-love.SAKURA.ne.jp>
2012-01-19  0:45       ` [PATCH v3] " Andrew Morton
2012-01-19  2:07         ` Tetsuo Handa
2012-01-24  4:32           ` [PATCH v4] " Tetsuo Handa
2012-01-26  0:40             ` Andrew Morton
2012-01-26  1:32               ` Tetsuo Handa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).