All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ming Lei <ming.lei@canonical.com>,
	Alex Riesen <raa.lkml@gmail.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Jens Axboe <axboe@kernel.dk>,
	USB list <linux-usb@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Rusty Russell <rusty@rustcorp.com.au>
Subject: [PATCH] module, async: async_synchronize_full() on module init iff async is used
Date: Tue, 15 Jan 2013 18:52:51 -0800	[thread overview]
Message-ID: <20130116025251.GM2668@htj.dyndns.org> (raw)
In-Reply-To: <CA+55aFxLJ=Yfms9SkPJLjQ-ZASPNZC0JUBsb3OwonJvTCcBzxA@mail.gmail.com>

If the default iosched is built as module, the kernel may deadlock
while trying to load the iosched module on device probe if the probing
was running off async.  This is because async_synchronize_full() at
the end of module init ends up waiting for the async job which
initiated the module loading.

 async A				modprobe

 1. finds a device
 2. registers the block device
 3. request_module(default iosched)
					4. modprobe in userland
					5. load and init module
					6. async_synchronize_full()

Async A waits for modprobe to finish in request_module() and modprobe
waits for async A to finish in async_synchronize_full().

Because there's no easy to track dependency once control goes out to
userland, implementing properly nested flushing is difficult.  For
now, make module init perform async_synchronize_full() iff module init
has queued async jobs as suggested by Linus.

This avoids the described deadlock because iosched module doesn't use
async and thus wouldn't invoke async_synchronize_full().  This is
hacky and incomplete.  It will deadlock if async module loading nests;
however, this works around the known problem case and seems to be the
best of bad options.

For more details, please refer to the following thread.

  http://thread.gmane.org/gmane.linux.kernel/1420814

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
It makes me feel dirty but makes the problem go away and I can't think
of anything better, so here is the implementation of "used async"
workaround.

Thanks.

 include/linux/sched.h |    1 +
 kernel/async.c        |    3 +++
 kernel/module.c       |   27 +++++++++++++++++++++++++--
 3 files changed, 29 insertions(+), 2 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1810,6 +1810,7 @@ extern void thread_group_cputime_adjuste
 #define PF_MEMALLOC	0x00000800	/* Allocating memory */
 #define PF_NPROC_EXCEEDED 0x00001000	/* set_user noticed that RLIMIT_NPROC was exceeded */
 #define PF_USED_MATH	0x00002000	/* if unset the fpu must be initialized before use */
+#define PF_USED_ASYNC	0x00004000	/* used async_schedule*(), used by module init */
 #define PF_NOFREEZE	0x00008000	/* this thread should not be frozen */
 #define PF_FROZEN	0x00010000	/* frozen for system suspend */
 #define PF_FSTRANS	0x00020000	/* inside a filesystem transaction */
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -196,6 +196,9 @@ static async_cookie_t __async_schedule(a
 	atomic_inc(&entry_count);
 	spin_unlock_irqrestore(&async_lock, flags);
 
+	/* mark that this task has queued an async job, used by module init */
+	current->flags |= PF_USED_ASYNC;
+
 	/* schedule for execution */
 	queue_work(system_unbound_wq, &entry->work);
 
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3013,6 +3013,12 @@ static int do_init_module(struct module
 {
 	int ret = 0;
 
+	/*
+	 * We want to find out whether @mod uses async during init.  Clear
+	 * PF_USED_ASYNC.  async_schedule*() will set it.
+	 */
+	current->flags &= ~PF_USED_ASYNC;
+
 	blocking_notifier_call_chain(&module_notify_list,
 			MODULE_STATE_COMING, mod);
 
@@ -3058,8 +3064,25 @@ static int do_init_module(struct module
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_LIVE, mod);
 
-	/* We need to finish all async code before the module init sequence is done */
-	async_synchronize_full();
+	/*
+	 * We need to finish all async code before the module init sequence
+	 * is done.  This has potential to deadlock.  For example, a newly
+	 * detected block device can trigger request_module() of the
+	 * default iosched from async probing task.  Once userland helper
+	 * reaches here, async_synchronize_full() will wait on the async
+	 * task waiting on request_module() and deadlock.
+	 *
+	 * This deadlock is avoided by perfomring async_synchronize_full()
+	 * iff module init queued any async jobs.  This isn't a full
+	 * solution as it will deadlock the same if module loading from
+	 * async jobs nests more than once; however, due to the various
+	 * constraints, this hack seems to be the best option for now.
+	 * Please refer to the following thread for details.
+	 *
+	 * http://thread.gmane.org/gmane.linux.kernel/1420814
+	 */
+	if (current->flags & PF_USED_ASYNC)
+		async_synchronize_full();
 
 	mutex_lock(&module_mutex);
 	/* Drop initial reference. */

  reply	other threads:[~2013-01-16  2:53 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-11 21:04 USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Alex Riesen
2013-01-12  7:48 ` Alex Riesen
2013-01-12  9:18   ` Lan Tianyu
2013-01-12 17:37   ` Alan Stern
2013-01-12 19:39     ` Alex Riesen
2013-01-12 20:33       ` Alex Riesen
2013-01-12 22:52         ` Alan Stern
2013-01-13 12:09           ` Alex Riesen
2013-01-13 16:56             ` Alan Stern
2013-01-13 17:42               ` Alex Riesen
2013-01-13 19:16                 ` Oliver Neukum
2013-01-14  2:39                   ` Alan Stern
2013-01-14 16:43                     ` Alex Riesen
2013-01-14  3:47                 ` Ming Lei
2013-01-14  7:15                   ` Ming Lei
2013-01-14 17:30                     ` Linus Torvalds
2013-01-14 18:04                       ` Alan Stern
2013-01-14 18:34                         ` Linus Torvalds
2013-01-15  1:53                       ` Ming Lei
2013-01-15  6:23                         ` Ming Lei
2013-01-15 17:36                           ` Linus Torvalds
2013-01-15 18:18                             ` Linus Torvalds
2013-01-15 23:17                               ` Tejun Heo
2013-01-15 18:20                             ` Alan Stern
2013-01-15 18:39                               ` Tejun Heo
2013-01-15 18:32                             ` Tejun Heo
2013-01-15 20:18                               ` Linus Torvalds
2013-01-15 23:50                                 ` Tejun Heo
2013-01-16  0:25                                   ` Arjan van de Ven
2013-01-16  0:35                                     ` Tejun Heo
2013-01-16  4:01                                       ` Alan Stern
2013-01-16 16:12                                         ` Tejun Heo
2013-01-16 17:01                                           ` Alan Stern
2013-01-16 17:37                                             ` Tejun Heo
2013-01-16 17:51                                               ` Alan Stern
2013-01-16  0:36                                   ` Linus Torvalds
2013-01-16  0:40                                     ` Linus Torvalds
2013-01-16  2:52                                       ` Tejun Heo [this message]
2013-01-16  3:00                                         ` [PATCH] module, async: async_synchronize_full() on module init iff async is used Linus Torvalds
2013-01-16  3:25                                           ` Tejun Heo
2013-01-16  3:37                                             ` Linus Torvalds
2013-01-16 16:22                                               ` Arjan van de Ven
2013-01-16 16:48                                               ` Tejun Heo
2013-01-16 17:03                                                 ` Arjan van de Ven
2013-01-16 17:06                                                   ` Linus Torvalds
2013-01-16 21:30                                                     ` [PATCH 1/2] init, block: try to load default elevator module early during boot Tejun Heo
2013-01-17 18:05                                                       ` Linus Torvalds
2013-01-17 18:38                                                         ` Tejun Heo
2013-01-17 18:46                                                           ` Linus Torvalds
2013-01-17 18:59                                                             ` Tejun Heo
2013-01-17 19:00                                                               ` Linus Torvalds
2013-01-18  1:24                                                         ` [PATCH 1/3] workqueue: set PF_WQ_WORKER on rescuers Tejun Heo
2013-01-18  1:25                                                         ` [PATCH 2/3] workqueue, async: implement work/async_current_func() Tejun Heo
2013-01-18  2:47                                                           ` Linus Torvalds
2013-01-18  2:59                                                             ` Tejun Heo
2013-01-18  3:04                                                               ` Tejun Heo
2013-01-18  3:18                                                                 ` Linus Torvalds
2013-01-18  3:47                                                                   ` Tejun Heo
2013-01-18 22:08                                                                   ` [PATCH 1/5] workqueue: set PF_WQ_WORKER on rescuers Tejun Heo
2013-01-18 22:10                                                                   ` [PATCH 2/5] workqueue: rename kernel/workqueue_sched.h to kernel/workqueue_internal.h Tejun Heo
2013-01-18 22:11                                                                   ` [PATCH 3/5] workqueue: move struct worker definition to workqueue_internal.h Tejun Heo
2013-01-18 22:11                                                                   ` [PATCH 4/5] workqueue: implement current_is_async() Tejun Heo
2013-01-18 22:12                                                                   ` [PATCH 5/5] async, kmod: warn on synchronous request_module() from async workers Tejun Heo
2022-06-23  5:25                                                                     ` Saravana Kannan
2013-01-18  1:27                                                         ` [PATCH 3/3] " Tejun Heo
2013-01-23  0:53                                                       ` [PATCH v2 1/2] init, block: try to load default elevator module early during boot Tejun Heo
2013-01-16 21:31                                                     ` [PATCH 2/2] block: don't request module during elevator init Tejun Heo
2013-01-23  0:51                                                       ` [PATCH v2 " Tejun Heo
2013-01-16  3:30                                         ` [PATCH] module, async: async_synchronize_full() on module init iff async is used Ming Lei
2013-01-16  4:24                                         ` Rusty Russell
2013-01-16 11:36                                         ` Alex Riesen
2013-08-12  7:04                                         ` [3.8-rc3 -> 3.8-rc4 regression] " Jonathan Nieder
2013-08-12 15:09                                           ` Tejun Heo
2013-11-26 21:29                                             ` Josh Hunt
2013-11-26 21:53                                               ` Linus Torvalds
2013-11-26 22:12                                                 ` Josh Hunt
2013-11-26 22:29                                                   ` Tejun Heo
2013-12-03 14:28                                                     ` Josh Hunt
2013-12-03 15:19                                                       ` Tejun Heo
2013-12-04 23:01                                                         ` Josh Hunt
2013-12-04 23:12                                                           ` Tejun Heo
2013-11-26 22:30                                                   ` Linus Torvalds
2013-01-16  0:44                                     ` USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Tejun Heo
2013-01-16 17:19                               ` [PATCH] async: fix __lowest_in_progress() Tejun Heo
2013-01-17 18:16                                 ` Linus Torvalds
2013-01-17 18:50                                   ` Tejun Heo
2013-01-23  0:15                                 ` [PATCH v2] " Tejun Heo
2013-01-23  0:22                                   ` Linus Torvalds
2013-01-16  3:05                             ` USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Ming Lei
2013-01-16  4:14                               ` Linus Torvalds
2013-01-14  8:22                   ` Oliver Neukum
2013-01-14  8:40                     ` Ming Lei
2013-01-12 19:56     ` Alex Riesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130116025251.GM2668@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=arjan@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=raa.lkml@gmail.com \
    --cc=rusty@rustcorp.com.au \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.