All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
@ 2015-07-27 16:27 Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

This patchset does a bunch of cleanups and converts khelper to use
system unbound workqueues. The 3 first patches should be uncontroversial.
The last 2 patches are debatable.

Kmod creates kernel threads that perform userspace jobs and we want
those to have a large affinity in order not to contend busy CPUs. This
is (partly) why we use khelper which has a wide affinity that the kernel
threads it create can inherit from. Now khelper is a dedicated workqueue
that has singlethread properties which we aren't interested in.

Hence those two debatable changes:

_ We would like to use generic workqueues. System unbound workqueues are
  a very good candidate but they are not wide affine, only node affine.
  Now probably a node is enough to perform many parallel kmod jobs.

_ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
  handler) to use the workqueue. It means that if the workqueue blocks,
  and no other worker can take pending kmod request, we can be screwed.
  Now if we have 512 threads, this should be enough.

I added Tejun to discuss these things.

Changes since v2, after Oleg reviews:

* Reordered patches such that the uncontroversial patches begin the
  queue and can be applied independantly.

* Tell about the need for workqueue also because we want a privileged
  (root) thread for usermodehelper job.

* Deactivate signal after UMH_WAIT_PROC handler.

* Update comment to handle the fact system unbound workqueues are node
  affine and not wide affine.

---
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	nohz/kmod-v3

HEAD: 470ed6f10191864d28d2c8f4fd8a349ef58f427b

Thanks,
	Frederic
---

Frederic Weisbecker (5):
      kmod: Bunch of internal functions renames
      kmod: Remove unecessary explicit wide CPU affinity setting
      kmod: Add up-to-date explanations on the purpose of each asynchronous levels
      kmod: Use system_unbound_wq instead of khelper
      kmod: Handle UMH_WAIT_PROC from system unbound workqueue


 include/linux/kmod.h |  2 --
 init/main.c          |  1 -
 kernel/kmod.c        | 91 ++++++++++++++++++++++++++++------------------------
 3 files changed, 49 insertions(+), 45 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] kmod: Bunch of internal functions renames
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
@ 2015-07-27 16:27 ` Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 2/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

Underscores on function names aren't much verbose to explain the
purpose of a function. And kmod has interesting such flavours.

Lets rename the following functions:

* __call_usermodehelper -> call_usermodehelper_exec_work
* ____call_usermodehelper -> call_usermodehelper_exec_async
* wait_for_helper -> call_usermodehelper_exec_sync

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 2777f40..4682e91 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -213,7 +213,7 @@ static void umh_complete(struct subprocess_info *sub_info)
 /*
  * This is the task which runs the usermode application
  */
-static int ____call_usermodehelper(void *data)
+static int call_usermodehelper_exec_async(void *data)
 {
 	struct subprocess_info *sub_info = data;
 	struct cred *new;
@@ -258,7 +258,10 @@ static int ____call_usermodehelper(void *data)
 			   (const char __user *const __user *)sub_info->envp);
 out:
 	sub_info->retval = retval;
-	/* wait_for_helper() will call umh_complete if UHM_WAIT_PROC. */
+	/*
+	 * call_usermodehelper_exec_sync() will call umh_complete
+	 * if UHM_WAIT_PROC.
+	 */
 	if (!(sub_info->wait & UMH_WAIT_PROC))
 		umh_complete(sub_info);
 	if (!retval)
@@ -267,14 +270,14 @@ out:
 }
 
 /* Keventd can't block, but this (a child) can. */
-static int wait_for_helper(void *data)
+static int call_usermodehelper_exec_sync(void *data)
 {
 	struct subprocess_info *sub_info = data;
 	pid_t pid;
 
 	/* If SIGCLD is ignored sys_wait4 won't populate the status. */
 	kernel_sigaction(SIGCHLD, SIG_DFL);
-	pid = kernel_thread(____call_usermodehelper, sub_info, SIGCHLD);
+	pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
 	if (pid < 0) {
 		sub_info->retval = pid;
 	} else {
@@ -282,17 +285,18 @@ static int wait_for_helper(void *data)
 		/*
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
-		 * But wait_for_helper() always runs as keventd, and put_user()
-		 * to a kernel address works OK for kernel threads, due to their
-		 * having an mm_segment_t which spans the entire address space.
+		 * But call_usermodehelper_exec_sync() always runs as keventd,
+		 * and put_user() to a kernel address works OK for kernel
+		 * threads, due to their having an mm_segment_t which spans the
+		 * entire address space.
 		 *
 		 * Thus the __user pointer cast is valid here.
 		 */
 		sys_wait4(pid, (int __user *)&ret, 0, NULL);
 
 		/*
-		 * If ret is 0, either ____call_usermodehelper failed and the
-		 * real error code is already in sub_info->retval or
+		 * If ret is 0, either call_usermodehelper_exec_async failed and
+		 * the real error code is already in sub_info->retval or
 		 * sub_info->retval is 0 anyway, so don't mess with it then.
 		 */
 		if (ret)
@@ -304,17 +308,17 @@ static int wait_for_helper(void *data)
 }
 
 /* This is run by khelper thread  */
-static void __call_usermodehelper(struct work_struct *work)
+static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
 		container_of(work, struct subprocess_info, work);
 	pid_t pid;
 
 	if (sub_info->wait & UMH_WAIT_PROC)
-		pid = kernel_thread(wait_for_helper, sub_info,
+		pid = kernel_thread(call_usermodehelper_exec_sync, sub_info,
 				    CLONE_FS | CLONE_FILES | SIGCHLD);
 	else
-		pid = kernel_thread(____call_usermodehelper, sub_info,
+		pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
 				    SIGCHLD);
 
 	if (pid < 0) {
@@ -509,7 +513,7 @@ struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
 	if (!sub_info)
 		goto out;
 
-	INIT_WORK(&sub_info->work, __call_usermodehelper);
+	INIT_WORK(&sub_info->work, call_usermodehelper_exec_work);
 	sub_info->path = path;
 	sub_info->argv = argv;
 	sub_info->envp = envp;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] kmod: Remove unecessary explicit wide CPU affinity setting
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
@ 2015-07-27 16:27 ` Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

Khelper is affine to all CPUs. Now since it creates the
call_usermodehelper_exec_[a]sync() kernel threads, those inherit the
wide affinity.

As such explicitly forcing a wide affinity from those kernel threads
is like a no-op.

Just remove it. It's needless and it breaks CPU isolation users who
rely on workqueue affinity tuning.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 4682e91..97be0cf 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -223,9 +223,6 @@ static int call_usermodehelper_exec_async(void *data)
 	flush_signal_handlers(current, 1);
 	spin_unlock_irq(&current->sighand->siglock);
 
-	/* We can run anywhere, unlike our parent keventd(). */
-	set_cpus_allowed_ptr(current, cpu_all_mask);
-
 	/*
 	 * Our parent is keventd, which runs with elevated scheduling priority.
 	 * Avoid propagating that into the userspace child.
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 2/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
@ 2015-07-27 16:27 ` Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 4/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

There seem to be quite some confusions on the comments, likely due to
changes that came after them.

Now since it's very non obvious why we have 3 levels of asynchronous
code to implement usermodehelpers, it's important to comment in detail
the reason of this layout.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 97be0cf..7833041 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -224,8 +224,8 @@ static int call_usermodehelper_exec_async(void *data)
 	spin_unlock_irq(&current->sighand->siglock);
 
 	/*
-	 * Our parent is keventd, which runs with elevated scheduling priority.
-	 * Avoid propagating that into the userspace child.
+	 * Our parent is khelper which runs with elevated scheduling
+	 * priority. Avoid propagating that into the userspace child.
 	 */
 	set_user_nice(current, 0);
 
@@ -266,7 +266,11 @@ out:
 	do_exit(0);
 }
 
-/* Keventd can't block, but this (a child) can. */
+/*
+ * Handles UMH_WAIT_PROC. Our parent khelper can't wait for usermodehelper
+ * completion without blocking every other pending requests. That's why
+ * we use a kernel thread dedicated for that purpose.
+ */
 static int call_usermodehelper_exec_sync(void *data)
 {
 	struct subprocess_info *sub_info = data;
@@ -282,8 +286,8 @@ static int call_usermodehelper_exec_sync(void *data)
 		/*
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
-		 * But call_usermodehelper_exec_sync() always runs as keventd,
-		 * and put_user() to a kernel address works OK for kernel
+		 * But call_usermodehelper_exec_sync() always runs as kernel
+		 * thread and put_user() to a kernel address works OK for kernel
 		 * threads, due to their having an mm_segment_t which spans the
 		 * entire address space.
 		 *
@@ -304,7 +308,19 @@ static int call_usermodehelper_exec_sync(void *data)
 	do_exit(0);
 }
 
-/* This is run by khelper thread  */
+/*
+ * This function doesn't strictly needs to be called asynchronously. But we
+ * need to create the usermodehelper kernel threads from a task that is affine
+ * to all CPUs (or nohz housekeeping ones) such that they inherit a widest
+ * affinity irrespective of call_usermodehelper() callers with possibly reduced
+ * affinity (eg: per-cpu workqueues). We don't want usermodehelper targets to
+ * contend any busy CPU.
+ * Khelper provides such wide affinity.
+ *
+ * Besides, khelper provides the privilege level that caller might not have to
+ * perform the usermodehelper request.
+ *
+ */
 static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
@@ -532,8 +548,8 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
  *        from interrupt context.
  *
  * Runs a user-space application.  The application is started
- * asynchronously if wait is not set, and runs as a child of keventd.
- * (ie. it runs with full root capabilities).
+ * asynchronously if wait is not set, and runs as a child of khelper.
+ * (ie. it runs with full root capabilities and wide affinity).
  */
 int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2015-07-27 16:27 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
@ 2015-07-27 16:27 ` Frederic Weisbecker
  2015-07-27 16:27 ` [PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

We need to launch the usermodehelper kernel threads with the widest
affinity and this is partly why we use khelper. This workqueue has
unbound properties and thus a wide affinity inherited by all its
children.

Now khelper also has special properties that we aren't much interested
in: ordered and singlethread. There is really no need about ordering as
all we do is creating kernel threads. This can be done concurrently.
And singlethread is a useless limitation as well.

The workqueue engine already proposes generic unbound workqueues that
don't share these useless properties and handle well parallel jobs.

The only worrysome specific is their affinity to the node of the current
CPU. It's fine for creating the usermodehelper kernel threads but those
inherit this affinity for longer jobs such as requesting modules.

This patch proposes to use these node affine unbound workqueues assuming
that a node is sufficient to handle several parallel usermodehelper
requests.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/kmod.h |  2 --
 init/main.c          |  1 -
 kernel/kmod.c        | 40 +++++++++++++++++-----------------------
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 0555cc6..fcfd2bf 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -85,8 +85,6 @@ enum umh_disable_depth {
 	UMH_DISABLED,
 };
 
-extern void usermodehelper_init(void);
-
 extern int __usermodehelper_disable(enum umh_disable_depth depth);
 extern void __usermodehelper_set_disable_depth(enum umh_disable_depth depth);
 
diff --git a/init/main.c b/init/main.c
index c5d5626..45551f8 100644
--- a/init/main.c
+++ b/init/main.c
@@ -877,7 +877,6 @@ static void __init do_initcalls(void)
 static void __init do_basic_setup(void)
 {
 	cpuset_init_smp();
-	usermodehelper_init();
 	shmem_init();
 	driver_init();
 	init_irq_proc();
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 7833041..ccf60cc 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -45,8 +45,6 @@
 
 extern int max_threads;
 
-static struct workqueue_struct *khelper_wq;
-
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
 
@@ -224,7 +222,7 @@ static int call_usermodehelper_exec_async(void *data)
 	spin_unlock_irq(&current->sighand->siglock);
 
 	/*
-	 * Our parent is khelper which runs with elevated scheduling
+	 * Our parent (unbound workqueue) runs with elevated scheduling
 	 * priority. Avoid propagating that into the userspace child.
 	 */
 	set_user_nice(current, 0);
@@ -267,9 +265,10 @@ out:
 }
 
 /*
- * Handles UMH_WAIT_PROC. Our parent khelper can't wait for usermodehelper
- * completion without blocking every other pending requests. That's why
- * we use a kernel thread dedicated for that purpose.
+ * Handles UMH_WAIT_PROC. Our parent (unbound workqueue) might not be able to
+ * run enough instances to handle usermodehelper completions without blocking
+ * some other pending requests. That's why we use a kernel thread dedicated for
+ * that purpose.
  */
 static int call_usermodehelper_exec_sync(void *data)
 {
@@ -311,14 +310,15 @@ static int call_usermodehelper_exec_sync(void *data)
 /*
  * This function doesn't strictly needs to be called asynchronously. But we
  * need to create the usermodehelper kernel threads from a task that is affine
- * to all CPUs (or nohz housekeeping ones) such that they inherit a widest
- * affinity irrespective of call_usermodehelper() callers with possibly reduced
- * affinity (eg: per-cpu workqueues). We don't want usermodehelper targets to
- * contend any busy CPU.
- * Khelper provides such wide affinity.
+ * to an optimized set of CPUs (or nohz housekeeping ones) such that they
+ * inherit a widest affinity irrespective of call_usermodehelper() callers with
+ * possibly reduced affinity (eg: per-cpu workqueues). We don't want
+ * usermodehelper targets to contend a busy CPU.
  *
- * Besides, khelper provides the privilege level that caller might not have to
- * perform the usermodehelper request.
+ * Unbound workqueues provide such wide affinity.
+ *
+ * Besides, workqueues provide the privilege level that caller might not have
+ * to perform the usermodehelper request.
  *
  */
 static void call_usermodehelper_exec_work(struct work_struct *work)
@@ -548,8 +548,8 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
  *        from interrupt context.
  *
  * Runs a user-space application.  The application is started
- * asynchronously if wait is not set, and runs as a child of khelper.
- * (ie. it runs with full root capabilities and wide affinity).
+ * asynchronously if wait is not set, and runs as a child of system workqueues.
+ * (ie. it runs with full root capabilities and optimized affinity).
  */
 int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 {
@@ -561,7 +561,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 		return -EINVAL;
 	}
 	helper_lock();
-	if (!khelper_wq || usermodehelper_disabled) {
+	if (usermodehelper_disabled) {
 		retval = -EBUSY;
 		goto out;
 	}
@@ -573,7 +573,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 	sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &done;
 	sub_info->wait = wait;
 
-	queue_work(khelper_wq, &sub_info->work);
+	queue_work(system_unbound_wq, &sub_info->work);
 	if (wait == UMH_NO_WAIT)	/* task has freed sub_info */
 		goto unlock;
 
@@ -703,9 +703,3 @@ struct ctl_table usermodehelper_table[] = {
 	},
 	{ }
 };
-
-void __init usermodehelper_init(void)
-{
-	khelper_wq = create_singlethread_workqueue("khelper");
-	BUG_ON(!khelper_wq);
-}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2015-07-27 16:27 ` [PATCH 4/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
@ 2015-07-27 16:27 ` Frederic Weisbecker
  2015-07-27 18:48 ` [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Tejun Heo
  2015-08-05 17:10 ` Oleg Nesterov
  6 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 16:27 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Tejun Heo, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

The UMH_WAIT_PROC handler runs in its own thread in order to make sure
that waiting for the exec kernel thread completion won't block other
usermodehelper queued jobs.

On older workqueue implementations, worklets couldn't sleep without
blocking the rest of the queue. But now the workqueue subsystem handles
that. Khelper still had the older limitation due to its singlethread
properties but we replaced it to system unbound workqueues.

Those are affine to the current node and can block up to some number
of instances.

They are a good candidate to handle UMH_WAIT_PROC assuming that we have
enough system unbound workers to handle lots of parallel usermodehelper
jobs.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 44 ++++++++++++++++++++------------------------
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index ccf60cc..c1f2550 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -264,15 +264,9 @@ out:
 	do_exit(0);
 }
 
-/*
- * Handles UMH_WAIT_PROC. Our parent (unbound workqueue) might not be able to
- * run enough instances to handle usermodehelper completions without blocking
- * some other pending requests. That's why we use a kernel thread dedicated for
- * that purpose.
- */
-static int call_usermodehelper_exec_sync(void *data)
+/* Handles UMH_WAIT_PROC.  */
+static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
 {
-	struct subprocess_info *sub_info = data;
 	pid_t pid;
 
 	/* If SIGCLD is ignored sys_wait4 won't populate the status. */
@@ -286,9 +280,9 @@ static int call_usermodehelper_exec_sync(void *data)
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
 		 * But call_usermodehelper_exec_sync() always runs as kernel
-		 * thread and put_user() to a kernel address works OK for kernel
-		 * threads, due to their having an mm_segment_t which spans the
-		 * entire address space.
+		 * thread (workqueue) and put_user() to a kernel address works
+		 * OK for kernel threads, due to their having an mm_segment_t
+		 * which spans the entire address space.
 		 *
 		 * Thus the __user pointer cast is valid here.
 		 */
@@ -303,19 +297,21 @@ static int call_usermodehelper_exec_sync(void *data)
 			sub_info->retval = ret;
 	}
 
+	/* Restore default kernel sig handler */
+	kernel_sigaction(SIGCHLD, SIG_IGN);
+
 	umh_complete(sub_info);
-	do_exit(0);
 }
 
 /*
- * This function doesn't strictly needs to be called asynchronously. But we
- * need to create the usermodehelper kernel threads from a task that is affine
+ * We need to create the usermodehelper kernel thread from a task that is affine
  * to an optimized set of CPUs (or nohz housekeeping ones) such that they
  * inherit a widest affinity irrespective of call_usermodehelper() callers with
  * possibly reduced affinity (eg: per-cpu workqueues). We don't want
  * usermodehelper targets to contend a busy CPU.
  *
- * Unbound workqueues provide such wide affinity.
+ * Unbound workqueues provide such wide affinity and allow to block on
+ * UMH_WAIT_PROC requests without blocking pending request (up to some limit).
  *
  * Besides, workqueues provide the privilege level that caller might not have
  * to perform the usermodehelper request.
@@ -325,18 +321,18 @@ static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
 		container_of(work, struct subprocess_info, work);
-	pid_t pid;
 
-	if (sub_info->wait & UMH_WAIT_PROC)
-		pid = kernel_thread(call_usermodehelper_exec_sync, sub_info,
-				    CLONE_FS | CLONE_FILES | SIGCHLD);
-	else
+	if (sub_info->wait & UMH_WAIT_PROC) {
+		call_usermodehelper_exec_sync(sub_info);
+	} else {
+		pid_t pid;
+
 		pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
 				    SIGCHLD);
-
-	if (pid < 0) {
-		sub_info->retval = pid;
-		umh_complete(sub_info);
+		if (pid < 0) {
+			sub_info->retval = pid;
+			umh_complete(sub_info);
+		}
 	}
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2015-07-27 16:27 ` [PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
@ 2015-07-27 18:48 ` Tejun Heo
  2015-07-27 21:05   ` Frederic Weisbecker
  2015-08-05 17:10 ` Oleg Nesterov
  6 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2015-07-27 18:48 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Oleg Nesterov, Christoph Lameter, Rik van Riel, Andrew Morton

On Mon, Jul 27, 2015 at 06:27:15PM +0200, Frederic Weisbecker wrote:
> Hence those two debatable changes:
> 
> _ We would like to use generic workqueues. System unbound workqueues are
>   a very good candidate but they are not wide affine, only node affine.
>   Now probably a node is enough to perform many parallel kmod jobs.

If being node-affine is an issue, kmod can easily create a workqueue
w/o NUMA affinity using apply_workqueue_attrs() with no_numa set to
%true.

> _ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
>   handler) to use the workqueue. It means that if the workqueue blocks,
>   and no other worker can take pending kmod request, we can be screwed.
>   Now if we have 512 threads, this should be enough.

The maximum number of worker can also be raised on the workqueue.
That said, I don't think we want to.

IMHO, system_wq should be fine and if it isn't turning off numa
affinity or raising max worker limit later is pretty trivial.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-07-27 18:48 ` [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Tejun Heo
@ 2015-07-27 21:05   ` Frederic Weisbecker
  2015-07-28 18:01     ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-27 21:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Oleg Nesterov, Christoph Lameter, Rik van Riel, Andrew Morton

On Mon, Jul 27, 2015 at 02:48:40PM -0400, Tejun Heo wrote:
> On Mon, Jul 27, 2015 at 06:27:15PM +0200, Frederic Weisbecker wrote:
> > Hence those two debatable changes:
> > 
> > _ We would like to use generic workqueues. System unbound workqueues are
> >   a very good candidate but they are not wide affine, only node affine.
> >   Now probably a node is enough to perform many parallel kmod jobs.
> 
> If being node-affine is an issue, kmod can easily create a workqueue
> w/o NUMA affinity using apply_workqueue_attrs() with no_numa set to
> %true.

Right, but we would like to get rid of khelper which already plays a
similar role.

> 
> > _ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
> >   handler) to use the workqueue. It means that if the workqueue blocks,
> >   and no other worker can take pending kmod request, we can be screwed.
> >   Now if we have 512 threads, this should be enough.
> 
> The maximum number of worker can also be raised on the workqueue.
> That said, I don't think we want to.
> 
> IMHO, system_wq should be fine and if it isn't turning off numa
> affinity or raising max worker limit later is pretty trivial.

That's what I think too. How many workers system_unbound_wq can handle? If kmod
raises very high numbers of threads in parallel like > 500, I think that would be
a problem on its own anyway.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-07-27 21:05   ` Frederic Weisbecker
@ 2015-07-28 18:01     ` Tejun Heo
  2015-07-28 18:07       ` Frederic Weisbecker
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2015-07-28 18:01 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Oleg Nesterov, Christoph Lameter, Rik van Riel, Andrew Morton

Hello, Frederic.

On Mon, Jul 27, 2015 at 11:05:41PM +0200, Frederic Weisbecker wrote:
> > IMHO, system_wq should be fine and if it isn't turning off numa
> > affinity or raising max worker limit later is pretty trivial.
> 
> That's what I think too. How many workers system_unbound_wq can handle? If kmod

256 by default.

> raises very high numbers of threads in parallel like > 500, I think that would be
> a problem on its own anyway.

I'm having hard time to see how limit of 256 would be a problem.  If
it becomes one, let's deal with it then.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-07-28 18:01     ` Tejun Heo
@ 2015-07-28 18:07       ` Frederic Weisbecker
  0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-07-28 18:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Oleg Nesterov, Christoph Lameter, Rik van Riel, Andrew Morton

On Tue, Jul 28, 2015 at 02:01:14PM -0400, Tejun Heo wrote:
> Hello, Frederic.
> 
> On Mon, Jul 27, 2015 at 11:05:41PM +0200, Frederic Weisbecker wrote:
> > > IMHO, system_wq should be fine and if it isn't turning off numa
> > > affinity or raising max worker limit later is pretty trivial.
> > 
> > That's what I think too. How many workers system_unbound_wq can handle? If kmod
> 
> 256 by default.

Looks quite fine!

> 
> > raises very high numbers of threads in parallel like > 500, I think that would be
> > a problem on its own anyway.
> 
> I'm having hard time to see how limit of 256 would be a problem.  If
> it becomes one, let's deal with it then.

Agreed!

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2015-07-27 18:48 ` [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Tejun Heo
@ 2015-08-05 17:10 ` Oleg Nesterov
  2015-08-05 22:14   ` Frederic Weisbecker
  6 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2015-08-05 17:10 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Tejun Heo, Christoph Lameter, Rik van Riel, Andrew Morton

On 07/27, Frederic Weisbecker wrote:
>
> Hence those two debatable changes:
>
> _ We would like to use generic workqueues. System unbound workqueues are
>   a very good candidate but they are not wide affine, only node affine.
>   Now probably a node is enough to perform many parallel kmod jobs.
>
> _ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
>   handler) to use the workqueue. It means that if the workqueue blocks,
>   and no other worker can take pending kmod request, we can be screwed.
>   Now if we have 512 threads, this should be enough.

I think this series is fine. Feel free to add my reviewed-by.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3
  2015-08-05 17:10 ` Oleg Nesterov
@ 2015-08-05 22:14   ` Frederic Weisbecker
  0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2015-08-05 22:14 UTC (permalink / raw)
  To: Oleg Nesterov, Andrew Morton
  Cc: LKML, Tejun Heo, Christoph Lameter, Rik van Riel

On Wed, Aug 05, 2015 at 07:10:57PM +0200, Oleg Nesterov wrote:
> On 07/27, Frederic Weisbecker wrote:
> >
> > Hence those two debatable changes:
> >
> > _ We would like to use generic workqueues. System unbound workqueues are
> >   a very good candidate but they are not wide affine, only node affine.
> >   Now probably a node is enough to perform many parallel kmod jobs.
> >
> > _ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
> >   handler) to use the workqueue. It means that if the workqueue blocks,
> >   and no other worker can take pending kmod request, we can be screwed.
> >   Now if we have 512 threads, this should be enough.
> 
> I think this series is fine. Feel free to add my reviewed-by.

Great!

Andrew, if you're ok with it, can you please apply this set?

Thanks a lot!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-08-05 22:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-27 16:27 [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Frederic Weisbecker
2015-07-27 16:27 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
2015-07-27 16:27 ` [PATCH 2/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
2015-07-27 16:27 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
2015-07-27 16:27 ` [PATCH 4/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
2015-07-27 16:27 ` [PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
2015-07-27 18:48 ` [PATCH 0/5] kmod: Cleanups, simplifications, and make isolation friendly v3 Tejun Heo
2015-07-27 21:05   ` Frederic Weisbecker
2015-07-28 18:01     ` Tejun Heo
2015-07-28 18:07       ` Frederic Weisbecker
2015-08-05 17:10 ` Oleg Nesterov
2015-08-05 22:14   ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.