linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] kmod: Simplifications and cleanups v2
@ 2015-07-09 18:07 Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

Changes in this set:

* Cleanup references to kevent (suggested by Andrew)
* Use of generic unbound workqueues (suggested by Oleg)
* Clarify changelogs (suggested by Oleg)
* Remove the UMH_WAIT_PROC kernel thread (RFC)

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	nohz/kmod-v2

HEAD: 42613c237705cd6814f6ff90a2b64bd8f57700b5

Thanks,
	Frederic
---

Frederic Weisbecker (5):
      kmod: Bunch of internal functions renames
      kmod: Use system_unbound_wq instead of khelper
      kmod: Add up-to-date explanations on the purpose of each asynchronous levels
      kmod: Remove unecessary explicit wide CPU affinity setting
      kmod: Handle UMH_WAIT_PROC from system unbound workqueue


 include/linux/kmod.h |  2 --
 init/main.c          |  1 -
 kernel/kmod.c        | 82 ++++++++++++++++++++++++++--------------------------
 3 files changed, 41 insertions(+), 44 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] kmod: Bunch of internal functions renames
  2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
@ 2015-07-09 18:07 ` Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

Underscores on function names aren't much verbose to explain the
purpose of a function. And kmod has interesting such flavours.

Lets rename the following functions:

* __call_usermodehelper -> call_usermodehelper_exec_work
* ____call_usermodehelper -> call_usermodehelper_exec_async
* wait_for_helper -> call_usermodehelper_exec_sync

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 2777f40..4682e91 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -213,7 +213,7 @@ static void umh_complete(struct subprocess_info *sub_info)
 /*
  * This is the task which runs the usermode application
  */
-static int ____call_usermodehelper(void *data)
+static int call_usermodehelper_exec_async(void *data)
 {
 	struct subprocess_info *sub_info = data;
 	struct cred *new;
@@ -258,7 +258,10 @@ static int ____call_usermodehelper(void *data)
 			   (const char __user *const __user *)sub_info->envp);
 out:
 	sub_info->retval = retval;
-	/* wait_for_helper() will call umh_complete if UHM_WAIT_PROC. */
+	/*
+	 * call_usermodehelper_exec_sync() will call umh_complete
+	 * if UHM_WAIT_PROC.
+	 */
 	if (!(sub_info->wait & UMH_WAIT_PROC))
 		umh_complete(sub_info);
 	if (!retval)
@@ -267,14 +270,14 @@ out:
 }
 
 /* Keventd can't block, but this (a child) can. */
-static int wait_for_helper(void *data)
+static int call_usermodehelper_exec_sync(void *data)
 {
 	struct subprocess_info *sub_info = data;
 	pid_t pid;
 
 	/* If SIGCLD is ignored sys_wait4 won't populate the status. */
 	kernel_sigaction(SIGCHLD, SIG_DFL);
-	pid = kernel_thread(____call_usermodehelper, sub_info, SIGCHLD);
+	pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
 	if (pid < 0) {
 		sub_info->retval = pid;
 	} else {
@@ -282,17 +285,18 @@ static int wait_for_helper(void *data)
 		/*
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
-		 * But wait_for_helper() always runs as keventd, and put_user()
-		 * to a kernel address works OK for kernel threads, due to their
-		 * having an mm_segment_t which spans the entire address space.
+		 * But call_usermodehelper_exec_sync() always runs as keventd,
+		 * and put_user() to a kernel address works OK for kernel
+		 * threads, due to their having an mm_segment_t which spans the
+		 * entire address space.
 		 *
 		 * Thus the __user pointer cast is valid here.
 		 */
 		sys_wait4(pid, (int __user *)&ret, 0, NULL);
 
 		/*
-		 * If ret is 0, either ____call_usermodehelper failed and the
-		 * real error code is already in sub_info->retval or
+		 * If ret is 0, either call_usermodehelper_exec_async failed and
+		 * the real error code is already in sub_info->retval or
 		 * sub_info->retval is 0 anyway, so don't mess with it then.
 		 */
 		if (ret)
@@ -304,17 +308,17 @@ static int wait_for_helper(void *data)
 }
 
 /* This is run by khelper thread  */
-static void __call_usermodehelper(struct work_struct *work)
+static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
 		container_of(work, struct subprocess_info, work);
 	pid_t pid;
 
 	if (sub_info->wait & UMH_WAIT_PROC)
-		pid = kernel_thread(wait_for_helper, sub_info,
+		pid = kernel_thread(call_usermodehelper_exec_sync, sub_info,
 				    CLONE_FS | CLONE_FILES | SIGCHLD);
 	else
-		pid = kernel_thread(____call_usermodehelper, sub_info,
+		pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
 				    SIGCHLD);
 
 	if (pid < 0) {
@@ -509,7 +513,7 @@ struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
 	if (!sub_info)
 		goto out;
 
-	INIT_WORK(&sub_info->work, __call_usermodehelper);
+	INIT_WORK(&sub_info->work, call_usermodehelper_exec_work);
 	sub_info->path = path;
 	sub_info->argv = argv;
 	sub_info->envp = envp;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
@ 2015-07-09 18:07 ` Frederic Weisbecker
  2015-07-09 22:44   ` Oleg Nesterov
  2015-07-09 18:07 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

We need to launch the usermodehelper kernel threads with the widest
affinity and this is why we have khelper for. This workqueue has unbound
properties and thus a wide affinity inherited by all its children.

Now khelper also has special properties that we aren't much interested
in: ordered and singlethread. There is really no need about ordering as
all we do is creating kernel threads. This can be done concurrently.
And singlethread is a useless limitation as well.

The workqueue engine already proposes generic unbound workqueues that
don't share these useless properties and handle well parallel jobs.

Lets just use them.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/kmod.h |  2 --
 init/main.c          |  1 -
 kernel/kmod.c        | 12 ++----------
 3 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 0555cc6..fcfd2bf 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -85,8 +85,6 @@ enum umh_disable_depth {
 	UMH_DISABLED,
 };
 
-extern void usermodehelper_init(void);
-
 extern int __usermodehelper_disable(enum umh_disable_depth depth);
 extern void __usermodehelper_set_disable_depth(enum umh_disable_depth depth);
 
diff --git a/init/main.c b/init/main.c
index c5d5626..45551f8 100644
--- a/init/main.c
+++ b/init/main.c
@@ -877,7 +877,6 @@ static void __init do_initcalls(void)
 static void __init do_basic_setup(void)
 {
 	cpuset_init_smp();
-	usermodehelper_init();
 	shmem_init();
 	driver_init();
 	init_irq_proc();
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 4682e91..d8cc116ab 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -45,8 +45,6 @@
 
 extern int max_threads;
 
-static struct workqueue_struct *khelper_wq;
-
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
 
@@ -548,7 +546,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 		return -EINVAL;
 	}
 	helper_lock();
-	if (!khelper_wq || usermodehelper_disabled) {
+	if (usermodehelper_disabled) {
 		retval = -EBUSY;
 		goto out;
 	}
@@ -560,7 +558,7 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
 	sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &done;
 	sub_info->wait = wait;
 
-	queue_work(khelper_wq, &sub_info->work);
+	queue_work(system_unbound_wq, &sub_info->work);
 	if (wait == UMH_NO_WAIT)	/* task has freed sub_info */
 		goto unlock;
 
@@ -690,9 +688,3 @@ struct ctl_table usermodehelper_table[] = {
 	},
 	{ }
 };
-
-void __init usermodehelper_init(void)
-{
-	khelper_wq = create_singlethread_workqueue("khelper");
-	BUG_ON(!khelper_wq);
-}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels
  2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
@ 2015-07-09 18:07 ` Frederic Weisbecker
  2015-07-09 18:07 ` [PATCH 4/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
  2015-07-09 18:07 ` [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
  4 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

There seem to be quite some confusions on the comments, likely due to
changes that came after them.

Now since it's very non obvious why we have 3 levels of asynchronous
code to implement usermodehelpers, it's important to comment in detail
the reason of this layout.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index d8cc116ab..9ffb24c 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -221,12 +221,12 @@ static int call_usermodehelper_exec_async(void *data)
 	flush_signal_handlers(current, 1);
 	spin_unlock_irq(&current->sighand->siglock);
 
-	/* We can run anywhere, unlike our parent keventd(). */
+	/* We can run anywhere, unlike our parent (unbound workqueue). */
 	set_cpus_allowed_ptr(current, cpu_all_mask);
 
 	/*
-	 * Our parent is keventd, which runs with elevated scheduling priority.
-	 * Avoid propagating that into the userspace child.
+	 * Our parent is the unbound workqueue, which runs with elevated
+	 * scheduling priority. Avoid propagating that into the userspace child.
 	 */
 	set_user_nice(current, 0);
 
@@ -267,7 +267,7 @@ out:
 	do_exit(0);
 }
 
-/* Keventd can't block, but this (a child) can. */
+/* Handles UMH_WAIT_PROC.  */
 static int call_usermodehelper_exec_sync(void *data)
 {
 	struct subprocess_info *sub_info = data;
@@ -283,8 +283,8 @@ static int call_usermodehelper_exec_sync(void *data)
 		/*
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
-		 * But call_usermodehelper_exec_sync() always runs as keventd,
-		 * and put_user() to a kernel address works OK for kernel
+		 * But call_usermodehelper_exec_sync() always runs as kernel
+		 * thread and put_user() to a kernel address works OK for kernel
 		 * threads, due to their having an mm_segment_t which spans the
 		 * entire address space.
 		 *
@@ -305,7 +305,16 @@ static int call_usermodehelper_exec_sync(void *data)
 	do_exit(0);
 }
 
-/* This is run by khelper thread  */
+/*
+ * This function doesn't strictly need to be called asynchronously. But we
+ * need to create the usermodehelper kernel threads from a task that is affine
+ * to all CPUs (or nohz housekeeping ones) such that they inherit a widest
+ * affinity irrespective of call_usermodehelper() callers with possibly reduced
+ * affinity (eg: per-cpu workqueues). We don't want usermodehelper targets to
+ * contend any busy CPU.
+ *
+ * Unbound workqueues provide such wide affinity.
+ */
 static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
@@ -533,7 +542,7 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
  *        from interrupt context.
  *
  * Runs a user-space application.  The application is started
- * asynchronously if wait is not set, and runs as a child of keventd.
+ * asynchronously if wait is not set, and runs as a child of unbound workqueues.
  * (ie. it runs with full root capabilities).
  */
 int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] kmod: Remove unecessary explicit wide CPU affinity setting
  2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2015-07-09 18:07 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
@ 2015-07-09 18:07 ` Frederic Weisbecker
  2015-07-09 18:07 ` [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
  4 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

The call_usermodehelper_exec_[a]sync() kernel threads are created by
unbound workqueues precisely because we want them to be affine to all
CPUs, irrespective of any call_usermodehelper() caller with possibly
reduced CPU affinity. So this explicit all-CPUs wide affinity forcing is
useless.

Not only useless it even adds disturbance on isolated CPUs in nohz full
configurations where users set the unbound workqueues low level cpumask
to a reduced set in order to execute non-user-critical work on
housekeeping dedicated CPUs. This reduced affinity is naturally
inherited to usermodehelper kernel threads but the explicit call to
set_cpus_allowed_ptr() breaks that.

So just remove it.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 9ffb24c..d190178 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -221,9 +221,6 @@ static int call_usermodehelper_exec_async(void *data)
 	flush_signal_handlers(current, 1);
 	spin_unlock_irq(&current->sighand->siglock);
 
-	/* We can run anywhere, unlike our parent (unbound workqueue). */
-	set_cpus_allowed_ptr(current, cpu_all_mask);
-
 	/*
 	 * Our parent is the unbound workqueue, which runs with elevated
 	 * scheduling priority. Avoid propagating that into the userspace child.
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue
  2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2015-07-09 18:07 ` [PATCH 4/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
@ 2015-07-09 18:07 ` Frederic Weisbecker
  2015-07-09 22:51   ` Oleg Nesterov
  4 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-09 18:07 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Oleg Nesterov, Christoph Lameter,
	Rik van Riel, Andrew Morton

The UMH_WAIT_PROC handler runs in its own thread for obsolete reasons.
We couldn't launch and then wait for the exec kernel thread completion
without blocking other usermodehelper queued jobs since khelper was
implemented as a singlthread ordered workqueue.

But now we replaced khelper with generic system unbound workqueues which
can handle concurrent blocking jobs.

So lets run it from the workqueue.

CHECK: I'm just worried about the signal handler that gets tweaked
and also the call to sys_wait() that might fiddle with internals. The
system workqueue must continue to work without surprise for other
works.

Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/kmod.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index d190178..a8bf872 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -265,9 +265,8 @@ out:
 }
 
 /* Handles UMH_WAIT_PROC.  */
-static int call_usermodehelper_exec_sync(void *data)
+static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
 {
-	struct subprocess_info *sub_info = data;
 	pid_t pid;
 
 	/* If SIGCLD is ignored sys_wait4 won't populate the status. */
@@ -281,9 +280,9 @@ static int call_usermodehelper_exec_sync(void *data)
 		 * Normally it is bogus to call wait4() from in-kernel because
 		 * wait4() wants to write the exit code to a userspace address.
 		 * But call_usermodehelper_exec_sync() always runs as kernel
-		 * thread and put_user() to a kernel address works OK for kernel
-		 * threads, due to their having an mm_segment_t which spans the
-		 * entire address space.
+		 * thread (workqueue) and put_user() to a kernel address works
+		 * OK for kernel threads, due to their having an mm_segment_t
+		 * which spans the entire address space.
 		 *
 		 * Thus the __user pointer cast is valid here.
 		 */
@@ -299,7 +298,6 @@ static int call_usermodehelper_exec_sync(void *data)
 	}
 
 	umh_complete(sub_info);
-	do_exit(0);
 }
 
 /*
@@ -316,18 +314,18 @@ static void call_usermodehelper_exec_work(struct work_struct *work)
 {
 	struct subprocess_info *sub_info =
 		container_of(work, struct subprocess_info, work);
-	pid_t pid;
 
-	if (sub_info->wait & UMH_WAIT_PROC)
-		pid = kernel_thread(call_usermodehelper_exec_sync, sub_info,
-				    CLONE_FS | CLONE_FILES | SIGCHLD);
-	else
+	if (sub_info->wait & UMH_WAIT_PROC) {
+		call_usermodehelper_exec_sync(sub_info);
+	} else {
+		pid_t pid;
+
 		pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
 				    SIGCHLD);
-
-	if (pid < 0) {
-		sub_info->retval = pid;
-		umh_complete(sub_info);
+		if (pid < 0) {
+			sub_info->retval = pid;
+			umh_complete(sub_info);
+		}
 	}
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-09 18:07 ` [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
@ 2015-07-09 22:44   ` Oleg Nesterov
  2015-07-10 13:47     ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2015-07-09 22:44 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Christoph Lameter, Rik van Riel, Andrew Morton

On 07/09, Frederic Weisbecker wrote:
>
> We need to launch the usermodehelper kernel threads with the widest
> affinity and this is why we have khelper for. This workqueue has unbound
> properties and thus a wide affinity inherited by all its children.
>
> Now khelper also has special properties that we aren't much interested
> in: ordered and singlethread. There is really no need about ordering as
> all we do is creating kernel threads. This can be done concurrently.
> And singlethread is a useless limitation as well.
>
> The workqueue engine already proposes generic unbound workqueues that
> don't share these useless properties and handle well parallel jobs.
>
> Lets just use them.
>
> Suggested-by: Oleg Nesterov <oleg@redhat.com>

Well yes, but it seems that you missed another part of my email ;)

If we just change usermodehelper to use system_unbound_wq then we
probably should keep set_cpus_allowed_ptr() removed by 4/5.

Note that system_unbound_wq has ->no_numa == F, so its worker threads
are NUMA bound. Perhaps this is not that bad, I do not know. But at
least this means that 4/5 needs more documentation/justification.

But as for this particular patch I obviously like it, khelper_wq
must die imo ;)

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue
  2015-07-09 18:07 ` [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
@ 2015-07-09 22:51   ` Oleg Nesterov
  2015-07-10 13:57     ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2015-07-09 22:51 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Christoph Lameter, Rik van Riel, Andrew Morton

On 07/09, Frederic Weisbecker wrote:
>
> The UMH_WAIT_PROC handler runs in its own thread for obsolete reasons.
> We couldn't launch and then wait for the exec kernel thread completion
> without blocking other usermodehelper queued jobs since khelper was
> implemented as a singlthread ordered workqueue.
>
> But now we replaced khelper with generic system unbound workqueues which
> can handle concurrent blocking jobs.
>
> So lets run it from the workqueue.

Probably this is fine, but I am a bit worried...

WQ_MAX_ACTIVE == 512, this should be enough "in practice". But nothing
protects us from creative driver(s) which spawns 512 long-living user
space tasks...

Note also that userpace can ptrace these task and "block" sys_wait()
forever.

I am worried ;)

> CHECK: I'm just worried about the signal handler that gets tweaked
> and also the call to sys_wait() that might fiddle with internals. The
> system workqueue must continue to work without surprise for other
> works.

Yes. This means that this patch is wrong without disallow_signal()
at the end.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-09 22:44   ` Oleg Nesterov
@ 2015-07-10 13:47     ` Frederic Weisbecker
  2015-07-10 14:20       ` Christoph Lameter
  0 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-10 13:47 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: LKML, Christoph Lameter, Rik van Riel, Andrew Morton

On Fri, Jul 10, 2015 at 12:44:06AM +0200, Oleg Nesterov wrote:
> On 07/09, Frederic Weisbecker wrote:
> >
> > We need to launch the usermodehelper kernel threads with the widest
> > affinity and this is why we have khelper for. This workqueue has unbound
> > properties and thus a wide affinity inherited by all its children.
> >
> > Now khelper also has special properties that we aren't much interested
> > in: ordered and singlethread. There is really no need about ordering as
> > all we do is creating kernel threads. This can be done concurrently.
> > And singlethread is a useless limitation as well.
> >
> > The workqueue engine already proposes generic unbound workqueues that
> > don't share these useless properties and handle well parallel jobs.
> >
> > Lets just use them.
> >
> > Suggested-by: Oleg Nesterov <oleg@redhat.com>
> 
> Well yes, but it seems that you missed another part of my email ;)
> 
> If we just change usermodehelper to use system_unbound_wq then we
> probably should keep set_cpus_allowed_ptr() removed by 4/5.
> 
> Note that system_unbound_wq has ->no_numa == F, so its worker threads
> are NUMA bound. Perhaps this is not that bad, I do not know. But at
> least this means that 4/5 needs more documentation/justification.

Duh! I really thought it was one thread wide affine.
I didn't see that while testing because my box is not NUMA and so I
saw a global affinity.

Now perhaps it is a good thing in the end. At least in nohz full it
doesn't change anything as we affine that workqueue too. But we must
be sure that a single NUMA node is enough to handle typical loads of
usermodehelper.

If nobody can't tell, I suppose all we can do is stay conservative and
create a global no_numa version of system_unbound_wq...

> 
> But as for this particular patch I obviously like it, khelper_wq
> must die imo ;)

Sure :-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue
  2015-07-09 22:51   ` Oleg Nesterov
@ 2015-07-10 13:57     ` Frederic Weisbecker
  0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-10 13:57 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: LKML, Christoph Lameter, Rik van Riel, Andrew Morton

On Fri, Jul 10, 2015 at 12:51:17AM +0200, Oleg Nesterov wrote:
> On 07/09, Frederic Weisbecker wrote:
> >
> > The UMH_WAIT_PROC handler runs in its own thread for obsolete reasons.
> > We couldn't launch and then wait for the exec kernel thread completion
> > without blocking other usermodehelper queued jobs since khelper was
> > implemented as a singlthread ordered workqueue.
> >
> > But now we replaced khelper with generic system unbound workqueues which
> > can handle concurrent blocking jobs.
> >
> > So lets run it from the workqueue.
> 
> Probably this is fine, but I am a bit worried...
> 
> WQ_MAX_ACTIVE == 512, this should be enough "in practice". But nothing
> protects us from creative driver(s) which spawns 512 long-living user
> space tasks...
> 
> Note also that userpace can ptrace these task and "block" sys_wait()
> forever.
> 
> I am worried ;)

I am too. And it depends a lot on which workqueue we rely on. If we can't
rely on the existing ones, we'll to create a new one with a high value for
max active and thus potentially a lot of tasks created just for that
usermodehelper thing...

Then again if we can't know, all we can do is stay conservative.
At least we can update the comments to tell about those doubts.

> 
> > CHECK: I'm just worried about the signal handler that gets tweaked
> > and also the call to sys_wait() that might fiddle with internals. The
> > system workqueue must continue to work without surprise for other
> > works.
> 
> Yes. This means that this patch is wrong without disallow_signal()
> at the end.

Yeah. At least that's fixable :-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 13:47     ` Frederic Weisbecker
@ 2015-07-10 14:20       ` Christoph Lameter
  2015-07-10 17:12         ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2015-07-10 14:20 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, 10 Jul 2015, Frederic Weisbecker wrote:

> Now perhaps it is a good thing in the end. At least in nohz full it
> doesn't change anything as we affine that workqueue too. But we must
> be sure that a single NUMA node is enough to handle typical loads of
> usermodehelper.

This is configurable right? So if you screw up you are responsible.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 14:20       ` Christoph Lameter
@ 2015-07-10 17:12         ` Frederic Weisbecker
  2015-07-10 17:52           ` Christoph Lameter
  0 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-10 17:12 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, Jul 10, 2015 at 09:20:46AM -0500, Christoph Lameter wrote:
> On Fri, 10 Jul 2015, Frederic Weisbecker wrote:
> 
> > Now perhaps it is a good thing in the end. At least in nohz full it
> > doesn't change anything as we affine that workqueue too. But we must
> > be sure that a single NUMA node is enough to handle typical loads of
> > usermodehelper.
> 
> This is configurable right? So if you screw up you are responsible.

No it's not much configurable. The works are scheduled on tasks that are
node affine and you can't change that for system_unbound_wq. Only WQ_SYSFS
workqueues can be overriden on their no_numa property but even there that's
after the boot and most of the usermodehelper load goes on boot.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 17:12         ` Frederic Weisbecker
@ 2015-07-10 17:52           ` Christoph Lameter
  2015-07-10 18:10             ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2015-07-10 17:52 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, 10 Jul 2015, Frederic Weisbecker wrote:

> No it's not much configurable. The works are scheduled on tasks that are
> node affine and you can't change that for system_unbound_wq. Only WQ_SYSFS
> workqueues can be overriden on their no_numa property but even there that's
> after the boot and most of the usermodehelper load goes on boot.

Well then lets have at least one thread per NUMA node so that NUMA
affinity works?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 17:52           ` Christoph Lameter
@ 2015-07-10 18:10             ` Frederic Weisbecker
  2015-07-10 19:05               ` Christoph Lameter
  0 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-10 18:10 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, Jul 10, 2015 at 12:52:39PM -0500, Christoph Lameter wrote:
> On Fri, 10 Jul 2015, Frederic Weisbecker wrote:
> 
> > No it's not much configurable. The works are scheduled on tasks that are
> > node affine and you can't change that for system_unbound_wq. Only WQ_SYSFS
> > workqueues can be overriden on their no_numa property but even there that's
> > after the boot and most of the usermodehelper load goes on boot.
> 
> Well then lets have at least one thread per NUMA node so that NUMA
> affinity works?

That's already the case. There is at least one thread per node for the
workqueue cpumask.

Note that nohz full is perfectly fine with that. The issue I'm worried about
is the case where drivers spawn hundreds of jobs and it all happen on the same
node because the kernel threads inherit the workqueue affinity, instead of
the global affinity that khelper had.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 18:10             ` Frederic Weisbecker
@ 2015-07-10 19:05               ` Christoph Lameter
  2015-07-14 14:04                 ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2015-07-10 19:05 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, 10 Jul 2015, Frederic Weisbecker wrote:

> Note that nohz full is perfectly fine with that. The issue I'm worried about
> is the case where drivers spawn hundreds of jobs and it all happen on the same
> node because the kernel threads inherit the workqueue affinity, instead of
> the global affinity that khelper had.

Well if this is working as intended here then the kernel threads will only
run on a specific cpu. As far as we can tell the amout of kernel threads
spawned is rather low and also the performance requirements on those
threads are low.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper
  2015-07-10 19:05               ` Christoph Lameter
@ 2015-07-14 14:04                 ` Frederic Weisbecker
  0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2015-07-14 14:04 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Oleg Nesterov, LKML, Rik van Riel, Andrew Morton

On Fri, Jul 10, 2015 at 02:05:56PM -0500, Christoph Lameter wrote:
> On Fri, 10 Jul 2015, Frederic Weisbecker wrote:
> 
> > Note that nohz full is perfectly fine with that. The issue I'm worried about
> > is the case where drivers spawn hundreds of jobs and it all happen on the same
> > node because the kernel threads inherit the workqueue affinity, instead of
> > the global affinity that khelper had.
> 
> Well if this is working as intended here then the kernel threads will only
> run on a specific cpu. As far as we can tell the amout of kernel threads
> spawned is rather low

Quite high actually. I count 578 calls on my machine. Most of them are launched
by crypto subsystem trying to load modules. And it takes more than one second to
complete all of these requests...

> and also the performance requirements on those
> threads are low.

I think it is sensitive given the possible high number of instances launched. Now
at least the crypto subsystem hasn't optimized that at all because all these
instances are serialized. Basically on my machine, all of them run on CPU 0.

Now I'm worried about other configs that may launch loads of parallel
usermodehelper threads. That said I tend to think that if such a thing hasn't
been seen as a problem on small SMP systems, why would it be an issue if we
affine them on a NUMA node that is usually at least 4 CPUs wide? Or is it possible
to see lower numbers of CPUs in a NUMA node?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-07-14 14:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-09 18:07 [PATCH 0/5] kmod: Simplifications and cleanups v2 Frederic Weisbecker
2015-07-09 18:07 ` [PATCH 1/5] kmod: Bunch of internal functions renames Frederic Weisbecker
2015-07-09 18:07 ` [PATCH 2/5] kmod: Use system_unbound_wq instead of khelper Frederic Weisbecker
2015-07-09 22:44   ` Oleg Nesterov
2015-07-10 13:47     ` Frederic Weisbecker
2015-07-10 14:20       ` Christoph Lameter
2015-07-10 17:12         ` Frederic Weisbecker
2015-07-10 17:52           ` Christoph Lameter
2015-07-10 18:10             ` Frederic Weisbecker
2015-07-10 19:05               ` Christoph Lameter
2015-07-14 14:04                 ` Frederic Weisbecker
2015-07-09 18:07 ` [PATCH 3/5] kmod: Add up-to-date explanations on the purpose of each asynchronous levels Frederic Weisbecker
2015-07-09 18:07 ` [PATCH 4/5] kmod: Remove unecessary explicit wide CPU affinity setting Frederic Weisbecker
2015-07-09 18:07 ` [RFC PATCH 5/5] kmod: Handle UMH_WAIT_PROC from system unbound workqueue Frederic Weisbecker
2015-07-09 22:51   ` Oleg Nesterov
2015-07-10 13:57     ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).