All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <htejun@gmail.com>, Li Zefan <lizf@cn.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Paul Menage <paul@paulmenage.org>
Subject: Re: [patch] cpusets: allow PF_THREAD_BOUND kworkers to escape from a cpuset
Date: Fri, 23 Sep 2011 16:33:12 +0200	[thread overview]
Message-ID: <1316788392.6544.33.camel@marge.simson.net> (raw)
In-Reply-To: <alpine.DEB.2.00.1109230619410.666@chino.kir.corp.google.com>

On Fri, 2011-09-23 at 06:27 -0700, David Rientjes wrote:
> On Fri, 23 Sep 2011, Mike Galbraith wrote:
> 
> > cpusets: allow PF_THREAD_BOUND kworkers to escape from a cpuset
> > 
> > kworkers can be born in a cpuset, leaving them adrift on an unsinkable ship.
> > Allow them to be moved to the root cpuset so the cpuset can be destroyed
> > 
> > Signed-off-by: Mike Galbraith <efault@gmx.de>
> > Acked-by: Li Zefan <lizf@cn.fujitsu.com>
> 
> Did Li ack this version?

Oops, no.
 
> > diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> > index 10131fd..3769f9e 100644
> > --- a/kernel/cpuset.c
> > +++ b/kernel/cpuset.c
> > @@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> >  	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
> >  	 * be changed.
> >  	 */
> > -	if (tsk->flags & PF_THREAD_BOUND)
> > +	if (tsk->flags & PF_THREAD_BOUND && cont != cont->top_cgroup)
> >  		return -EINVAL;
> >  
> >  	return 0;
> > @@ -1426,9 +1426,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
> >  	/*
> >  	 * can_attach beforehand should guarantee that this doesn't fail.
> >  	 * TODO: have a better way to handle failure here
> > +	 * 
> > +	 * Special case: bound kthreads born in a cpuset may be moved to
> > +	 * the top level cpuset without attempting to diddle their mask.
> >  	 */
> > -	err = set_cpus_allowed_ptr(tsk, cpus_attach);
> > -	WARN_ON_ONCE(err);
> > +	if (!(tsk->flags & PF_THREAD_BOUND && cont == cont->top_cgroup)) {
> > +		err = set_cpus_allowed_ptr(tsk, cpus_attach);
> > +		WARN_ON_ONCE(err);
> > +	}
> >  
> >  	cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
> >  	cpuset_update_task_spread_flag(cs, tsk);
> 
> This doesn't make any sense, the user can now change the cpumask of the 
> kworker with cpusets but not with sched_setaffinity().

marge:~/tmp # ps l `cat /cpusets/system/tasks`|grep kworker
1     0  6466     2  20   0      0     0 schedu S    ?          0:00 [kworker/2:4]

Ok, there's a kworker which was born in 'system' cpuset, cpus 0-2.

marge:~/tmp # taskset -p 6466
pid 6466's current affinity mask: 4
marge:~/tmp # cset shield --reset
cset: --> deactivating/reseting shielding
cset: moving 7 tasks from "/user" user set to root set...
[==================================================]%
cset: moving 243 tasks from "/system" system set to root set...
[==================================================]%
cset: deleting "/user" and "/system" sets
cset: done
marge:~/tmp # taskset -p 6466
pid 6466's current affinity mask: 4

Worker still alive, affinity intact, 'system' cpuset destroyed, no
gripage in dmesg.  All seems fine.

> PF_THREAD_BOUND is set specifically so threads cannot move from the cpu 
> that they are bound to, that's why the cpuset code and sched_setaffinity() 
> reject such a configuration.

Yeah.

>   So the problem isn't in the cpuset code or 
> scheduler at all, you would need to deal with this in the kworker code by 
> either not setting PF_THREAD_BOUND (which, according to the comment, Tejun 
> thinks is pretty important) or manage the worker threads in a way such 
> that the new cpumask (all cpus, since it's in the root cpuset) actually 
> make sense for that kworker.  The cpuset code won't care if the kworker 
> has a cpumask of all online cpus.

I don't see why it's a kworker code problem.  The kworker code couldn't
care less about cpusets.  But I don't care who fixes it or how.

If cpusets doesn't want to let PF_THREAD_BOUND threads out, how about
cpusets not letting userland scripts attach kthreadd instead?

cpusets: disallow moving kthreadd into a cpuset.

If kthreadd is moved into a cpuset, it's child may then acquire
PF_THREAD_BOUND and become trapped, making it's cpuset immortal.

Signed-off-by: Mike Galbraith <efault@gmx.de>

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 10131fd..0d9cd04 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -59,6 +59,7 @@
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
 #include <linux/cgroup.h>
+#include <linux/kthread.h>
 
 /*
  * Workqueue for cpuset related tasks.
@@ -1382,9 +1383,10 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
 	 * set of allowed nodes is unnecessary.  Thus, cpusets are not
 	 * applicable for such threads.  This prevents checking for success of
 	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
-	 * be changed.
+	 * be changed.  We also disallow attaching kthreadd, to prevent it's
+	 * child from becoming trapped should it then acquire PF_THREAD_BOUND.
 	 */
-	if (tsk->flags & PF_THREAD_BOUND)
+	if (tsk->flags & PF_THREAD_BOUND || tsk == kthreadd_task)
 		return -EINVAL;
 
 	return 0;



  reply	other threads:[~2011-09-23 14:33 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-23  6:21 [patch] cpusets: allow PF_THREAD_BOUND kworkers to escape from a cpuset Mike Galbraith
2011-09-23  7:00 ` Li Zefan
2011-09-23  7:19   ` Mike Galbraith
2011-09-23  9:12     ` David Rientjes
2011-09-23  9:42       ` Mike Galbraith
2011-09-23 10:53         ` Mike Galbraith
2011-09-23 13:27           ` David Rientjes
2011-09-23 14:33             ` Mike Galbraith [this message]
2011-09-23 18:16               ` Mike Galbraith
2011-09-23 20:20               ` David Rientjes
2011-09-24  3:21                 ` Tejun Heo
2011-10-10  5:34               ` Mike Galbraith
2011-10-10  8:03                 ` [patch] cpusets, cgroups: disallow attaching kthreadd Mike Galbraith
2011-10-10 16:43                   ` Tejun Heo
2011-10-11  2:31                     ` Mike Galbraith
2011-10-11 14:08                     ` Peter Zijlstra
2011-10-11 16:57                       ` Mike Galbraith
2011-10-12  1:22                         ` David Rientjes
2011-10-12  1:45                           ` Mike Galbraith
2011-10-12  1:20                       ` David Rientjes
2011-10-18  8:10                   ` patch] " Mike Galbraith
2011-10-18  8:16                     ` David Rientjes
2011-10-18  8:42                     ` Peter Zijlstra
2011-10-18  8:47                       ` Mike Galbraith
2011-10-18  9:06                         ` Peter Zijlstra
2011-10-18  9:23                           ` Mike Galbraith
2011-10-18 10:11                             ` Mike Galbraith
2011-10-18 20:38                               ` David Rientjes
2011-10-19  4:00                                 ` Mike Galbraith
2011-10-19  4:53                                   ` Paul Menage
2011-10-19  4:56                                     ` Paul Menage
2011-10-19  5:28                                       ` Mike Galbraith
2011-10-19  7:50                                 ` Peter Zijlstra
2011-10-19 19:47                                   ` David Rientjes
2011-10-20 10:32                                     ` Peter Zijlstra
2011-10-20 21:24                                       ` David Rientjes
2011-10-19  4:57                     ` Paul Menage
2011-10-19  5:24                       ` [patch-final] " Mike Galbraith
2011-10-19  7:54                         ` Peter Zijlstra
2011-10-19  8:00                           ` Paul Menage
2011-10-19  8:21                           ` Mike Galbraith
2011-10-26 20:27                         ` David Rientjes
2011-11-10 20:51                           ` David Rientjes
2011-12-06 20:13                             ` David Rientjes
2011-12-06 22:47                               ` Andrew Morton
2011-12-06 23:53                                 ` David Rientjes
2011-12-07  0:05                                   ` Andrew Morton
2011-12-07  3:18                                   ` [resubmit] " Mike Galbraith
2011-12-07  4:36                                     ` Mike Galbraith
2011-12-07 22:03                                       ` David Rientjes
2011-12-14 20:16                                         ` Andrew Morton
2011-12-15  2:50                                           ` David Rientjes
2012-01-06 22:06                                       ` Andrew Morton
2012-01-07  6:34                                         ` Mike Galbraith
2012-01-07  7:59                                           ` Andrew Morton
2011-12-07 22:01                                     ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1316788392.6544.33.camel@marge.simson.net \
    --to=efault@gmx.de \
    --cc=htejun@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=paul@paulmenage.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.