All of lore.kernel.org
 help / color / mirror / Atom feed
* cgroup attach task - slogging cpu
@ 2013-10-04  5:25 anjana vk
       [not found] ` <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: anjana vk @ 2013-10-04  5:25 UTC (permalink / raw)
  To: oleg-H+wXaHxf7aLQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1651 bytes --]

Hi Oleg, All

I saw an issue of CPU slogging posted in
https://lkml.org/lkml/2013/8/28/283, and require your valuable
suggestion on a very similar issue I am facing.

We are facing the issue of cpu slogging in the function
    cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
bool threadgroup)
in the do-while loop which iterates over the threadgroup.

 rcu_read_lock();
 do {
 struct task_and_cgroup ent;

 /* @tsk either already exited or can't exit until the end */
 if (tsk->flags & PF_EXITING)
 continue;

 /* as per above, nr_threads may decrease, but not increase. */
 BUG_ON(i >= group_size);
 ent.task = tsk;
 ent.cgrp = task_cgroup_from_root(tsk, root);
 /* nothing to do if this task is already in the cgroup */
 if (ent.cgrp == cgrp)
 continue;
 /*
 * saying GFP_ATOMIC has no effect here because we did prealloc
 * earlier, but it's good form to communicate our expectations.
 */
 retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 BUG_ON(retval != 0);
 i++;

 if (!threadgroup)
 break;
 } while_each_thread(leader, tsk);

Problem Observed:
In this loop, in case of a single thread, and argument “bool
threadgroup” as “false” and
if(ent.cgrp == cgrp) is true
we will continue to the next thread instead of breaking out of the loop.

Possible Solution and Doubt:
When a break condition was added as shown in the patch attached, the
cpu sluggishness was not occurring.
Can you please provide your suggestions, if this is the right way to
fix the above mentioned issue.
Also, if a fix is already in for this, can you please guide me to that.

Thanks and Regards
Anjana

[-- Attachment #2: cgroup_singe_thread_attach.patch --]
[-- Type: application/octet-stream, Size: 737 bytes --]

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1f53387..cae8416 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 		ent.task = tsk;
 		ent.cgrp = task_cgroup_from_root(tsk, root);
 		/* nothing to do if this task is already in the cgroup */
-		if (ent.cgrp == cgrp)
+		if (ent.cgrp == cgrp && !threadgroup)
+			break;
+		else if(ent.cgrp == cgrp)
 			continue;
 		/*
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
@@ -2199,7 +2201,6 @@ retry_find_task:
 	ret = cgroup_attach_task(cgrp, tsk, threadgroup);
 
 	threadgroup_unlock(tsk);
-
 	put_task_struct(tsk);
 out_unlock_cgroup:
 	mutex_unlock(&cgroup_mutex);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found] ` <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-10-04 13:02   ` Oleg Nesterov
       [not found]     ` <20131004130207.GA9338-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-04 13:02 UTC (permalink / raw)
  To: anjana vk; +Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA

Hello Anjana,

On 10/04, anjana vk wrote:
>
> I saw an issue of CPU slogging posted in
> https://lkml.org/lkml/2013/8/28/283, and require your valuable
> suggestion on a very similar issue I am facing.

Not sure I understand, but just in case: yes the lockless
while_each_thread() is buggy and should be fixed (it should actually
die eventually). And a lot of current users of while_each_thread()
are themselves buggy and need the fixes no matter what we do with
while_each_thread.

Oh. and just in case... I am (slowly) working on this, but didn't
finish it yet, sorry.

But I do not really understand cgroup_attach_task(), and I am not
sure you observed the same problem. Add Tejun.

> We are facing the issue of cpu slogging in the function
>     cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> bool threadgroup)
> in the do-while loop which iterates over the threadgroup.
>
>  rcu_read_lock();
>  do {
>  struct task_and_cgroup ent;
>
>  /* @tsk either already exited or can't exit until the end */
>  if (tsk->flags & PF_EXITING)
>  continue;
>
>  /* as per above, nr_threads may decrease, but not increase. */
>  BUG_ON(i >= group_size);
>  ent.task = tsk;
>  ent.cgrp = task_cgroup_from_root(tsk, root);
>  /* nothing to do if this task is already in the cgroup */
>  if (ent.cgrp == cgrp)
>  continue;
>  /*
>  * saying GFP_ATOMIC has no effect here because we did prealloc
>  * earlier, but it's good form to communicate our expectations.
>  */
>  retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
>  BUG_ON(retval != 0);
>  i++;
>
>  if (!threadgroup)
>  break;
>  } while_each_thread(leader, tsk);
>
> Problem Observed:
> In this loop, in case of a single thread, and argument “bool
> threadgroup” as “false” and
> if(ent.cgrp == cgrp) is true
> we will continue to the next thread instead of breaking out of the loop.

But in this case the loop should be terminated by while_each_thread's
check?

Again, again, while_each_thread() is wrong, and we need

	--- x/kernel/cgroup.c
	+++ x/kernel/cgroup.c
	@@ -2034,7 +2034,7 @@ static int cgroup_attach_task(struct cgr
		 * take an rcu_read_lock.
		 */
		rcu_read_lock();
	-	do {
	+	for_each_thread(leader->signal, task) {
			struct task_and_cgroup ent;
	 
			/* @tsk either already exited or can't exit until the end */
	@@ -2058,7 +2058,7 @@ static int cgroup_attach_task(struct cgr
	 
			if (!threadgroup)
				break;
	-	} while_each_thread(leader, tsk);
	+	}
		rcu_read_unlock();
		/* remember the number of threads in the array for later. */
		group_size = i;

after we add for_each_thread/etc. But I am not sure we need another
"threadgroup" check.

> Possible Solution and Doubt:
> When a break condition was added as shown in the patch attached, the
> cpu sluggishness was not occurring.
> Can you please provide your suggestions, if this is the right way to
> fix the above mentioned issue.
> Also, if a fix is already in for this, can you please guide me to that.
>
> Thanks and Regards
> Anjana
>
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1f53387..cae8416 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  		ent.task = tsk;
>  		ent.cgrp = task_cgroup_from_root(tsk, root);
>  		/* nothing to do if this task is already in the cgroup */
> -		if (ent.cgrp == cgrp)
> +		if (ent.cgrp == cgrp && !threadgroup)
> +			break;
> +		else if(ent.cgrp == cgrp)
>  			continue;
>  		/*
>  		 * saying GFP_ATOMIC has no effect here because we did prealloc
> @@ -2199,7 +2201,6 @@ retry_find_task:
>  	ret = cgroup_attach_task(cgrp, tsk, threadgroup);
>  
>  	threadgroup_unlock(tsk);
> -
>  	put_task_struct(tsk);
>  out_unlock_cgroup:
>  	mutex_unlock(&cgroup_mutex);

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]     ` <20131004130207.GA9338-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-10-07 18:45       ` Tejun Heo
       [not found]         ` <20131007184507.GD27396-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Tejun Heo @ 2013-10-07 18:45 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA

Hey,

On Fri, Oct 04, 2013 at 03:02:07PM +0200, Oleg Nesterov wrote:
> Not sure I understand, but just in case: yes the lockless
> while_each_thread() is buggy and should be fixed (it should actually
> die eventually). And a lot of current users of while_each_thread()
> are themselves buggy and need the fixes no matter what we do with
> while_each_thread.
> 
> Oh. and just in case... I am (slowly) working on this, but didn't
> finish it yet, sorry.

Is there anything I can do to make the hang go away in the meantime?

> But I do not really understand cgroup_attach_task(), and I am not
> sure you observed the same problem. Add Tejun.

Sounds like the same problem to me.

> 	--- x/kernel/cgroup.c
> 	+++ x/kernel/cgroup.c
> 	@@ -2034,7 +2034,7 @@ static int cgroup_attach_task(struct cgr
> 		 * take an rcu_read_lock.
> 		 */
> 		rcu_read_lock();
> 	-	do {
> 	+	for_each_thread(leader->signal, task) {
> 			struct task_and_cgroup ent;
> 	 
> 			/* @tsk either already exited or can't exit until the end */
> 	@@ -2058,7 +2058,7 @@ static int cgroup_attach_task(struct cgr
> 	 
> 			if (!threadgroup)
> 				break;
> 	-	} while_each_thread(leader, tsk);
> 	+	}
> 		rcu_read_unlock();
> 		/* remember the number of threads in the array for later. */
> 		group_size = i;
> 
> after we add for_each_thread/etc. But I am not sure we need another
> "threadgroup" check.

...
> > diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> > index 1f53387..cae8416 100644
> > --- a/kernel/cgroup.c
> > +++ b/kernel/cgroup.c
> > @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> >  		ent.task = tsk;
> >  		ent.cgrp = task_cgroup_from_root(tsk, root);
> >  		/* nothing to do if this task is already in the cgroup */
> > -		if (ent.cgrp == cgrp)
> > +		if (ent.cgrp == cgrp && !threadgroup)
> > +			break;
> > +		else if(ent.cgrp == cgrp)
> >  			continue;

I have no idea know how this would make any difference.  Hmm...

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]         ` <20131007184507.GD27396-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2013-10-08 14:58           ` Oleg Nesterov
       [not found]             ` <20131008145833.GA15600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-08 14:58 UTC (permalink / raw)
  To: Tejun Heo; +Cc: anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA

On 10/07, Tejun Heo wrote:
>
> Hey,
>
> On Fri, Oct 04, 2013 at 03:02:07PM +0200, Oleg Nesterov wrote:
> > Not sure I understand, but just in case: yes the lockless
> > while_each_thread() is buggy and should be fixed (it should actually
> > die eventually). And a lot of current users of while_each_thread()
> > are themselves buggy and need the fixes no matter what we do with
> > while_each_thread.
> >
> > Oh. and just in case... I am (slowly) working on this, but didn't
> > finish it yet, sorry.
>
> Is there anything I can do to make the hang go away in the meantime?

well, probably tasklist, or pid_alive() check... note that this code
rcu/while_each_thread() logic looks wrong even if while_each_thread()
was fine. Will try to makes a patch today.

But! Anjana! I am so sorry!! I simply can't understand how I managed
to misunderstand (and confuse!) you.

Tejun, could you please look at the patch from Anjana again?


> > > --- a/kernel/cgroup.c
> > > +++ b/kernel/cgroup.c
> > > @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
> > >  		ent.task = tsk;
> > >  		ent.cgrp = task_cgroup_from_root(tsk, root);
> > >  		/* nothing to do if this task is already in the cgroup */
> > > -		if (ent.cgrp == cgrp)
> > > +		if (ent.cgrp == cgrp && !threadgroup)
> > > +			break;
> > > +		else if(ent.cgrp == cgrp)
> > >  			continue;
>
> I have no idea know how this would make any difference.  Hmm...

I guess the patch needs a cleanup, but this code looks indeed wrong
even if we forget about rcu/while_each_thread? Really, we should not
"continue" if threadgroup == T, in this case we miss the last

		if (!threadgroup)
			break;

check in the main loop. So Anjana was right (sorry again!), and we
should probably do

		ent.cgrp = task_cgroup_from_root(...);
		if (ent.cgrp != cgrp) {
			retval = flex_array_put(...);
			...
		}

		if (!threadgroup)
			break;

Or I am wrong again?

Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]             ` <20131008145833.GA15600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-10-09  5:35               ` Li Zefan
       [not found]                 ` <5254EB2A.7090803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Li Zefan @ 2013-10-09  5:35 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA

>>>> --- a/kernel/cgroup.c
>>>> +++ b/kernel/cgroup.c
>>>> @@ -2002,7 +2002,9 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>>>>  		ent.task = tsk;
>>>>  		ent.cgrp = task_cgroup_from_root(tsk, root);
>>>>  		/* nothing to do if this task is already in the cgroup */
>>>> -		if (ent.cgrp == cgrp)
>>>> +		if (ent.cgrp == cgrp && !threadgroup)
>>>> +			break;
>>>> +		else if(ent.cgrp == cgrp)
>>>>  			continue;
>>
>> I have no idea know how this would make any difference.  Hmm...
> 
> I guess the patch needs a cleanup, but this code looks indeed wrong
> even if we forget about rcu/while_each_thread? Really, we should not
> "continue" if threadgroup == T, in this case we miss the last
> 
> 		if (!threadgroup)
> 			break;
> 

Yes, this is wrong.

lxc34:/mnt/tmp # echo 5207 > tasks
lxc34:/mnt/tmp # cat tasks
5207
lxc34:/mnt/tmp # echo 5207 > tasks
lxc34:/mnt/tmp # cat tasks
5207
5215
lxc34:/mnt/tmp # echo 5207 > tasks
lxc34:/mnt/tmp # cat tasks
5207
5215
5216

I wanted to attach 5207 to the cgroup, and in this case it should be
a no-op because the thread was already in this cgroup, but the result
was another thread got attached to it.

Anjana, could you revise the patch and send it out with proper changelog
and Signed-off-by? And please add "Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+"

The cpu slogging issue is another bug that we're working on.

> check in the main loop. So Anjana was right (sorry again!), and we
> should probably do
> 
> 		ent.cgrp = task_cgroup_from_root(...);
> 		if (ent.cgrp != cgrp) {
> 			retval = flex_array_put(...);
> 			...
> 		}
> 
> 		if (!threadgroup)
> 			break;
> 

Or

do {
	...
	if (ent.cgrp == cgrp)
		goto next;
	...
next:
	if (!threadgroup)
		break;
} while (...);

> Or I am wrong again?
> 

No, you are not! :)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]                 ` <5254EB2A.7090803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2013-10-09 13:30                   ` Oleg Nesterov
       [not found]                     ` <20131009133047.GA12414-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-09 13:30 UTC (permalink / raw)
  To: Li Zefan; +Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA

On 10/09, Li Zefan wrote:
>
> Anjana, could you revise the patch and send it out with proper changelog
> and Signed-off-by? And please add "Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+"

Yes, Anjana, please!

> > check in the main loop. So Anjana was right (sorry again!), and we
> > should probably do
> > 
> > 		ent.cgrp = task_cgroup_from_root(...);
> > 		if (ent.cgrp != cgrp) {
> > 			retval = flex_array_put(...);
> > 			...
> > 		}
> > 
> > 		if (!threadgroup)
> > 			break;
> > 
> 
> Or
> 
> do {
> 	...
> 	if (ent.cgrp == cgrp)
> 		goto next;

Or this, agreed.

> > Or I am wrong again?
>
> No, you are not! :)

Thanks ;)

Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]                     ` <20131009133047.GA12414-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-10-09 14:05                       ` Oleg Nesterov
  2013-10-09 16:54                           ` Oleg Nesterov
       [not found]                         ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g@mail.gmail.com>
  0 siblings, 2 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-09 14:05 UTC (permalink / raw)
  To: Li Zefan; +Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA

On 10/09, Oleg Nesterov wrote:
>
> On 10/09, Li Zefan wrote:
> >
> > Anjana, could you revise the patch and send it out with proper changelog
> > and Signed-off-by? And please add "Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+"
>
> Yes, Anjana, please!

Please note also that the PF_EXITING check has the same problem, it also
needs "goto next".

> > > check in the main loop. So Anjana was right (sorry again!), and we
> > > should probably do
> > > 
> > > 		ent.cgrp = task_cgroup_from_root(...);
> > > 		if (ent.cgrp != cgrp) {
> > > 			retval = flex_array_put(...);
> > > 			...
> > > 		}
> > > 
> > > 		if (!threadgroup)
> > > 			break;
> > > 
> > 
> > Or
> > 
> > do {
> > 	...
> > 	if (ent.cgrp == cgrp)
> > 		goto next;
> 
> Or this, agreed.
> 
> > > Or I am wrong again?
> >
> > No, you are not! :)
> 
> Thanks ;)
> 
> Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-09 16:54                           ` Oleg Nesterov
  0 siblings, 0 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-09 16:54 UTC (permalink / raw)
  To: Li Zefan; +Cc: Tejun Heo, anjana vk, cgroups, linux-kernel, eunki_kim

And I am starting to think that this change should also fix the
while_each_thread() problems in this particular case.

In generak the code like

	rcu_read_lock();
	task = find_get_task(...);
	rcu_read_unlock();

	rcu_read_lock();
	t = task;
	do {
		...
	} while_each_thread (task, t);
	rcu_read_unlock();

is wrong even if while_each_thread() was correct (and we have a lot
of examples of this pattern). A GP can pass before the 2nd rcu-lock,
and we simply can't trust ->thread_group.next.

But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
be called with threadgroup == T when a) tsk is ->group_leader and b)
we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
"tsk" can't be removed from ->thread_group list before other threads.

If next_thread() sees thread_group.next != leader, we know that the
that .next thread didn't do __unhash_process() yet, and since we
know that in this case "leader" didn't do this too we are safe.

In short: __unhash_process(leader) (in this) case can never change
->thread_group.next of another thread, because leader->thread_group
should be already list_empty().

On 10/09, Oleg Nesterov wrote:
>
> On 10/09, Oleg Nesterov wrote:
> >
> > On 10/09, Li Zefan wrote:
> > >
> > > Anjana, could you revise the patch and send it out with proper changelog
> > > and Signed-off-by? And please add "Cc: <stable@vger.kernel.org> # 3.9+"
> >
> > Yes, Anjana, please!
>
> Please note also that the PF_EXITING check has the same problem, it also
> needs "goto next".
>
> > > > check in the main loop. So Anjana was right (sorry again!), and we
> > > > should probably do
> > > >
> > > > 		ent.cgrp = task_cgroup_from_root(...);
> > > > 		if (ent.cgrp != cgrp) {
> > > > 			retval = flex_array_put(...);
> > > > 			...
> > > > 		}
> > > >
> > > > 		if (!threadgroup)
> > > > 			break;
> > > >
> > >
> > > Or
> > >
> > > do {
> > > 	...
> > > 	if (ent.cgrp == cgrp)
> > > 		goto next;
> >
> > Or this, agreed.
> >
> > > > Or I am wrong again?
> > >
> > > No, you are not! :)
> >
> > Thanks ;)
> >
> > Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-09 16:54                           ` Oleg Nesterov
  0 siblings, 0 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-09 16:54 UTC (permalink / raw)
  To: Li Zefan
  Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

And I am starting to think that this change should also fix the
while_each_thread() problems in this particular case.

In generak the code like

	rcu_read_lock();
	task = find_get_task(...);
	rcu_read_unlock();

	rcu_read_lock();
	t = task;
	do {
		...
	} while_each_thread (task, t);
	rcu_read_unlock();

is wrong even if while_each_thread() was correct (and we have a lot
of examples of this pattern). A GP can pass before the 2nd rcu-lock,
and we simply can't trust ->thread_group.next.

But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
be called with threadgroup == T when a) tsk is ->group_leader and b)
we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
"tsk" can't be removed from ->thread_group list before other threads.

If next_thread() sees thread_group.next != leader, we know that the
that .next thread didn't do __unhash_process() yet, and since we
know that in this case "leader" didn't do this too we are safe.

In short: __unhash_process(leader) (in this) case can never change
->thread_group.next of another thread, because leader->thread_group
should be already list_empty().

On 10/09, Oleg Nesterov wrote:
>
> On 10/09, Oleg Nesterov wrote:
> >
> > On 10/09, Li Zefan wrote:
> > >
> > > Anjana, could you revise the patch and send it out with proper changelog
> > > and Signed-off-by? And please add "Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+"
> >
> > Yes, Anjana, please!
>
> Please note also that the PF_EXITING check has the same problem, it also
> needs "goto next".
>
> > > > check in the main loop. So Anjana was right (sorry again!), and we
> > > > should probably do
> > > >
> > > > 		ent.cgrp = task_cgroup_from_root(...);
> > > > 		if (ent.cgrp != cgrp) {
> > > > 			retval = flex_array_put(...);
> > > > 			...
> > > > 		}
> > > >
> > > > 		if (!threadgroup)
> > > > 			break;
> > > >
> > >
> > > Or
> > >
> > > do {
> > > 	...
> > > 	if (ent.cgrp == cgrp)
> > > 		goto next;
> >
> > Or this, agreed.
> >
> > > > Or I am wrong again?
> > >
> > > No, you are not! :)
> >
> > Thanks ;)
> >
> > Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]                           ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-10-10  4:22                             ` anjana vk
  2013-10-10 11:11                             ` Oleg Nesterov
  1 sibling, 0 replies; 23+ messages in thread
From: anjana vk @ 2013-10-10  4:22 UTC (permalink / raw)
  To: Anjana V Kumar
  Cc: Oleg Nesterov, Li Zefan, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, stable-u79uwXL29TY76Z2rM5mHXA

+cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 10/10/13, Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi All,
>
> Please find the attached patch making the changes for the case where thread
> is already in the required cgroup.
>
> Can you please let me know if this is the right way to fix this issue, if
> any modifications are required.
>
> Thanks and Regards
> Anjana
>
>
> On Wed, Oct 9, 2013 at 7:35 PM, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>
>> On 10/09, Oleg Nesterov wrote:
>> >
>> > On 10/09, Li Zefan wrote:
>> > >
>> > > Anjana, could you revise the patch and send it out with proper
>> changelog
>> > > and Signed-off-by? And please add "Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> #
>> 3.9+"
>> >
>> > Yes, Anjana, please!
>>
>> Please note also that the PF_EXITING check has the same problem, it also
>> needs "goto next".
>>
>> > > > check in the main loop. So Anjana was right (sorry again!), and we
>> > > > should probably do
>> > > >
>> > > >           ent.cgrp = task_cgroup_from_root(...);
>> > > >           if (ent.cgrp != cgrp) {
>> > > >                   retval = flex_array_put(...);
>> > > >                   ...
>> > > >           }
>> > > >
>> > > >           if (!threadgroup)
>> > > >                   break;
>> > > >
>> > >
>> > > Or
>> > >
>> > > do {
>> > >     ...
>> > >     if (ent.cgrp == cgrp)
>> > >             goto next;
>> >
>> > Or this, agreed.
>> >
>> > > > Or I am wrong again?
>> > >
>> > > No, you are not! :)
>> >
>> > Thanks ;)
>> >
>> > Oleg.
>>
>>
>
>
> --
> Anjana
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup attach task - slogging cpu
       [not found]                           ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-10-10  4:22                             ` cgroup attach task - slogging cpu anjana vk
@ 2013-10-10 11:11                             ` Oleg Nesterov
  1 sibling, 0 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-10 11:11 UTC (permalink / raw)
  To: Anjana V Kumar
  Cc: Li Zefan, Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	stable-u79uwXL29TY76Z2rM5mHXA

Hi Anjana,

> On 10/10, Anjana V Kumar wrote:
> >
> >
> Problem:
> Issue when attaching a single thread to a cgroup if the thread was alredy in the
> cgroup. The check if the thread is already in cgroup in the above case,
> continues to the next thread instead of exciting.

Yes. Thanks.

> @@ -2047,7 +2047,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  		ent.cgrp = task_cgroup_from_root(tsk, root);
>  		/* nothing to do if this task is already in the cgroup */
>  		if (ent.cgrp == cgrp)
> -			continue;
> +			goto next_thread;
>  		/*
>  		 * saying GFP_ATOMIC has no effect here because we did prealloc
>  		 * earlier, but it's good form to communicate our expectations.
> @@ -2055,7 +2055,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
>  		BUG_ON(retval != 0);
>  		i++;
> -
> +next_thread:

Yes, but you forgot to fix another "continue" after "ent.cgrp == cgrp" ;)

Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-11 13:15                             ` Li Zefan
  0 siblings, 0 replies; 23+ messages in thread
From: Li Zefan @ 2013-10-11 13:15 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Tejun Heo, anjana vk, cgroups, linux-kernel, eunki_kim

On 2013/10/10 0:54, Oleg Nesterov wrote:
> And I am starting to think that this change should also fix the
> while_each_thread() problems in this particular case.
> 
> In generak the code like
> 
> 	rcu_read_lock();
> 	task = find_get_task(...);
> 	rcu_read_unlock();
> 
> 	rcu_read_lock();
> 	t = task;
> 	do {
> 		...
> 	} while_each_thread (task, t);
> 	rcu_read_unlock();
> 
> is wrong even if while_each_thread() was correct (and we have a lot
> of examples of this pattern). A GP can pass before the 2nd rcu-lock,
> and we simply can't trust ->thread_group.next.
> 
> But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
> be called with threadgroup == T when a) tsk is ->group_leader and b)
> we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
> "tsk" can't be removed from ->thread_group list before other threads.
> 
> If next_thread() sees thread_group.next != leader, we know that the
> that .next thread didn't do __unhash_process() yet, and since we
> know that in this case "leader" didn't do this too we are safe.
> 
> In short: __unhash_process(leader) (in this) case can never change
> ->thread_group.next of another thread, because leader->thread_group
> should be already list_empty().
> 

If threadgroup == false, and if the tsk is existing or is already in
the targeted cgroup, we won't break the loop due to the bug but do
this:

  while_each_thread(task, t)

If @task isn't the leader, we might got stuck in the loop?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-11 13:15                             ` Li Zefan
  0 siblings, 0 replies; 23+ messages in thread
From: Li Zefan @ 2013-10-11 13:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

On 2013/10/10 0:54, Oleg Nesterov wrote:
> And I am starting to think that this change should also fix the
> while_each_thread() problems in this particular case.
> 
> In generak the code like
> 
> 	rcu_read_lock();
> 	task = find_get_task(...);
> 	rcu_read_unlock();
> 
> 	rcu_read_lock();
> 	t = task;
> 	do {
> 		...
> 	} while_each_thread (task, t);
> 	rcu_read_unlock();
> 
> is wrong even if while_each_thread() was correct (and we have a lot
> of examples of this pattern). A GP can pass before the 2nd rcu-lock,
> and we simply can't trust ->thread_group.next.
> 
> But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
> be called with threadgroup == T when a) tsk is ->group_leader and b)
> we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
> "tsk" can't be removed from ->thread_group list before other threads.
> 
> If next_thread() sees thread_group.next != leader, we know that the
> that .next thread didn't do __unhash_process() yet, and since we
> know that in this case "leader" didn't do this too we are safe.
> 
> In short: __unhash_process(leader) (in this) case can never change
> ->thread_group.next of another thread, because leader->thread_group
> should be already list_empty().
> 

If threadgroup == false, and if the tsk is existing or is already in
the targeted cgroup, we won't break the loop due to the bug but do
this:

  while_each_thread(task, t)

If @task isn't the leader, we might got stuck in the loop?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-11 16:00                               ` Oleg Nesterov
  0 siblings, 0 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-11 16:00 UTC (permalink / raw)
  To: Li Zefan; +Cc: Tejun Heo, anjana vk, cgroups, linux-kernel, eunki_kim

On 10/11, Li Zefan wrote:
>
> On 2013/10/10 0:54, Oleg Nesterov wrote:
>
> > And I am starting to think that this change should also fix the
> > while_each_thread() problems in this particular case.

Please see below,

> > In generak the code like
> >
> > 	rcu_read_lock();
> > 	task = find_get_task(...);
> > 	rcu_read_unlock();
> >
> > 	rcu_read_lock();
> > 	t = task;
> > 	do {
> > 		...
> > 	} while_each_thread (task, t);
> > 	rcu_read_unlock();
> >
> > is wrong even if while_each_thread() was correct (and we have a lot
> > of examples of this pattern). A GP can pass before the 2nd rcu-lock,
> > and we simply can't trust ->thread_group.next.
> >
> > But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
> > be called with threadgroup == T when a) tsk is ->group_leader and b)
> > we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
> > "tsk" can't be removed from ->thread_group list before other threads.
> >
> > If next_thread() sees thread_group.next != leader, we know that the
> > that .next thread didn't do __unhash_process() yet, and since we
> > know that in this case "leader" didn't do this too we are safe.
> >
> > In short: __unhash_process(leader) (in this) case can never change
> > ->thread_group.next of another thread, because leader->thread_group
> > should be already list_empty().
> >
>
> If threadgroup == false, and if the tsk is existing or is already in
> the targeted cgroup, we won't break the loop due to the bug but do
> this:
>
>   while_each_thread(task, t)
>
> If @task isn't the leader, we might got stuck in the loop?

Yes, yes, sure. We need to fix the wrong "continue" logic, hopefully

I tried to say (see above) that after we do this while_each_thread()
should be fine in this particular case.

Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-11 16:00                               ` Oleg Nesterov
  0 siblings, 0 replies; 23+ messages in thread
From: Oleg Nesterov @ 2013-10-11 16:00 UTC (permalink / raw)
  To: Li Zefan
  Cc: Tejun Heo, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

On 10/11, Li Zefan wrote:
>
> On 2013/10/10 0:54, Oleg Nesterov wrote:
>
> > And I am starting to think that this change should also fix the
> > while_each_thread() problems in this particular case.

Please see below,

> > In generak the code like
> >
> > 	rcu_read_lock();
> > 	task = find_get_task(...);
> > 	rcu_read_unlock();
> >
> > 	rcu_read_lock();
> > 	t = task;
> > 	do {
> > 		...
> > 	} while_each_thread (task, t);
> > 	rcu_read_unlock();
> >
> > is wrong even if while_each_thread() was correct (and we have a lot
> > of examples of this pattern). A GP can pass before the 2nd rcu-lock,
> > and we simply can't trust ->thread_group.next.
> >
> > But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
> > be called with threadgroup == T when a) tsk is ->group_leader and b)
> > we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
> > "tsk" can't be removed from ->thread_group list before other threads.
> >
> > If next_thread() sees thread_group.next != leader, we know that the
> > that .next thread didn't do __unhash_process() yet, and since we
> > know that in this case "leader" didn't do this too we are safe.
> >
> > In short: __unhash_process(leader) (in this) case can never change
> > ->thread_group.next of another thread, because leader->thread_group
> > should be already list_empty().
> >
>
> If threadgroup == false, and if the tsk is existing or is already in
> the targeted cgroup, we won't break the loop due to the bug but do
> this:
>
>   while_each_thread(task, t)
>
> If @task isn't the leader, we might got stuck in the loop?

Yes, yes, sure. We need to fix the wrong "continue" logic, hopefully

I tried to say (see above) that after we do this while_each_thread()
should be fine in this particular case.

Oleg.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly
@ 2013-10-12  2:59                                 ` Li Zefan
  0 siblings, 0 replies; 23+ messages in thread
From: Li Zefan @ 2013-10-12  2:59 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Oleg Nesterov, anjana vk, cgroups, linux-kernel, eunki_kim

From: Anjana V Kumar <anjanavk12@gmail.com>

Both Anjana and Eunki reported a stall in the while_each_thread loop
in cgroup_attach_task().

It's because, when we attach a single thread to a cgroup, if the cgroup
is exiting or is already in that cgroup, we won't break the loop.

If the task is already in the cgroup, the bug can lead to another thread
being attached to the cgroup unexpectedly:

  # echo 5207 > tasks
  # cat tasks
  5207
  # echo 5207 > tasks
  # cat tasks
  5207
  5215

What's worse, if the task to be attached isn't the leader of the thread
group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
analysis.

This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")

Cc: <stable@vger.kernel.org> # 3.9+
Reported-by: Eunki Kim <eunki_kim@samsung.com>
Reported-by: Anjana V Kumar <anjanavk12@gmail.com>
Signed-off-by: Anjana V Kumar <anjanavk12@gmail.com>
[ lizf: - fixed the first continue, pointed out by Oleg,
        - rewrote changelog. ]
Signed-off-by: Li Zefan <lizefan@huawei.com>
---
 kernel/cgroup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a5629f1..3db1d2e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2002,7 +2002,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 
 		/* @tsk either already exited or can't exit until the end */
 		if (tsk->flags & PF_EXITING)
-			continue;
+			goto next;
 
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
@@ -2010,7 +2010,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 		ent.cgrp = task_cgroup_from_root(tsk, root);
 		/* nothing to do if this task is already in the cgroup */
 		if (ent.cgrp == cgrp)
-			continue;
+			goto next;
 		/*
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
@@ -2018,7 +2018,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
-
+next:
 		if (!threadgroup)
 			break;
 	} while_each_thread(leader, tsk);
-- 
1.8.0.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly
@ 2013-10-12  2:59                                 ` Li Zefan
  0 siblings, 0 replies; 23+ messages in thread
From: Li Zefan @ 2013-10-12  2:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

From: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Both Anjana and Eunki reported a stall in the while_each_thread loop
in cgroup_attach_task().

It's because, when we attach a single thread to a cgroup, if the cgroup
is exiting or is already in that cgroup, we won't break the loop.

If the task is already in the cgroup, the bug can lead to another thread
being attached to the cgroup unexpectedly:

  # echo 5207 > tasks
  # cat tasks
  5207
  # echo 5207 > tasks
  # cat tasks
  5207
  5215

What's worse, if the task to be attached isn't the leader of the thread
group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
analysis.

This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")

Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+
Reported-by: Eunki Kim <eunki_kim-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
Reported-by: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
[ lizf: - fixed the first continue, pointed out by Oleg,
        - rewrote changelog. ]
Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 kernel/cgroup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a5629f1..3db1d2e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2002,7 +2002,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 
 		/* @tsk either already exited or can't exit until the end */
 		if (tsk->flags & PF_EXITING)
-			continue;
+			goto next;
 
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
@@ -2010,7 +2010,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 		ent.cgrp = task_cgroup_from_root(tsk, root);
 		/* nothing to do if this task is already in the cgroup */
 		if (ent.cgrp == cgrp)
-			continue;
+			goto next;
 		/*
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
@@ -2018,7 +2018,7 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
-
+next:
 		if (!threadgroup)
 			break;
 	} while_each_thread(leader, tsk);
-- 
1.8.0.2

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly
@ 2013-10-13 20:08                                   ` Tejun Heo
  0 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2013-10-13 20:08 UTC (permalink / raw)
  To: Li Zefan; +Cc: Oleg Nesterov, anjana vk, cgroups, linux-kernel, eunki_kim

On Sat, Oct 12, 2013 at 10:59:17AM +0800, Li Zefan wrote:
> From: Anjana V Kumar <anjanavk12@gmail.com>
> 
> Both Anjana and Eunki reported a stall in the while_each_thread loop
> in cgroup_attach_task().
> 
> It's because, when we attach a single thread to a cgroup, if the cgroup
> is exiting or is already in that cgroup, we won't break the loop.
> 
> If the task is already in the cgroup, the bug can lead to another thread
> being attached to the cgroup unexpectedly:
> 
>   # echo 5207 > tasks
>   # cat tasks
>   5207
>   # echo 5207 > tasks
>   # cat tasks
>   5207
>   5215
> 
> What's worse, if the task to be attached isn't the leader of the thread
> group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
> analysis.
> 
> This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
> ("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")
> 
> Cc: <stable@vger.kernel.org> # 3.9+
> Reported-by: Eunki Kim <eunki_kim@samsung.com>
> Reported-by: Anjana V Kumar <anjanavk12@gmail.com>
> Signed-off-by: Anjana V Kumar <anjanavk12@gmail.com>
> [ lizf: - fixed the first continue, pointed out by Oleg,
>         - rewrote changelog. ]
> Signed-off-by: Li Zefan <lizefan@huawei.com>

Applied to cgroup/for-3.12-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly
@ 2013-10-13 20:08                                   ` Tejun Heo
  0 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2013-10-13 20:08 UTC (permalink / raw)
  To: Li Zefan
  Cc: Oleg Nesterov, anjana vk, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

On Sat, Oct 12, 2013 at 10:59:17AM +0800, Li Zefan wrote:
> From: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> 
> Both Anjana and Eunki reported a stall in the while_each_thread loop
> in cgroup_attach_task().
> 
> It's because, when we attach a single thread to a cgroup, if the cgroup
> is exiting or is already in that cgroup, we won't break the loop.
> 
> If the task is already in the cgroup, the bug can lead to another thread
> being attached to the cgroup unexpectedly:
> 
>   # echo 5207 > tasks
>   # cat tasks
>   5207
>   # echo 5207 > tasks
>   # cat tasks
>   5207
>   5215
> 
> What's worse, if the task to be attached isn't the leader of the thread
> group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
> analysis.
> 
> This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
> ("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")
> 
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.9+
> Reported-by: Eunki Kim <eunki_kim-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
> Reported-by: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> [ lizf: - fixed the first continue, pointed out by Oleg,
>         - rewrote changelog. ]
> Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

Applied to cgroup/for-3.12-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-15  5:04                                 ` anjana vk
  0 siblings, 0 replies; 23+ messages in thread
From: anjana vk @ 2013-10-15  5:04 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Li Zefan, Tejun Heo, cgroups, linux-kernel, eunki_kim

Hi All,

Thankyou for your suggestions.
I have made the modifications as suggested.

Please find the patch below.

>From fd85f093f912a160c0896ea2784dfbdd64f142ca Mon Sep 17 00:00:00 2001
From: Anjana V Kumar <anjanavk12@gmail.com>
Date: Wed, 9 Oct 2013 16:49:22 +0530
Subject: [PATCH] Single thread break missing in cgroup_attach_task

Problem:
Issue when attaching a single thread to a cgroup if the thread was alredy in the
cgroup. The check if the thread is already in cgroup in the above case,
continues to the next thread instead of exciting.

Solution:
Add a check if it is a single thread or a threadgroup and
take decision accordingly. If it is a single thread break out of the do-while
loop, and if it is a threadgroup goto the next thread.

Change-Id: If6bbdef12ca5992823ac2a7bc15eaeb3dddb461a
Signed-off-by: Anjana V Kumar <anjanavk12@gmail.com>
---
 kernel/cgroup.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2418b6e..13eafdc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2039,7 +2039,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,

 		/* @tsk either already exited or can't exit until the end */
 		if (tsk->flags & PF_EXITING)
-			continue;
+			goto next_thread;

 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
@@ -2047,7 +2047,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,
 		ent.cgrp = task_cgroup_from_root(tsk, root);
 		/* nothing to do if this task is already in the cgroup */
 		if (ent.cgrp == cgrp)
-			continue;
+			goto next_thread;
 		/*
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
@@ -2055,7 +2055,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,
 		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
-
+next_thread:
 		if (!threadgroup)
 			break;
 	} while_each_thread(leader, tsk);
-- 
1.7.6



On 10/11/13, Oleg Nesterov <oleg@redhat.com> wrote:
> On 10/11, Li Zefan wrote:
>>
>> On 2013/10/10 0:54, Oleg Nesterov wrote:
>>
>> > And I am starting to think that this change should also fix the
>> > while_each_thread() problems in this particular case.
>
> Please see below,
>
>> > In generak the code like
>> >
>> > 	rcu_read_lock();
>> > 	task = find_get_task(...);
>> > 	rcu_read_unlock();
>> >
>> > 	rcu_read_lock();
>> > 	t = task;
>> > 	do {
>> > 		...
>> > 	} while_each_thread (task, t);
>> > 	rcu_read_unlock();
>> >
>> > is wrong even if while_each_thread() was correct (and we have a lot
>> > of examples of this pattern). A GP can pass before the 2nd rcu-lock,
>> > and we simply can't trust ->thread_group.next.
>> >
>> > But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
>> > be called with threadgroup == T when a) tsk is ->group_leader and b)
>> > we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
>> > "tsk" can't be removed from ->thread_group list before other threads.
>> >
>> > If next_thread() sees thread_group.next != leader, we know that the
>> > that .next thread didn't do __unhash_process() yet, and since we
>> > know that in this case "leader" didn't do this too we are safe.
>> >
>> > In short: __unhash_process(leader) (in this) case can never change
>> > ->thread_group.next of another thread, because leader->thread_group
>> > should be already list_empty().
>> >
>>
>> If threadgroup == false, and if the tsk is existing or is already in
>> the targeted cgroup, we won't break the loop due to the bug but do
>> this:
>>
>>   while_each_thread(task, t)
>>
>> If @task isn't the leader, we might got stuck in the loop?
>
> Yes, yes, sure. We need to fix the wrong "continue" logic, hopefully
>
> I tried to say (see above) that after we do this while_each_thread()
> should be fine in this particular case.
>
> Oleg.
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-15  5:04                                 ` anjana vk
  0 siblings, 0 replies; 23+ messages in thread
From: anjana vk @ 2013-10-15  5:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Li Zefan, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

Hi All,

Thankyou for your suggestions.
I have made the modifications as suggested.

Please find the patch below.

From fd85f093f912a160c0896ea2784dfbdd64f142ca Mon Sep 17 00:00:00 2001
From: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, 9 Oct 2013 16:49:22 +0530
Subject: [PATCH] Single thread break missing in cgroup_attach_task

Problem:
Issue when attaching a single thread to a cgroup if the thread was alredy in the
cgroup. The check if the thread is already in cgroup in the above case,
continues to the next thread instead of exciting.

Solution:
Add a check if it is a single thread or a threadgroup and
take decision accordingly. If it is a single thread break out of the do-while
loop, and if it is a threadgroup goto the next thread.

Change-Id: If6bbdef12ca5992823ac2a7bc15eaeb3dddb461a
Signed-off-by: Anjana V Kumar <anjanavk12-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 kernel/cgroup.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2418b6e..13eafdc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2039,7 +2039,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,

 		/* @tsk either already exited or can't exit until the end */
 		if (tsk->flags & PF_EXITING)
-			continue;
+			goto next_thread;

 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
@@ -2047,7 +2047,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,
 		ent.cgrp = task_cgroup_from_root(tsk, root);
 		/* nothing to do if this task is already in the cgroup */
 		if (ent.cgrp == cgrp)
-			continue;
+			goto next_thread;
 		/*
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
@@ -2055,7 +2055,7 @@ static int cgroup_attach_task(struct cgroup
*cgrp, struct task_struct *tsk,
 		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
-
+next_thread:
 		if (!threadgroup)
 			break;
 	} while_each_thread(leader, tsk);
-- 
1.7.6



On 10/11/13, Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 10/11, Li Zefan wrote:
>>
>> On 2013/10/10 0:54, Oleg Nesterov wrote:
>>
>> > And I am starting to think that this change should also fix the
>> > while_each_thread() problems in this particular case.
>
> Please see below,
>
>> > In generak the code like
>> >
>> > 	rcu_read_lock();
>> > 	task = find_get_task(...);
>> > 	rcu_read_unlock();
>> >
>> > 	rcu_read_lock();
>> > 	t = task;
>> > 	do {
>> > 		...
>> > 	} while_each_thread (task, t);
>> > 	rcu_read_unlock();
>> >
>> > is wrong even if while_each_thread() was correct (and we have a lot
>> > of examples of this pattern). A GP can pass before the 2nd rcu-lock,
>> > and we simply can't trust ->thread_group.next.
>> >
>> > But I didn't notice that cgroup_attach_task(tsk, threadgroup) can only
>> > be called with threadgroup == T when a) tsk is ->group_leader and b)
>> > we hold threadgroup_lock() which blocks de_thread(). IOW, in this case
>> > "tsk" can't be removed from ->thread_group list before other threads.
>> >
>> > If next_thread() sees thread_group.next != leader, we know that the
>> > that .next thread didn't do __unhash_process() yet, and since we
>> > know that in this case "leader" didn't do this too we are safe.
>> >
>> > In short: __unhash_process(leader) (in this) case can never change
>> > ->thread_group.next of another thread, because leader->thread_group
>> > should be already list_empty().
>> >
>>
>> If threadgroup == false, and if the tsk is existing or is already in
>> the targeted cgroup, we won't break the loop due to the bug but do
>> this:
>>
>>   while_each_thread(task, t)
>>
>> If @task isn't the leader, we might got stuck in the loop?
>
> Yes, yes, sure. We need to fix the wrong "continue" logic, hopefully
>
> I tried to say (see above) that after we do this while_each_thread()
> should be fine in this particular case.
>
> Oleg.
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-15 13:34                                   ` Tejun Heo
  0 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2013-10-15 13:34 UTC (permalink / raw)
  To: anjana vk; +Cc: Oleg Nesterov, Li Zefan, cgroups, linux-kernel, eunki_kim

On Tue, Oct 15, 2013 at 10:34:04AM +0530, anjana vk wrote:
> Hi All,
> 
> Thankyou for your suggestions.
> I have made the modifications as suggested.

I already applied the version Li posted (as you the author, of course).

Thanks a lot!

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu)
@ 2013-10-15 13:34                                   ` Tejun Heo
  0 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2013-10-15 13:34 UTC (permalink / raw)
  To: anjana vk
  Cc: Oleg Nesterov, Li Zefan, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	eunki_kim-Sze3O3UU22JBDgjK7y7TUQ

On Tue, Oct 15, 2013 at 10:34:04AM +0530, anjana vk wrote:
> Hi All,
> 
> Thankyou for your suggestions.
> I have made the modifications as suggested.

I already applied the version Li posted (as you the author, of course).

Thanks a lot!

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-10-15 13:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-04  5:25 cgroup attach task - slogging cpu anjana vk
     [not found] ` <CALPf4Tz+Gf_Q7wKKBufCc1mtV1qVPVrOW0S1qhHxfOv6pJa2Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-04 13:02   ` Oleg Nesterov
     [not found]     ` <20131004130207.GA9338-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-07 18:45       ` Tejun Heo
     [not found]         ` <20131007184507.GD27396-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-10-08 14:58           ` Oleg Nesterov
     [not found]             ` <20131008145833.GA15600-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09  5:35               ` Li Zefan
     [not found]                 ` <5254EB2A.7090803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-09 13:30                   ` Oleg Nesterov
     [not found]                     ` <20131009133047.GA12414-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-09 14:05                       ` Oleg Nesterov
2013-10-09 16:54                         ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) Oleg Nesterov
2013-10-09 16:54                           ` Oleg Nesterov
2013-10-11 13:15                           ` Li Zefan
2013-10-11 13:15                             ` Li Zefan
2013-10-11 16:00                             ` Oleg Nesterov
2013-10-11 16:00                               ` Oleg Nesterov
2013-10-12  2:59                               ` [PATCH] cgroup: fix to break the while loop in cgroup_attach_task() correctly Li Zefan
2013-10-12  2:59                                 ` Li Zefan
2013-10-13 20:08                                 ` Tejun Heo
2013-10-13 20:08                                   ` Tejun Heo
2013-10-15  5:04                               ` cgroup_attach_task && while_each_thread (Was: cgroup attach task - slogging cpu) anjana vk
2013-10-15  5:04                                 ` anjana vk
2013-10-15 13:34                                 ` Tejun Heo
2013-10-15 13:34                                   ` Tejun Heo
     [not found]                         ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g@mail.gmail.com>
     [not found]                           ` <CAChhN7RerxpSadqyosUeSKFg+qcOpO4d-maEKBZ0rvOQGvN27g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-10  4:22                             ` cgroup attach task - slogging cpu anjana vk
2013-10-10 11:11                             ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.