All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.32.y stable kernel regression with taskset
@ 2010-09-15 18:47 Yinghai Lu
  2010-09-16  3:45 ` Mike Galbraith
  0 siblings, 1 reply; 5+ messages in thread
From: Yinghai Lu @ 2010-09-15 18:47 UTC (permalink / raw)
  To: Greg KH, Ingo Molnar, Peter Zijlstra; +Cc: linux-kernel, John Wright

found problem with cpuscaling test.

Under 2.6.32.21   Userspace gov  
min freq load test time is nearly the same as max freq load test time around  ~16 seconds

under 2.6.18-194
min freq load test time is ~40 seconds
max freq load test time	  is ~ 17 seconds

the test is 
1. set governor for one cpu to userspace
2. set freq to min for that cpu
3. using taskset to put load test only on that cpu, and get the time for load test.

so that mean taskset did not put load test on cpu that we want. and other cpu still have ondemand governor and load test get done much faster

git bisect report:

c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
Author: John Wright <john.wright@hp.com>
Date:   Tue Apr 13 16:55:37 2010 -0600

    sched: Fix a race between ttwu() and migrate_task()
    
    Based on commit e2912009fb7b715728311b0d8fe327a1432b3f79 upstream, but
    done differently as this issue is not present in .33 or .34 kernels due
    to rework in this area.
    
    If a task is in the TASK_WAITING state, then try_to_wake_up() is working
    on it, and it will place it on the correct cpu.
    
    This commit ensures that neither migrate_task() nor __migrate_task()
    calls set_task_cpu(p) while p is in the TASK_WAKING state.  Otherwise,
    there could be two concurrent calls to set_task_cpu(p), resulting in
    the task's cfs_rq being inconsistent with its cpu.
    
    Signed-off-by: John Wright <john.wright@hp.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 a9d18950d8edddb761c0266706f671f0e9a006fe 2c3a7d7d5e616ecc276b4e93f4b6e5162a9382c8 M	kernel

diff --git a/kernel/sched.c b/kernel/sched.c
index 2591562..3261c19 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2116,12 +2116,10 @@ migrate_task(struct task_struct *p, int dest_cpu, struct
 
 	/*
 	 * If the task is not on a runqueue (and not running), then
-	 * it is sufficient to simply update the task's cpu field.
+	 * the next wake-up will properly place the task.
 	 */
-	if (!p->se.on_rq && !task_running(rq, p)) {
-		set_task_cpu(p, dest_cpu);
+	if (!p->se.on_rq && !task_running(rq, p))
 		return 0;
-	}
 
 	init_completion(&req->done);
 	req->task = p;
@@ -7167,6 +7165,9 @@ static int __migrate_task(struct task_struct *p, int src_c
 	/* Already moved. */
 	if (task_cpu(p) != src_cpu)
 		goto done;
+	/* Waking up, don't get in the way of try_to_wake_up(). */
+	if (p->state == TASK_WAKING)
+		goto fail;
 	/* Affinity changed (again). */
 	if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed))
 		goto fail;


after reverting it, cpuscaling can pass the test.

BTW, currently mainline is ok.

Thanks

Yinghai Lu

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: 2.6.32.y stable kernel regression with taskset
  2010-09-15 18:47 2.6.32.y stable kernel regression with taskset Yinghai Lu
@ 2010-09-16  3:45 ` Mike Galbraith
  2010-09-16 14:42   ` Greg KH
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Galbraith @ 2010-09-16  3:45 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg KH, Ingo Molnar, Peter Zijlstra, linux-kernel, John Wright

On Wed, 2010-09-15 at 11:47 -0700, Yinghai Lu wrote:
> found problem with cpuscaling test.
> 
> Under 2.6.32.21   Userspace gov  
> min freq load test time is nearly the same as max freq load test time around  ~16 seconds
> 
> under 2.6.18-194
> min freq load test time is ~40 seconds
> max freq load test time	  is ~ 17 seconds
> 
> the test is 
> 1. set governor for one cpu to userspace
> 2. set freq to min for that cpu
> 3. using taskset to put load test only on that cpu, and get the time for load test.
> 
> so that mean taskset did not put load test on cpu that we want. and other cpu still have ondemand governor and load test get done much faster
> 
> git bisect report:
> 
> c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
> commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
> Author: John Wright <john.wright@hp.com>
> Date:   Tue Apr 13 16:55:37 2010 -0600
> 
>     sched: Fix a race between ttwu() and migrate_task()

Known issue.  There's a sched series for 32-stable in the pipeline.

	-Mike


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.32.y stable kernel regression with taskset
  2010-09-16  3:45 ` Mike Galbraith
@ 2010-09-16 14:42   ` Greg KH
  2010-09-16 19:35     ` Yinghai Lu
  0 siblings, 1 reply; 5+ messages in thread
From: Greg KH @ 2010-09-16 14:42 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Yinghai Lu, Ingo Molnar, Peter Zijlstra, linux-kernel, John Wright

On Thu, Sep 16, 2010 at 05:45:12AM +0200, Mike Galbraith wrote:
> On Wed, 2010-09-15 at 11:47 -0700, Yinghai Lu wrote:
> > found problem with cpuscaling test.
> > 
> > Under 2.6.32.21   Userspace gov  
> > min freq load test time is nearly the same as max freq load test time around  ~16 seconds
> > 
> > under 2.6.18-194
> > min freq load test time is ~40 seconds
> > max freq load test time	  is ~ 17 seconds
> > 
> > the test is 
> > 1. set governor for one cpu to userspace
> > 2. set freq to min for that cpu
> > 3. using taskset to put load test only on that cpu, and get the time for load test.
> > 
> > so that mean taskset did not put load test on cpu that we want. and other cpu still have ondemand governor and load test get done much faster
> > 
> > git bisect report:
> > 
> > c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
> > commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
> > Author: John Wright <john.wright@hp.com>
> > Date:   Tue Apr 13 16:55:37 2010 -0600
> > 
> >     sched: Fix a race between ttwu() and migrate_task()
> 
> Known issue.  There's a sched series for 32-stable in the pipeline.

Yes, sorry, I'm working my way through to them, hope to have them
finished and applied soon.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.32.y stable kernel regression with taskset
  2010-09-16 14:42   ` Greg KH
@ 2010-09-16 19:35     ` Yinghai Lu
  2010-09-17  4:46       ` Mike Galbraith
  0 siblings, 1 reply; 5+ messages in thread
From: Yinghai Lu @ 2010-09-16 19:35 UTC (permalink / raw)
  To: Greg KH
  Cc: Mike Galbraith, Ingo Molnar, Peter Zijlstra, linux-kernel, John Wright

On 09/16/2010 07:42 AM, Greg KH wrote:
> On Thu, Sep 16, 2010 at 05:45:12AM +0200, Mike Galbraith wrote:
>> On Wed, 2010-09-15 at 11:47 -0700, Yinghai Lu wrote:
>>> found problem with cpuscaling test.
>>>
>>> Under 2.6.32.21   Userspace gov  
>>> min freq load test time is nearly the same as max freq load test time around  ~16 seconds
>>>
>>> under 2.6.18-194
>>> min freq load test time is ~40 seconds
>>> max freq load test time	  is ~ 17 seconds
>>>
>>> the test is 
>>> 1. set governor for one cpu to userspace
>>> 2. set freq to min for that cpu
>>> 3. using taskset to put load test only on that cpu, and get the time for load test.
>>>
>>> so that mean taskset did not put load test on cpu that we want. and other cpu still have ondemand governor and load test get done much faster
>>>
>>> git bisect report:
>>>
>>> c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
>>> commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
>>> Author: John Wright <john.wright@hp.com>
>>> Date:   Tue Apr 13 16:55:37 2010 -0600
>>>
>>>     sched: Fix a race between ttwu() and migrate_task()
>>
>> Known issue.  There's a sched series for 32-stable in the pipeline.
> 
> Yes, sorry, I'm working my way through to them, hope to have them
> finished and applied soon.
> 

why not just revert that patch?

wonder if SLES11SP1 or RHEL 6 have this patch or not.

Yinghai


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.32.y stable kernel regression with taskset
  2010-09-16 19:35     ` Yinghai Lu
@ 2010-09-17  4:46       ` Mike Galbraith
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Galbraith @ 2010-09-17  4:46 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg KH, Ingo Molnar, Peter Zijlstra, linux-kernel, John Wright

On Thu, 2010-09-16 at 12:35 -0700, Yinghai Lu wrote:
> On 09/16/2010 07:42 AM, Greg KH wrote:
> > On Thu, Sep 16, 2010 at 05:45:12AM +0200, Mike Galbraith wrote:

> >>> git bisect report:
> >>>
> >>> c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
> >>> commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
> >>> Author: John Wright <john.wright@hp.com>
> >>> Date:   Tue Apr 13 16:55:37 2010 -0600
> >>>
> >>>     sched: Fix a race between ttwu() and migrate_task()
> >>
> >> Known issue.  There's a sched series for 32-stable in the pipeline.
> > 
> > Yes, sorry, I'm working my way through to them, hope to have them
> > finished and applied soon.
> > 
> 
> why not just revert that patch?

Because it exists.  If revert would fix the problem, neither the commit
in question nor the upstream commit it mentions would exist.

	-Mike


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-09-17  4:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-15 18:47 2.6.32.y stable kernel regression with taskset Yinghai Lu
2010-09-16  3:45 ` Mike Galbraith
2010-09-16 14:42   ` Greg KH
2010-09-16 19:35     ` Yinghai Lu
2010-09-17  4:46       ` Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.