linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression with sched yield - 2.6.25-rc2-mm1
@ 2008-02-18 12:17 Balbir Singh
       [not found] ` <1203338377.10858.3.camel@lappy>
  0 siblings, 1 reply; 5+ messages in thread
From: Balbir Singh @ 2008-02-18 12:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Dhaval Giani, Srivatsa Vaddagiri, Andrew Morton,
	Zhang, Yanmin

Hi,

I was looking at the 45% regression reported by Yanmin, when while running the
test, I ran into

1:mon> t
[c0000000e7677da0] c000000000067de0 .sys_sched_yield+0x6c/0xbc
[c0000000e7677e30] c000000000008748 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 00000400001d09e4
SP (4000664cb10) is in userspace
1:mon> r
R00 = 0000000000000001   R16 = 0000000000000000
R01 = c0000000e7677d20   R17 = 0000000000000000
R02 = c000000000949490   R18 = 00000000100c5290
R03 = 0000000000000008   R19 = 0000000000000000
R04 = 0000000010036370   R20 = 0000000010040bc0
R05 = 0000000000000000   R21 = 0000000000000000
R06 = 0000000000000001   R22 = 0000000000000002
R07 = 000000000000002d   R23 = 0000000000000008
R08 = 0000000000000000   R24 = 00000000100387c0
R09 = 0000000000000000   R25 = 0000000010038c20
R10 = c0000000e5c12520   R26 = 00000000100363e8
R11 = c0000000e5c12558   R27 = 0000000010038c50
R12 = 800000000000f032   R28 = 0000000000360000
R13 = c00000000083c300   R29 = c000000000b43b80
R14 = 0000040000ee6e0d   R30 = c0000000008ba608
R15 = 0000000000000000   R31 = c000000000b42680
pc  = c000000000068e50 .yield_task_fair+0x94/0xc4
lr  = c000000000067de0 .sys_sched_yield+0x6c/0xbc
msr = 8000000000009032   cr  = 24204482
ctr = c000000000068dbc   xer = 0000000020000010   trap =  300
dar = 0000000000000050   dsisr = 40000000
1:mon> e
cpu 0x1: Vector: 300 (Data Access) at [c0000000e7677aa0]
    pc: c000000000068e50: .yield_task_fair+0x94/0xc4
    lr: c000000000067de0: .sys_sched_yield+0x6c/0xbc
    sp: c0000000e7677d20
   msr: 8000000000009032
   dar: 50
 dsisr: 40000000
  current = 0xc0000000e5c12520
  paca    = 0xc00000000083c300
    pid   = 569, comm = java
1:mon> di %pc
c000000000068e50  e9280050      ld      r9,80(r8)
c000000000068e54  e80b0050      ld      r0,80(r11)
c000000000068e58  7fa90040      cmpld   cr7,r9,r0
c000000000068e5c  419c000c      blt     cr7,c000000000068e68    #
.yield_task_fair+0xac/0xc4
c000000000068e60  38090001      addi    r0,r9,1
c000000000068e64  f80b0050      std     r0,80(r11)
c000000000068e68  38210080      addi    r1,r1,128
c000000000068e6c  e8010010      ld      r0,16(r1)
c000000000068e70  ebc1fff0      ld      r30,-16(r1)
c000000000068e74  ebe1fff8      ld      r31,-8(r1)
c000000000068e78  7c0803a6      mtlr    r0
c000000000068e7c  4e800020      blr
c000000000068e80  7c0802a6      mflr    r0
c000000000068e84  fba1ffe8      std     r29,-24(r1)
c000000000068e88  fbe1fff8      std     r31,-8(r1)
c000000000068e8c  f8010010      std     r0,16(r1)

Matching assembly and symbols, the code turned out to be around

        /*
         * Find the rightmost entry in the rbtree:
         */
        rightmost = __pick_last_entity(&rq->cfs);
        /*
         * Already in the rightmost position?
         */
        if (unlikely(rightmost->vruntime < se->vruntime))
                return;

It looked like rightmost was set to NULL. I am going to try and find some time
in tomorrow and see if I can debug it further.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regression with sched yield - 2.6.25-rc2-mm1
       [not found] ` <1203338377.10858.3.camel@lappy>
@ 2008-02-18 14:48   ` Balbir Singh
  2008-02-18 15:18     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Balbir Singh @ 2008-02-18 14:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dhaval Giani, Srivatsa Vaddagiri, Andrew Morton,
	Zhang, Yanmin, linux kernel mailing list

Peter Zijlstra wrote:
> On Mon, 2008-02-18 at 17:47 +0530, Balbir Singh wrote:
>> Hi,
>>
>> I was looking at the 45% regression reported by Yanmin, when while running the
>> test, I ran into
>>
>> 1:mon> t
>> [c0000000e7677da0] c000000000067de0 .sys_sched_yield+0x6c/0xbc
>> [c0000000e7677e30] c000000000008748 syscall_exit+0x0/0x40
>> --- Exception: c01 (System Call) at 00000400001d09e4
>> SP (4000664cb10) is in userspace
>> 1:mon> r
>> R00 = 0000000000000001   R16 = 0000000000000000
>> R01 = c0000000e7677d20   R17 = 0000000000000000
>> R02 = c000000000949490   R18 = 00000000100c5290
>> R03 = 0000000000000008   R19 = 0000000000000000
>> R04 = 0000000010036370   R20 = 0000000010040bc0
>> R05 = 0000000000000000   R21 = 0000000000000000
>> R06 = 0000000000000001   R22 = 0000000000000002
>> R07 = 000000000000002d   R23 = 0000000000000008
>> R08 = 0000000000000000   R24 = 00000000100387c0
>> R09 = 0000000000000000   R25 = 0000000010038c20
>> R10 = c0000000e5c12520   R26 = 00000000100363e8
>> R11 = c0000000e5c12558   R27 = 0000000010038c50
>> R12 = 800000000000f032   R28 = 0000000000360000
>> R13 = c00000000083c300   R29 = c000000000b43b80
>> R14 = 0000040000ee6e0d   R30 = c0000000008ba608
>> R15 = 0000000000000000   R31 = c000000000b42680
>> pc  = c000000000068e50 .yield_task_fair+0x94/0xc4
>> lr  = c000000000067de0 .sys_sched_yield+0x6c/0xbc
>> msr = 8000000000009032   cr  = 24204482
>> ctr = c000000000068dbc   xer = 0000000020000010   trap =  300
>> dar = 0000000000000050   dsisr = 40000000
>> 1:mon> e
>> cpu 0x1: Vector: 300 (Data Access) at [c0000000e7677aa0]
>>     pc: c000000000068e50: .yield_task_fair+0x94/0xc4
>>     lr: c000000000067de0: .sys_sched_yield+0x6c/0xbc
>>     sp: c0000000e7677d20
>>    msr: 8000000000009032
>>    dar: 50
>>  dsisr: 40000000
>>   current = 0xc0000000e5c12520
>>   paca    = 0xc00000000083c300
>>     pid   = 569, comm = java
>> 1:mon> di %pc
>> c000000000068e50  e9280050      ld      r9,80(r8)
>> c000000000068e54  e80b0050      ld      r0,80(r11)
>> c000000000068e58  7fa90040      cmpld   cr7,r9,r0
>> c000000000068e5c  419c000c      blt     cr7,c000000000068e68    #
>> ..yield_task_fair+0xac/0xc4
>> c000000000068e60  38090001      addi    r0,r9,1
>> c000000000068e64  f80b0050      std     r0,80(r11)
>> c000000000068e68  38210080      addi    r1,r1,128
>> c000000000068e6c  e8010010      ld      r0,16(r1)
>> c000000000068e70  ebc1fff0      ld      r30,-16(r1)
>> c000000000068e74  ebe1fff8      ld      r31,-8(r1)
>> c000000000068e78  7c0803a6      mtlr    r0
>> c000000000068e7c  4e800020      blr
>> c000000000068e80  7c0802a6      mflr    r0
>> c000000000068e84  fba1ffe8      std     r29,-24(r1)
>> c000000000068e88  fbe1fff8      std     r31,-8(r1)
>> c000000000068e8c  f8010010      std     r0,16(r1)
>>
>> Matching assembly and symbols, the code turned out to be around
>>
>>         /*
>>          * Find the rightmost entry in the rbtree:
>>          */
>>         rightmost = __pick_last_entity(&rq->cfs);
>>         /*
>>          * Already in the rightmost position?
>>          */
>>         if (unlikely(rightmost->vruntime < se->vruntime))
>>                 return;
>>
>> It looked like rightmost was set to NULL. I am going to try and find some time
>> in tomorrow and see if I can debug it further.
> 
> 
> Humm, the check that should have avoided that is:
> 
>         /*
>          * Are we the only task in the tree?
>          */
>         if (unlikely(rq->load.weight == curr->se.load.weight))
>                 return;
> 
> 
> But I guess that overlooks rt tasks, they also increase the load.
> So I guess something like this ought to fix it..
> 

Peter,

I don't remember any real time tasks running on the system, so I would be
surprised if that is indeed the case. Having said that, rightmost was indeed
NULL, so I need to figure out why it was. The other question is why would a real
time task be found by sched_yield_fair? I think we need to investigate more.

> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index b9ade89..83eb30c 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -998,7 +998,7 @@ static void yield_task_fair(struct rq *rq)
>  	/*
>  	 * Already in the rightmost position?
>  	 */
> -	if (unlikely(rightmost->vruntime < se->vruntime))
> +	if (unlikely(!rightmost || rightmost->vruntime < se->vruntime))
>  		return;
> 
>  	/*
> 
> 
> 
> 


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regression with sched yield - 2.6.25-rc2-mm1
  2008-02-18 14:48   ` Balbir Singh
@ 2008-02-18 15:18     ` Peter Zijlstra
  2008-02-18 15:19       ` Balbir Singh
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2008-02-18 15:18 UTC (permalink / raw)
  To: balbir
  Cc: Ingo Molnar, Dhaval Giani, Srivatsa Vaddagiri, Andrew Morton,
	Zhang, Yanmin, linux kernel mailing list


On Mon, 2008-02-18 at 20:18 +0530, Balbir Singh wrote:

> > Humm, the check that should have avoided that is:
> > 
> >         /*
> >          * Are we the only task in the tree?
> >          */
> >         if (unlikely(rq->load.weight == curr->se.load.weight))
> >                 return;
> > 
> > 
> > But I guess that overlooks rt tasks, they also increase the load.
> > So I guess something like this ought to fix it..
> > 
> 
> Peter,
> 
> I don't remember any real time tasks running on the system, so I would be
> surprised if that is indeed the case.

Various kthreads have rt prio. Notably the load_balancer_monitor().

>  Having said that, rightmost was indeed
> NULL, so I need to figure out why it was. The other question is why would a real
> time task be found by sched_yield_fair? 

Because a rt task contributes weight and would make the test above fail
because rt->load would be larger than expected.

> > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> > index b9ade89..83eb30c 100644
> > --- a/kernel/sched_fair.c
> > +++ b/kernel/sched_fair.c
> > @@ -998,7 +998,7 @@ static void yield_task_fair(struct rq *rq)
> >  	/*
> >  	 * Already in the rightmost position?
> >  	 */
> > -	if (unlikely(rightmost->vruntime < se->vruntime))
> > +	if (unlikely(!rightmost || rightmost->vruntime < se->vruntime))
> >  		return;
> > 
> >  	/*
> > 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regression with sched yield - 2.6.25-rc2-mm1
  2008-02-18 15:18     ` Peter Zijlstra
@ 2008-02-18 15:19       ` Balbir Singh
  2008-02-18 15:35         ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Balbir Singh @ 2008-02-18 15:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dhaval Giani, Srivatsa Vaddagiri, Andrew Morton,
	Zhang, Yanmin, linux kernel mailing list

Peter Zijlstra wrote:
> On Mon, 2008-02-18 at 20:18 +0530, Balbir Singh wrote:
> 
>>> Humm, the check that should have avoided that is:
>>>
>>>         /*
>>>          * Are we the only task in the tree?
>>>          */
>>>         if (unlikely(rq->load.weight == curr->se.load.weight))
>>>                 return;
>>>
>>>
>>> But I guess that overlooks rt tasks, they also increase the load.
>>> So I guess something like this ought to fix it..
>>>
>> Peter,
>>
>> I don't remember any real time tasks running on the system, so I would be
>> surprised if that is indeed the case.
> 
> Various kthreads have rt prio. Notably the load_balancer_monitor().
> 

OK, but does it belong to the cfs_rq?

>>  Having said that, rightmost was indeed
>> NULL, so I need to figure out why it was. The other question is why would a real
>> time task be found by sched_yield_fair? 
> 
> Because a rt task contributes weight and would make the test above fail
> because rt->load would be larger than expected.
> 

I thought we were searching an RBtree for the fair group scheduler. If what you
say is indeed true, shouldn't we check if the task is an rt task in
sched_yield_fair() instead of the !rightmost check?

>>> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
>>> index b9ade89..83eb30c 100644
>>> --- a/kernel/sched_fair.c
>>> +++ b/kernel/sched_fair.c
>>> @@ -998,7 +998,7 @@ static void yield_task_fair(struct rq *rq)
>>>  	/*
>>>  	 * Already in the rightmost position?
>>>  	 */
>>> -	if (unlikely(rightmost->vruntime < se->vruntime))
>>> +	if (unlikely(!rightmost || rightmost->vruntime < se->vruntime))
>>>  		return;
>>>
>>>  	/*
>>>
> 


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Regression with sched yield - 2.6.25-rc2-mm1
  2008-02-18 15:19       ` Balbir Singh
@ 2008-02-18 15:35         ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2008-02-18 15:35 UTC (permalink / raw)
  To: balbir
  Cc: Ingo Molnar, Dhaval Giani, Srivatsa Vaddagiri, Andrew Morton,
	Zhang, Yanmin, linux kernel mailing list


On Mon, 2008-02-18 at 20:49 +0530, Balbir Singh wrote:
> Peter Zijlstra wrote:
> > On Mon, 2008-02-18 at 20:18 +0530, Balbir Singh wrote:
> > 
> >>> Humm, the check that should have avoided that is:
> >>>
> >>>         /*
> >>>          * Are we the only task in the tree?
> >>>          */
> >>>         if (unlikely(rq->load.weight == curr->se.load.weight))
> >>>                 return;
> >>>

> OK, but does it belong to the cfs_rq?

I'm not looking at the cfs_rq, but at rq. Looking at cfs_rq isn't
correct because it might be a group with only a single task even though
there might be more tasks on this cpu.

Now it turns out, looking at the rq isn't correct either. At the time I
think I thought that a runnable RT task would've preempted - but that is
of course not valid under all preemption models - and racy even on
PREEMPT=y

> >>  Having said that, rightmost was indeed
> >> NULL, so I need to figure out why it was. The other question is why would a real
> >> time task be found by sched_yield_fair? 
> > 
> > Because a rt task contributes weight and would make the test above fail
> > because rt->load would be larger than expected.
> > 
> 
> I thought we were searching an RBtree for the fair group scheduler. If what you
> say is indeed true, shouldn't we check if the task is an rt task in
> sched_yield_fair() instead of the !rightmost check?

We're not actually finding a rt task. Just the presence of a runnable RT
task on this CPU skews the weights and fools my test.

rightmost returns NULL because there just isn't anybody else in the CFS
rq (note that current isn't in the tree).



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-02-18 15:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-18 12:17 Regression with sched yield - 2.6.25-rc2-mm1 Balbir Singh
     [not found] ` <1203338377.10858.3.camel@lappy>
2008-02-18 14:48   ` Balbir Singh
2008-02-18 15:18     ` Peter Zijlstra
2008-02-18 15:19       ` Balbir Singh
2008-02-18 15:35         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).