linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* smp 'nice' bias support breaks scheduler behavior
@ 2006-01-26 10:52 Siddha, Suresh B
  2006-01-26 12:25 ` Con Kolivas
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-01-26 10:52 UTC (permalink / raw)
  To: kernel; +Cc: mingo, nickpiggin, linux-kernel

Con,

> [PATCH] sched: implement nice support across physical cpus on SMP

I don't see imbalance calculations in find_busiest_group() take
prio_bias into account. This will result in wrong imbalance value and
will cause issues.

For example on a DP system with HT, if there are three runnable processes
(simple infinite loop with same nice value), this patch is resulting in bouncing
of these 3 processes from one processor to another...Lets assume
if the 3 processes are scheduled as 2 in package-0 and 1 in package1..
Now when the busy processor on package-1 does load balance and as
imbalance doesn't take "prio_bias" into account, this will kick active
load balance on package-0.. And this is continuing for ever, resulting
in bouncing from one processor to another.

Even when the system is completely loaded and if there is an imbalance,
this patch causes wrong imabalance counts and cause unoptimized
movements.

Do you want to look into this and post a patch for 2.6.16?

thanks,
suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-26 10:52 smp 'nice' bias support breaks scheduler behavior Siddha, Suresh B
@ 2006-01-26 12:25 ` Con Kolivas
  2006-01-26 23:36   ` Peter Williams
  0 siblings, 1 reply; 9+ messages in thread
From: Con Kolivas @ 2006-01-26 12:25 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: mingo, nickpiggin, linux-kernel

On Thursday 26 January 2006 21:52, Siddha, Suresh B wrote:
> Con,
>
> > [PATCH] sched: implement nice support across physical cpus on SMP
>
> I don't see imbalance calculations in find_busiest_group() take
> prio_bias into account. This will result in wrong imbalance value and
> will cause issues.


in 2.6.16-rc1:

find_busiest_group(....

	load = __target_load(i, load_idx, idle);
else
	load = __source_load(i, load_idx, idle);

where __target_load and __source_load is where we take into account prio_bias.

I'm not sure which code you're looking at, but Peter Williams is working on 
rewriting the smp nice balancing code in -mm at the moment so that is quite 
different from current linus tree.

Con

>
> For example on a DP system with HT, if there are three runnable processes
> (simple infinite loop with same nice value), this patch is resulting in
> bouncing of these 3 processes from one processor to another...Lets assume
> if the 3 processes are scheduled as 2 in package-0 and 1 in package1..
> Now when the busy processor on package-1 does load balance and as
> imbalance doesn't take "prio_bias" into account, this will kick active
> load balance on package-0.. And this is continuing for ever, resulting
> in bouncing from one processor to another.
>
> Even when the system is completely loaded and if there is an imbalance,
> this patch causes wrong imabalance counts and cause unoptimized
> movements.
>
> Do you want to look into this and post a patch for 2.6.16?
>
> thanks,
> suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-26 12:25 ` Con Kolivas
@ 2006-01-26 23:36   ` Peter Williams
  2006-01-26 23:56     ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Williams @ 2006-01-26 23:36 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Con Kolivas, mingo, nickpiggin, linux-kernel

Con Kolivas wrote:
> On Thursday 26 January 2006 21:52, Siddha, Suresh B wrote:
> 
>>Con,
>>
>>
>>>[PATCH] sched: implement nice support across physical cpus on SMP
>>
>>I don't see imbalance calculations in find_busiest_group() take
>>prio_bias into account. This will result in wrong imbalance value and
>>will cause issues.
> 
> 
> 
> in 2.6.16-rc1:
> 
> find_busiest_group(....
> 
> 	load = __target_load(i, load_idx, idle);
> else
> 	load = __source_load(i, load_idx, idle);
> 
> where __target_load and __source_load is where we take into account prio_bias.
> 
> I'm not sure which code you're looking at, but Peter Williams is working on 
> rewriting the smp nice balancing code in -mm at the moment so that is quite 
> different from current linus tree.
> 

Yes, indeed.  And it would be very helpful if people interested in this 
topic (and that have test suites designed to test whether "niceness" is 
being well balanced across CPUs) could test it.  This is especially the 
case for larger systems as I do not have ready access for testing on them.

> 
> 
>>For example on a DP system with HT, if there are three runnable processes
>>(simple infinite loop with same nice value), this patch is resulting in
>>bouncing of these 3 processes from one processor to another...Lets assume
>>if the 3 processes are scheduled as 2 in package-0 and 1 in package1..
>>Now when the busy processor on package-1 does load balance and as
>>imbalance doesn't take "prio_bias" into account, this will kick active
>>load balance on package-0.. And this is continuing for ever, resulting
>>in bouncing from one processor to another.
>>
>>Even when the system is completely loaded and if there is an imbalance,
>>this patch causes wrong imabalance counts and cause unoptimized
>>movements.
>>
>>Do you want to look into this and post a patch for 2.6.16?

Thanks,
Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-26 23:36   ` Peter Williams
@ 2006-01-26 23:56     ` Siddha, Suresh B
  2006-01-27  1:29       ` Con Kolivas
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-01-26 23:56 UTC (permalink / raw)
  To: Peter Williams
  Cc: Siddha, Suresh B, Con Kolivas, mingo, nickpiggin, linux-kernel

On Fri, Jan 27, 2006 at 10:36:36AM +1100, Peter Williams wrote:
> Con Kolivas wrote:
> > On Thursday 26 January 2006 21:52, Siddha, Suresh B wrote:
> >>>[PATCH] sched: implement nice support across physical cpus on SMP
> >>
> >>I don't see imbalance calculations in find_busiest_group() take
> >>prio_bias into account. This will result in wrong imbalance value and
> >>will cause issues.
> > in 2.6.16-rc1:
> > 
> > find_busiest_group(....
> > 
> > 	load = __target_load(i, load_idx, idle);
> > else
> > 	load = __source_load(i, load_idx, idle);
> > 
> > where __target_load and __source_load is where we take into account prio_bias.

We take that into consideration only while calculating the loads.. But we
don't scale it down while calculating imbalance, resulting in the problem
I mentioned.

> > 
> > I'm not sure which code you're looking at, but Peter Williams is working on 
> > rewriting the smp nice balancing code in -mm at the moment so that is quite 
> > different from current linus tree.
> > 

Peters changes in -mm fix this issue. Will this be pushed to Linus tree
before 2.6.16 comes out?

> 
> Yes, indeed.  And it would be very helpful if people interested in this 
> topic (and that have test suites designed to test whether "niceness" is 
> being well balanced across CPUs) could test it.  This is especially the 
> case for larger systems as I do not have ready access for testing on them.

I don't have any test suites for testing "niceness". But I can def check
more to make sure that it doesn't cause any regression :)

thanks,
suresh

> 
> > 
> > 
> >>For example on a DP system with HT, if there are three runnable processes
> >>(simple infinite loop with same nice value), this patch is resulting in
> >>bouncing of these 3 processes from one processor to another...Lets assume
> >>if the 3 processes are scheduled as 2 in package-0 and 1 in package1..
> >>Now when the busy processor on package-1 does load balance and as
> >>imbalance doesn't take "prio_bias" into account, this will kick active
> >>load balance on package-0.. And this is continuing for ever, resulting
> >>in bouncing from one processor to another.
> >>
> >>Even when the system is completely loaded and if there is an imbalance,
> >>this patch causes wrong imabalance counts and cause unoptimized
> >>movements.
> >>
> >>Do you want to look into this and post a patch for 2.6.16?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-26 23:56     ` Siddha, Suresh B
@ 2006-01-27  1:29       ` Con Kolivas
  2006-01-27  1:34         ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Con Kolivas @ 2006-01-27  1:29 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Peter Williams, mingo, nickpiggin, linux-kernel

On Friday 27 January 2006 10:56, Siddha, Suresh B wrote:
> > Con Kolivas wrote:
> > > I'm not sure which code you're looking at, but Peter Williams is
> > > working on rewriting the smp nice balancing code in -mm at the moment
> > > so that is quite different from current linus tree.
>
> Peters changes in -mm fix this issue. Will this be pushed to Linus tree
> before 2.6.16 comes out?

We planned to push it for 2.6.16 but issues showed up, which Peter has since 
addressed - but this means the code has not had enough testing to go into 
2.6.16 - so it is likely to be in 2.6.17.

Cheers,
Con

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-27  1:29       ` Con Kolivas
@ 2006-01-27  1:34         ` Siddha, Suresh B
  2006-01-27  1:54           ` Con Kolivas
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-01-27  1:34 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Siddha, Suresh B, Peter Williams, mingo, nickpiggin, linux-kernel

On Fri, Jan 27, 2006 at 12:29:02PM +1100, Con Kolivas wrote:
> We planned to push it for 2.6.16 but issues showed up, which Peter has since 
> addressed - but this means the code has not had enough testing to go into 
> 2.6.16 - so it is likely to be in 2.6.17.

Then for 2.6.16, I would like to see smp nice handling patch backed out.

Do you agree?

thanks,
suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-27  1:34         ` Siddha, Suresh B
@ 2006-01-27  1:54           ` Con Kolivas
  2006-01-27  2:11             ` Siddha, Suresh B
  0 siblings, 1 reply; 9+ messages in thread
From: Con Kolivas @ 2006-01-27  1:54 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Peter Williams, mingo, nickpiggin, linux-kernel

On Friday 27 January 2006 12:34, Siddha, Suresh B wrote:
> On Fri, Jan 27, 2006 at 12:29:02PM +1100, Con Kolivas wrote:
> > We planned to push it for 2.6.16 but issues showed up, which Peter has
> > since addressed - but this means the code has not had enough testing to
> > go into 2.6.16 - so it is likely to be in 2.6.17.
>
> Then for 2.6.16, I would like to see smp nice handling patch backed out.
>
> Do you agree?

It's not my decision to keep Peter's patch out of mainline. If you can make a 
strong enough case for it then Linus will merge it up even though it's after 
rc1. Otherwise I'll let Ingo decide on whether to pull the current 
implementation or not - you're saying that with the one thing you described 
that misbehaves that it is doing more harm than fixing smp nice handling.

Con

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-27  1:54           ` Con Kolivas
@ 2006-01-27  2:11             ` Siddha, Suresh B
  2006-01-27  2:58               ` Con Kolivas
  0 siblings, 1 reply; 9+ messages in thread
From: Siddha, Suresh B @ 2006-01-27  2:11 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Siddha, Suresh B, Peter Williams, mingo, nickpiggin, linux-kernel

On Fri, Jan 27, 2006 at 12:54:53PM +1100, Con Kolivas wrote:
> It's not my decision to keep Peter's patch out of mainline. If you can make a 
> strong enough case for it then Linus will merge it up even though it's after 
> rc1. 

I don't want to push Peters patch to 2.6.16, as I haven't tested much.

> Otherwise I'll let Ingo decide on whether to pull the current 
> implementation or not - you're saying that with the one thing you described 
> that misbehaves that it is doing more harm than fixing smp nice handling.

Are we sure that it really fixes smp nice handling? Its not just one 
scenario(bouncing processes on a lightly loaded system), I am talking about. 
Imbalance calculations will be wrong even on a completely loaded system.. 
Are you sure that there are no perf regressions with your patch..

Sorry for commenting on this patch so late.. I was on a very long vacation.
I think it is safe to back that out for 2.6.16 and do more work and get it
in 2.6.17.

thanks,
suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: smp 'nice' bias support breaks scheduler behavior
  2006-01-27  2:11             ` Siddha, Suresh B
@ 2006-01-27  2:58               ` Con Kolivas
  0 siblings, 0 replies; 9+ messages in thread
From: Con Kolivas @ 2006-01-27  2:58 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Peter Williams, mingo, nickpiggin, linux-kernel

On Friday 27 January 2006 13:11, Siddha, Suresh B wrote:
> On Fri, Jan 27, 2006 at 12:54:53PM +1100, Con Kolivas wrote:
> > It's not my decision to keep Peter's patch out of mainline. If you can
> > make a strong enough case for it then Linus will merge it up even though
> > it's after rc1.
>
> I don't want to push Peters patch to 2.6.16, as I haven't tested much.
>
> > Otherwise I'll let Ingo decide on whether to pull the current
> > implementation or not - you're saying that with the one thing you
> > described that misbehaves that it is doing more harm than fixing smp nice
> > handling.
>
> Are we sure that it really fixes smp nice handling? Its not just one
> scenario(bouncing processes on a lightly loaded system), I am talking
> about. Imbalance calculations will be wrong even on a completely loaded
> system.. Are you sure that there are no perf regressions with your patch..

It was extensively tested for more than 3 months in the -mm tree. Early on 
there were accounting bugs in the code which I corrected and we saw no 
performance regression after that across a wide range of benchmarks and 
hardware configurations at the time thanks to M Bligh. (see test.kernel.org) 
Some were done on the osdl (STP) test bench showing no regression as well but 
the osdl infrastructure became pretty much unworkable not long after.

> Sorry for commenting on this patch so late.. I was on a very long vacation.
> I think it is safe to back that out for 2.6.16 and do more work and get it
> in 2.6.17.

Well I have no emotional investment in the code, I just want to do what's 
right. In the absence of measurable throughput regressions and improvement in 
smp nice handling I don't believe we should back it out.

Cheers,
Con

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-01-27  2:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-26 10:52 smp 'nice' bias support breaks scheduler behavior Siddha, Suresh B
2006-01-26 12:25 ` Con Kolivas
2006-01-26 23:36   ` Peter Williams
2006-01-26 23:56     ` Siddha, Suresh B
2006-01-27  1:29       ` Con Kolivas
2006-01-27  1:34         ` Siddha, Suresh B
2006-01-27  1:54           ` Con Kolivas
2006-01-27  2:11             ` Siddha, Suresh B
2006-01-27  2:58               ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).