* Nick's scheduler v17
@ 2003-10-24 18:10 Nick Piggin
2003-10-24 20:17 ` cliff white
2003-10-24 21:49 ` Andrew Theurer
0 siblings, 2 replies; 7+ messages in thread
From: Nick Piggin @ 2003-10-24 18:10 UTC (permalink / raw)
To: linux-kernel
Hi,
http://www.kerneltrap.org/~npiggin/v17/
Still working on SMP and NUMA. Some (maybe) interesting things I put in are
- Sequential CPU balancing so you don't get a big storm of balances
every 1/4s.
- Balancing is trying to err more on the side of caution, I have to start
analysing it more thoroughly though.
- Attacked the NUMA balancing code. There should now be less buslocked ops /
cache pingpongs in some fastpaths. Volanomark likes it, more realistic
loads
won't improve so much http://www.kerneltrap.org/~npiggin/v17/volano.png
This improvement is NUMA only.
I haven't had time to reproduce Cliff's serious reaim performance
dropoffs so
they're probably still there. I couldn't reproduce Martin's kernbench
dropoff,
but the 16-way I'm using only has 512K cache which might not show it up.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-24 18:10 Nick's scheduler v17 Nick Piggin
@ 2003-10-24 20:17 ` cliff white
2003-10-24 21:49 ` Andrew Theurer
1 sibling, 0 replies; 7+ messages in thread
From: cliff white @ 2003-10-24 20:17 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel
On Sat, 25 Oct 2003 04:10:24 +1000
Nick Piggin <piggin@cyberone.com.au> wrote:
> Hi,
> http://www.kerneltrap.org/~npiggin/v17/
>
> Still working on SMP and NUMA. Some (maybe) interesting things I put in are
> - Sequential CPU balancing so you don't get a big storm of balances
> every 1/4s.
> - Balancing is trying to err more on the side of caution, I have to start
> analysing it more thoroughly though.
> - Attacked the NUMA balancing code. There should now be less buslocked ops /
> cache pingpongs in some fastpaths. Volanomark likes it, more realistic
> loads
> won't improve so much http://www.kerneltrap.org/~npiggin/v17/volano.png
> This improvement is NUMA only.
>
> I haven't had time to reproduce Cliff's serious reaim performance
> dropoffs so
> they're probably still there. I couldn't reproduce Martin's kernbench
> dropoff,
> but the 16-way I'm using only has 512K cache which might not show it up.
>
Okay, i put both v17 patches in the OSDL Patch Lifecycle Manager
( http://www.osdl.org/plm-cgi/plm/ )
sched-v17 is PLM #2251
sched-nopolicy-v17 is PLM #2252
Results asap.
cliffw
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
The church is near, but the road is icy.
The bar is far, but i will walk carefully. - Russian proverb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-24 18:10 Nick's scheduler v17 Nick Piggin
2003-10-24 20:17 ` cliff white
@ 2003-10-24 21:49 ` Andrew Theurer
2003-10-25 1:12 ` Nick Piggin
1 sibling, 1 reply; 7+ messages in thread
From: Andrew Theurer @ 2003-10-24 21:49 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel
On Friday 24 October 2003 13:10, Nick Piggin wrote:
> Hi,
> http://www.kerneltrap.org/~npiggin/v17/
>
> Still working on SMP and NUMA. Some (maybe) interesting things I put in are
> - Sequential CPU balancing so you don't get a big storm of balances
> every 1/4s.
> - Balancing is trying to err more on the side of caution, I have to start
> analysing it more thoroughly though.
+
+ *imbalance /= 2;
+ *imbalance = (*imbalance + FPT - 1) / FPT;
I think I see what is going on here, but would something like this work out
better?
*imbalance = min(this_load - load_avg, load_avg - max_load)
That way you take just enough to either have busiest_queue or this_rq's length
be the load_avg. I suppose you could take even less, but IMO, the /=2 is
what I really don't like. Perhaps:
*imbalance = min(this_load - load_avg, load_avg - max_load);
*imbalance = (*imbalance + FPT - 1) / FPT;
This should work well for intranode balances, internode balances may need a
little optimization, since the load_avg really does not really represent the
load avg of the two nodes in question, just one cpu from one of them and all
the cpus from another.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-24 21:49 ` Andrew Theurer
@ 2003-10-25 1:12 ` Nick Piggin
2003-10-26 6:43 ` Nick Piggin
0 siblings, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2003-10-25 1:12 UTC (permalink / raw)
To: Andrew Theurer; +Cc: linux-kernel
Andrew Theurer wrote:
>On Friday 24 October 2003 13:10, Nick Piggin wrote:
>
>>Hi,
>>http://www.kerneltrap.org/~npiggin/v17/
>>
>>Still working on SMP and NUMA. Some (maybe) interesting things I put in are
>>- Sequential CPU balancing so you don't get a big storm of balances
>>every 1/4s.
>>- Balancing is trying to err more on the side of caution, I have to start
>> analysing it more thoroughly though.
>>
>
>+
>+ *imbalance /= 2;
>+ *imbalance = (*imbalance + FPT - 1) / FPT;
>
>I think I see what is going on here, but would something like this work out
>better?
>
Yeah, sorry its not well commented. Its still changing quite quickly.
>
> *imbalance = min(this_load - load_avg, load_avg - max_load)
>
>That way you take just enough to either have busiest_queue or this_rq's length
>be the load_avg. I suppose you could take even less, but IMO, the /=2 is
>what I really don't like. Perhaps:
>
That is _exactly_ what I had before! Thats probably the way to go. Thanks
for having a look at it.
>
>
>*imbalance = min(this_load - load_avg, load_avg - max_load);
>*imbalance = (*imbalance + FPT - 1) / FPT;
>
>This should work well for intranode balances, internode balances may need a
>little optimization, since the load_avg really does not really represent the
>load avg of the two nodes in question, just one cpu from one of them and all
>the cpus from another.
>
Yeah that does need a bit of rethinking.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-25 1:12 ` Nick Piggin
@ 2003-10-26 6:43 ` Nick Piggin
2003-10-27 17:02 ` Andrew Theurer
0 siblings, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2003-10-26 6:43 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Theurer, linux-kernel
Nick Piggin wrote:
>
>
> Andrew Theurer wrote:
>
>> On Friday 24 October 2003 13:10, Nick Piggin wrote:
>>
>>> Hi,
>>> http://www.kerneltrap.org/~npiggin/v17/
>>>
>>> Still working on SMP and NUMA. Some (maybe) interesting things I put
>>> in are
>>> - Sequential CPU balancing so you don't get a big storm of balances
>>> every 1/4s.
>>> - Balancing is trying to err more on the side of caution, I have to
>>> start
>>> analysing it more thoroughly though.
>>>
>>
>> +
>> + *imbalance /= 2;
>> + *imbalance = (*imbalance + FPT - 1) / FPT;
>>
>> I think I see what is going on here, but would something like this
>> work out better?
>>
>
> Yeah, sorry its not well commented. Its still changing quite quickly.
>
>>
>> *imbalance = min(this_load - load_avg, load_avg - max_load)
>>
>> That way you take just enough to either have busiest_queue or
>> this_rq's length be the load_avg. I suppose you could take even
>> less, but IMO, the /=2 is what I really don't like. Perhaps:
>>
>
> That is _exactly_ what I had before! Thats probably the way to go. Thanks
> for having a look at it.
>
>>
>>
>> *imbalance = min(this_load - load_avg, load_avg - max_load);
>> *imbalance = (*imbalance + FPT - 1) / FPT;
>>
>> This should work well for intranode balances, internode balances may
>> need a little optimization, since the load_avg really does not really
>> represent the load avg of the two nodes in question, just one cpu
>> from one of them and all the cpus from another.
>>
Oh, actually, after my path, load_avg represents the load average of _all_
the nodes. Have a look at find_busiest_node. Which jogs my memory of why
its not always a good idea to do your *imbalance min(...) thing (I actually
saw this happening).
5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus.
balancing doesn't happen. I have to think about this a bit more...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-26 6:43 ` Nick Piggin
@ 2003-10-27 17:02 ` Andrew Theurer
2003-10-27 23:13 ` Nick Piggin
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Theurer @ 2003-10-27 17:02 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel
> >> *imbalance = min(this_load - load_avg, load_avg - max_load)
> >>
> >> That way you take just enough to either have busiest_queue or
> >> this_rq's length be the load_avg. I suppose you could take even
> >> less, but IMO, the /=2 is what I really don't like. Perhaps:
> >
> > That is _exactly_ what I had before! Thats probably the way to go. Thanks
> > for having a look at it.
> >
> >> *imbalance = min(this_load - load_avg, load_avg - max_load);
> >> *imbalance = (*imbalance + FPT - 1) / FPT;
> >>
> >> This should work well for intranode balances, internode balances may
> >> need a little optimization, since the load_avg really does not really
> >> represent the load avg of the two nodes in question, just one cpu
> >> from one of them and all the cpus from another.
>
> Oh, actually, after my path, load_avg represents the load average of _all_
> the nodes. Have a look at find_busiest_node. Which jogs my memory of why
> its not always a good idea to do your *imbalance min(...) thing (I actually
> saw this happening).
Oops, I meant avg_load, which you calculate in find_busiest_queue on the fly.
> 5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus.
> balancing doesn't happen. I have to think about this a bit more...
Actually, if we use avg_load, I guess it would be 0, since this is an unsigned
long. Maybe avg_load needs to have a min value of 1. Then if we apply:
*imbalance = min(max_load - avg_load, avg_load - this_load)
min(4 - 1, 1 - 0)
And imbalance looks a lot better. Only concern would be an idle cpu stealing
from another, leaving the other cpu idle. I guess a check could be put
there.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Nick's scheduler v17
2003-10-27 17:02 ` Andrew Theurer
@ 2003-10-27 23:13 ` Nick Piggin
0 siblings, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2003-10-27 23:13 UTC (permalink / raw)
To: Andrew Theurer; +Cc: linux-kernel
Andrew Theurer wrote:
>>>> *imbalance = min(this_load - load_avg, load_avg - max_load)
>>>>
>>>>That way you take just enough to either have busiest_queue or
>>>>this_rq's length be the load_avg. I suppose you could take even
>>>>less, but IMO, the /=2 is what I really don't like. Perhaps:
>>>>
>>>That is _exactly_ what I had before! Thats probably the way to go. Thanks
>>>for having a look at it.
>>>
>>>
>>>>*imbalance = min(this_load - load_avg, load_avg - max_load);
>>>>*imbalance = (*imbalance + FPT - 1) / FPT;
>>>>
>>>>This should work well for intranode balances, internode balances may
>>>>need a little optimization, since the load_avg really does not really
>>>>represent the load avg of the two nodes in question, just one cpu
>>>>from one of them and all the cpus from another.
>>>>
>>Oh, actually, after my path, load_avg represents the load average of _all_
>>the nodes. Have a look at find_busiest_node. Which jogs my memory of why
>>its not always a good idea to do your *imbalance min(...) thing (I actually
>>saw this happening).
>>
>
>Oops, I meant avg_load, which you calculate in find_busiest_queue on the fly.
>
OK
>
>>5 CPUs, 4 processes running on one cpu. load_avg would be 0.8 for all cpus.
>>balancing doesn't happen. I have to think about this a bit more...
>>
>
>Actually, if we use avg_load, I guess it would be 0, since this is an unsigned
>long. Maybe avg_load needs to have a min value of 1. Then if we apply:
>
Well its got a fixed point scaling factor.
>
>*imbalance = min(max_load - avg_load, avg_load - this_load)
> min(4 - 1, 1 - 0)
>
>
I think you want:
*imbalance = min(max_load - avg_load, avg_load - this_load)
if ( (*imbalance < 1*FPT) &&
(max_load - this_load) > 1*FPT )
*imbalance = 1*FPT;
So if there is a total imbalance of more than 1 task, at least one
will be moved.
>
>And imbalance looks a lot better. Only concern would be an idle cpu stealing
>from another, leaving the other cpu idle. I guess a check could be put
>there.
>
>
pull_task won't pull a running task, so you get some protection there.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-10-27 23:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-24 18:10 Nick's scheduler v17 Nick Piggin
2003-10-24 20:17 ` cliff white
2003-10-24 21:49 ` Andrew Theurer
2003-10-25 1:12 ` Nick Piggin
2003-10-26 6:43 ` Nick Piggin
2003-10-27 17:02 ` Andrew Theurer
2003-10-27 23:13 ` Nick Piggin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.