From: Nick Piggin <piggin@cyberone.com.au>
To: habanero@us.ibm.com
Cc: Bill Davidsen <davidsen@tmr.com>,
"Martin J. Bligh" <mbligh@aracnet.com>,
Erich Focht <efocht@hpce.nec.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
LSE <lse-tech@lists.sourceforge.net>, Andi Kleen <ak@muc.de>,
torvalds@osdl.org, mingo@elte.hu
Subject: Re: [Lse-tech] Re: [patch] scheduler fix for 1cpu/node case
Date: Sat, 23 Aug 2003 10:29:21 +1000 [thread overview]
Message-ID: <3F46B561.7060706@cyberone.com.au> (raw)
In-Reply-To: <200308221912.38184.habanero@us.ibm.com>
Andrew Theurer wrote:
>On Friday 22 August 2003 17:56, Nick Piggin wrote:
>
>>Andrew Theurer wrote:
>>
>>>On Wednesday 13 August 2003 15:49, Bill Davidsen wrote:
>>>
>>>>On Mon, 28 Jul 2003, Andrew Theurer wrote:
>>>>
>>>>>Personally, I'd like to see all systems use NUMA sched, non NUMA systems
>>>>>being a single node (no policy difference from non-numa sched), allowing
>>>>>us to remove all NUMA ifdefs. I think the code would be much more
>>>>>readable.
>>>>>
>>>>That sounds like a great idea, but I'm not sure it could be realized
>>>>short of a major rewrite. Look how hard Ingo and Con are working just to
>>>>get a single node doing a good job with interactive and throughput
>>>>tradeoffs.
>>>>
>>>Actually it's not too bad. Attached is a patch to do it. It also does
>>>multi-level node support and makes all the load balance routines
>>>runqueue-centric instead of cpu-centric, so adding something like shared
>>>runqueues (for HT) should be really easy. Hmm, other things: inter-node
>>>balance intervals are now arch specific (AMD is "1"). The default
>>>busy/idle balance timers of 200/1 are not arch specific, but I'm thinking
>>>they should be. And for non-numa, the scheduling policy is the same as
>>>it was with vanilla O(1).
>>>
>>I'm not saying you're wrong, but do you have some numbers where this
>>helps? ie. two architectures that need very different balance numbers.
>>And what is the reason for making AMD's balance interval 1?
>>
>
>AMD is 1 because there's no need to balance within a node, so I want the
>inter-node balance frequency to be as often as it was with just O(1). This
>interval would not work well with other NUMA boxes, so that's the main reason
>to have arch specific intervals.
>
OK, I misread the patch. IIRC AMD has 1 CPU per node? If so, why doesn't
this simply prevent balancing within a node?
> And, as a general guideline, boxes with
>different local-remote latency ratios will probably benefit from different
>inter-node balance intervals. I don't know what these ratios are, but I'd
>like the kernel to have the ability to change for one arch and not affect
>another.
>
I fully appreciate there are huge differences... I am curious if
you can see much improvements in practice.
>
>>Also, things like nr_running_inc are supposed to be very fast. I am
>>a bit worried to see a loop and CPU shared atomics in there.
>>
>
>That has concerned me, too. So far I haven't been able to see a measurable
>difference either way (within noise level), but it's possible. The other
>alternative is to sum up node load at sched_best_cpu and find_busiest_node.
>
Hmm... get someone to try the scheduler benchmarks on a 32 way box ;)
>
>>node_2_node is an odd sounding conversion too ;)
>>
>
>I just went off the toplogy already there, so I left it.
>
>
>>BTW. you should be CC'ing Ingo if you have any intention of scheduler
>>stuff getting into 2.6.
>>
>
>OK, thanks!
>
>
Good luck with it. Definitely some good ideas.
next prev parent reply other threads:[~2003-08-23 0:30 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-07-28 19:16 [patch] scheduler fix for 1cpu/node case Erich Focht
2003-07-28 19:55 ` Martin J. Bligh
2003-07-28 20:18 ` Erich Focht
2003-07-28 20:37 ` Martin J. Bligh
2003-07-29 2:24 ` Andrew Theurer
2003-07-29 10:08 ` Erich Focht
2003-07-29 13:33 ` [Lse-tech] " Andrew Theurer
2003-07-30 15:23 ` Erich Focht
2003-07-30 15:44 ` Andrew Theurer
2003-07-29 14:27 ` Martin J. Bligh
2003-08-13 20:49 ` Bill Davidsen
2003-08-22 15:46 ` [Lse-tech] " Andrew Theurer
2003-08-22 22:56 ` Nick Piggin
2003-08-23 0:12 ` Andrew Theurer
2003-08-23 0:29 ` Nick Piggin [this message]
2003-08-23 0:47 ` William Lee Irwin III
2003-08-23 8:48 ` Nick Piggin
2003-08-23 14:32 ` Andrew Theurer
2003-08-23 1:31 ` Martin J. Bligh
2003-07-29 10:08 ` Erich Focht
2003-07-29 14:41 ` Andi Kleen
2003-07-31 15:05 ` Martin J. Bligh
2003-07-31 21:45 ` Erich Focht
2003-08-01 0:26 ` Martin J. Bligh
2003-08-01 16:30 ` [Lse-tech] " Erich Focht
2003-07-29 14:06 Mala Anand
2003-07-29 14:29 ` Martin J. Bligh
2003-07-29 16:04 Mala Anand
2003-07-30 16:34 Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3F46B561.7060706@cyberone.com.au \
--to=piggin@cyberone.com.au \
--cc=ak@muc.de \
--cc=davidsen@tmr.com \
--cc=efocht@hpce.nec.com \
--cc=habanero@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mbligh@aracnet.com \
--cc=mingo@elte.hu \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).