linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <piggin@cyberone.com.au>
To: habanero@us.ibm.com
Cc: Bill Davidsen <davidsen@tmr.com>,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	Erich Focht <efocht@hpce.nec.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	LSE <lse-tech@lists.sourceforge.net>, Andi Kleen <ak@muc.de>,
	torvalds@osdl.org, mingo@elte.hu
Subject: Re: [Lse-tech] Re: [patch] scheduler fix for 1cpu/node case
Date: Sat, 23 Aug 2003 10:29:21 +1000	[thread overview]
Message-ID: <3F46B561.7060706@cyberone.com.au> (raw)
In-Reply-To: <200308221912.38184.habanero@us.ibm.com>



Andrew Theurer wrote:

>On Friday 22 August 2003 17:56, Nick Piggin wrote:
>
>>Andrew Theurer wrote:
>>
>>>On Wednesday 13 August 2003 15:49, Bill Davidsen wrote:
>>>
>>>>On Mon, 28 Jul 2003, Andrew Theurer wrote:
>>>>
>>>>>Personally, I'd like to see all systems use NUMA sched, non NUMA systems
>>>>>being a single node (no policy difference from non-numa sched), allowing
>>>>>us to remove all NUMA ifdefs.  I think the code would be much more
>>>>>readable.
>>>>>
>>>>That sounds like a great idea, but I'm not sure it could be realized
>>>>short of a major rewrite. Look how hard Ingo and Con are working just to
>>>>get a single node doing a good job with interactive and throughput
>>>>tradeoffs.
>>>>
>>>Actually it's not too bad.  Attached is a patch to do it.  It also does
>>>multi-level node support and makes all the load balance routines
>>>runqueue-centric instead of cpu-centric, so adding something like shared
>>>runqueues (for HT) should be really easy.  Hmm, other things: inter-node
>>>balance intervals are now arch specific (AMD is "1").  The default
>>>busy/idle balance timers of 200/1 are not arch specific, but I'm thinking
>>>they should be.  And for non-numa, the scheduling policy is the same as
>>>it was with vanilla O(1).
>>>
>>I'm not saying you're wrong, but do you have some numbers where this
>>helps? ie. two architectures that need very different balance numbers.
>>And what is the reason for making AMD's balance interval 1?
>>
>
>AMD is 1 because there's no need to balance within a node, so I want the 
>inter-node balance frequency to be as often as it was with just O(1).  This 
>interval would not work well with other NUMA boxes, so that's the main reason 
>to have arch specific intervals.
>

OK, I misread the patch. IIRC AMD has 1 CPU per node? If so, why doesn't
this simply prevent balancing within a node?

>  And, as a general guideline, boxes with 
>different local-remote latency ratios will probably benefit from different 
>inter-node balance intervals.  I don't know what these ratios are, but I'd 
>like the kernel to have the ability to change for one arch and not affect 
>another.
>

I fully appreciate there are huge differences... I am curious if
you can see much improvements in practice.

>
>>Also, things like nr_running_inc are supposed to be very fast. I am
>>a bit worried to see a loop and CPU shared atomics in there.
>>
>
>That has concerned me, too.  So far I haven't been able to see a measurable 
>difference either way (within noise level), but it's possible.  The other 
>alternative is to sum up node load at sched_best_cpu and find_busiest_node.
>

Hmm... get someone to try the scheduler benchmarks on a 32 way box ;)

>
>>node_2_node is an odd sounding conversion too ;)
>>
>
>I just went off the toplogy already there, so I left it.
>
>
>>BTW. you should be CC'ing Ingo if you have any intention of scheduler
>>stuff getting into 2.6.
>>
>
>OK, thanks!
>
>

Good luck with it. Definitely some good ideas.



  reply	other threads:[~2003-08-23  0:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-28 19:16 [patch] scheduler fix for 1cpu/node case Erich Focht
2003-07-28 19:55 ` Martin J. Bligh
2003-07-28 20:18   ` Erich Focht
2003-07-28 20:37     ` Martin J. Bligh
2003-07-29  2:24       ` Andrew Theurer
2003-07-29 10:08         ` Erich Focht
2003-07-29 13:33           ` [Lse-tech] " Andrew Theurer
2003-07-30 15:23             ` Erich Focht
2003-07-30 15:44               ` Andrew Theurer
2003-07-29 14:27           ` Martin J. Bligh
2003-08-13 20:49         ` Bill Davidsen
2003-08-22 15:46           ` [Lse-tech] " Andrew Theurer
2003-08-22 22:56             ` Nick Piggin
2003-08-23  0:12               ` Andrew Theurer
2003-08-23  0:29                 ` Nick Piggin [this message]
2003-08-23  0:47                   ` William Lee Irwin III
2003-08-23  8:48                     ` Nick Piggin
2003-08-23 14:32                   ` Andrew Theurer
2003-08-23  1:31                 ` Martin J. Bligh
2003-07-29 10:08       ` Erich Focht
2003-07-29 14:41     ` Andi Kleen
2003-07-31 15:05 ` Martin J. Bligh
2003-07-31 21:45   ` Erich Focht
2003-08-01  0:26     ` Martin J. Bligh
2003-08-01 16:30       ` [Lse-tech] " Erich Focht
2003-07-29 14:06 Mala Anand
2003-07-29 14:29 ` Martin J. Bligh
2003-07-29 16:04 Mala Anand
2003-07-30 16:34 Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F46B561.7060706@cyberone.com.au \
    --to=piggin@cyberone.com.au \
    --cc=ak@muc.de \
    --cc=davidsen@tmr.com \
    --cc=efocht@hpce.nec.com \
    --cc=habanero@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mbligh@aracnet.com \
    --cc=mingo@elte.hu \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).