All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
Cc: Mel Gorman <mgorman@suse.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	"bsegall@google.com" <bsegall@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Phil Auld <pauld@redhat.com>, Hillf Danton <hdanton@sina.com>,
	Ingo Molnar <mingo@kernel.org>, Linuxarm <linuxarm@huawei.com>,
	"Liguozhu (Kenneth)" <liguozhu@hisilicon.com>
Subject: Re: [RFC] sched/numa: don't move tasks to idle numa nodes while src node has very light load?
Date: Mon, 7 Sep 2020 13:42:16 +0100	[thread overview]
Message-ID: <20200907124216.GE3179@techsingularity.net> (raw)
In-Reply-To: <e7b7462375de4175a83ece3b60bab899@hisilicon.com>

On Mon, Sep 07, 2020 at 12:00:10PM +0000, Song Bao Hua (Barry Song) wrote:
> Hi All,
> In case we have a numa system with 4 nodes and in each node we have 24 cpus, and all of the 96 cores are idle.
> Then we start a process with 4 threads in this totally idle system. 
> Actually any one of the four numa nodes should have enough capability to run the 4 threads while they can still have 20 idle CPUS after that.
> But right now the existing code in CFS load balance will spread the 4 threads to multiple nodes.
> This results in two negative side effects:
> 1. more numa nodes are awaken while they can save power in lowest frequency and halt status
> 2. cache coherency overhead between numa nodes
> 
> A proof-of-concept patch I made to "fix" this issue to some extent is like:
> 

This is similar in concept to a patch that did something similar except
in adjust_numa_imbalance(). It ended up being great for light loads like
simple communicating pairs but fell apart for some HPC workloads when
memory bandwidth requirements increased. Ultimately it was dropped until
the NUMA/CPU load balancing was reconciled so may be worth a revisit. At
the time, it was really problematic once a one node was roughly 25% CPU
utilised on a 2-socket machine with hyper-threading enabled. The patch may
still work out but it would need wider testing. Within mmtests, the NAS
workloads for D-class on a 2-socket machine varying the number of parallel
tasks/processes are used should be enough to determine if the patch is
free from side-effects for one machine. It gets problematic for different
machine sizes as the point where memory bandwidth is saturated varies.
group_weight/4 might be fine on one machine as a cut-off and a problem
on a larger machine with more cores -- I hit that particular problem
when one 2 socket machine with 48 logical CPUs was fine but a different
machine with 80 logical CPUs regressed.

I'm not saying the patch is wrong, just that patches in general for this
area (everyone, not just you) need fairly broad testing.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2020-09-07 12:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-07 12:00 [RFC] sched/numa: don't move tasks to idle numa nodes while src node has very light load? Song Bao Hua (Barry Song)
2020-09-07 12:37 ` Vincent Guittot
2020-10-10 21:43   ` Song Bao Hua (Barry Song)
2020-09-07 12:42 ` Mel Gorman [this message]
2020-10-10 22:04   ` Song Bao Hua (Barry Song)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200907124216.GE3179@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=a.p.zijlstra@chello.nl \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=juri.lelli@redhat.com \
    --cc=liguozhu@hisilicon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.