linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"riel@redhat.com" <riel@redhat.com>,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"srikar@linux.vnet.ibm.com" <srikar@linux.vnet.ibm.com>,
	"pjt@google.com" <pjt@google.com>,
	"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"efault@gmx.de" <efault@gmx.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
	"svaidy@linux.vnet.ibm.com" <svaidy@linux.vnet.ibm.com>,
	"tim.c.chen@linux.intel.com" <tim.c.chen@linux.intel.com>,
	"jason.low2@hp.com" <jason.low2@hp.com>
Subject: Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs
Date: Mon, 30 Mar 2015 12:30:26 +0100	[thread overview]
Message-ID: <20150330113026.GS18994@e105550-lin.cambridge.arm.com> (raw)
In-Reply-To: <5518FA9B.6040508@linux.vnet.ibm.com>

On Mon, Mar 30, 2015 at 08:26:19AM +0100, Preeti U Murthy wrote:
> Hi Morten,
> 
> On 03/27/2015 11:26 PM, Morten Rasmussen wrote:
> > 
> > I agree that the current behaviour is undesirable and should be fixed,
> > but IMHO waking up all idle cpus can not be justified. It is only one
> > additional cpu though with your patch so it isn't quite that bad.
> > 
> > I agree that it is hard to predict how many additional cpus you need,
> > but I don't think you necessarily need that information as long as you
> > start by filling up the cpu that was kicked to do the
> > nohz_idle_balance() first.
> > 
> > You would also solve your problem if you removed the ability for the cpu
> > to bail out after balancing itself and force it to finish the job. It
> > would mean harming tasks that where pulled to the balancing cpu as they
> > would have to wait being scheduling until the nohz_idle_balance() has
> > completed. It could be a price worth paying.
> 
> But how would this prevent waking up idle CPUs ? You still end up waking
> up all idle CPUs, wouldn't you?

That depends on the scenario. In your example from the changelog you
would. You have enough work for all the nohz-idle cpus and you will keep
iterating through all of them and pull work on their behalf and hence
wake them up. But in a scenario where you don't have enough work for all
nohz-idle cpus you are guaranteed that the balancer cpu has taken its
share and doesn't go back to sleep immediately after finish
nohz_idle_balance(). So all cpus woken up will have a task to run
including the balancer cpu. 

> 
> > 
> > An alternative could be to let the balancing cpu balance itself first
> > and bail out as it currently does, but let it kick the next nohz_idle
> > cpu to continue the job if it thinks there is more work to be done. So
> > you would get a chain of kicks that would stop when there nothing
> > more to do be done. It isn't quite as fast as your solution as it would
> 
> I am afraid there is more to this. If a given CPU is unable to pull
> tasks, it could mean that it is an unworthy destination CPU. But it does
> not mean that the other idle CPUs are unworthy of balancing too.
> 
> So if the ILB CPU stops waking up idle CPUs when it has nothing to pull,
> we will end up hurting load balancing. Take for example the scenario
> described in the changelog. The idle CPUs within a numa node may find
> load balanced within themselves and hence refrain from pulling any load.
> If these ILB CPUs stop nohz idle load balancing at this point, the load
> will never get spread across nodes.
> 
> If on the other hand, if we keep kicking idle CPUs to carry on idle load
> balancing, the wakeup scenario will be no better than it is with this patch.

By more work to be done I didn't mean stopping when the balancer cpu
gives up, I meant stopping the kick chain when there nothing more to be
balanced/pulled (or reasonably close). For example use something like
the nohz_kick_needed() checks on the source cpu/group and stopping if
all cpus only have one runnable task. At least try to stop waking an
extra cpu when there is clearly no point in doing so.

> > require an IPI plus wakeup for each cpu to continue the work. But it
> > should be much faster than the current code I think.
> > 
> > IMHO it makes more sense to stay with the current scheme of ensuring
> > that the kicked cpu is actually used before waking up more cpus and
> > instead improve how additional cpus are kicked if they are needed.
> 
> It looks more sensible to do this in parallel. The scenario on POWER is
> that tasks don't spread out across nodes until 10s of fork. This is
> unforgivable and we cannot afford the code to be the way it is today.

You propose having multiple balancing cpus running in parallel?

I fully agree that the nohz-balancing behaviour should be improved. It
could use a few changes to improve energy-awareness as well. IMHO,
taking a double hit every time we need to wake up an additional cpu is
inefficient and goes against all the effort put into reducing wake-ups
is essential for saving energy.

One thing that would help reducing energy consumption and that vendors
carry out-of-tree is improving find_new_ilb() to pick the cheapest cpu
to be kicked for nohz-idle balancing. However, this improvement is
pointless if we are going to wake up an additional cpu to receive the
task(s) that needs to be balanced.

I think a solution where we at least try to avoid waking up additional
cpus and vastly improves your 10s latency is possible.

Morten

  reply	other threads:[~2015-03-30 11:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-26 13:02 [PATCH V2] sched: Improve load balancing in the presence of idle CPUs Preeti U Murthy
2015-03-26 17:03 ` Jason Low
2015-03-27  2:12 ` Wanpeng Li
2015-03-27  4:33   ` Preeti U Murthy
2015-03-27  4:23     ` Wanpeng Li
2015-03-27  5:01     ` Jason Low
2015-03-27  5:07   ` Jason Low
2015-03-27  5:39     ` Srikar Dronamraju
2015-03-27  7:00       ` Wanpeng Li
2015-03-27  6:43     ` Wanpeng Li
2015-03-27 16:23     ` Preeti U Murthy
2015-03-27 11:43 ` [tip:sched/core] " tip-bot for Preeti U Murthy
2015-03-27 13:03 ` [PATCH V2] " Srikar Dronamraju
2015-03-27 14:38 ` Morten Rasmussen
2015-03-27 16:46   ` Preeti U Murthy
2015-03-27 17:56     ` Morten Rasmussen
2015-03-30  7:26       ` Preeti U Murthy
2015-03-30 11:30         ` Morten Rasmussen [this message]
2015-03-30 11:06       ` Peter Zijlstra
2015-03-30 12:03         ` Morten Rasmussen
2015-03-30 12:24           ` Peter Zijlstra
2015-03-30 12:54             ` Peter Zijlstra
2015-03-30 13:29             ` Vincent Guittot
2015-03-30 14:01               ` Peter Zijlstra
2015-03-30 15:27               ` Morten Rasmussen
2015-03-31  8:58           ` Preeti U Murthy
2015-03-31 17:30             ` Jason Low
2015-04-01  6:28               ` Preeti U Murthy
2015-04-01 13:03                 ` Morten Rasmussen
2015-04-02  0:55                   ` Jason Low
2015-04-02  3:22                   ` Preeti U Murthy
2015-03-30 13:45 ` Vincent Guittot
2015-03-31  8:06   ` Preeti U Murthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150330113026.GS18994@e105550-lin.cambridge.arm.com \
    --to=morten.rasmussen@arm.com \
    --cc=benh@kernel.crashing.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=efault@gmx.de \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).