linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Phil Auld <pauld@redhat.com>, Parth Shah <parth@linux.ibm.com>,
	Valentin Schneider <valentin.schneider@arm.com>
Subject: Re: [PATCH 0/4] remove runnable_load_avg and improve group_classify
Date: Wed, 12 Feb 2020 09:16:53 +0100	[thread overview]
Message-ID: <CAKfTPtA4h4FQoAEDjVeT4nWJCs5Lk5=9w4VnKq2wgpgJui7Y8w@mail.gmail.com> (raw)
In-Reply-To: <20200211210439.GS3420@suse.de>

On Tue, 11 Feb 2020 at 22:04, Mel Gorman <mgorman@suse.de> wrote:
>
> On Tue, Feb 11, 2020 at 06:46:47PM +0100, Vincent Guittot wrote:
> > NUMA load balancing is the last remaining piece of code that uses the
> > runnable_load_avg of PELT to balance tasks between nodes. The normal
> > load_balance has replaced it by a better description of the current state
> > of the group of cpus.  The same policy can be applied to the numa
> > balancing.
> >
> > Once unused, runnable_load_avg can be replaced by a simpler runnable_avg
> > signal that tracks the waiting time of tasks on rq. Currently, the state
> > of a group of CPUs is defined thanks to the number of running task and the
> > level of utilization of rq. But the utilization can be temporarly low
> > after the migration of a task whereas the rq is still overloaded with
> > tasks. In such case where tasks were competing for the rq, the
> > runnable_avg will stay high after the migration.
> >
> > Some hackbench results:
> >
> > - small arm64 dual quad cores system
> > hackbench -l (2560/#grp) -g #grp
> >
> > grp    tip/sched/core         +patchset              improvement
> > 1       1,327(+/-10,06 %)     1,247(+/-5,45 %)       5,97 %
> > 4       1,250(+/- 2,55 %)     1,207(+/-2,12 %)       3,42 %
> > 8       1,189(+/- 1,47 %)     1,179(+/-1,93 %)       0,90 %
> > 16      1,221(+/- 3,25 %)     1,219(+/-2,44 %)       0,16 %
> >
> > - large arm64 2 nodes / 224 cores system
> > hackbench -l (256000/#grp) -g #grp
> >
> > grp    tip/sched/core         +patchset              improvement
> > 1      14,197(+/- 2,73 %)     13,917(+/- 2,19 %)     1,98 %
> > 4       6,817(+/- 1,27 %)      6,523(+/-11,96 %)     4,31 %
> > 16      2,930(+/- 1,07 %)      2,911(+/- 1,08 %)     0,66 %
> > 32      2,735(+/- 1,71 %)      2,725(+/- 1,53 %)     0,37 %
> > 64      2,702(+/- 0,32 %)      2,717(+/- 1,07 %)    -0,53 %
> > 128     3,533(+/-14,66 %)     3,123(+/-12,47 %)     11,59 %
> > 256     3,918(+/-19,93 %)     3,390(+/- 5,93 %)     13,47 %
> >
>
> I haven't reviewed this yet because by co-incidence I'm finalising a
> series that tries to reconcile the load balancer with the NUMA balancer

That's interesting !
This series has been pending for a while and I have finally been able
to send it for review.*

> and it has been very tricky to get right.  One aspect though is that

I have been quite conservative in the policy as my main goal was not
to change all numa policy but mainly to remove the last user of
runnable_load_avg and i don't expect much behavior changes

> hackbench is generally not long-running enough to detect any performance
> regressions in NUMA balancing. At least I've never observed it to be a
> good evaluation for NUMA balancing.
>
> > Without the patchset, there is a significant number of time that a CPU has
> > spare capacity with more than 1 running task. Although this is a valid
> > case, this is not a state that should often happen when 160 tasks are
> > competing on 8 cores like for this test. The patchset fixes the situation
> > by taking into account the runnable_avg, which stays high after the
> > migration of a task on another CPU.
> >
>
> FWIW, during the rewrite, I ended up moving away from runnable_load to
> get the load balancer and NUMA balancer to use the same metrics.
>
> --
> Mel Gorman
> SUSE Labs

      reply	other threads:[~2020-02-12  8:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 17:46 [PATCH 0/4] remove runnable_load_avg and improve group_classify Vincent Guittot
2020-02-11 17:46 ` [PATCH 1/4] sched/fair: reorder enqueue/dequeue_task_fair path Vincent Guittot
2020-02-12 13:20   ` Mel Gorman
2020-02-12 14:47     ` Vincent Guittot
2020-02-12 16:11       ` Mel Gorman
2020-02-11 17:46 ` [RFC 2/4] sched/numa: replace runnable_load_avg by load_avg Vincent Guittot
2020-02-12 13:37   ` Mel Gorman
2020-02-12 15:03     ` Vincent Guittot
2020-02-12 16:04       ` Mel Gorman
2020-02-12 19:49     ` Mel Gorman
2020-02-12 21:29       ` Mel Gorman
2020-02-13  8:05       ` Vincent Guittot
2020-02-13  9:24         ` Mel Gorman
     [not found]         ` <20200213131658.9600-1-hdanton@sina.com>
2020-02-13 13:46           ` Mel Gorman
2020-02-13 15:00             ` Phil Auld
2020-02-13 15:14               ` Mel Gorman
2020-02-13 16:11                 ` Vincent Guittot
2020-02-13 16:34                   ` Mel Gorman
2020-02-13 16:38                     ` Vincent Guittot
2020-02-13 17:02                       ` Mel Gorman
2020-02-13 17:15                         ` Vincent Guittot
2020-02-11 17:46 ` [RFC 3/4] sched/fair: replace runnable load average by runnable average Vincent Guittot
2020-02-12 14:30   ` Mel Gorman
2020-02-14  7:42     ` Vincent Guittot
2020-02-13 17:36   ` Peter Zijlstra
2020-02-14  7:43     ` Vincent Guittot
2020-02-11 17:46 ` [RFC 4/4] sched/fair: Take into runnable_avg to classify group Vincent Guittot
2020-02-13 18:32   ` Valentin Schneider
2020-02-13 18:37     ` Valentin Schneider
2020-02-14  7:48       ` Vincent Guittot
2020-02-11 21:04 ` [PATCH 0/4] remove runnable_load_avg and improve group_classify Mel Gorman
2020-02-12  8:16   ` Vincent Guittot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKfTPtA4h4FQoAEDjVeT4nWJCs5Lk5=9w4VnKq2wgpgJui7Y8w@mail.gmail.com' \
    --to=vincent.guittot@linaro.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).