All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jirka Hladky <jhladky@redhat.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Phil Auld <pauld@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Douglas Shakshober <dshaks@redhat.com>,
	Waiman Long <longman@redhat.com>, Joe Mario <jmario@redhat.com>,
	Bill Gray <bgray@redhat.com>
Subject: Re: [PATCH 00/13] Reconcile NUMA balancing decisions with the load balancer v6
Date: Mon, 18 May 2020 16:52:52 +0200	[thread overview]
Message-ID: <CAE4VaGDBAquxbBjuzzyaT1WPR95wiaiHsrEPs-eOP2W+r=fQFg@mail.gmail.com> (raw)
In-Reply-To: <20200508034741.13036-1-hdanton@sina.com>

Hi Hillf,

thanks a lot for your patch!

Compared to 5.7 rc4 vanilla, we observe the following:
  * Single-tenant jobs show improvement up to 15% for SPECjbb2005 and
up to 100% for NAS in low thread mode. In other words, it fixes all
the problems we have reported in this thread.
  * Multitenancy jobs show performance degradation up to 30% for SPECjbb2005

While it fixes problems with single-tenant jobs and with a performance
at low system load, it breaks multi-tenant tasks.

We have compared it against kernel with adjust_numa_imbalance disabled
[1], and both kernels perform at the same level for the single-tenant
jobs, but the proposed patch is bad for the multitenancy mode. The
kernel with adjust_numa_imbalance disabled is a clear winner here.

We would be very interested in what others think about disabling
adjust_numa_imbalance function. The patch is bellow. It would be great
to collect performance results for different scenarios to make sure
the results are objective.

Thanks a lot!
Jirka

[1] Patch to test kernel with adjust_numa_imbalance disabled:
===============================================
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 02f323b..8c43d29 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8907,14 +8907,6 @@ static inline long adjust_numa_imbalance(int
imbalance, int src_nr_running)
{
       unsigned int imbalance_min;

-       /*
-        * Allow a small imbalance based on a simple pair of communicating
-        * tasks that remain local when the source domain is almost idle.
-        */
-       imbalance_min = 2;
-       if (src_nr_running <= imbalance_min)
-               return 0;
-
       return imbalance;
}
===============================================





On Fri, May 8, 2020 at 5:47 AM Hillf Danton <hdanton@sina.com> wrote:
>
>
> On Thu, 7 May 2020 13:49:34 Phil Auld wrote:
> >
> > On Thu, May 07, 2020 at 06:29:44PM +0200 Jirka Hladky wrote:
> > > Hi Mel,
> > >
> > > we are not targeting just OMP applications. We see the performance
> > > degradation also for other workloads, like SPECjbb2005 and
> > > SPECjvm2008. Even worse, it also affects a higher number of threads.
> > > For example, comparing 5.7.0-0.rc2 against 5.6 kernel, on 4 NUMA
> > > server with 2x AMD 7351 CPU, we see performance degradation 22% for 32
> > > threads (the system has 64 CPUs in total). We observe this degradation
> > > only when we run a single SPECjbb binary. When running 4 SPECjbb
> > > binaries in parallel, there is no change in performance between 5.6
> > > and 5.7.
> > >
> > > That's why we are asking for the kernel tunable, which we would add to
> > > the tuned profile. We don't expect users to change this frequently but
> > > rather to set the performance profile once based on the purpose of the
> > > server.
> > >
> > > If you could prepare a patch for us, we would be more than happy to
> > > test it extensively. Based on the results, we can then evaluate if
> > > it's the way to go. Thoughts?
> > >
> >
> > I'm happy to spin up a patch once I'm sure what exactly the tuning would
> > effect. At an initial glance I'm thinking it would be the imbalance_min
> > which is currently hardcoded to 2. But there may be something else...
>
> hrm... try to restore the old behavior by skipping task migrate in favor
> of the local node if there is no imbalance.
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1928,18 +1928,16 @@ static void task_numa_find_cpu(struct ta
>                 src_running = env->src_stats.nr_running - 1;
>                 dst_running = env->dst_stats.nr_running + 1;
>                 imbalance = max(0, dst_running - src_running);
> -               imbalance = adjust_numa_imbalance(imbalance, src_running);
> +               imbalance = adjust_numa_imbalance(imbalance, src_running +1);
>
> -               /* Use idle CPU if there is no imbalance */
> +               /* No task migrate without imbalance */
>                 if (!imbalance) {
> -                       maymove = true;
> -                       if (env->dst_stats.idle_cpu >= 0) {
> -                               env->dst_cpu = env->dst_stats.idle_cpu;
> -                               task_numa_assign(env, NULL, 0);
> -                               return;
> -                       }
> +                       env->best_cpu = -1;
> +                       return;
>                 }
> -       } else {
> +       }
> +
> +       do {
>                 long src_load, dst_load, load;
>                 /*
>                  * If the improvement from just moving env->p direction is better
> @@ -1949,7 +1947,7 @@ static void task_numa_find_cpu(struct ta
>                 dst_load = env->dst_stats.load + load;
>                 src_load = env->src_stats.load - load;
>                 maymove = !load_too_imbalanced(src_load, dst_load, env);
> -       }
> +       } while (0);
>
>         for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
>                 /* Skip this CPU if the source task cannot migrate */
>
>


--
-Jirka


  parent reply	other threads:[~2020-05-18 14:53 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24  9:52 [PATCH 00/13] Reconcile NUMA balancing decisions with the load balancer v6 Mel Gorman
2020-02-24  9:52 ` [PATCH 01/13] sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression Mel Gorman
2020-02-24  9:52 ` [PATCH 02/13] sched/numa: Trace when no candidate CPU was found on the preferred node Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 03/13] sched/numa: Distinguish between the different task_numa_migrate failure cases Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] sched/numa: Distinguish between the different task_numa_migrate() " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 04/13] sched/fair: Reorder enqueue/dequeue_task_fair path Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2020-02-24  9:52 ` [PATCH 05/13] sched/numa: Replace runnable_load_avg by load_avg Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2020-02-24  9:52 ` [PATCH 06/13] sched/numa: Use similar logic to the load balancer for moving between domains with spare capacity Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 07/13] sched/pelt: Remove unused runnable load average Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2020-02-24  9:52 ` [PATCH 08/13] sched/pelt: Add a new runnable average signal Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2020-02-24 16:01     ` Valentin Schneider
2020-02-24 16:34       ` Mel Gorman
2020-02-25  8:23       ` Vincent Guittot
2020-02-24  9:52 ` [PATCH 09/13] sched/fair: Take into account runnable_avg to classify group Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Vincent Guittot
2020-02-24  9:52 ` [PATCH 10/13] sched/numa: Prefer using an idle cpu as a migration target instead of comparing tasks Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] sched/numa: Prefer using an idle CPU " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 11/13] sched/numa: Find an alternative idle CPU if the CPU is part of an active NUMA balance Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 12/13] sched/numa: Bias swapping tasks based on their preferred node Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2020-02-24  9:52 ` [PATCH 13/13] sched/numa: Stop an exhastive search if a reasonable swap candidate or idle CPU is found Mel Gorman
2020-02-24 15:20   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2020-02-24 15:16 ` [PATCH 00/13] Reconcile NUMA balancing decisions with the load balancer v6 Ingo Molnar
2020-02-25 11:59   ` Mel Gorman
2020-02-25 13:28     ` Vincent Guittot
2020-02-25 14:24       ` Mel Gorman
2020-02-25 14:53         ` Vincent Guittot
2020-02-27  9:09         ` Ingo Molnar
2020-03-09 19:12 ` Phil Auld
2020-03-09 20:36   ` Mel Gorman
2020-03-12  9:54     ` Mel Gorman
2020-03-12 12:17       ` Jirka Hladky
     [not found]       ` <CAE4VaGA4q4_qfC5qe3zaLRfiJhvMaSb2WADgOcQeTwmPvNat+A@mail.gmail.com>
2020-03-12 15:56         ` Mel Gorman
2020-03-12 17:06           ` Jirka Hladky
     [not found]           ` <CAE4VaGD8DUEi6JnKd8vrqUL_8HZXnNyHMoK2D+1-F5wo+5Z53Q@mail.gmail.com>
2020-03-12 21:47             ` Mel Gorman
2020-03-12 22:24               ` Jirka Hladky
2020-03-20 15:08                 ` Jirka Hladky
     [not found]                 ` <CAE4VaGC09OfU2zXeq2yp_N0zXMbTku5ETz0KEocGi-RSiKXv-w@mail.gmail.com>
2020-03-20 15:22                   ` Mel Gorman
2020-03-20 15:33                     ` Jirka Hladky
     [not found]                     ` <CAE4VaGBGbTT8dqNyLWAwuiqL8E+3p1_SqP6XTTV71wNZMjc9Zg@mail.gmail.com>
2020-03-20 16:38                       ` Mel Gorman
2020-03-20 17:21                         ` Jirka Hladky
2020-05-07 15:24                         ` Jirka Hladky
2020-05-07 15:54                           ` Mel Gorman
2020-05-07 16:29                             ` Jirka Hladky
2020-05-07 17:49                               ` Phil Auld
     [not found]                                 ` <20200508034741.13036-1-hdanton@sina.com>
2020-05-18 14:52                                   ` Jirka Hladky [this message]
     [not found]                                     ` <20200519043154.10876-1-hdanton@sina.com>
2020-05-20 13:58                                       ` Jirka Hladky
2020-05-20 16:01                                         ` Jirka Hladky
2020-05-21 11:06                                         ` Mel Gorman
     [not found]                                         ` <20200521140931.15232-1-hdanton@sina.com>
2020-05-21 16:04                                           ` Mel Gorman
     [not found]                                           ` <20200522010950.3336-1-hdanton@sina.com>
2020-05-22 11:05                                             ` Mel Gorman
2020-05-08  9:22                               ` Mel Gorman
2020-05-08 11:05                                 ` Jirka Hladky
     [not found]                                 ` <CAE4VaGC_v6On-YvqdTwAWu3Mq4ofiV0pLov-QpV+QHr_SJr+Rw@mail.gmail.com>
2020-05-13 14:57                                   ` Jirka Hladky
2020-05-13 15:30                                     ` Mel Gorman
2020-05-13 16:20                                       ` Jirka Hladky
2020-05-14  9:50                                         ` Mel Gorman
     [not found]                                           ` <CAE4VaGCGUFOAZ+YHDnmeJ95o4W0j04Yb7EWnf8a43caUQs_WuQ@mail.gmail.com>
2020-05-14 10:08                                             ` Mel Gorman
2020-05-14 10:22                                               ` Jirka Hladky
2020-05-14 11:50                                                 ` Mel Gorman
2020-05-14 13:34                                                   ` Jirka Hladky
2020-05-14 15:31                                       ` Peter Zijlstra
2020-05-15  8:47                                         ` Mel Gorman
2020-05-15 11:17                                           ` Peter Zijlstra
2020-05-15 13:03                                             ` Mel Gorman
2020-05-15 13:12                                               ` Peter Zijlstra
2020-05-15 13:28                                                 ` Peter Zijlstra
2020-05-15 14:24                                             ` Peter Zijlstra
2020-05-21 10:38                                               ` Mel Gorman
2020-05-21 11:41                                                 ` Peter Zijlstra
2020-05-22 13:28                                                   ` Mel Gorman
2020-05-22 14:38                                                     ` Peter Zijlstra
2020-05-15 11:28                                           ` Peter Zijlstra
2020-05-15 12:22                                             ` Mel Gorman
2020-05-15 12:51                                               ` Peter Zijlstra
2020-05-15 14:43                                       ` Jirka Hladky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE4VaGDBAquxbBjuzzyaT1WPR95wiaiHsrEPs-eOP2W+r=fQFg@mail.gmail.com' \
    --to=jhladky@redhat.com \
    --cc=bgray@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dshaks@redhat.com \
    --cc=hdanton@sina.com \
    --cc=jmario@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.