linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Phil Auld <pauld@redhat.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <Morten.Rasmussen@arm.com>,
	Hillf Danton <hdanton@sina.com>, Parth Shah <parth@linux.ibm.com>,
	Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH v4 04/11] sched/fair: rework load_balance
Date: Fri, 8 Nov 2019 18:37:30 +0000	[thread overview]
Message-ID: <20191108183730.GU3016@techsingularity.net> (raw)
In-Reply-To: <20191108163501.GA26528@linaro.org>

On Fri, Nov 08, 2019 at 05:35:01PM +0100, Vincent Guittot wrote:
> > Fair enough, netperf hits the corner case where it does not work but
> > that is also true without your series.
> 
> I run mmtest/netperf test on my setup. It's a mix of small positive or
> negative differences (see below)
> 
> <SNIP>
>
> netperf-tcp
>                                5.3-rc2                5.3-rc2
>                                    tip               +rwk+fix
> Hmean     64         871.30 (   0.00%)      860.90 *  -1.19%*
> Hmean     128       1689.39 (   0.00%)     1679.31 *  -0.60%*
> Hmean     256       3199.59 (   0.00%)     3241.98 *   1.32%*
> Hmean     1024      9390.47 (   0.00%)     9268.47 *  -1.30%*
> Hmean     2048     13373.95 (   0.00%)    13395.61 *   0.16%*
> Hmean     3312     16701.30 (   0.00%)    17165.96 *   2.78%*
> Hmean     4096     15831.03 (   0.00%)    15544.66 *  -1.81%*
> Hmean     8192     19720.01 (   0.00%)    20188.60 *   2.38%*
> Hmean     16384    23925.90 (   0.00%)    23914.50 *  -0.05%*
> Stddev    64           7.38 (   0.00%)        4.23 (  42.67%)
> Stddev    128         11.62 (   0.00%)       10.13 (  12.85%)
> Stddev    256         34.33 (   0.00%)        7.94 (  76.88%)
> Stddev    1024        35.61 (   0.00%)      116.34 (-226.66%)
> Stddev    2048       285.30 (   0.00%)       80.50 (  71.78%)
> Stddev    3312       304.74 (   0.00%)      449.08 ( -47.36%)
> Stddev    4096       668.11 (   0.00%)      569.30 (  14.79%)
> Stddev    8192       733.23 (   0.00%)      944.38 ( -28.80%)
> Stddev    16384      553.03 (   0.00%)      299.44 (  45.86%)
> 
>                      5.3-rc2     5.3-rc2
>                          tip    +rwk+fix
> Duration User         138.05      140.95
> Duration System      1210.60     1208.45
> Duration Elapsed     1352.86     1352.90
> 

This roughly matches what I've seen. The interesting part to me for
netperf is the next section of the report that reports the locality of
numa hints. With netperf on a 2-socket machine, it's generally around
50% as the client/server are pulled apart. Because netperf is not
heavily memory bound, it doesn't have much impact on the overall
performance but it's good at catching the cross-node migrations.

> > 
> > > I agree that additional patches are probably needed to improve load
> > > balance at NUMA level and I expect that this rework will make it
> > > simpler to add.
> > > I just wanted to get the output of some real use cases before defining
> > > more numa level specific conditions. Some want to spread on there numa
> > > nodes but other want to keep everything together. The preferred node
> > > and fbq_classify_group was the only sensible metrics to me when he
> > > wrote this patchset but changes can be added if they make sense.
> > > 
> > 
> > That's fair. While it was possible to address the case before your
> > series, it was a hatchet job. If the changelog simply notes that some
> > special casing may still be required for SD_NUMA but it's outside the
> > scope of the series, then I'd be happy. At least there is a good chance
> > then if there is follow-up work that it won't be interpreted as an
> > attempt to reintroduce hacky heuristics.
> >
> 
> Would the additional comment make sense for you about work to be done
> for SD_NUMA ?
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0ad4b21..7e4cb65 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6960,11 +6960,34 @@ enum fbq_type { regular, remote, all };
>   * group. see update_sd_pick_busiest().
>   */
>  enum group_type {
> +	/*
> +	 * The group has spare capacity that can be used to process more work.
> +	 */
>  	group_has_spare = 0,
> +	/*
> +	 * The group is fully used and the tasks don't compete for more CPU
> +	 * cycles. Nevetheless, some tasks might wait before running.
> +	 */
>  	group_fully_busy,
> +	/*
> +	 * One task doesn't fit with CPU's capacity and must be migrated on a
> +	 * more powerful CPU.
> +	 */
>  	group_misfit_task,
> +	/*
> +	 * One local CPU with higher capacity is available and task should be
> +	 * migrated on it instead on current CPU.
> +	 */
>  	group_asym_packing,
> +	/*
> +	 * The tasks affinity prevents the scheduler to balance the load across
> +	 * the system.
> +	 */
>  	group_imbalanced,
> +	/*
> +	 * The CPU is overloaded and can't provide expected CPU cycles to all
> +	 * tasks.
> +	 */
>  	group_overloaded
>  };

Looks good.

>  
> @@ -8563,7 +8586,11 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>  
>  	/*
>  	 * Try to use spare capacity of local group without overloading it or
> -	 * emptying busiest
> +	 * emptying busiest.
> +	 * XXX Spreading tasks across numa nodes is not always the best policy
> +	 * and special cares should be taken for SD_NUMA domain level before
> +	 * spreading the tasks. For now, load_balance() fully relies on
> +	 * NUMA_BALANCING and fbq_classify_group/rq to overide the decision.
>  	 */
>  	if (local->group_type == group_has_spare) {
>  		if (busiest->group_type > group_fully_busy) {

Perfect. Any patch in that are can then update the comment and it
should be semi-obvious to the next reviewer that it's expected.

Thanks Vincent.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2019-11-08 18:37 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18 13:26 [PATCH v4 00/10] sched/fair: rework the CFS load balance Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 01/11] sched/fair: clean up asym packing Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Clean " tip-bot2 for Vincent Guittot
2019-10-30 14:51   ` [PATCH v4 01/11] sched/fair: clean " Mel Gorman
2019-10-30 16:03     ` Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 02/11] sched/fair: rename sum_nr_running to sum_h_nr_running Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Rename sg_lb_stats::sum_nr_running " tip-bot2 for Vincent Guittot
2019-10-30 14:53   ` [PATCH v4 02/11] sched/fair: rename sum_nr_running " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 03/11] sched/fair: remove meaningless imbalance calculation Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 04/11] sched/fair: rework load_balance Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Rework load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:45   ` [PATCH v4 04/11] sched/fair: rework load_balance Mel Gorman
2019-10-30 16:16     ` Valentin Schneider
2019-10-31  9:09     ` Vincent Guittot
2019-10-31 10:15       ` Mel Gorman
2019-10-31 11:13         ` Vincent Guittot
2019-10-31 11:40           ` Mel Gorman
2019-11-08 16:35             ` Vincent Guittot
2019-11-08 18:37               ` Mel Gorman [this message]
2019-11-12 10:58                 ` Vincent Guittot
2019-11-12 15:06                   ` Mel Gorman
2019-11-12 15:40                     ` Vincent Guittot
2019-11-12 17:45                       ` Mel Gorman
2019-11-18 13:50     ` Ingo Molnar
2019-11-18 13:57       ` Vincent Guittot
2019-11-18 14:51       ` Mel Gorman
2019-10-18 13:26 ` [PATCH v4 05/11] sched/fair: use rq->nr_running when balancing load Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-30 15:54   ` [PATCH v4 05/11] sched/fair: use " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Use load instead of runnable load in load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:58   ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Mel Gorman
2019-10-18 13:26 ` [PATCH v4 07/11] sched/fair: evenly spread tasks when not overloaded Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Spread out tasks evenly " tip-bot2 for Vincent Guittot
2019-10-30 16:03   ` [PATCH v4 07/11] sched/fair: evenly spread tasks " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 08/11] sched/fair: use utilization to select misfit task Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 09/11] sched/fair: use load instead of runnable load in wakeup path Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 10/11] sched/fair: optimize find_idlest_group Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Optimize find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Vincent Guittot
2019-10-21  9:12   ` [tip: sched/core] sched/fair: Rework find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-22 16:46   ` [PATCH] sched/fair: fix rework of find_idlest_group() Vincent Guittot
2019-10-23  7:50     ` Chen, Rong A
2019-10-30 16:07     ` Mel Gorman
2019-11-18 17:42     ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Vincent Guittot
2019-11-22 14:37     ` [PATCH] sched/fair: fix " Valentin Schneider
2019-11-25  9:16       ` Vincent Guittot
2019-11-25 11:03         ` Valentin Schneider
2019-11-20 11:58   ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Qais Yousef
2019-11-20 13:21     ` Vincent Guittot
2019-11-20 16:53       ` Vincent Guittot
2019-11-20 17:34         ` Qais Yousef
2019-11-20 17:43           ` Vincent Guittot
2019-11-20 18:10             ` Qais Yousef
2019-11-20 18:20               ` Vincent Guittot
2019-11-20 18:27                 ` Qais Yousef
2019-11-20 19:28                   ` Vincent Guittot
2019-11-20 19:55                     ` Qais Yousef
2019-11-21 14:58                       ` Qais Yousef
2019-11-22 14:34   ` Valentin Schneider
2019-11-25  9:59     ` Vincent Guittot
2019-11-25 11:13       ` Valentin Schneider
2019-10-21  7:50 ` [PATCH v4 00/10] sched/fair: rework the CFS load balance Ingo Molnar
2019-10-21  8:44   ` Vincent Guittot
2019-10-21 12:56     ` Phil Auld
2019-10-24 12:38     ` Phil Auld
2019-10-24 13:46       ` Phil Auld
2019-10-24 14:59         ` Vincent Guittot
2019-10-25 13:33           ` Phil Auld
2019-10-28 13:03             ` Vincent Guittot
2019-10-30 14:39               ` Phil Auld
2019-10-30 16:24                 ` Dietmar Eggemann
2019-10-30 16:35                   ` Valentin Schneider
2019-10-30 17:19                     ` Phil Auld
2019-10-30 17:25                       ` Valentin Schneider
2019-10-30 17:29                         ` Phil Auld
2019-10-30 17:28                       ` Vincent Guittot
2019-10-30 17:44                         ` Phil Auld
2019-10-30 17:25                 ` Vincent Guittot
2019-10-31 13:57                   ` Phil Auld
2019-10-31 16:41                     ` Vincent Guittot
2019-10-30 16:24   ` Mel Gorman
2019-10-30 16:35     ` Vincent Guittot
2019-11-18 13:15     ` Ingo Molnar
2019-11-25 12:48 ` Valentin Schneider
2020-01-03 16:39   ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191108183730.GU3016@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=Morten.Rasmussen@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quentin.perret@arm.com \
    --cc=riel@surriel.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).