From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Valentin Schneider <valentin.schneider@arm.com>,
Aubrey Li <aubrey.li@linux.intel.com>,
Barry Song <song.bao.hua@hisilicon.com>,
Mike Galbraith <efault@gmx.de>,
Gautham Shenoy <gautham.shenoy@amd.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs
Date: Fri, 4 Feb 2022 12:36:54 +0530 [thread overview]
Message-ID: <20220204070654.GF618915@linux.vnet.ibm.com> (raw)
In-Reply-To: <20220203144652.12540-3-mgorman@techsingularity.net>
* Mel Gorman <mgorman@techsingularity.net> [2022-02-03 14:46:52]:
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index d201a7052a29..e6cd55951304 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2242,6 +2242,59 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> }
> }
>
> + /*
> + * Calculate an allowed NUMA imbalance such that LLCs do not get
> + * imbalanced.
> + */
We seem to adding this hunk before the sched_domains may be degenerated.
Wondering if we really want to do it before degeneration.
Let say we have 3 sched domains and we calculated the sd->imb_numa_nr for
all the 3 domains, then lets say the middle sched_domain gets degenerated.
Would the sd->imb_numa_nr's still be relevant?
> + for_each_cpu(i, cpu_map) {
> + unsigned int imb = 0;
> + unsigned int imb_span = 1;
> +
> + for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
> + struct sched_domain *child = sd->child;
> +
> + if (!(sd->flags & SD_SHARE_PKG_RESOURCES) && child &&
> + (child->flags & SD_SHARE_PKG_RESOURCES)) {
> + struct sched_domain *top, *top_p;
> + unsigned int nr_llcs;
> +
> + /*
> + * For a single LLC per node, allow an
> + * imbalance up to 25% of the node. This is an
> + * arbitrary cutoff based on SMT-2 to balance
> + * between memory bandwidth and avoiding
> + * premature sharing of HT resources and SMT-4
> + * or SMT-8 *may* benefit from a different
> + * cutoff.
> + *
> + * For multiple LLCs, allow an imbalance
> + * until multiple tasks would share an LLC
> + * on one node while LLCs on another node
> + * remain idle.
> + */
> + nr_llcs = sd->span_weight / child->span_weight;
> + if (nr_llcs == 1)
> + imb = sd->span_weight >> 2;
> + else
> + imb = nr_llcs;
> + sd->imb_numa_nr = imb;
> +
> + /* Set span based on the first NUMA domain. */
> + top = sd;
> + top_p = top->parent;
> + while (top_p && !(top_p->flags & SD_NUMA)) {
> + top = top->parent;
> + top_p = top->parent;
> + }
> + imb_span = top_p ? top_p->span_weight : sd->span_weight;
I am getting confused by imb_span.
Let say we have a topology of SMT -> MC -> DIE -> NUMA -> NUMA, with SMT and
MC domains having SD_SHARE_PKG_RESOURCES flag set.
We come here only for DIE domain.
imb_span set here is being used for both the subsequent sched domains
most likely they will be NUMA domains. Right?
> + } else {
> + int factor = max(1U, (sd->span_weight / imb_span));
> +
> + sd->imb_numa_nr = imb * factor;
For SMT, (or any sched domains below the llcs) factor would be
sd->span_weight but imb_numa_nr and imb would be 0.
For NUMA (or any sched domain just above DIE), factor would be
sd->imb_numa_nr would be nr_llcs.
For subsequent sched_domains, the sd->imb_numa_nr would be some multiple of
nr_llcs. Right?
> + }
> + }
> + }
> +
> /* Calculate CPU capacity for physical packages and nodes */
> for (i = nr_cpumask_bits-1; i >= 0; i--) {
> if (!cpumask_test_cpu(i, cpu_map))
> --
> 2.31.1
>
--
Thanks and Regards
Srikar Dronamraju
next prev parent reply other threads:[~2022-02-04 7:07 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-03 14:46 [PATCH v5 0/2] Adjust NUMA imbalance for multiple LLCs Mel Gorman
2022-02-03 14:46 ` [PATCH 1/2] sched/fair: Improve consistency of allowed NUMA balance calculations Mel Gorman
2022-02-03 14:46 ` [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs Mel Gorman
2022-02-04 7:06 ` Srikar Dronamraju [this message]
2022-02-04 9:04 ` Mel Gorman
2022-02-04 15:07 ` Nayak, KPrateek (K Prateek)
2022-02-04 16:45 ` Mel Gorman
-- strict thread matches above, loose matches on Subject: below --
2022-02-08 9:43 [PATCH v6 0/2] Adjust NUMA imbalance for " Mel Gorman
2022-02-08 9:43 ` [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans " Mel Gorman
2022-02-08 16:19 ` Gautham R. Shenoy
2022-02-09 5:10 ` K Prateek Nayak
2022-02-09 10:33 ` Mel Gorman
2022-02-11 19:02 ` Jirka Hladky
2022-02-14 10:27 ` Srikar Dronamraju
2022-02-14 11:03 ` Vincent Guittot
2021-12-10 9:33 [PATCH v4 0/2] Adjust NUMA imbalance for " Mel Gorman
2021-12-10 9:33 ` [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans " Mel Gorman
2021-12-13 8:28 ` Gautham R. Shenoy
2021-12-13 13:01 ` Mel Gorman
2021-12-13 14:47 ` Gautham R. Shenoy
2021-12-15 11:52 ` Gautham R. Shenoy
2021-12-15 12:25 ` Mel Gorman
2021-12-16 18:33 ` Gautham R. Shenoy
2021-12-20 11:12 ` Mel Gorman
2021-12-21 15:03 ` Gautham R. Shenoy
2021-12-21 17:13 ` Vincent Guittot
2021-12-22 8:52 ` Jirka Hladky
2022-01-04 19:52 ` Jirka Hladky
2022-01-05 10:42 ` Mel Gorman
2022-01-05 10:49 ` Mel Gorman
2022-01-10 15:53 ` Vincent Guittot
2022-01-12 10:24 ` Mel Gorman
2021-12-17 19:54 ` Gautham R. Shenoy
2021-12-01 15:18 [PATCH v3 0/2] Adjust NUMA imbalance for " Mel Gorman
2021-12-01 15:18 ` [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans " Mel Gorman
2021-12-03 8:15 ` Barry Song
2021-12-03 10:50 ` Mel Gorman
2021-12-03 11:14 ` Barry Song
2021-12-03 13:27 ` Mel Gorman
2021-12-04 10:40 ` Peter Zijlstra
2021-12-06 8:48 ` Gautham R. Shenoy
2021-12-06 14:51 ` Peter Zijlstra
2021-12-06 15:12 ` Mel Gorman
2021-12-09 14:23 ` Valentin Schneider
2021-12-09 15:43 ` Mel Gorman
2021-11-25 15:19 [PATCH 0/2] Adjust NUMA imbalance for " Mel Gorman
2021-11-25 15:19 ` [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans " Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220204070654.GF618915@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=aubrey.li@linux.intel.com \
--cc=efault@gmx.de \
--cc=gautham.shenoy@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=song.bao.hua@hisilicon.com \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).