From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
LKML <linux-kernel@vger.kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Anton Blanchard <anton@ozlabs.org>,
"Oliver O'Halloran" <oohall@gmail.com>,
Nathan Lynch <nathanl@linux.ibm.com>,
Michael Neuling <mikey@neuling.org>,
Gautham R Shenoy <ego@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Jordan Niethe <jniethe5@gmail.com>,
Morten Rasmussen <morten.rasmussen@arm.com>
Subject: Re: [PATCH v4 09/10] Powerpc/smp: Create coregroup domain
Date: Mon, 3 Aug 2020 11:31:02 +0530 [thread overview]
Message-ID: <20200803060102.GD24375@linux.vnet.ibm.com> (raw)
In-Reply-To: <jhjlfj0ijeg.mognet@arm.com>
> > Also in the current P9 itself, two neighbouring core-pairs form a quad.
> > Cache latency within a quad is better than a latency to a distant core-pair.
> > Cache latency within a core pair is way better than latency within a quad.
> > So if we have only 4 threads running on a DIE all of them accessing the same
> > cache-lines, then we could probably benefit if all the tasks were to run
> > within the quad aka MC/Coregroup.
> >
>
> Did you test this? WRT load balance we do try to balance "load" over the
> different domain spans, so if you represent quads as their own MC domain,
> you would AFAICT end up spreading tasks over the quads (rather than packing
> them) when balancing at e.g. DIE level. The desired behaviour might be
> hackable with some more ASYM_PACKING, but I'm not sure I should be
> suggesting that :-)
>
Agree, load balance will try to spread the load across the quads. In my hack,
I was explicitly marking QUAD domains as !SD_PREFER_SIBLING + relaxing few
load spreading rules when SD_PREFER_SIBLING was not set. And this was on a
slightly older kernel (without recent Vincent's load balance overhaul).
> > I have found some benchmarks which are latency sensitive to benefit by
> > having a grouping a quad level (using kernel hacks and not backed by
> > firmware changes). Gautham also found similar results in his experiments
> > but he only used binding within the stock kernel.
> >
>
> IIUC you reflect this "fabric quirk" (i.e. coregroups) using this DT
> binding thing.
>
> That's also where things get interesting (for me) because I experienced
> something similar on another arm64 platform (ThunderX1). This was more
> about cache bandwidth than cache latency, but IMO it's in the same bag of
> fabric quirks. I blabbered a bit about this at last LPC [1], but kind of
> gave up on it given the TX1 was the only (arm64) platform where I could get
> both significant and reproducible results.
>
> Now, if you folks are seeing this on completely different hardware and have
> "real" workloads that truly benefit from this kind of domain partitioning,
> this might be another incentive to try and sort of generalize this. That's
> outside the scope of your series, but your findings give me some hope!
>
> I think what I had in mind back then was that if enough folks cared about
> it, we might get some bits added to the ACPI spec; something along the
> lines of proximity domains for the caches described in PPTT, IOW a cache
> distance matrix. I don't really know what it'll take to get there, but I
> figured I'd dump this in case someone's listening :-)
>
Very interesting.
> > I am not setting SD_SHARE_PKG_RESOURCES in MC/Coregroup sd_flags as in MC
> > domain need not be LLC domain for Power.
>
> From what I understood your MC domain does seem to map to LLC; but in any
> case, shouldn't you set that flag at least for BIGCORE (i.e. L2)? AIUI with
> your changes your sd_llc is gonna be SMT, and that's not going to be a very
> big mask. IMO you do want to correctly reflect your LLC situation via this
> flag to make cpus_share_cache() work properly.
I detect if the LLC is shared at BIGCORE, and if they are shared at BIGCORE,
then I dynamically rename the DOMAIN as CACHE and enable
SD_SHARE_PKG_RESOURCES in that domain.
>
> [1]: https://linuxplumbersconf.org/event/4/contributions/484/
Thanks for the pointer.
--
Thanks and Regards
Srikar Dronamraju
next prev parent reply other threads:[~2020-08-03 6:01 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-27 5:32 [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 01/10] powerpc/smp: Fix a warning under !NEED_MULTIPLE_NODES Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 02/10] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 03/10] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 04/10] powerpc/smp: Move topology fixups into a new function Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-27 5:32 ` [PATCH v4 06/10] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-30 5:55 ` Gautham R Shenoy
2020-07-31 7:45 ` Michael Ellerman
2020-07-31 9:29 ` Srikar Dronamraju
2020-07-31 12:22 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 07/10] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-31 7:49 ` Michael Ellerman
2020-07-31 9:18 ` Srikar Dronamraju
2020-07-31 11:31 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 08/10] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-31 7:52 ` Michael Ellerman
2020-07-31 9:49 ` Srikar Dronamraju
2020-07-31 12:14 ` Michael Ellerman
2020-07-27 5:32 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-27 18:52 ` Gautham R Shenoy
2020-07-28 15:03 ` Valentin Schneider
2020-07-29 6:13 ` Srikar Dronamraju
2020-07-31 1:05 ` Valentin Schneider
2020-08-03 6:01 ` Srikar Dronamraju [this message]
2020-07-31 7:36 ` Gautham R Shenoy
2020-07-27 5:32 ` [PATCH v4 10/10] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-31 8:02 ` Michael Ellerman
2020-07-31 9:58 ` Srikar Dronamraju
2020-07-31 11:29 ` Michael Ellerman
2020-07-30 17:22 ` [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
-- strict thread matches above, loose matches on Subject: below --
2020-07-27 5:17 Srikar Dronamraju
2020-07-27 5:18 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200803060102.GD24375@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=anton@ozlabs.org \
--cc=ego@linux.vnet.ibm.com \
--cc=jniethe5@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=mpe@ellerman.id.au \
--cc=nathanl@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=oohall@gmail.com \
--cc=peterz@infradead.org \
--cc=valentin.schneider@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).