linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Anton Blanchard <anton@ozlabs.org>,
	"Oliver O'Halloran" <oohall@gmail.com>,
	Nathan Lynch <nathanl@linux.ibm.com>,
	Michael Neuling <mikey@neuling.org>,
	Gautham R Shenoy <ego@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Jordan Niethe <jniethe5@gmail.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>
Subject: Re: [PATCH v4 09/10] Powerpc/smp: Create coregroup domain
Date: Mon, 3 Aug 2020 11:31:02 +0530	[thread overview]
Message-ID: <20200803060102.GD24375@linux.vnet.ibm.com> (raw)
In-Reply-To: <jhjlfj0ijeg.mognet@arm.com>

> > Also in the current P9 itself, two neighbouring core-pairs form a quad.
> > Cache latency within a quad is better than a latency to a distant core-pair.
> > Cache latency within a core pair is way better than latency within a quad.
> > So if we have only 4 threads running on a DIE all of them accessing the same
> > cache-lines, then we could probably benefit if all the tasks were to run
> > within the quad aka MC/Coregroup.
> >
> 
> Did you test this? WRT load balance we do try to balance "load" over the
> different domain spans, so if you represent quads as their own MC domain,
> you would AFAICT end up spreading tasks over the quads (rather than packing
> them) when balancing at e.g. DIE level. The desired behaviour might be
> hackable with some more ASYM_PACKING, but I'm not sure I should be
> suggesting that :-)
> 

Agree, load balance will try to spread the load across the quads. In my hack,
I was explicitly marking QUAD domains as !SD_PREFER_SIBLING + relaxing few
load spreading rules when SD_PREFER_SIBLING was not set. And this was on a
slightly older kernel (without recent Vincent's load balance overhaul).

> > I have found some benchmarks which are latency sensitive to benefit by
> > having a grouping a quad level (using kernel hacks and not backed by
> > firmware changes). Gautham also found similar results in his experiments
> > but he only used binding within the stock kernel.
> >
> 
> IIUC you reflect this "fabric quirk" (i.e. coregroups) using this DT
> binding thing.
> 
> That's also where things get interesting (for me) because I experienced
> something similar on another arm64 platform (ThunderX1). This was more
> about cache bandwidth than cache latency, but IMO it's in the same bag of
> fabric quirks. I blabbered a bit about this at last LPC [1], but kind of
> gave up on it given the TX1 was the only (arm64) platform where I could get
> both significant and reproducible results.
> 
> Now, if you folks are seeing this on completely different hardware and have
> "real" workloads that truly benefit from this kind of domain partitioning,
> this might be another incentive to try and sort of generalize this. That's
> outside the scope of your series, but your findings give me some hope!
> 
> I think what I had in mind back then was that if enough folks cared about
> it, we might get some bits added to the ACPI spec; something along the
> lines of proximity domains for the caches described in PPTT, IOW a cache
> distance matrix. I don't really know what it'll take to get there, but I
> figured I'd dump this in case someone's listening :-)
> 

Very interesting.

> > I am not setting SD_SHARE_PKG_RESOURCES in MC/Coregroup sd_flags as in MC
> > domain need not be LLC domain for Power.
> 
> From what I understood your MC domain does seem to map to LLC; but in any
> case, shouldn't you set that flag at least for BIGCORE (i.e. L2)? AIUI with
> your changes your sd_llc is gonna be SMT, and that's not going to be a very
> big mask. IMO you do want to correctly reflect your LLC situation via this
> flag to make cpus_share_cache() work properly.

I detect if the LLC is shared at BIGCORE, and if they are shared at BIGCORE,
then I dynamically rename the DOMAIN as CACHE and enable
SD_SHARE_PKG_RESOURCES in that domain.

> 
> [1]: https://linuxplumbersconf.org/event/4/contributions/484/

Thanks for the pointer.

-- 
Thanks and Regards
Srikar Dronamraju

  reply	other threads:[~2020-08-03  6:01 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-27  5:32 [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 01/10] powerpc/smp: Fix a warning under !NEED_MULTIPLE_NODES Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 02/10] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 03/10] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 04/10] powerpc/smp: Move topology fixups into a new function Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-27  5:32 ` [PATCH v4 06/10] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-30  5:55   ` Gautham R Shenoy
2020-07-31  7:45   ` Michael Ellerman
2020-07-31  9:29     ` Srikar Dronamraju
2020-07-31 12:22       ` Michael Ellerman
2020-07-27  5:32 ` [PATCH v4 07/10] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-31  7:49   ` Michael Ellerman
2020-07-31  9:18     ` Srikar Dronamraju
2020-07-31 11:31       ` Michael Ellerman
2020-07-27  5:32 ` [PATCH v4 08/10] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-31  7:52   ` Michael Ellerman
2020-07-31  9:49     ` Srikar Dronamraju
2020-07-31 12:14       ` Michael Ellerman
2020-07-27  5:32 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-27 18:52   ` Gautham R Shenoy
2020-07-28 15:03   ` Valentin Schneider
2020-07-29  6:13     ` Srikar Dronamraju
2020-07-31  1:05       ` Valentin Schneider
2020-08-03  6:01         ` Srikar Dronamraju [this message]
2020-07-31  7:36       ` Gautham R Shenoy
2020-07-27  5:32 ` [PATCH v4 10/10] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-31  8:02   ` Michael Ellerman
2020-07-31  9:58     ` Srikar Dronamraju
2020-07-31 11:29       ` Michael Ellerman
2020-07-30 17:22 ` [PATCH v4 00/10] Coregroup support on Powerpc Srikar Dronamraju
  -- strict thread matches above, loose matches on Subject: below --
2020-07-27  5:17 Srikar Dronamraju
2020-07-27  5:18 ` [PATCH v4 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200803060102.GD24375@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=anton@ozlabs.org \
    --cc=ego@linux.vnet.ibm.com \
    --cc=jniethe5@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=mpe@ellerman.id.au \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=oohall@gmail.com \
    --cc=peterz@infradead.org \
    --cc=valentin.schneider@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).