All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>,
	Oliver OHalloran <oliveroh@au1.ibm.com>,
	Michael Neuling <mikey@linux.ibm.com>,
	Michael Ellerman <michaele@au1.ibm.com>,
	Anton Blanchard <anton@au1.ibm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Nick Piggin <npiggin@au1.ibm.com>
Subject: Re: [PATCH 02/11] powerpc/smp: Merge Power9 topology with Power topology
Date: Mon, 20 Jul 2020 13:40:52 +0530	[thread overview]
Message-ID: <20200720081052.GF21103@linux.vnet.ibm.com> (raw)
In-Reply-To: <20200717054436.GB25851@in.ibm.com>

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-17 11:14:36]:

> Hi Srikar,
> 
> On Tue, Jul 14, 2020 at 10:06:15AM +0530, Srikar Dronamraju wrote:
> > A new sched_domain_topology_level was added just for Power9. However the
> > same can be achieved by merging powerpc_topology with power9_topology
> > and makes the code more simpler especially when adding a new sched
> > domain.
> > 
> > Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> > Cc: Michael Ellerman <michaele@au1.ibm.com>
> > Cc: Nick Piggin <npiggin@au1.ibm.com>
> > Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> > Cc: Nathan Lynch <nathanl@linux.ibm.com>
> > Cc: Michael Neuling <mikey@linux.ibm.com>
> > Cc: Anton Blanchard <anton@au1.ibm.com>
> > Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> > Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> > Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
> >  1 file changed, 10 insertions(+), 23 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 680c0edcc59d..069ea4b21c6d 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
> >  }
> > 
> >  #ifdef CONFIG_SCHED_SMT
> > -/* cpumask of CPUs with asymetric SMT dependancy */
> > +/* cpumask of CPUs with asymmetric SMT dependency */
> >  static int powerpc_smt_flags(void)
> >  {
> >  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> > @@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
> >  }
> >  #endif
> > 
> > -static struct sched_domain_topology_level powerpc_topology[] = {
> > -#ifdef CONFIG_SCHED_SMT
> > -	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> > -#endif
> > -	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> > -	{ NULL, },
> > -};
> > -
> >  /*
> >   * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
> >   * This topology makes it *much* cheaper to migrate tasks between adjacent cores
> > @@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
> >   */
> >  static const struct cpumask *shared_cache_mask(int cpu)
> >  {
> > -	return cpu_l2_cache_mask(cpu);
> > +	if (shared_caches)
> > +		return cpu_l2_cache_mask(cpu);
> > +
> > +	if (has_big_cores)
> > +		return cpu_smallcore_mask(cpu);
> > +
> > +	return cpu_smt_mask(cpu);
> >  }
> > 
> >  #ifdef CONFIG_SCHED_SMT
> > @@ -1363,7 +1361,7 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
> >  }
> >  #endif
> > 
> > -static struct sched_domain_topology_level power9_topology[] = {
> > +static struct sched_domain_topology_level powerpc_topology[] = {
> 
> 
> >  #ifdef CONFIG_SCHED_SMT
> >  	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> >  #endif
> > @@ -1388,21 +1386,10 @@ void __init smp_cpus_done(unsigned int max_cpus)
> >  #ifdef CONFIG_SCHED_SMT
> >  	if (has_big_cores) {
> >  		pr_info("Big cores detected but using small core scheduling\n");
> I> -		power9_topology[0].mask = smallcore_smt_mask;
> >  		powerpc_topology[0].mask = smallcore_smt_mask;
> >  	}
> >  #endif
> > -	/*
> > -	 * If any CPU detects that it's sharing a cache with another CPU then
> > -	 * use the deeper topology that is aware of this sharing.
> > -	 */
> > -	if (shared_caches) {
> > -		pr_info("Using shared cache scheduler topology\n");
> > -		set_sched_topology(power9_topology);
> > -	} else {
> > -		pr_info("Using standard scheduler topology\n");
> > -		set_sched_topology(powerpc_topology);
> 
> 
> Ok, so we will go with the three level topology by default (SMT,
> CACHE, DIE) and will rely on the sched-domain creation code to
> degenerate CACHE domain in case SMT and CACHE have the same set of
> CPUs (POWER8 for eg).
> 

Right.

> From a cleanup perspective this is better, since we won't have to
> worry about defining multiple topology structures, but from a
> performance point of view, wouldn't we now pay an extra penalty of
> degenerating the CACHE domains on POWER8 kind of systems, each time
> when a CPU comes online ?
> 

So if we end up either adding a topology definition for each of the new
topologies we support or we have to take the extra penalty.

But going ahead

> Do we know how bad it is ? If the degeneration takes a few extra
> microseconds, that should be ok I suppose.
> 

It certainly will add to the penalty, I haven't captured per degeneration
statistics. However I ran an experiment where I run ppc64_cpu --smt=8 ,
followed by ppc64_cpu --smt=1 in a loop of 100 iterations.

On a Power8 System with 256 cpus 8 nodes.

Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  8
Core(s) per socket:  4
Socket(s):           8
NUMA node(s):        8
Model:               2.1 (pvr 004b 0201)
Model name:          POWER8 (architected), altivec supported
Hypervisor vendor:   pHyp
Virtualization type: para
L1d cache:           64K
L1i cache:           32K
L2 cache:            512K
L3 cache:            8192K
NUMA node0 CPU(s):   0-31
NUMA node1 CPU(s):   32-63
NUMA node2 CPU(s):   64-95
NUMA node3 CPU(s):   96-127
NUMA node4 CPU(s):   128-159
NUMA node5 CPU(s):   160-191
NUMA node6 CPU(s):   192-223
NUMA node7 CPU(s):   224-255

ppc64_cpu --smt=1
    N           Min           Max        Median           Avg        Stddev
x 100         38.17         53.78         46.81       46.6766     2.8421603

x 100         41.34         58.24         48.35       47.9649     3.6866087

ppc64_cpu --smt=8
    N           Min           Max        Median           Avg        Stddev
x 100         57.43         75.88         60.61       61.0246      2.418685

x 100         58.21         79.24         62.59       63.3326     3.4094558

But once we cleanup, we could add ways to fixup topologies so that we
reverse the overhead.

-- 
Thanks and Regards
Srikar Dronamraju

  reply	other threads:[~2020-07-20  8:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14  4:36 [PATCH 00/11] Support for grouping cores Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 01/11] powerpc/smp: Cache node for reuse Srikar Dronamraju
2020-07-17  4:51   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 02/11] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-17  5:44   ` Gautham R Shenoy
2020-07-20  8:10     ` Srikar Dronamraju [this message]
2020-07-14  4:36 ` [PATCH 03/11] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-17  5:45   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 04/11] powerpc/smp: Enable small core scheduling sooner Srikar Dronamraju
2020-07-17  5:48   ` Gautham R Shenoy
2020-07-20  7:20     ` Srikar Dronamraju
2020-07-20  7:47   ` Jordan Niethe
2020-07-20  8:52     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 05/11] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-14  5:40   ` Oliver O'Halloran
2020-07-14  6:30     ` Srikar Dronamraju
2020-07-17  6:00   ` Gautham R Shenoy
2020-07-20  6:45     ` Srikar Dronamraju
2020-07-20  8:58       ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 06/11] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-17  6:37   ` Gautham R Shenoy
2020-07-20  6:19     ` Srikar Dronamraju
2020-07-20  9:07       ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 07/11] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-17  8:08   ` Gautham R Shenoy
2020-07-20 13:56   ` Michael Ellerman
2020-07-21  2:57     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 08/11] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-17  8:08   ` Gautham R Shenoy
2020-07-14  4:36 ` [PATCH 09/11] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-17  8:19   ` Gautham R Shenoy
2020-07-17  8:23     ` Gautham R Shenoy
2020-07-20  6:02     ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 10/11] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-17  8:26   ` Gautham R Shenoy
2020-07-20  5:48     ` Srikar Dronamraju
2020-07-20  9:10       ` Gautham R Shenoy
2020-07-20 10:26         ` Srikar Dronamraju
2020-07-14  4:36 ` [PATCH 11/11] powerpc/smp: Provide an ability to disable coregroup Srikar Dronamraju
2020-07-17  8:28   ` Gautham R Shenoy
2020-07-20 13:57   ` Michael Ellerman
2020-07-14  5:06 ` [PATCH 00/11] Support for grouping cores Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200720081052.GF21103@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=anton@au1.ibm.com \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michaele@au1.ibm.com \
    --cc=mikey@linux.ibm.com \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@au1.ibm.com \
    --cc=oliveroh@au1.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.