All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Coregroup support on Powerpc
@ 2020-07-21 11:38 ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Changelog v1 -> v2:
v1: https://lore.kernel.org/linuxppc-dev/20200714043624.5648-1-srikar@linux.vnet.ibm.com/t/#u

powerpc/smp: Merge Power9 topology with Power topology
	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT

powerpc/smp: Enable small core scheduling sooner
	Restored the previous info msg (Jordan)
	Moved big core topology fixup to fixup_topology (Gautham)

powerpc/smp: Dont assume l2-cache to be superset of sibling
	Set cpumask after verifying l2-cache. (Gautham)

powerpc/smp: Generalize 2nd sched domain
	Moved shared_cache topology fixup to fixup_topology (Gautham)

Powerpc/numa: Detect support for coregroup
	Explained Coregroup in commit msg (Michael Ellerman)

Powerpc/smp: Create coregroup domain
	Moved coregroup topology fixup to fixup_topology (Gautham)

powerpc/smp: Implement cpu_to_coregroup_id
	Move coregroup_enabled before getting associativity (Gautham)

powerpc/smp: Provide an ability to disable coregroup
	Patch dropped (Michael Ellerman)

Cleanup of existing powerpc topologies and add coregroup support on
Powerpc. Coregroup is a group of (subset of) cores of a DIE that share
a resource.

Patch 7 of this patch series: "Powerpc/numa: Detect support for coregroup"
depends on
https://lore.kernel.org/linuxppc-dev/20200707140644.7241-1-srikar@linux.vnet.ibm.com/t/#u
However it should be easy to rebase the patch without the above patch.

This patch series is based on top of current powerpc/next tree + the
above patch.

On Power 8 Systems
------------------
$ tail /proc/cpuinfo
processor	: 255
cpu		: POWER8 (architected), altivec supported
clock		: 3724.000000MHz
revision	: 2.1 (pvr 004b 0201)

timebase	: 512000000
platform	: pSeries
model		: IBM,8408-E8E
machine		: CHRP IBM,8408-E8E
MMU		: Hash

Before the patchset
-------------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
DIE
NUMA
NUMA
$ head /proc/schedstat
version 15
timestamp 4295534931
cpu0 0 0 0 0 0 0 41389823338 17682779896 14117
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,00000000,00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 27087859050 152273672 10396
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

After the patchset
------------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
DIE
NUMA
NUMA
$ head /proc/schedstat
version 15
timestamp 4295534931
cpu0 0 0 0 0 0 0 41389823338 17682779896 14117
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,00000000,00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 27087859050 152273672 10396
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

On Power 9 (with device-tree enablement to show coregroups).
(hunks for mimicing a coregroup was posted at
https://lore.kernel.org/linuxppc-dev/20200714043624.5648-1-srikar@linux.vnet.ibm.com/t/#m2cb09bb11c7a93257d6123d1d27edb8212f8af21)
-----------------------------------------------------------
$ tail /proc/cpuinfo
processor	: 127
cpu		: POWER9 (architected), altivec supported
clock		: 3000.000000MHz
revision	: 2.2 (pvr 004e 0202)

timebase	: 512000000
platform	: pSeries
model		: IBM,9008-22L
machine		: CHRP IBM,9008-22L
MMU		: Hash

Before patchset
--------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
CACHE
DIE
NUMA

$ head /proc/schedstat
version 15
timestamp 4318242208
cpu0 0 0 0 0 0 0 28077107004 4773387362 78205
domain0 00000000,00000000,00000000,00000055 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 24177439200 413887604 75393
domain0 00000000,00000000,00000000,000000aa 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

After patchset
--------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
CACHE
MC
DIE
NUMA

$ head /proc/schedstat
version 15
timestamp 4318242208
cpu0 0 0 0 0 0 0 28077107004 4773387362 78205
domain0 00000000,00000000,00000000,00000055 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain4 ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 24177439200 413887604 75393
domain0 00000000,00000000,00000000,000000aa 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>

Srikar Dronamraju (10):
  powerpc/smp: Cache node for reuse
  powerpc/smp: Merge Power9 topology with Power topology
  powerpc/smp: Move powerpc_topology above
  powerpc/smp: Enable small core scheduling sooner
  powerpc/smp: Dont assume l2-cache to be superset of sibling
  powerpc/smp: Generalize 2nd sched domain
  Powerpc/numa: Detect support for coregroup
  powerpc/smp: Allocate cpumask only after searching thread group
  Powerpc/smp: Create coregroup domain
  powerpc/smp: Implement cpu_to_coregroup_id

 arch/powerpc/include/asm/smp.h      |   1 +
 arch/powerpc/include/asm/topology.h |  10 ++
 arch/powerpc/kernel/smp.c           | 255 +++++++++++++++++-----------
 arch/powerpc/mm/numa.c              |  59 +++++--
 4 files changed, 213 insertions(+), 112 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 00/10] Coregroup support on Powerpc
@ 2020-07-21 11:38 ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Changelog v1 -> v2:
v1: https://lore.kernel.org/linuxppc-dev/20200714043624.5648-1-srikar@linux.vnet.ibm.com/t/#u

powerpc/smp: Merge Power9 topology with Power topology
	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT

powerpc/smp: Enable small core scheduling sooner
	Restored the previous info msg (Jordan)
	Moved big core topology fixup to fixup_topology (Gautham)

powerpc/smp: Dont assume l2-cache to be superset of sibling
	Set cpumask after verifying l2-cache. (Gautham)

powerpc/smp: Generalize 2nd sched domain
	Moved shared_cache topology fixup to fixup_topology (Gautham)

Powerpc/numa: Detect support for coregroup
	Explained Coregroup in commit msg (Michael Ellerman)

Powerpc/smp: Create coregroup domain
	Moved coregroup topology fixup to fixup_topology (Gautham)

powerpc/smp: Implement cpu_to_coregroup_id
	Move coregroup_enabled before getting associativity (Gautham)

powerpc/smp: Provide an ability to disable coregroup
	Patch dropped (Michael Ellerman)

Cleanup of existing powerpc topologies and add coregroup support on
Powerpc. Coregroup is a group of (subset of) cores of a DIE that share
a resource.

Patch 7 of this patch series: "Powerpc/numa: Detect support for coregroup"
depends on
https://lore.kernel.org/linuxppc-dev/20200707140644.7241-1-srikar@linux.vnet.ibm.com/t/#u
However it should be easy to rebase the patch without the above patch.

This patch series is based on top of current powerpc/next tree + the
above patch.

On Power 8 Systems
------------------
$ tail /proc/cpuinfo
processor	: 255
cpu		: POWER8 (architected), altivec supported
clock		: 3724.000000MHz
revision	: 2.1 (pvr 004b 0201)

timebase	: 512000000
platform	: pSeries
model		: IBM,8408-E8E
machine		: CHRP IBM,8408-E8E
MMU		: Hash

Before the patchset
-------------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
DIE
NUMA
NUMA
$ head /proc/schedstat
version 15
timestamp 4295534931
cpu0 0 0 0 0 0 0 41389823338 17682779896 14117
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,00000000,00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 27087859050 152273672 10396
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

After the patchset
------------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
DIE
NUMA
NUMA
$ head /proc/schedstat
version 15
timestamp 4295534931
cpu0 0 0 0 0 0 0 41389823338 17682779896 14117
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,00000000,00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 27087859050 152273672 10396
domain0 00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

On Power 9 (with device-tree enablement to show coregroups).
(hunks for mimicing a coregroup was posted at
https://lore.kernel.org/linuxppc-dev/20200714043624.5648-1-srikar@linux.vnet.ibm.com/t/#m2cb09bb11c7a93257d6123d1d27edb8212f8af21)
-----------------------------------------------------------
$ tail /proc/cpuinfo
processor	: 127
cpu		: POWER9 (architected), altivec supported
clock		: 3000.000000MHz
revision	: 2.2 (pvr 004e 0202)

timebase	: 512000000
platform	: pSeries
model		: IBM,9008-22L
machine		: CHRP IBM,9008-22L
MMU		: Hash

Before patchset
--------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
CACHE
DIE
NUMA

$ head /proc/schedstat
version 15
timestamp 4318242208
cpu0 0 0 0 0 0 0 28077107004 4773387362 78205
domain0 00000000,00000000,00000000,00000055 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 24177439200 413887604 75393
domain0 00000000,00000000,00000000,000000aa 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

After patchset
--------------
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/name
SMT
CACHE
MC
DIE
NUMA

$ head /proc/schedstat
version 15
timestamp 4318242208
cpu0 0 0 0 0 0 0 28077107004 4773387362 78205
domain0 00000000,00000000,00000000,00000055 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000000,000000ff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,00000000,00000000,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 00000000,00000000,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain4 ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 24177439200 413887604 75393
domain0 00000000,00000000,00000000,000000aa 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>

Srikar Dronamraju (10):
  powerpc/smp: Cache node for reuse
  powerpc/smp: Merge Power9 topology with Power topology
  powerpc/smp: Move powerpc_topology above
  powerpc/smp: Enable small core scheduling sooner
  powerpc/smp: Dont assume l2-cache to be superset of sibling
  powerpc/smp: Generalize 2nd sched domain
  Powerpc/numa: Detect support for coregroup
  powerpc/smp: Allocate cpumask only after searching thread group
  Powerpc/smp: Create coregroup domain
  powerpc/smp: Implement cpu_to_coregroup_id

 arch/powerpc/include/asm/smp.h      |   1 +
 arch/powerpc/include/asm/topology.h |  10 ++
 arch/powerpc/kernel/smp.c           | 255 +++++++++++++++++-----------
 arch/powerpc/mm/numa.c              |  59 +++++--
 4 files changed, 213 insertions(+), 112 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 01/10] powerpc/smp: Cache node for reuse
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

While cpu_to_node is inline function with access to per_cpu variable.
However when using repeatedly, it may be cleaner to cache it in a local
variable.

Also fix a build error in a some weird config.
"error: _numa_cpu_lookup_table_ undeclared"

No functional change

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 73199470c265..680c0edcc59d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -843,7 +843,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 
 	DBG("smp_prepare_cpus\n");
 
-	/* 
+	/*
 	 * setup_cpu may need to be called on the boot cpu. We havent
 	 * spun any cpus up but lets be paranoid.
 	 */
@@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	cpu_callin_map[boot_cpuid] = 1;
 
 	for_each_possible_cpu(cpu) {
+		int node = cpu_to_node(cpu);
+
 		zalloc_cpumask_var_node(&per_cpu(cpu_sibling_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_l2_cache_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 		/*
 		 * numa_node_id() works after this.
 		 */
 		if (cpu_present(cpu)) {
-			set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
-			set_cpu_numa_mem(cpu,
-				local_memory_node(numa_cpu_lookup_table[cpu]));
+			node = numa_cpu_lookup_table[cpu];
+			set_cpu_numa_node(cpu, node);
+			set_cpu_numa_mem(cpu, local_memory_node(node));
 		}
+#endif
 	}
 
 	/* Init the cpumasks so the boot CPU is related to itself */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 01/10] powerpc/smp: Cache node for reuse
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

While cpu_to_node is inline function with access to per_cpu variable.
However when using repeatedly, it may be cleaner to cache it in a local
variable.

Also fix a build error in a some weird config.
"error: _numa_cpu_lookup_table_ undeclared"

No functional change

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 73199470c265..680c0edcc59d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -843,7 +843,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 
 	DBG("smp_prepare_cpus\n");
 
-	/* 
+	/*
 	 * setup_cpu may need to be called on the boot cpu. We havent
 	 * spun any cpus up but lets be paranoid.
 	 */
@@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	cpu_callin_map[boot_cpuid] = 1;
 
 	for_each_possible_cpu(cpu) {
+		int node = cpu_to_node(cpu);
+
 		zalloc_cpumask_var_node(&per_cpu(cpu_sibling_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_l2_cache_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
-					GFP_KERNEL, cpu_to_node(cpu));
+					GFP_KERNEL, node);
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 		/*
 		 * numa_node_id() works after this.
 		 */
 		if (cpu_present(cpu)) {
-			set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
-			set_cpu_numa_mem(cpu,
-				local_memory_node(numa_cpu_lookup_table[cpu]));
+			node = numa_cpu_lookup_table[cpu];
+			set_cpu_numa_node(cpu, node);
+			set_cpu_numa_mem(cpu, local_memory_node(node));
 		}
+#endif
 	}
 
 	/* Init the cpumasks so the boot CPU is related to itself */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 02/10] powerpc/smp: Merge Power9 topology with Power topology
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

A new sched_domain_topology_level was added just for Power9. However the
same can be achieved by merging powerpc_topology with power9_topology
and makes the code more simpler especially when adding a new sched
domain.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Merge Power9 topology with Power topology
	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT

 arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 680c0edcc59d..0e0b118d9b6e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
 }
 
 #ifdef CONFIG_SCHED_SMT
-/* cpumask of CPUs with asymetric SMT dependancy */
+/* cpumask of CPUs with asymmetric SMT dependency */
 static int powerpc_smt_flags(void)
 {
 	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
@@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
 }
 #endif
 
-static struct sched_domain_topology_level powerpc_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
-#endif
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 /*
  * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
  * This topology makes it *much* cheaper to migrate tasks between adjacent cores
@@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
  */
 static const struct cpumask *shared_cache_mask(int cpu)
 {
-	return cpu_l2_cache_mask(cpu);
+	if (shared_caches)
+		return cpu_l2_cache_mask(cpu);
+
+	if (has_big_cores)
+		return cpu_smallcore_mask(cpu);
+
+	return per_cpu(cpu_sibling_map, cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
@@ -1363,7 +1361,7 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
-static struct sched_domain_topology_level power9_topology[] = {
+static struct sched_domain_topology_level powerpc_topology[] = {
 #ifdef CONFIG_SCHED_SMT
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
@@ -1388,21 +1386,10 @@ void __init smp_cpus_done(unsigned int max_cpus)
 #ifdef CONFIG_SCHED_SMT
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-		power9_topology[0].mask = smallcore_smt_mask;
 		powerpc_topology[0].mask = smallcore_smt_mask;
 	}
 #endif
-	/*
-	 * If any CPU detects that it's sharing a cache with another CPU then
-	 * use the deeper topology that is aware of this sharing.
-	 */
-	if (shared_caches) {
-		pr_info("Using shared cache scheduler topology\n");
-		set_sched_topology(power9_topology);
-	} else {
-		pr_info("Using standard scheduler topology\n");
-		set_sched_topology(powerpc_topology);
-	}
+	set_sched_topology(powerpc_topology);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 02/10] powerpc/smp: Merge Power9 topology with Power topology
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

A new sched_domain_topology_level was added just for Power9. However the
same can be achieved by merging powerpc_topology with power9_topology
and makes the code more simpler especially when adding a new sched
domain.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Merge Power9 topology with Power topology
	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT

 arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 680c0edcc59d..0e0b118d9b6e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
 }
 
 #ifdef CONFIG_SCHED_SMT
-/* cpumask of CPUs with asymetric SMT dependancy */
+/* cpumask of CPUs with asymmetric SMT dependency */
 static int powerpc_smt_flags(void)
 {
 	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
@@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
 }
 #endif
 
-static struct sched_domain_topology_level powerpc_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
-#endif
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 /*
  * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
  * This topology makes it *much* cheaper to migrate tasks between adjacent cores
@@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
  */
 static const struct cpumask *shared_cache_mask(int cpu)
 {
-	return cpu_l2_cache_mask(cpu);
+	if (shared_caches)
+		return cpu_l2_cache_mask(cpu);
+
+	if (has_big_cores)
+		return cpu_smallcore_mask(cpu);
+
+	return per_cpu(cpu_sibling_map, cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
@@ -1363,7 +1361,7 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
-static struct sched_domain_topology_level power9_topology[] = {
+static struct sched_domain_topology_level powerpc_topology[] = {
 #ifdef CONFIG_SCHED_SMT
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
@@ -1388,21 +1386,10 @@ void __init smp_cpus_done(unsigned int max_cpus)
 #ifdef CONFIG_SCHED_SMT
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-		power9_topology[0].mask = smallcore_smt_mask;
 		powerpc_topology[0].mask = smallcore_smt_mask;
 	}
 #endif
-	/*
-	 * If any CPU detects that it's sharing a cache with another CPU then
-	 * use the deeper topology that is aware of this sharing.
-	 */
-	if (shared_caches) {
-		pr_info("Using shared cache scheduler topology\n");
-		set_sched_topology(power9_topology);
-	} else {
-		pr_info("Using standard scheduler topology\n");
-		set_sched_topology(powerpc_topology);
-	}
+	set_sched_topology(powerpc_topology);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 03/10] powerpc/smp: Move powerpc_topology above
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Just moving the powerpc_topology description above.
This will help in using functions in this file and avoid declarations.

No other functional changes

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 116 +++++++++++++++++++-------------------
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 0e0b118d9b6e..1ce95da00cb6 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -818,6 +818,64 @@ static int init_cpu_l1_cache_map(int cpu)
 	return err;
 }
 
+static bool shared_caches;
+
+#ifdef CONFIG_SCHED_SMT
+/* cpumask of CPUs with asymmetric SMT dependency */
+static int powerpc_smt_flags(void)
+{
+	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
+
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
+		flags |= SD_ASYM_PACKING;
+	}
+	return flags;
+}
+#endif
+
+/*
+ * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
+ * This topology makes it *much* cheaper to migrate tasks between adjacent cores
+ * since the migrated task remains cache hot. We want to take advantage of this
+ * at the scheduler level so an extra topology level is required.
+ */
+static int powerpc_shared_cache_flags(void)
+{
+	return SD_SHARE_PKG_RESOURCES;
+}
+
+/*
+ * We can't just pass cpu_l2_cache_mask() directly because
+ * returns a non-const pointer and the compiler barfs on that.
+ */
+static const struct cpumask *shared_cache_mask(int cpu)
+{
+	if (shared_caches)
+		return cpu_l2_cache_mask(cpu);
+
+	if (has_big_cores)
+		return cpu_smallcore_mask(cpu);
+
+	return per_cpu(cpu_sibling_map, cpu);
+}
+
+#ifdef CONFIG_SCHED_SMT
+static const struct cpumask *smallcore_smt_mask(int cpu)
+{
+	return cpu_smallcore_mask(cpu);
+}
+#endif
+
+static struct sched_domain_topology_level powerpc_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
+	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
+	{ NULL, },
+};
+
 static int init_big_cores(void)
 {
 	int cpu;
@@ -1249,8 +1307,6 @@ static void add_cpu_to_masks(int cpu)
 			set_cpus_related(cpu, i, cpu_core_mask);
 }
 
-static bool shared_caches;
-
 /* Activate a secondary processor. */
 void start_secondary(void *unused)
 {
@@ -1314,62 +1370,6 @@ int setup_profiling_timer(unsigned int multiplier)
 	return 0;
 }
 
-#ifdef CONFIG_SCHED_SMT
-/* cpumask of CPUs with asymmetric SMT dependency */
-static int powerpc_smt_flags(void)
-{
-	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
-
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
-		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
-		flags |= SD_ASYM_PACKING;
-	}
-	return flags;
-}
-#endif
-
-/*
- * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
- * This topology makes it *much* cheaper to migrate tasks between adjacent cores
- * since the migrated task remains cache hot. We want to take advantage of this
- * at the scheduler level so an extra topology level is required.
- */
-static int powerpc_shared_cache_flags(void)
-{
-	return SD_SHARE_PKG_RESOURCES;
-}
-
-/*
- * We can't just pass cpu_l2_cache_mask() directly because
- * returns a non-const pointer and the compiler barfs on that.
- */
-static const struct cpumask *shared_cache_mask(int cpu)
-{
-	if (shared_caches)
-		return cpu_l2_cache_mask(cpu);
-
-	if (has_big_cores)
-		return cpu_smallcore_mask(cpu);
-
-	return per_cpu(cpu_sibling_map, cpu);
-}
-
-#ifdef CONFIG_SCHED_SMT
-static const struct cpumask *smallcore_smt_mask(int cpu)
-{
-	return cpu_smallcore_mask(cpu);
-}
-#endif
-
-static struct sched_domain_topology_level powerpc_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
-#endif
-	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 void __init smp_cpus_done(unsigned int max_cpus)
 {
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 03/10] powerpc/smp: Move powerpc_topology above
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Just moving the powerpc_topology description above.
This will help in using functions in this file and avoid declarations.

No other functional changes

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 116 +++++++++++++++++++-------------------
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 0e0b118d9b6e..1ce95da00cb6 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -818,6 +818,64 @@ static int init_cpu_l1_cache_map(int cpu)
 	return err;
 }
 
+static bool shared_caches;
+
+#ifdef CONFIG_SCHED_SMT
+/* cpumask of CPUs with asymmetric SMT dependency */
+static int powerpc_smt_flags(void)
+{
+	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
+
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
+		flags |= SD_ASYM_PACKING;
+	}
+	return flags;
+}
+#endif
+
+/*
+ * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
+ * This topology makes it *much* cheaper to migrate tasks between adjacent cores
+ * since the migrated task remains cache hot. We want to take advantage of this
+ * at the scheduler level so an extra topology level is required.
+ */
+static int powerpc_shared_cache_flags(void)
+{
+	return SD_SHARE_PKG_RESOURCES;
+}
+
+/*
+ * We can't just pass cpu_l2_cache_mask() directly because
+ * returns a non-const pointer and the compiler barfs on that.
+ */
+static const struct cpumask *shared_cache_mask(int cpu)
+{
+	if (shared_caches)
+		return cpu_l2_cache_mask(cpu);
+
+	if (has_big_cores)
+		return cpu_smallcore_mask(cpu);
+
+	return per_cpu(cpu_sibling_map, cpu);
+}
+
+#ifdef CONFIG_SCHED_SMT
+static const struct cpumask *smallcore_smt_mask(int cpu)
+{
+	return cpu_smallcore_mask(cpu);
+}
+#endif
+
+static struct sched_domain_topology_level powerpc_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
+	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
+	{ NULL, },
+};
+
 static int init_big_cores(void)
 {
 	int cpu;
@@ -1249,8 +1307,6 @@ static void add_cpu_to_masks(int cpu)
 			set_cpus_related(cpu, i, cpu_core_mask);
 }
 
-static bool shared_caches;
-
 /* Activate a secondary processor. */
 void start_secondary(void *unused)
 {
@@ -1314,62 +1370,6 @@ int setup_profiling_timer(unsigned int multiplier)
 	return 0;
 }
 
-#ifdef CONFIG_SCHED_SMT
-/* cpumask of CPUs with asymmetric SMT dependency */
-static int powerpc_smt_flags(void)
-{
-	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
-
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
-		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
-		flags |= SD_ASYM_PACKING;
-	}
-	return flags;
-}
-#endif
-
-/*
- * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
- * This topology makes it *much* cheaper to migrate tasks between adjacent cores
- * since the migrated task remains cache hot. We want to take advantage of this
- * at the scheduler level so an extra topology level is required.
- */
-static int powerpc_shared_cache_flags(void)
-{
-	return SD_SHARE_PKG_RESOURCES;
-}
-
-/*
- * We can't just pass cpu_l2_cache_mask() directly because
- * returns a non-const pointer and the compiler barfs on that.
- */
-static const struct cpumask *shared_cache_mask(int cpu)
-{
-	if (shared_caches)
-		return cpu_l2_cache_mask(cpu);
-
-	if (has_big_cores)
-		return cpu_smallcore_mask(cpu);
-
-	return per_cpu(cpu_sibling_map, cpu);
-}
-
-#ifdef CONFIG_SCHED_SMT
-static const struct cpumask *smallcore_smt_mask(int cpu)
-{
-	return cpu_smallcore_mask(cpu);
-}
-#endif
-
-static struct sched_domain_topology_level powerpc_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
-#endif
-	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 void __init smp_cpus_done(unsigned int max_cpus)
 {
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Enable small core scheduling as soon as we detect that we are in a
system that supports thread group. Doing so would avoid a redundant
check.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Enable small core scheduling sooner
	Restored the previous info msg (Jordan)
	Moved big core topology fixup to fixup_topology (Gautham)

 arch/powerpc/kernel/smp.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 1ce95da00cb6..72f16dc0cb26 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1370,6 +1370,16 @@ int setup_profiling_timer(unsigned int multiplier)
 	return 0;
 }
 
+static void fixup_topology(void)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (has_big_cores) {
+		pr_info("Big cores detected but using small core scheduling\n");
+		powerpc_topology[0].mask = smallcore_smt_mask;
+	}
+#endif
+}
+
 void __init smp_cpus_done(unsigned int max_cpus)
 {
 	/*
@@ -1383,12 +1393,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 
 	dump_numa_cpu_topology();
 
-#ifdef CONFIG_SCHED_SMT
-	if (has_big_cores) {
-		pr_info("Big cores detected but using small core scheduling\n");
-		powerpc_topology[0].mask = smallcore_smt_mask;
-	}
-#endif
+	fixup_topology();
 	set_sched_topology(powerpc_topology);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Enable small core scheduling as soon as we detect that we are in a
system that supports thread group. Doing so would avoid a redundant
check.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Enable small core scheduling sooner
	Restored the previous info msg (Jordan)
	Moved big core topology fixup to fixup_topology (Gautham)

 arch/powerpc/kernel/smp.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 1ce95da00cb6..72f16dc0cb26 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1370,6 +1370,16 @@ int setup_profiling_timer(unsigned int multiplier)
 	return 0;
 }
 
+static void fixup_topology(void)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (has_big_cores) {
+		pr_info("Big cores detected but using small core scheduling\n");
+		powerpc_topology[0].mask = smallcore_smt_mask;
+	}
+#endif
+}
+
 void __init smp_cpus_done(unsigned int max_cpus)
 {
 	/*
@@ -1383,12 +1393,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 
 	dump_numa_cpu_topology();
 
-#ifdef CONFIG_SCHED_SMT
-	if (has_big_cores) {
-		pr_info("Big cores detected but using small core scheduling\n");
-		powerpc_topology[0].mask = smallcore_smt_mask;
-	}
-#endif
+	fixup_topology();
 	set_sched_topology(powerpc_topology);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Current code assumes that cpumask of cpus sharing a l2-cache mask will
always be a superset of cpu_sibling_mask.

Lets stop that assumption. cpu_l2_cache_mask is a superset of
cpu_sibling_mask if and only if shared_caches is set.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Dont assume l2-cache to be superset of sibling
	Set cpumask after verifying l2-cache. (Gautham)

 arch/powerpc/kernel/smp.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 72f16dc0cb26..57468877499a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
 	if (!l2_cache)
 		return false;
 
+	cpumask_set_cpu(cpu, mask_fn(cpu));
 	for_each_cpu(i, cpu_online_mask) {
 		/*
 		 * when updating the marks the current CPU has not been marked
@@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
 	 * add it to it's own thread sibling mask.
 	 */
 	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
 
 	for (i = first_thread; i < first_thread + threads_per_core; i++)
 		if (cpu_online(i))
 			set_cpus_related(i, cpu, cpu_sibling_mask);
 
 	add_cpu_to_smallcore_masks(cpu);
-	/*
-	 * Copy the thread sibling mask into the cache sibling mask
-	 * and mark any CPUs that share an L2 with this CPU.
-	 */
-	for_each_cpu(i, cpu_sibling_mask(cpu))
-		set_cpus_related(cpu, i, cpu_l2_cache_mask);
 	update_mask_by_l2(cpu, cpu_l2_cache_mask);
 
-	/*
-	 * Copy the cache sibling mask into core sibling mask and mark
-	 * any CPUs on the same chip as this CPU.
-	 */
-	for_each_cpu(i, cpu_l2_cache_mask(cpu))
-		set_cpus_related(cpu, i, cpu_core_mask);
+	if (pkg_id == -1) {
+		struct cpumask *(*mask)(int) = cpu_sibling_mask;
+
+		/*
+		 * Copy the sibling mask into core sibling mask and
+		 * mark any CPUs on the same chip as this CPU.
+		 */
+		if (shared_caches)
+			mask = cpu_l2_cache_mask;
+
+		for_each_cpu(i, mask(cpu))
+			set_cpus_related(cpu, i, cpu_core_mask);
 
-	if (pkg_id == -1)
 		return;
+	}
 
 	for_each_cpu(i, cpu_online_mask)
 		if (get_physical_package_id(i) == pkg_id)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Current code assumes that cpumask of cpus sharing a l2-cache mask will
always be a superset of cpu_sibling_mask.

Lets stop that assumption. cpu_l2_cache_mask is a superset of
cpu_sibling_mask if and only if shared_caches is set.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Dont assume l2-cache to be superset of sibling
	Set cpumask after verifying l2-cache. (Gautham)

 arch/powerpc/kernel/smp.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 72f16dc0cb26..57468877499a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
 	if (!l2_cache)
 		return false;
 
+	cpumask_set_cpu(cpu, mask_fn(cpu));
 	for_each_cpu(i, cpu_online_mask) {
 		/*
 		 * when updating the marks the current CPU has not been marked
@@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
 	 * add it to it's own thread sibling mask.
 	 */
 	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
+	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
 
 	for (i = first_thread; i < first_thread + threads_per_core; i++)
 		if (cpu_online(i))
 			set_cpus_related(i, cpu, cpu_sibling_mask);
 
 	add_cpu_to_smallcore_masks(cpu);
-	/*
-	 * Copy the thread sibling mask into the cache sibling mask
-	 * and mark any CPUs that share an L2 with this CPU.
-	 */
-	for_each_cpu(i, cpu_sibling_mask(cpu))
-		set_cpus_related(cpu, i, cpu_l2_cache_mask);
 	update_mask_by_l2(cpu, cpu_l2_cache_mask);
 
-	/*
-	 * Copy the cache sibling mask into core sibling mask and mark
-	 * any CPUs on the same chip as this CPU.
-	 */
-	for_each_cpu(i, cpu_l2_cache_mask(cpu))
-		set_cpus_related(cpu, i, cpu_core_mask);
+	if (pkg_id == -1) {
+		struct cpumask *(*mask)(int) = cpu_sibling_mask;
+
+		/*
+		 * Copy the sibling mask into core sibling mask and
+		 * mark any CPUs on the same chip as this CPU.
+		 */
+		if (shared_caches)
+			mask = cpu_l2_cache_mask;
+
+		for_each_cpu(i, mask(cpu))
+			set_cpus_related(cpu, i, cpu_core_mask);
 
-	if (pkg_id == -1)
 		return;
+	}
 
 	for_each_cpu(i, cpu_online_mask)
 		if (get_physical_package_id(i) == pkg_id)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Currently "CACHE" domain happens to be the 2nd sched domain as per
powerpc_topology. This domain will collapse if cpumask of l2-cache is
same as SMT domain. However we could generalize this domain such that it
could mean either be a "CACHE" domain or a "BIGCORE" domain.

While setting up the "CACHE" domain, check if shared_cache is already
set.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Generalize 2nd sched domain
	Moved shared_cache topology fixup to fixup_topology (Gautham)

 arch/powerpc/kernel/smp.c | 49 ++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 57468877499a..933ebdf97432 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -85,6 +85,14 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 EXPORT_SYMBOL_GPL(has_big_cores);
 
+enum {
+#ifdef CONFIG_SCHED_SMT
+	smt_idx,
+#endif
+	bigcore_idx,
+	die_idx,
+};
+
 #define MAX_THREAD_LIST_SIZE	8
 #define THREAD_GROUP_SHARE_L1   1
 struct thread_groups {
@@ -851,13 +859,7 @@ static int powerpc_shared_cache_flags(void)
  */
 static const struct cpumask *shared_cache_mask(int cpu)
 {
-	if (shared_caches)
-		return cpu_l2_cache_mask(cpu);
-
-	if (has_big_cores)
-		return cpu_smallcore_mask(cpu);
-
-	return per_cpu(cpu_sibling_map, cpu);
+	return per_cpu(cpu_l2_cache_map, cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
@@ -867,11 +869,16 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
+static const struct cpumask *cpu_bigcore_mask(int cpu)
+{
+	return per_cpu(cpu_sibling_map, cpu);
+}
+
 static struct sched_domain_topology_level powerpc_topology[] = {
 #ifdef CONFIG_SCHED_SMT
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
-	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
+	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };
@@ -1313,7 +1320,6 @@ static void add_cpu_to_masks(int cpu)
 void start_secondary(void *unused)
 {
 	unsigned int cpu = smp_processor_id();
-	struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
 
 	mmgrab(&init_mm);
 	current->active_mm = &init_mm;
@@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
 	/* Update topology CPU masks */
 	add_cpu_to_masks(cpu);
 
-	if (has_big_cores)
-		sibling_mask = cpu_smallcore_mask;
 	/*
 	 * Check for any shared caches. Note that this must be done on a
 	 * per-core basis because one core in the pair might be disabled.
 	 */
-	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
-		shared_caches = true;
+	if (!shared_caches) {
+		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
+		struct cpumask *mask = cpu_l2_cache_mask(cpu);
+
+		if (has_big_cores)
+			sibling_mask = cpu_smallcore_mask;
+
+		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
+			shared_caches = true;
+	}
 
 	set_numa_node(numa_cpu_lookup_table[cpu]);
 	set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
@@ -1374,10 +1386,19 @@ int setup_profiling_timer(unsigned int multiplier)
 
 static void fixup_topology(void)
 {
+	if (shared_caches) {
+		pr_info("Using shared cache scheduler topology\n");
+		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
+#ifdef CONFIG_SCHED_DEBUG
+		powerpc_topology[bigcore_idx].name = "CACHE";
+#endif
+		powerpc_topology[bigcore_idx].sd_flags = powerpc_shared_cache_flags;
+	}
+
 #ifdef CONFIG_SCHED_SMT
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-		powerpc_topology[0].mask = smallcore_smt_mask;
+		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
 	}
 #endif
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Currently "CACHE" domain happens to be the 2nd sched domain as per
powerpc_topology. This domain will collapse if cpumask of l2-cache is
same as SMT domain. However we could generalize this domain such that it
could mean either be a "CACHE" domain or a "BIGCORE" domain.

While setting up the "CACHE" domain, check if shared_cache is already
set.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Generalize 2nd sched domain
	Moved shared_cache topology fixup to fixup_topology (Gautham)

 arch/powerpc/kernel/smp.c | 49 ++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 57468877499a..933ebdf97432 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -85,6 +85,14 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 EXPORT_SYMBOL_GPL(has_big_cores);
 
+enum {
+#ifdef CONFIG_SCHED_SMT
+	smt_idx,
+#endif
+	bigcore_idx,
+	die_idx,
+};
+
 #define MAX_THREAD_LIST_SIZE	8
 #define THREAD_GROUP_SHARE_L1   1
 struct thread_groups {
@@ -851,13 +859,7 @@ static int powerpc_shared_cache_flags(void)
  */
 static const struct cpumask *shared_cache_mask(int cpu)
 {
-	if (shared_caches)
-		return cpu_l2_cache_mask(cpu);
-
-	if (has_big_cores)
-		return cpu_smallcore_mask(cpu);
-
-	return per_cpu(cpu_sibling_map, cpu);
+	return per_cpu(cpu_l2_cache_map, cpu);
 }
 
 #ifdef CONFIG_SCHED_SMT
@@ -867,11 +869,16 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
+static const struct cpumask *cpu_bigcore_mask(int cpu)
+{
+	return per_cpu(cpu_sibling_map, cpu);
+}
+
 static struct sched_domain_topology_level powerpc_topology[] = {
 #ifdef CONFIG_SCHED_SMT
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
-	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
+	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };
@@ -1313,7 +1320,6 @@ static void add_cpu_to_masks(int cpu)
 void start_secondary(void *unused)
 {
 	unsigned int cpu = smp_processor_id();
-	struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
 
 	mmgrab(&init_mm);
 	current->active_mm = &init_mm;
@@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
 	/* Update topology CPU masks */
 	add_cpu_to_masks(cpu);
 
-	if (has_big_cores)
-		sibling_mask = cpu_smallcore_mask;
 	/*
 	 * Check for any shared caches. Note that this must be done on a
 	 * per-core basis because one core in the pair might be disabled.
 	 */
-	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
-		shared_caches = true;
+	if (!shared_caches) {
+		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
+		struct cpumask *mask = cpu_l2_cache_mask(cpu);
+
+		if (has_big_cores)
+			sibling_mask = cpu_smallcore_mask;
+
+		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
+			shared_caches = true;
+	}
 
 	set_numa_node(numa_cpu_lookup_table[cpu]);
 	set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
@@ -1374,10 +1386,19 @@ int setup_profiling_timer(unsigned int multiplier)
 
 static void fixup_topology(void)
 {
+	if (shared_caches) {
+		pr_info("Using shared cache scheduler topology\n");
+		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
+#ifdef CONFIG_SCHED_DEBUG
+		powerpc_topology[bigcore_idx].name = "CACHE";
+#endif
+		powerpc_topology[bigcore_idx].sd_flags = powerpc_shared_cache_flags;
+	}
+
 #ifdef CONFIG_SCHED_SMT
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-		powerpc_topology[0].mask = smallcore_smt_mask;
+		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
 	}
 #endif
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 07/10] Powerpc/numa: Detect support for coregroup
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Add support for grouping cores based on the device-tree classification.
- The last domain in the associativity domains always refers to the
core.
- If primary reference domain happens to be the penultimate domain in
the associativity domains device-tree property, then there are no
coregroups. However if its not a penultimate domain, then there are
coregroups. There can be more than one coregroup. For now we would be
interested in the last or the smallest coregroups.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
Powerpc/numa: Detect support for coregroup
	Explained Coregroup in commit msg (Michael Ellerman)

 arch/powerpc/include/asm/smp.h |  1 +
 arch/powerpc/kernel/smp.c      |  1 +
 arch/powerpc/mm/numa.c         | 34 +++++++++++++++++++++-------------
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 49a25e2400f2..5bdc17a7049f 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -28,6 +28,7 @@
 extern int boot_cpuid;
 extern int spinning_secondaries;
 extern u32 *cpu_to_phys_id;
+extern bool coregroup_enabled;
 
 extern void cpu_die(void);
 extern int cpu_to_chip_id(int cpu);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 933ebdf97432..320e36a0ec0b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -74,6 +74,7 @@ static DEFINE_PER_CPU(int, cpu_state) = { 0 };
 
 struct task_struct *secondary_current;
 bool has_big_cores;
+bool coregroup_enabled;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index bc5b2e8112c8..3248160c0327 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -886,7 +886,9 @@ static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn)
 static void __init find_possible_nodes(void)
 {
 	struct device_node *rtas;
-	u32 numnodes, i;
+	const __be32 *domains;
+	int prop_length, max_nodes;
+	u32 i;
 
 	if (!numa_enabled)
 		return;
@@ -895,25 +897,31 @@ static void __init find_possible_nodes(void)
 	if (!rtas)
 		return;
 
-	if (of_property_read_u32_index(rtas, "ibm,current-associativity-domains",
-				min_common_depth, &numnodes)) {
-		/*
-		 * ibm,current-associativity-domains is a fairly recent
-		 * property. If it doesn't exist, then fallback on
-		 * ibm,max-associativity-domains. Current denotes what the
-		 * platform can support compared to max which denotes what the
-		 * Hypervisor can support.
-		 */
-		if (of_property_read_u32_index(rtas, "ibm,max-associativity-domains",
-				min_common_depth, &numnodes))
+	/*
+	 * ibm,current-associativity-domains is a fairly recent property. If
+	 * it doesn't exist, then fallback on ibm,max-associativity-domains.
+	 * Current denotes what the platform can support compared to max
+	 * which denotes what the Hypervisor can support.
+	 */
+	domains = of_get_property(rtas, "ibm,current-associativity-domains",
+					&prop_length);
+	if (!domains) {
+		domains = of_get_property(rtas, "ibm,max-associativity-domains",
+					&prop_length);
+		if (!domains)
 			goto out;
 	}
 
-	for (i = 0; i < numnodes; i++) {
+	max_nodes = of_read_number(&domains[min_common_depth], 1);
+	for (i = 0; i < max_nodes; i++) {
 		if (!node_possible(i))
 			node_set(i, node_possible_map);
 	}
 
+	prop_length /= sizeof(int);
+	if (prop_length > min_common_depth + 2)
+		coregroup_enabled = 1;
+
 out:
 	of_node_put(rtas);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 07/10] Powerpc/numa: Detect support for coregroup
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Add support for grouping cores based on the device-tree classification.
- The last domain in the associativity domains always refers to the
core.
- If primary reference domain happens to be the penultimate domain in
the associativity domains device-tree property, then there are no
coregroups. However if its not a penultimate domain, then there are
coregroups. There can be more than one coregroup. For now we would be
interested in the last or the smallest coregroups.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
Powerpc/numa: Detect support for coregroup
	Explained Coregroup in commit msg (Michael Ellerman)

 arch/powerpc/include/asm/smp.h |  1 +
 arch/powerpc/kernel/smp.c      |  1 +
 arch/powerpc/mm/numa.c         | 34 +++++++++++++++++++++-------------
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 49a25e2400f2..5bdc17a7049f 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -28,6 +28,7 @@
 extern int boot_cpuid;
 extern int spinning_secondaries;
 extern u32 *cpu_to_phys_id;
+extern bool coregroup_enabled;
 
 extern void cpu_die(void);
 extern int cpu_to_chip_id(int cpu);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 933ebdf97432..320e36a0ec0b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -74,6 +74,7 @@ static DEFINE_PER_CPU(int, cpu_state) = { 0 };
 
 struct task_struct *secondary_current;
 bool has_big_cores;
+bool coregroup_enabled;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index bc5b2e8112c8..3248160c0327 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -886,7 +886,9 @@ static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn)
 static void __init find_possible_nodes(void)
 {
 	struct device_node *rtas;
-	u32 numnodes, i;
+	const __be32 *domains;
+	int prop_length, max_nodes;
+	u32 i;
 
 	if (!numa_enabled)
 		return;
@@ -895,25 +897,31 @@ static void __init find_possible_nodes(void)
 	if (!rtas)
 		return;
 
-	if (of_property_read_u32_index(rtas, "ibm,current-associativity-domains",
-				min_common_depth, &numnodes)) {
-		/*
-		 * ibm,current-associativity-domains is a fairly recent
-		 * property. If it doesn't exist, then fallback on
-		 * ibm,max-associativity-domains. Current denotes what the
-		 * platform can support compared to max which denotes what the
-		 * Hypervisor can support.
-		 */
-		if (of_property_read_u32_index(rtas, "ibm,max-associativity-domains",
-				min_common_depth, &numnodes))
+	/*
+	 * ibm,current-associativity-domains is a fairly recent property. If
+	 * it doesn't exist, then fallback on ibm,max-associativity-domains.
+	 * Current denotes what the platform can support compared to max
+	 * which denotes what the Hypervisor can support.
+	 */
+	domains = of_get_property(rtas, "ibm,current-associativity-domains",
+					&prop_length);
+	if (!domains) {
+		domains = of_get_property(rtas, "ibm,max-associativity-domains",
+					&prop_length);
+		if (!domains)
 			goto out;
 	}
 
-	for (i = 0; i < numnodes; i++) {
+	max_nodes = of_read_number(&domains[min_common_depth], 1);
+	for (i = 0; i < max_nodes; i++) {
 		if (!node_possible(i))
 			node_set(i, node_possible_map);
 	}
 
+	prop_length /= sizeof(int);
+	if (prop_length > min_common_depth + 2)
+		coregroup_enabled = 1;
+
 out:
 	of_node_put(rtas);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 08/10] powerpc/smp: Allocate cpumask only after searching thread group
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

If allocated earlier and the search fails, then cpumask need to be
freed. However cpu_l1_cache_map can be allocated after we search thread
group.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 320e36a0ec0b..97b762a1944a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -797,10 +797,6 @@ static int init_cpu_l1_cache_map(int cpu)
 	if (err)
 		goto out;
 
-	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
-				GFP_KERNEL,
-				cpu_to_node(cpu));
-
 	cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
 
 	if (unlikely(cpu_group_start == -1)) {
@@ -809,6 +805,9 @@ static int init_cpu_l1_cache_map(int cpu)
 		goto out;
 	}
 
+	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
+				GFP_KERNEL, cpu_to_node(cpu));
+
 	for (i = first_thread; i < first_thread + threads_per_core; i++) {
 		int i_group_start = get_cpu_thread_group_start(i, &tg);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 08/10] powerpc/smp: Allocate cpumask only after searching thread group
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

If allocated earlier and the search fails, then cpumask need to be
freed. However cpu_l1_cache_map can be allocated after we search thread
group.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 320e36a0ec0b..97b762a1944a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -797,10 +797,6 @@ static int init_cpu_l1_cache_map(int cpu)
 	if (err)
 		goto out;
 
-	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
-				GFP_KERNEL,
-				cpu_to_node(cpu));
-
 	cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
 
 	if (unlikely(cpu_group_start == -1)) {
@@ -809,6 +805,9 @@ static int init_cpu_l1_cache_map(int cpu)
 		goto out;
 	}
 
+	zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
+				GFP_KERNEL, cpu_to_node(cpu));
+
 	for (i = first_thread; i < first_thread + threads_per_core; i++) {
 		int i_group_start = get_cpu_thread_group_start(i, &tg);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Add percpu coregroup maps and masks to create coregroup domain.
If a coregroup doesn't exist, the coregroup domain will be degenerated
in favour of SMT/CACHE domain.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
Powerpc/smp: Create coregroup domain
	Moved coregroup topology fixup to fixup_topology (Gautham)

 arch/powerpc/include/asm/topology.h | 10 ++++++++
 arch/powerpc/kernel/smp.c           | 38 +++++++++++++++++++++++++++++
 arch/powerpc/mm/numa.c              |  5 ++++
 3 files changed, 53 insertions(+)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index f0b6300e7dd3..6609174918ab 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -88,12 +88,22 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
 
 #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
 extern int find_and_online_cpu_nid(int cpu);
+extern int cpu_to_coregroup_id(int cpu);
 #else
 static inline int find_and_online_cpu_nid(int cpu)
 {
 	return 0;
 }
 
+static inline int cpu_to_coregroup_id(int cpu)
+{
+#ifdef CONFIG_SMP
+	return cpu_to_core_id(cpu);
+#else
+	return 0;
+#endif
+}
+
 #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
 
 #include <asm-generic/topology.h>
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 97b762a1944a..a7e1366b7fd3 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -80,6 +80,7 @@ DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_coregroup_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
@@ -91,6 +92,7 @@ enum {
 	smt_idx,
 #endif
 	bigcore_idx,
+	mc_idx,
 	die_idx,
 };
 
@@ -869,6 +871,21 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
+static struct cpumask *cpu_coregroup_mask(int cpu)
+{
+	return per_cpu(cpu_coregroup_map, cpu);
+}
+
+static bool has_coregroup_support(void)
+{
+	return coregroup_enabled;
+}
+
+static const struct cpumask *cpu_mc_mask(int cpu)
+{
+	return cpu_coregroup_mask(cpu);
+}
+
 static const struct cpumask *cpu_bigcore_mask(int cpu)
 {
 	return per_cpu(cpu_sibling_map, cpu);
@@ -879,6 +896,7 @@ static struct sched_domain_topology_level powerpc_topology[] = {
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
 	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
+	{ cpu_mc_mask, SD_INIT_NAME(MC) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };
@@ -927,6 +945,10 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
 					GFP_KERNEL, node);
+		if (has_coregroup_support())
+			zalloc_cpumask_var_node(&per_cpu(cpu_coregroup_map, cpu),
+						GFP_KERNEL, node);
+
 #ifdef CONFIG_NEED_MULTIPLE_NODES
 		/*
 		 * numa_node_id() works after this.
@@ -944,6 +966,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	cpumask_set_cpu(boot_cpuid, cpu_l2_cache_mask(boot_cpuid));
 	cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
 
+	if (has_coregroup_support())
+		cpumask_set_cpu(boot_cpuid, cpu_coregroup_mask(boot_cpuid));
+
 	init_big_cores();
 	if (has_big_cores) {
 		cpumask_set_cpu(boot_cpuid,
@@ -1235,6 +1260,8 @@ static void remove_cpu_from_masks(int cpu)
 		set_cpus_unrelated(cpu, i, cpu_sibling_mask);
 		if (has_big_cores)
 			set_cpus_unrelated(cpu, i, cpu_smallcore_mask);
+		if (has_coregroup_support())
+			set_cpus_unrelated(cpu, i, cpu_coregroup_mask);
 	}
 }
 #endif
@@ -1295,6 +1322,14 @@ static void add_cpu_to_masks(int cpu)
 	add_cpu_to_smallcore_masks(cpu);
 	update_mask_by_l2(cpu, cpu_l2_cache_mask);
 
+	if (has_coregroup_support()) {
+		cpumask_set_cpu(cpu, cpu_coregroup_mask(cpu));
+		for_each_cpu(i, cpu_online_mask) {
+			if (cpu_to_coregroup_id(cpu) == cpu_to_coregroup_id(i))
+				set_cpus_related(cpu, i, cpu_coregroup_mask);
+		}
+	}
+
 	if (pkg_id == -1) {
 		struct cpumask *(*mask)(int) = cpu_sibling_mask;
 
@@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
 
 static void fixup_topology(void)
 {
+	if (!has_coregroup_support())
+		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
+
 	if (shared_caches) {
 		pr_info("Using shared cache scheduler topology\n");
 		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 3248160c0327..ef8aa580da21 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1216,6 +1216,11 @@ int find_and_online_cpu_nid(int cpu)
 	return new_nid;
 }
 
+int cpu_to_coregroup_id(int cpu)
+{
+	return cpu_to_core_id(cpu);
+}
+
 static int topology_update_init(void)
 {
 	topology_inited = 1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Add percpu coregroup maps and masks to create coregroup domain.
If a coregroup doesn't exist, the coregroup domain will be degenerated
in favour of SMT/CACHE domain.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
Powerpc/smp: Create coregroup domain
	Moved coregroup topology fixup to fixup_topology (Gautham)

 arch/powerpc/include/asm/topology.h | 10 ++++++++
 arch/powerpc/kernel/smp.c           | 38 +++++++++++++++++++++++++++++
 arch/powerpc/mm/numa.c              |  5 ++++
 3 files changed, 53 insertions(+)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index f0b6300e7dd3..6609174918ab 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -88,12 +88,22 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
 
 #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
 extern int find_and_online_cpu_nid(int cpu);
+extern int cpu_to_coregroup_id(int cpu);
 #else
 static inline int find_and_online_cpu_nid(int cpu)
 {
 	return 0;
 }
 
+static inline int cpu_to_coregroup_id(int cpu)
+{
+#ifdef CONFIG_SMP
+	return cpu_to_core_id(cpu);
+#else
+	return 0;
+#endif
+}
+
 #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
 
 #include <asm-generic/topology.h>
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 97b762a1944a..a7e1366b7fd3 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -80,6 +80,7 @@ DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_coregroup_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
@@ -91,6 +92,7 @@ enum {
 	smt_idx,
 #endif
 	bigcore_idx,
+	mc_idx,
 	die_idx,
 };
 
@@ -869,6 +871,21 @@ static const struct cpumask *smallcore_smt_mask(int cpu)
 }
 #endif
 
+static struct cpumask *cpu_coregroup_mask(int cpu)
+{
+	return per_cpu(cpu_coregroup_map, cpu);
+}
+
+static bool has_coregroup_support(void)
+{
+	return coregroup_enabled;
+}
+
+static const struct cpumask *cpu_mc_mask(int cpu)
+{
+	return cpu_coregroup_mask(cpu);
+}
+
 static const struct cpumask *cpu_bigcore_mask(int cpu)
 {
 	return per_cpu(cpu_sibling_map, cpu);
@@ -879,6 +896,7 @@ static struct sched_domain_topology_level powerpc_topology[] = {
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
 	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
+	{ cpu_mc_mask, SD_INIT_NAME(MC) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };
@@ -927,6 +945,10 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 					GFP_KERNEL, node);
 		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
 					GFP_KERNEL, node);
+		if (has_coregroup_support())
+			zalloc_cpumask_var_node(&per_cpu(cpu_coregroup_map, cpu),
+						GFP_KERNEL, node);
+
 #ifdef CONFIG_NEED_MULTIPLE_NODES
 		/*
 		 * numa_node_id() works after this.
@@ -944,6 +966,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	cpumask_set_cpu(boot_cpuid, cpu_l2_cache_mask(boot_cpuid));
 	cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
 
+	if (has_coregroup_support())
+		cpumask_set_cpu(boot_cpuid, cpu_coregroup_mask(boot_cpuid));
+
 	init_big_cores();
 	if (has_big_cores) {
 		cpumask_set_cpu(boot_cpuid,
@@ -1235,6 +1260,8 @@ static void remove_cpu_from_masks(int cpu)
 		set_cpus_unrelated(cpu, i, cpu_sibling_mask);
 		if (has_big_cores)
 			set_cpus_unrelated(cpu, i, cpu_smallcore_mask);
+		if (has_coregroup_support())
+			set_cpus_unrelated(cpu, i, cpu_coregroup_mask);
 	}
 }
 #endif
@@ -1295,6 +1322,14 @@ static void add_cpu_to_masks(int cpu)
 	add_cpu_to_smallcore_masks(cpu);
 	update_mask_by_l2(cpu, cpu_l2_cache_mask);
 
+	if (has_coregroup_support()) {
+		cpumask_set_cpu(cpu, cpu_coregroup_mask(cpu));
+		for_each_cpu(i, cpu_online_mask) {
+			if (cpu_to_coregroup_id(cpu) == cpu_to_coregroup_id(i))
+				set_cpus_related(cpu, i, cpu_coregroup_mask);
+		}
+	}
+
 	if (pkg_id == -1) {
 		struct cpumask *(*mask)(int) = cpu_sibling_mask;
 
@@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
 
 static void fixup_topology(void)
 {
+	if (!has_coregroup_support())
+		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
+
 	if (shared_caches) {
 		pr_info("Using shared cache scheduler topology\n");
 		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 3248160c0327..ef8aa580da21 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1216,6 +1216,11 @@ int find_and_online_cpu_nid(int cpu)
 	return new_nid;
 }
 
+int cpu_to_coregroup_id(int cpu)
+{
+	return cpu_to_core_id(cpu);
+}
+
 static int topology_update_init(void)
 {
 	topology_inited = 1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 10/10] powerpc/smp: Implement cpu_to_coregroup_id
  2020-07-21 11:38 ` Srikar Dronamraju
@ 2020-07-21 11:38   ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Lookup the coregroup id from the associativity array.

If unable to detect the coregroup id, fallback on the core id.
This way, ensure sched_domain degenerates and an extra sched domain is
not created.

Ideally this function should have been implemented in
arch/powerpc/kernel/smp.c. However if its implemented in mm/numa.c, we
don't need to find the primary domain again.

If the device-tree mentions more than one coregroup, then kernel
implements only the last or the smallest coregroup, which currently
corresponds to the penultimate domain in the device-tree.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Implement cpu_to_coregroup_id
	Move coregroup_enabled before getting associativity (Gautham)

 arch/powerpc/mm/numa.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index ef8aa580da21..ae57b68beaee 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1218,6 +1218,26 @@ int find_and_online_cpu_nid(int cpu)
 
 int cpu_to_coregroup_id(int cpu)
 {
+	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
+	int index;
+
+	if (cpu < 0 || cpu > nr_cpu_ids)
+		return -1;
+
+	if (!coregroup_enabled)
+		goto out;
+
+	if (!firmware_has_feature(FW_FEATURE_VPHN))
+		goto out;
+
+	if (vphn_get_associativity(cpu, associativity))
+		goto out;
+
+	index = of_read_number(associativity, 1);
+	if (index > min_common_depth + 1)
+		return of_read_number(&associativity[index - 1], 1);
+
+out:
 	return cpu_to_core_id(cpu);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 10/10] powerpc/smp: Implement cpu_to_coregroup_id
@ 2020-07-21 11:38   ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-21 11:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Lookup the coregroup id from the associativity array.

If unable to detect the coregroup id, fallback on the core id.
This way, ensure sched_domain degenerates and an extra sched domain is
not created.

Ideally this function should have been implemented in
arch/powerpc/kernel/smp.c. However if its implemented in mm/numa.c, we
don't need to find the primary domain again.

If the device-tree mentions more than one coregroup, then kernel
implements only the last or the smallest coregroup, which currently
corresponds to the penultimate domain in the device-tree.

Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Nick Piggin <npiggin@au1.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Michael Neuling <mikey@linux.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Cc: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1 -> v2:
powerpc/smp: Implement cpu_to_coregroup_id
	Move coregroup_enabled before getting associativity (Gautham)

 arch/powerpc/mm/numa.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index ef8aa580da21..ae57b68beaee 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1218,6 +1218,26 @@ int find_and_online_cpu_nid(int cpu)
 
 int cpu_to_coregroup_id(int cpu)
 {
+	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
+	int index;
+
+	if (cpu < 0 || cpu > nr_cpu_ids)
+		return -1;
+
+	if (!coregroup_enabled)
+		goto out;
+
+	if (!firmware_has_feature(FW_FEATURE_VPHN))
+		goto out;
+
+	if (vphn_get_associativity(cpu, associativity))
+		goto out;
+
+	index = of_read_number(associativity, 1);
+	if (index > min_common_depth + 1)
+		return of_read_number(&associativity[index - 1], 1);
+
+out:
 	return cpu_to_core_id(cpu);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 02/10] powerpc/smp: Merge Power9 topology with Power topology
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  5:48     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  5:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

On Tue, Jul 21, 2020 at 05:08:06PM +0530, Srikar Dronamraju wrote:
> A new sched_domain_topology_level was added just for Power9. However the
> same can be achieved by merging powerpc_topology with power9_topology
> and makes the code more simpler especially when adding a new sched
> domain.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Merge Power9 topology with Power topology
> 	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
> 	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT
> 
>  arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
>  1 file changed, 10 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 680c0edcc59d..0e0b118d9b6e 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
>  }
> 
>  #ifdef CONFIG_SCHED_SMT
> -/* cpumask of CPUs with asymetric SMT dependancy */
> +/* cpumask of CPUs with asymmetric SMT dependency */
>  static int powerpc_smt_flags(void)
>  {
>  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> @@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
>  }
>  #endif
> 
> -static struct sched_domain_topology_level powerpc_topology[] = {
> -#ifdef CONFIG_SCHED_SMT
> -	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> -#endif
> -	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> -	{ NULL, },
> -};
> -
>  /*
>   * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
>   * This topology makes it *much* cheaper to migrate tasks between adjacent cores
> @@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
>   */
>  static const struct cpumask *shared_cache_mask(int cpu)
>  {
> -	return cpu_l2_cache_mask(cpu);
> +	if (shared_caches)
> +		return cpu_l2_cache_mask(cpu);
> +
> +	if (has_big_cores)
> +		return cpu_smallcore_mask(cpu);
> +
> +	return per_cpu(cpu_sibling_map, cpu);
>  }


It might be helpful to enumerate the consequences of this change:

With this patch, on POWER7 and POWER8

   SMT and CACHE domains' cpumasks will both be
   per_cpu(cpu_sibling_map, cpu).

   On POWER7 SMT level flags has the following
   (SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING)

   On POWER8 SMT level flags has the following
   (SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES).

   On both POWER7 and POWER8, CACHE level flags only has
   SD_SHARE_PKG_RESOURCES

   Thus, on both POWER7 and POWER8, since the SMT and CACHE cpumasks
   are the same and since CACHE has no additional flags which SMT does
   not, the parent domain CACHE will be degenerated.

   Hence we will have SMT --> DIE --> NUMA as before without the
   patch. So the patch introduces no behavioural change. Only change
   is an additional degeneration of the CACHE domain.

On POWER9 : Baremetal.
   SMT level cpumask = per_cpu(cpu_sibling_map, cpu)

   Since the caches are shared for a pair of two cores,
   CACHE level cpumask = cpu_l2_cache_mask(cpu)

   Thus, we will have SMT --> CACHE --> DIE --> NUMA as before.  No
   behavioural change.

On POWER9 : LPAR
   SMT level cpumask = cpu_smallcore_mask(cpu).

   Since the caches are shared,
   CACHE level cpumask = cpu_l2_cache_mask(cpu).

   Thus, we will have SMT --> CACHE --> DIE --> NUMA as before.  Again
   no change in behaviour.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 02/10] powerpc/smp: Merge Power9 topology with Power topology
@ 2020-07-22  5:48     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  5:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

On Tue, Jul 21, 2020 at 05:08:06PM +0530, Srikar Dronamraju wrote:
> A new sched_domain_topology_level was added just for Power9. However the
> same can be achieved by merging powerpc_topology with power9_topology
> and makes the code more simpler especially when adding a new sched
> domain.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Merge Power9 topology with Power topology
> 	Replaced a reference to cpu_smt_mask with per_cpu(cpu_sibling_map, cpu)
> 	since cpu_smt_mask is only defined under CONFIG_SCHED_SMT
> 
>  arch/powerpc/kernel/smp.c | 33 ++++++++++-----------------------
>  1 file changed, 10 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 680c0edcc59d..0e0b118d9b6e 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1315,7 +1315,7 @@ int setup_profiling_timer(unsigned int multiplier)
>  }
> 
>  #ifdef CONFIG_SCHED_SMT
> -/* cpumask of CPUs with asymetric SMT dependancy */
> +/* cpumask of CPUs with asymmetric SMT dependency */
>  static int powerpc_smt_flags(void)
>  {
>  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> @@ -1328,14 +1328,6 @@ static int powerpc_smt_flags(void)
>  }
>  #endif
> 
> -static struct sched_domain_topology_level powerpc_topology[] = {
> -#ifdef CONFIG_SCHED_SMT
> -	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> -#endif
> -	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> -	{ NULL, },
> -};
> -
>  /*
>   * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
>   * This topology makes it *much* cheaper to migrate tasks between adjacent cores
> @@ -1353,7 +1345,13 @@ static int powerpc_shared_cache_flags(void)
>   */
>  static const struct cpumask *shared_cache_mask(int cpu)
>  {
> -	return cpu_l2_cache_mask(cpu);
> +	if (shared_caches)
> +		return cpu_l2_cache_mask(cpu);
> +
> +	if (has_big_cores)
> +		return cpu_smallcore_mask(cpu);
> +
> +	return per_cpu(cpu_sibling_map, cpu);
>  }


It might be helpful to enumerate the consequences of this change:

With this patch, on POWER7 and POWER8

   SMT and CACHE domains' cpumasks will both be
   per_cpu(cpu_sibling_map, cpu).

   On POWER7 SMT level flags has the following
   (SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING)

   On POWER8 SMT level flags has the following
   (SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES).

   On both POWER7 and POWER8, CACHE level flags only has
   SD_SHARE_PKG_RESOURCES

   Thus, on both POWER7 and POWER8, since the SMT and CACHE cpumasks
   are the same and since CACHE has no additional flags which SMT does
   not, the parent domain CACHE will be degenerated.

   Hence we will have SMT --> DIE --> NUMA as before without the
   patch. So the patch introduces no behavioural change. Only change
   is an additional degeneration of the CACHE domain.

On POWER9 : Baremetal.
   SMT level cpumask = per_cpu(cpu_sibling_map, cpu)

   Since the caches are shared for a pair of two cores,
   CACHE level cpumask = cpu_l2_cache_mask(cpu)

   Thus, we will have SMT --> CACHE --> DIE --> NUMA as before.  No
   behavioural change.

On POWER9 : LPAR
   SMT level cpumask = cpu_smallcore_mask(cpu).

   Since the caches are shared,
   CACHE level cpumask = cpu_l2_cache_mask(cpu).

   Thus, we will have SMT --> CACHE --> DIE --> NUMA as before.  Again
   no change in behaviour.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  5:59     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  5:59 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Hello Srikar,

On Tue, Jul 21, 2020 at 05:08:08PM +0530, Srikar Dronamraju wrote:
> Enable small core scheduling as soon as we detect that we are in a
> system that supports thread group. Doing so would avoid a redundant
> check.

The patch looks good to me. However, I think the commit message still
reflect the v1 code where we were moving the functionality from
smp_cpus_done() to init_big_cores().

In this we are moving it to a helper function to collate all fixups to
topology.

> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Enable small core scheduling sooner
> 	Restored the previous info msg (Jordan)
> 	Moved big core topology fixup to fixup_topology (Gautham)
> 
>  arch/powerpc/kernel/smp.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 1ce95da00cb6..72f16dc0cb26 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1370,6 +1370,16 @@ int setup_profiling_timer(unsigned int multiplier)
>  	return 0;
>  }
> 
> +static void fixup_topology(void)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +	if (has_big_cores) {
> +		pr_info("Big cores detected but using small core scheduling\n");
> +		powerpc_topology[0].mask = smallcore_smt_mask;
> +	}
> +#endif
> +}
> +
>  void __init smp_cpus_done(unsigned int max_cpus)
>  {
>  	/*
> @@ -1383,12 +1393,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
> 
>  	dump_numa_cpu_topology();
> 
> -#ifdef CONFIG_SCHED_SMT
> -	if (has_big_cores) {
> -		pr_info("Big cores detected but using small core scheduling\n");
> -		powerpc_topology[0].mask = smallcore_smt_mask;
> -	}
> -#endif
> +	fixup_topology();
>  	set_sched_topology(powerpc_topology);
>  }
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
@ 2020-07-22  5:59     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  5:59 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

Hello Srikar,

On Tue, Jul 21, 2020 at 05:08:08PM +0530, Srikar Dronamraju wrote:
> Enable small core scheduling as soon as we detect that we are in a
> system that supports thread group. Doing so would avoid a redundant
> check.

The patch looks good to me. However, I think the commit message still
reflect the v1 code where we were moving the functionality from
smp_cpus_done() to init_big_cores().

In this we are moving it to a helper function to collate all fixups to
topology.

> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Enable small core scheduling sooner
> 	Restored the previous info msg (Jordan)
> 	Moved big core topology fixup to fixup_topology (Gautham)
> 
>  arch/powerpc/kernel/smp.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 1ce95da00cb6..72f16dc0cb26 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1370,6 +1370,16 @@ int setup_profiling_timer(unsigned int multiplier)
>  	return 0;
>  }
> 
> +static void fixup_topology(void)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +	if (has_big_cores) {
> +		pr_info("Big cores detected but using small core scheduling\n");
> +		powerpc_topology[0].mask = smallcore_smt_mask;
> +	}
> +#endif
> +}
> +
>  void __init smp_cpus_done(unsigned int max_cpus)
>  {
>  	/*
> @@ -1383,12 +1393,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
> 
>  	dump_numa_cpu_topology();
> 
> -#ifdef CONFIG_SCHED_SMT
> -	if (has_big_cores) {
> -		pr_info("Big cores detected but using small core scheduling\n");
> -		powerpc_topology[0].mask = smallcore_smt_mask;
> -	}
> -#endif
> +	fixup_topology();
>  	set_sched_topology(powerpc_topology);
>  }
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  6:21     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  6:21 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Hi Srikar,

On Tue, Jul 21, 2020 at 05:08:09PM +0530, Srikar Dronamraju wrote:
> Current code assumes that cpumask of cpus sharing a l2-cache mask will
> always be a superset of cpu_sibling_mask.
> 
> Lets stop that assumption. cpu_l2_cache_mask is a superset of
> cpu_sibling_mask if and only if shared_caches is set.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Dont assume l2-cache to be superset of sibling
> 	Set cpumask after verifying l2-cache. (Gautham)
> 
>  arch/powerpc/kernel/smp.c | 28 +++++++++++++++-------------
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 72f16dc0cb26..57468877499a 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
>  	if (!l2_cache)
>  		return false;
> 
> +	cpumask_set_cpu(cpu, mask_fn(cpu));


Ok, we need to do this because "cpu" is not yet set in the
cpu_online_mask. Prior to your patch the "cpu" was getting set in
cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
the patch.


>  	for_each_cpu(i, cpu_online_mask) {
>  		/*
>  		 * when updating the marks the current CPU has not been marked
> @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
>  	 * add it to it's own thread sibling mask.
>  	 */
>  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
>  	for (i = first_thread; i < first_thread + threads_per_core; i++)
>  		if (cpu_online(i))
>  			set_cpus_related(i, cpu, cpu_sibling_mask);
> 
>  	add_cpu_to_smallcore_masks(cpu);
> -	/*
> -	 * Copy the thread sibling mask into the cache sibling mask
> -	 * and mark any CPUs that share an L2 with this CPU.
> -	 */
> -	for_each_cpu(i, cpu_sibling_mask(cpu))
> -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
>  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> 
> -	/*
> -	 * Copy the cache sibling mask into core sibling mask and mark
> -	 * any CPUs on the same chip as this CPU.
> -	 */
> -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> -		set_cpus_related(cpu, i, cpu_core_mask);
> +	if (pkg_id == -1) {

I suppose this "if" condition is an optimization, since if pkg_id != -1,
we anyway set these CPUs in the cpu_core_mask below.

However...

> +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> +
> +		/*
> +		 * Copy the sibling mask into core sibling mask and
> +		 * mark any CPUs on the same chip as this CPU.
> +		 */
> +		if (shared_caches)
> +			mask = cpu_l2_cache_mask;
> +
> +		for_each_cpu(i, mask(cpu))
> +			set_cpus_related(cpu, i, cpu_core_mask);
> 
> -	if (pkg_id == -1)
>  		return;
> +	}


... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
"cpu" in the cpu_core_mask(cpu) in the for-loop below ?


> 
>  	for_each_cpu(i, cpu_online_mask)
>  		if (get_physical_package_id(i) == pkg_id)


Before this patch it was unconditionally getting set in
cpu_core_mask(cpu) because of the fact that it was set in
cpu_l2_cache_mask(cpu) and we were unconditionally setting all the
CPUs in cpu_l2_cache_mask(cpu) in cpu_core_mask(cpu).

What am I missing ?

> -- 
> 2.17.1
>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
@ 2020-07-22  6:21     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  6:21 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

Hi Srikar,

On Tue, Jul 21, 2020 at 05:08:09PM +0530, Srikar Dronamraju wrote:
> Current code assumes that cpumask of cpus sharing a l2-cache mask will
> always be a superset of cpu_sibling_mask.
> 
> Lets stop that assumption. cpu_l2_cache_mask is a superset of
> cpu_sibling_mask if and only if shared_caches is set.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Dont assume l2-cache to be superset of sibling
> 	Set cpumask after verifying l2-cache. (Gautham)
> 
>  arch/powerpc/kernel/smp.c | 28 +++++++++++++++-------------
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 72f16dc0cb26..57468877499a 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
>  	if (!l2_cache)
>  		return false;
> 
> +	cpumask_set_cpu(cpu, mask_fn(cpu));


Ok, we need to do this because "cpu" is not yet set in the
cpu_online_mask. Prior to your patch the "cpu" was getting set in
cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
the patch.


>  	for_each_cpu(i, cpu_online_mask) {
>  		/*
>  		 * when updating the marks the current CPU has not been marked
> @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
>  	 * add it to it's own thread sibling mask.
>  	 */
>  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
>  	for (i = first_thread; i < first_thread + threads_per_core; i++)
>  		if (cpu_online(i))
>  			set_cpus_related(i, cpu, cpu_sibling_mask);
> 
>  	add_cpu_to_smallcore_masks(cpu);
> -	/*
> -	 * Copy the thread sibling mask into the cache sibling mask
> -	 * and mark any CPUs that share an L2 with this CPU.
> -	 */
> -	for_each_cpu(i, cpu_sibling_mask(cpu))
> -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
>  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> 
> -	/*
> -	 * Copy the cache sibling mask into core sibling mask and mark
> -	 * any CPUs on the same chip as this CPU.
> -	 */
> -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> -		set_cpus_related(cpu, i, cpu_core_mask);
> +	if (pkg_id == -1) {

I suppose this "if" condition is an optimization, since if pkg_id != -1,
we anyway set these CPUs in the cpu_core_mask below.

However...

> +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> +
> +		/*
> +		 * Copy the sibling mask into core sibling mask and
> +		 * mark any CPUs on the same chip as this CPU.
> +		 */
> +		if (shared_caches)
> +			mask = cpu_l2_cache_mask;
> +
> +		for_each_cpu(i, mask(cpu))
> +			set_cpus_related(cpu, i, cpu_core_mask);
> 
> -	if (pkg_id == -1)
>  		return;
> +	}


... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
"cpu" in the cpu_core_mask(cpu) in the for-loop below ?


> 
>  	for_each_cpu(i, cpu_online_mask)
>  		if (get_physical_package_id(i) == pkg_id)


Before this patch it was unconditionally getting set in
cpu_core_mask(cpu) because of the fact that it was set in
cpu_l2_cache_mask(cpu) and we were unconditionally setting all the
CPUs in cpu_l2_cache_mask(cpu) in cpu_core_mask(cpu).

What am I missing ?

> -- 
> 2.17.1
>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  6:56     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  6:56 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Hello Srikar,

On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> Currently "CACHE" domain happens to be the 2nd sched domain as per
> powerpc_topology. This domain will collapse if cpumask of l2-cache is
> same as SMT domain. However we could generalize this domain such that it
> could mean either be a "CACHE" domain or a "BIGCORE" domain.
> 
> While setting up the "CACHE" domain, check if shared_cache is already
> set.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Generalize 2nd sched domain
> 	Moved shared_cache topology fixup to fixup_topology (Gautham)
>

Just one comment below.

>  arch/powerpc/kernel/smp.c | 49 ++++++++++++++++++++++++++++-----------
>  1 file changed, 35 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 57468877499a..933ebdf97432 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -85,6 +85,14 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
>  EXPORT_PER_CPU_SYMBOL(cpu_core_map);
>  EXPORT_SYMBOL_GPL(has_big_cores);
> 
> +enum {
> +#ifdef CONFIG_SCHED_SMT
> +	smt_idx,
> +#endif
> +	bigcore_idx,
> +	die_idx,
> +};
> +


[..snip..]

> @@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
>  	/* Update topology CPU masks */
>  	add_cpu_to_masks(cpu);
> 
> -	if (has_big_cores)
> -		sibling_mask = cpu_smallcore_mask;
>  	/*
>  	 * Check for any shared caches. Note that this must be done on a
>  	 * per-core basis because one core in the pair might be disabled.
>  	 */
> -	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
> -		shared_caches = true;
> +	if (!shared_caches) {
> +		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> +		struct cpumask *mask = cpu_l2_cache_mask(cpu);
> +
> +		if (has_big_cores)
> +			sibling_mask = cpu_smallcore_mask;
> +
> +		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
> +			shared_caches = true;

At the risk of repeating my comment to the v1 version of the patch, we
have shared caches only l2_cache_mask(cpu) is a strict superset of
sibling_mask(cpu).

"cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu))" does not
capture this.

Could we please use

      if (!cpumask_equal(sibling_mask(cpu), mask) &&
      	  cpumask_subset(sibling_mask(cpu), mask) {
      }

?


> +	}
> 
>  	set_numa_node(numa_cpu_lookup_table[cpu]);
>  	set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
> @@ -1374,10 +1386,19 @@ int setup_profiling_timer(unsigned int multiplier)
> 
>  static void fixup_topology(void)
>  {
> +	if (shared_caches) {
> +		pr_info("Using shared cache scheduler topology\n");
> +		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
> +#ifdef CONFIG_SCHED_DEBUG
> +		powerpc_topology[bigcore_idx].name = "CACHE";
> +#endif
> +		powerpc_topology[bigcore_idx].sd_flags = powerpc_shared_cache_flags;
> +	}
> +
>  #ifdef CONFIG_SCHED_SMT
>  	if (has_big_cores) {
>  		pr_info("Big cores detected but using small core scheduling\n");
> -		powerpc_topology[0].mask = smallcore_smt_mask;
> +		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
>  	}
>  #endif


Otherwise the patch looks good to me.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  6:56     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  6:56 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

Hello Srikar,

On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> Currently "CACHE" domain happens to be the 2nd sched domain as per
> powerpc_topology. This domain will collapse if cpumask of l2-cache is
> same as SMT domain. However we could generalize this domain such that it
> could mean either be a "CACHE" domain or a "BIGCORE" domain.
> 
> While setting up the "CACHE" domain, check if shared_cache is already
> set.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1 -> v2:
> powerpc/smp: Generalize 2nd sched domain
> 	Moved shared_cache topology fixup to fixup_topology (Gautham)
>

Just one comment below.

>  arch/powerpc/kernel/smp.c | 49 ++++++++++++++++++++++++++++-----------
>  1 file changed, 35 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 57468877499a..933ebdf97432 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -85,6 +85,14 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
>  EXPORT_PER_CPU_SYMBOL(cpu_core_map);
>  EXPORT_SYMBOL_GPL(has_big_cores);
> 
> +enum {
> +#ifdef CONFIG_SCHED_SMT
> +	smt_idx,
> +#endif
> +	bigcore_idx,
> +	die_idx,
> +};
> +


[..snip..]

> @@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
>  	/* Update topology CPU masks */
>  	add_cpu_to_masks(cpu);
> 
> -	if (has_big_cores)
> -		sibling_mask = cpu_smallcore_mask;
>  	/*
>  	 * Check for any shared caches. Note that this must be done on a
>  	 * per-core basis because one core in the pair might be disabled.
>  	 */
> -	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
> -		shared_caches = true;
> +	if (!shared_caches) {
> +		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> +		struct cpumask *mask = cpu_l2_cache_mask(cpu);
> +
> +		if (has_big_cores)
> +			sibling_mask = cpu_smallcore_mask;
> +
> +		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
> +			shared_caches = true;

At the risk of repeating my comment to the v1 version of the patch, we
have shared caches only l2_cache_mask(cpu) is a strict superset of
sibling_mask(cpu).

"cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu))" does not
capture this.

Could we please use

      if (!cpumask_equal(sibling_mask(cpu), mask) &&
      	  cpumask_subset(sibling_mask(cpu), mask) {
      }

?


> +	}
> 
>  	set_numa_node(numa_cpu_lookup_table[cpu]);
>  	set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
> @@ -1374,10 +1386,19 @@ int setup_profiling_timer(unsigned int multiplier)
> 
>  static void fixup_topology(void)
>  {
> +	if (shared_caches) {
> +		pr_info("Using shared cache scheduler topology\n");
> +		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
> +#ifdef CONFIG_SCHED_DEBUG
> +		powerpc_topology[bigcore_idx].name = "CACHE";
> +#endif
> +		powerpc_topology[bigcore_idx].sd_flags = powerpc_shared_cache_flags;
> +	}
> +
>  #ifdef CONFIG_SCHED_SMT
>  	if (has_big_cores) {
>  		pr_info("Big cores detected but using small core scheduling\n");
> -		powerpc_topology[0].mask = smallcore_smt_mask;
> +		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
>  	}
>  #endif


Otherwise the patch looks good to me.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
  2020-07-22  6:21     ` Gautham R Shenoy
@ 2020-07-22  6:57       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  6:57 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Vaidyanathan Srinivasan, Jordan Niethe

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:51:14]:

> Hi Srikar,
> 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 72f16dc0cb26..57468877499a 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
> >  	if (!l2_cache)
> >  		return false;
> > 
> > +	cpumask_set_cpu(cpu, mask_fn(cpu));
> 
> 
> Ok, we need to do this because "cpu" is not yet set in the
> cpu_online_mask. Prior to your patch the "cpu" was getting set in
> cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> the patch.
> 

Right.

> 
> >  	for_each_cpu(i, cpu_online_mask) {
> >  		/*
> >  		 * when updating the marks the current CPU has not been marked
> > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> >  	 * add it to it's own thread sibling mask.
> >  	 */
> >  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));

Note: Above, we are explicitly setting the cpu_core_mask.

> > 
> >  	for (i = first_thread; i < first_thread + threads_per_core; i++)
> >  		if (cpu_online(i))
> >  			set_cpus_related(i, cpu, cpu_sibling_mask);
> > 
> >  	add_cpu_to_smallcore_masks(cpu);
> > -	/*
> > -	 * Copy the thread sibling mask into the cache sibling mask
> > -	 * and mark any CPUs that share an L2 with this CPU.
> > -	 */
> > -	for_each_cpu(i, cpu_sibling_mask(cpu))
> > -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
> >  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > 
> > -	/*
> > -	 * Copy the cache sibling mask into core sibling mask and mark
> > -	 * any CPUs on the same chip as this CPU.
> > -	 */
> > -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > -		set_cpus_related(cpu, i, cpu_core_mask);
> > +	if (pkg_id == -1) {
> 
> I suppose this "if" condition is an optimization, since if pkg_id != -1,
> we anyway set these CPUs in the cpu_core_mask below.
> 
> However...

This is not just an optimization.
The hunk removed would only work if cpu_l2_cache_mask is bigger than
cpu_sibling_mask. (this was the previous assumption that we want to break)
If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
topology.

> 
> > +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > +
> > +		/*
> > +		 * Copy the sibling mask into core sibling mask and
> > +		 * mark any CPUs on the same chip as this CPU.
> > +		 */
> > +		if (shared_caches)
> > +			mask = cpu_l2_cache_mask;
> > +
> > +		for_each_cpu(i, mask(cpu))
> > +			set_cpus_related(cpu, i, cpu_core_mask);
> > 
> > -	if (pkg_id == -1)
> >  		return;
> > +	}
> 
> 
> ... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
> "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> 
> 

As noted above, we are setting before. So we don't missing the cpu and hence
have not different from before.

> --
> Thanks and Regards
> gautham.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
@ 2020-07-22  6:57       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  6:57 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Peter Zijlstra, Jordan Niethe, Anton Blanchard,
	LKML, Ingo Molnar, Nick Piggin, linuxppc-dev, Valentin Schneider

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:51:14]:

> Hi Srikar,
> 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 72f16dc0cb26..57468877499a 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
> >  	if (!l2_cache)
> >  		return false;
> > 
> > +	cpumask_set_cpu(cpu, mask_fn(cpu));
> 
> 
> Ok, we need to do this because "cpu" is not yet set in the
> cpu_online_mask. Prior to your patch the "cpu" was getting set in
> cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> the patch.
> 

Right.

> 
> >  	for_each_cpu(i, cpu_online_mask) {
> >  		/*
> >  		 * when updating the marks the current CPU has not been marked
> > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> >  	 * add it to it's own thread sibling mask.
> >  	 */
> >  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));

Note: Above, we are explicitly setting the cpu_core_mask.

> > 
> >  	for (i = first_thread; i < first_thread + threads_per_core; i++)
> >  		if (cpu_online(i))
> >  			set_cpus_related(i, cpu, cpu_sibling_mask);
> > 
> >  	add_cpu_to_smallcore_masks(cpu);
> > -	/*
> > -	 * Copy the thread sibling mask into the cache sibling mask
> > -	 * and mark any CPUs that share an L2 with this CPU.
> > -	 */
> > -	for_each_cpu(i, cpu_sibling_mask(cpu))
> > -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
> >  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > 
> > -	/*
> > -	 * Copy the cache sibling mask into core sibling mask and mark
> > -	 * any CPUs on the same chip as this CPU.
> > -	 */
> > -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > -		set_cpus_related(cpu, i, cpu_core_mask);
> > +	if (pkg_id == -1) {
> 
> I suppose this "if" condition is an optimization, since if pkg_id != -1,
> we anyway set these CPUs in the cpu_core_mask below.
> 
> However...

This is not just an optimization.
The hunk removed would only work if cpu_l2_cache_mask is bigger than
cpu_sibling_mask. (this was the previous assumption that we want to break)
If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
topology.

> 
> > +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > +
> > +		/*
> > +		 * Copy the sibling mask into core sibling mask and
> > +		 * mark any CPUs on the same chip as this CPU.
> > +		 */
> > +		if (shared_caches)
> > +			mask = cpu_l2_cache_mask;
> > +
> > +		for_each_cpu(i, mask(cpu))
> > +			set_cpus_related(cpu, i, cpu_core_mask);
> > 
> > -	if (pkg_id == -1)
> >  		return;
> > +	}
> 
> 
> ... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
> "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> 
> 

As noted above, we are setting before. So we don't missing the cpu and hence
have not different from before.

> --
> Thanks and Regards
> gautham.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
  2020-07-22  5:59     ` Gautham R Shenoy
@ 2020-07-22  6:59       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  6:59 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Vaidyanathan Srinivasan, Jordan Niethe

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:29:25]:

> Hello Srikar,
> 
> On Tue, Jul 21, 2020 at 05:08:08PM +0530, Srikar Dronamraju wrote:
> > Enable small core scheduling as soon as we detect that we are in a
> > system that supports thread group. Doing so would avoid a redundant
> > check.
> 
> The patch looks good to me. However, I think the commit message still
> reflect the v1 code where we were moving the functionality from
> smp_cpus_done() to init_big_cores().
> 
> In this we are moving it to a helper function to collate all fixups to
> topology.
> 

Right.
Thanks for pointing , will correct this.


-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner
@ 2020-07-22  6:59       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  6:59 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Peter Zijlstra, Jordan Niethe, Anton Blanchard,
	LKML, Ingo Molnar, Nick Piggin, linuxppc-dev, Valentin Schneider

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:29:25]:

> Hello Srikar,
> 
> On Tue, Jul 21, 2020 at 05:08:08PM +0530, Srikar Dronamraju wrote:
> > Enable small core scheduling as soon as we detect that we are in a
> > system that supports thread group. Doing so would avoid a redundant
> > check.
> 
> The patch looks good to me. However, I think the commit message still
> reflect the v1 code where we were moving the functionality from
> smp_cpus_done() to init_big_cores().
> 
> In this we are moving it to a helper function to collate all fixups to
> topology.
> 

Right.
Thanks for pointing , will correct this.


-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  7:04     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  7:04 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Hi Srikar,

On Tue, Jul 21, 2020 at 05:08:13PM +0530, Srikar Dronamraju wrote:
> Add percpu coregroup maps and masks to create coregroup domain.
> If a coregroup doesn't exist, the coregroup domain will be degenerated
> in favour of SMT/CACHE domain.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

A query below.

> ---
> Changelog v1 -> v2:
> Powerpc/smp: Create coregroup domain
> 	Moved coregroup topology fixup to fixup_topology (Gautham)
> 
>  arch/powerpc/include/asm/topology.h | 10 ++++++++
>  arch/powerpc/kernel/smp.c           | 38 +++++++++++++++++++++++++++++
>  arch/powerpc/mm/numa.c              |  5 ++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index f0b6300e7dd3..6609174918ab 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h

[..snip..]

> @@ -91,6 +92,7 @@ enum {
>  	smt_idx,
>  #endif
>  	bigcore_idx,
> +	mc_idx,
>  	die_idx,
>  };
> 


[..snip..]

> @@ -879,6 +896,7 @@ static struct sched_domain_topology_level powerpc_topology[] = {
>  	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
>  #endif
>  	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
> +	{ cpu_mc_mask, SD_INIT_NAME(MC) },
>  	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
>  	{ NULL, },
>  };


[..snip..]

> @@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
> 
>  static void fixup_topology(void)
>  {
> +	if (!has_coregroup_support())
> +		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
> +

Shouldn't we move this condition after doing the fixup for shared
caches ? Because if we have shared_caches, but not core_group, then we
want the coregroup domain to degenerate correctly.


>  	if (shared_caches) {
>  		pr_info("Using shared cache scheduler topology\n");
>  		powerpc_topology[bigcore_idx].mask = shared_cache_mask;


--
Thanks and regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
@ 2020-07-22  7:04     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  7:04 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

Hi Srikar,

On Tue, Jul 21, 2020 at 05:08:13PM +0530, Srikar Dronamraju wrote:
> Add percpu coregroup maps and masks to create coregroup domain.
> If a coregroup doesn't exist, the coregroup domain will be degenerated
> in favour of SMT/CACHE domain.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

A query below.

> ---
> Changelog v1 -> v2:
> Powerpc/smp: Create coregroup domain
> 	Moved coregroup topology fixup to fixup_topology (Gautham)
> 
>  arch/powerpc/include/asm/topology.h | 10 ++++++++
>  arch/powerpc/kernel/smp.c           | 38 +++++++++++++++++++++++++++++
>  arch/powerpc/mm/numa.c              |  5 ++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index f0b6300e7dd3..6609174918ab 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h

[..snip..]

> @@ -91,6 +92,7 @@ enum {
>  	smt_idx,
>  #endif
>  	bigcore_idx,
> +	mc_idx,
>  	die_idx,
>  };
> 


[..snip..]

> @@ -879,6 +896,7 @@ static struct sched_domain_topology_level powerpc_topology[] = {
>  	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
>  #endif
>  	{ cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
> +	{ cpu_mc_mask, SD_INIT_NAME(MC) },
>  	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
>  	{ NULL, },
>  };


[..snip..]

> @@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
> 
>  static void fixup_topology(void)
>  {
> +	if (!has_coregroup_support())
> +		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
> +

Shouldn't we move this condition after doing the fixup for shared
caches ? Because if we have shared_caches, but not core_group, then we
want the coregroup domain to degenerate correctly.


>  	if (shared_caches) {
>  		pr_info("Using shared cache scheduler topology\n");
>  		powerpc_topology[bigcore_idx].mask = shared_cache_mask;


--
Thanks and regards
gautham.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 10/10] powerpc/smp: Implement cpu_to_coregroup_id
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  7:06     ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  7:06 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

On Tue, Jul 21, 2020 at 05:08:14PM +0530, Srikar Dronamraju wrote:
> Lookup the coregroup id from the associativity array.
> 
> If unable to detect the coregroup id, fallback on the core id.
> This way, ensure sched_domain degenerates and an extra sched domain is
> not created.
> 
> Ideally this function should have been implemented in
> arch/powerpc/kernel/smp.c. However if its implemented in mm/numa.c, we
> don't need to find the primary domain again.
> 
> If the device-tree mentions more than one coregroup, then kernel
> implements only the last or the smallest coregroup, which currently
> corresponds to the penultimate domain in the device-tree.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Looks good to me.

Reviewed-by : Gautham R. Shenoy <ego@linux.vnet.ibm.com>


> ---
> Changelog v1 -> v2:
> powerpc/smp: Implement cpu_to_coregroup_id
> 	Move coregroup_enabled before getting associativity (Gautham)
> 
>  arch/powerpc/mm/numa.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index ef8aa580da21..ae57b68beaee 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1218,6 +1218,26 @@ int find_and_online_cpu_nid(int cpu)
> 
>  int cpu_to_coregroup_id(int cpu)
>  {
> +	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> +	int index;
> +
> +	if (cpu < 0 || cpu > nr_cpu_ids)
> +		return -1;
> +
> +	if (!coregroup_enabled)
> +		goto out;
> +
> +	if (!firmware_has_feature(FW_FEATURE_VPHN))
> +		goto out;
> +
> +	if (vphn_get_associativity(cpu, associativity))
> +		goto out;
> +
> +	index = of_read_number(associativity, 1);
> +	if (index > min_common_depth + 1)
> +		return of_read_number(&associativity[index - 1], 1);
> +
> +out:
>  	return cpu_to_core_id(cpu);
>  }
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 10/10] powerpc/smp: Implement cpu_to_coregroup_id
@ 2020-07-22  7:06     ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-22  7:06 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Ingo Molnar, Nick Piggin, linuxppc-dev,
	Valentin Schneider

On Tue, Jul 21, 2020 at 05:08:14PM +0530, Srikar Dronamraju wrote:
> Lookup the coregroup id from the associativity array.
> 
> If unable to detect the coregroup id, fallback on the core id.
> This way, ensure sched_domain degenerates and an extra sched domain is
> not created.
> 
> Ideally this function should have been implemented in
> arch/powerpc/kernel/smp.c. However if its implemented in mm/numa.c, we
> don't need to find the primary domain again.
> 
> If the device-tree mentions more than one coregroup, then kernel
> implements only the last or the smallest coregroup, which currently
> corresponds to the penultimate domain in the device-tree.
> 
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Looks good to me.

Reviewed-by : Gautham R. Shenoy <ego@linux.vnet.ibm.com>


> ---
> Changelog v1 -> v2:
> powerpc/smp: Implement cpu_to_coregroup_id
> 	Move coregroup_enabled before getting associativity (Gautham)
> 
>  arch/powerpc/mm/numa.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index ef8aa580da21..ae57b68beaee 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1218,6 +1218,26 @@ int find_and_online_cpu_nid(int cpu)
> 
>  int cpu_to_coregroup_id(int cpu)
>  {
> +	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> +	int index;
> +
> +	if (cpu < 0 || cpu > nr_cpu_ids)
> +		return -1;
> +
> +	if (!coregroup_enabled)
> +		goto out;
> +
> +	if (!firmware_has_feature(FW_FEATURE_VPHN))
> +		goto out;
> +
> +	if (vphn_get_associativity(cpu, associativity))
> +		goto out;
> +
> +	index = of_read_number(associativity, 1);
> +	if (index > min_common_depth + 1)
> +		return of_read_number(&associativity[index - 1], 1);
> +
> +out:
>  	return cpu_to_core_id(cpu);
>  }
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
  2020-07-22  7:04     ` Gautham R Shenoy
@ 2020-07-22  7:29       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  7:29 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Vaidyanathan Srinivasan, Jordan Niethe

> > @@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
> > 
> >  static void fixup_topology(void)
> >  {
> > +	if (!has_coregroup_support())
> > +		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
> > +
> 
> Shouldn't we move this condition after doing the fixup for shared
> caches ? Because if we have shared_caches, but not core_group, then we
> want the coregroup domain to degenerate correctly.
> 

Currently we aren't consolidating, and hence the order doesn't matter for
degeneration. However even if in future, if we want to consolidate the
domains before calling set_sched_topology(), this order would be ideal.

> 
> >  	if (shared_caches) {
> >  		pr_info("Using shared cache scheduler topology\n");
> >  		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
> 

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 09/10] Powerpc/smp: Create coregroup domain
@ 2020-07-22  7:29       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  7:29 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Peter Zijlstra, Jordan Niethe, Anton Blanchard,
	LKML, Ingo Molnar, Nick Piggin, linuxppc-dev, Valentin Schneider

> > @@ -1386,6 +1421,9 @@ int setup_profiling_timer(unsigned int multiplier)
> > 
> >  static void fixup_topology(void)
> >  {
> > +	if (!has_coregroup_support())
> > +		powerpc_topology[mc_idx].mask = cpu_bigcore_mask;
> > +
> 
> Shouldn't we move this condition after doing the fixup for shared
> caches ? Because if we have shared_caches, but not core_group, then we
> want the coregroup domain to degenerate correctly.
> 

Currently we aren't consolidating, and hence the order doesn't matter for
degeneration. However even if in future, if we want to consolidate the
domains before calling set_sched_topology(), this order would be ideal.

> 
> >  	if (shared_caches) {
> >  		pr_info("Using shared cache scheduler topology\n");
> >  		powerpc_topology[bigcore_idx].mask = shared_cache_mask;
> 

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-22  6:56     ` Gautham R Shenoy
@ 2020-07-22  7:39       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  7:39 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Vaidyanathan Srinivasan, Jordan Niethe

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 12:26:40]:

> Hello Srikar,
> 
> On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > same as SMT domain. However we could generalize this domain such that it
> > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> > 
> > While setting up the "CACHE" domain, check if shared_cache is already
> > set.
> > @@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
> >  	/* Update topology CPU masks */
> >  	add_cpu_to_masks(cpu);
> > 
> > -	if (has_big_cores)
> > -		sibling_mask = cpu_smallcore_mask;
> >  	/*
> >  	 * Check for any shared caches. Note that this must be done on a
> >  	 * per-core basis because one core in the pair might be disabled.
> >  	 */
> > -	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
> > -		shared_caches = true;
> > +	if (!shared_caches) {
> > +		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> > +		struct cpumask *mask = cpu_l2_cache_mask(cpu);
> > +
> > +		if (has_big_cores)
> > +			sibling_mask = cpu_smallcore_mask;
> > +
> > +		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
> > +			shared_caches = true;
> 
> At the risk of repeating my comment to the v1 version of the patch, we
> have shared caches only l2_cache_mask(cpu) is a strict superset of
> sibling_mask(cpu).
> 
> "cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu))" does not
> capture this.

Why would it not? We are setting shared_caches if and only if l2_cache_mask
is a strict superset of sibling/smallcore mask.

> Could we please use
> 
>       if (!cpumask_equal(sibling_mask(cpu), mask) &&
>       	  cpumask_subset(sibling_mask(cpu), mask) {
>       }
> 

Scheduler mandates that two cpumasks for the same CPU would either have to
be equal or one of them has to be a strict superset of the other. If not the
scheduler would mark our domains as broken. That being the case,
cpumask_weight will correctly capture what we are looking for. That said
your condition is also correct but slightly heavier and doesn't provide us
with any more information or correctness.

> 
> Otherwise the patch looks good to me.
> 
> --
> Thanks and Regards
> gautham.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  7:39       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  7:39 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Peter Zijlstra, Jordan Niethe, Anton Blanchard,
	LKML, Ingo Molnar, Nick Piggin, linuxppc-dev, Valentin Schneider

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 12:26:40]:

> Hello Srikar,
> 
> On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > same as SMT domain. However we could generalize this domain such that it
> > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> > 
> > While setting up the "CACHE" domain, check if shared_cache is already
> > set.
> > @@ -1339,14 +1345,20 @@ void start_secondary(void *unused)
> >  	/* Update topology CPU masks */
> >  	add_cpu_to_masks(cpu);
> > 
> > -	if (has_big_cores)
> > -		sibling_mask = cpu_smallcore_mask;
> >  	/*
> >  	 * Check for any shared caches. Note that this must be done on a
> >  	 * per-core basis because one core in the pair might be disabled.
> >  	 */
> > -	if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
> > -		shared_caches = true;
> > +	if (!shared_caches) {
> > +		struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> > +		struct cpumask *mask = cpu_l2_cache_mask(cpu);
> > +
> > +		if (has_big_cores)
> > +			sibling_mask = cpu_smallcore_mask;
> > +
> > +		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
> > +			shared_caches = true;
> 
> At the risk of repeating my comment to the v1 version of the patch, we
> have shared caches only l2_cache_mask(cpu) is a strict superset of
> sibling_mask(cpu).
> 
> "cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu))" does not
> capture this.

Why would it not? We are setting shared_caches if and only if l2_cache_mask
is a strict superset of sibling/smallcore mask.

> Could we please use
> 
>       if (!cpumask_equal(sibling_mask(cpu), mask) &&
>       	  cpumask_subset(sibling_mask(cpu), mask) {
>       }
> 

Scheduler mandates that two cpumasks for the same CPU would either have to
be equal or one of them has to be a strict superset of the other. If not the
scheduler would mark our domains as broken. That being the case,
cpumask_weight will correctly capture what we are looking for. That said
your condition is also correct but slightly heavier and doesn't provide us
with any more information or correctness.

> 
> Otherwise the patch looks good to me.
> 
> --
> Thanks and Regards
> gautham.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 01/10] powerpc/smp: Cache node for reuse
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  7:41     ` Michael Ellerman
  -1 siblings, 0 replies; 56+ messages in thread
From: Michael Ellerman @ 2020-07-22  7:41 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Srikar Dronamraju, linuxppc-dev, LKML, Ingo Molnar,
	Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Gautham R Shenoy, Vaidyanathan Srinivasan, Jordan Niethe

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> While cpu_to_node is inline function with access to per_cpu variable.
> However when using repeatedly, it may be cleaner to cache it in a local
> variable.

It's not clear what "cleaner" is supposed to mean. Are you saying it
makes the source clearer, or the generated code?

I'm not sure it will make any difference to the latter.

> Also fix a build error in a some weird config.
> "error: _numa_cpu_lookup_table_ undeclared"

Separate patch please.

> No functional change

The ifdef change means that's not true.

> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/smp.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 73199470c265..680c0edcc59d 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -843,7 +843,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  
>  	DBG("smp_prepare_cpus\n");
>  
> -	/* 
> +	/*
>  	 * setup_cpu may need to be called on the boot cpu. We havent
>  	 * spun any cpus up but lets be paranoid.
>  	 */
> @@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  	cpu_callin_map[boot_cpuid] = 1;
>  
>  	for_each_possible_cpu(cpu) {
> +		int node = cpu_to_node(cpu);
> +

Does cpu_to_node() even work here?

Doesn't look like it to me.

More fallout from 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") ?

>  		zalloc_cpumask_var_node(&per_cpu(cpu_sibling_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
>  		zalloc_cpumask_var_node(&per_cpu(cpu_l2_cache_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
>  		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
> +#ifdef CONFIG_NEED_MULTIPLE_NODES
>  		/*
>  		 * numa_node_id() works after this.
>  		 */
>  		if (cpu_present(cpu)) {
> -			set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
> -			set_cpu_numa_mem(cpu,
> -				local_memory_node(numa_cpu_lookup_table[cpu]));
> +			node = numa_cpu_lookup_table[cpu];
> +			set_cpu_numa_node(cpu, node);
> +			set_cpu_numa_mem(cpu, local_memory_node(node));
>  		}
> +#endif
>  	}
>  
>  	/* Init the cpumasks so the boot CPU is related to itself */
> -- 
> 2.17.1

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 01/10] powerpc/smp: Cache node for reuse
@ 2020-07-22  7:41     ` Michael Ellerman
  0 siblings, 0 replies; 56+ messages in thread
From: Michael Ellerman @ 2020-07-22  7:41 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Srikar Dronamraju, Peter Zijlstra,
	Jordan Niethe, Anton Blanchard, LKML, Valentin Schneider,
	Nick Piggin, linuxppc-dev, Ingo Molnar

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> While cpu_to_node is inline function with access to per_cpu variable.
> However when using repeatedly, it may be cleaner to cache it in a local
> variable.

It's not clear what "cleaner" is supposed to mean. Are you saying it
makes the source clearer, or the generated code?

I'm not sure it will make any difference to the latter.

> Also fix a build error in a some weird config.
> "error: _numa_cpu_lookup_table_ undeclared"

Separate patch please.

> No functional change

The ifdef change means that's not true.

> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Cc: LKML <linux-kernel@vger.kernel.org>
> Cc: Michael Ellerman <michaele@au1.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: Nick Piggin <npiggin@au1.ibm.com>
> Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Michael Neuling <mikey@linux.ibm.com>
> Cc: Anton Blanchard <anton@au1.ibm.com>
> Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
> Cc: Jordan Niethe <jniethe5@gmail.com>
> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/smp.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 73199470c265..680c0edcc59d 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -843,7 +843,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  
>  	DBG("smp_prepare_cpus\n");
>  
> -	/* 
> +	/*
>  	 * setup_cpu may need to be called on the boot cpu. We havent
>  	 * spun any cpus up but lets be paranoid.
>  	 */
> @@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  	cpu_callin_map[boot_cpuid] = 1;
>  
>  	for_each_possible_cpu(cpu) {
> +		int node = cpu_to_node(cpu);
> +

Does cpu_to_node() even work here?

Doesn't look like it to me.

More fallout from 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") ?

>  		zalloc_cpumask_var_node(&per_cpu(cpu_sibling_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
>  		zalloc_cpumask_var_node(&per_cpu(cpu_l2_cache_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
>  		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
> -					GFP_KERNEL, cpu_to_node(cpu));
> +					GFP_KERNEL, node);
> +#ifdef CONFIG_NEED_MULTIPLE_NODES
>  		/*
>  		 * numa_node_id() works after this.
>  		 */
>  		if (cpu_present(cpu)) {
> -			set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
> -			set_cpu_numa_mem(cpu,
> -				local_memory_node(numa_cpu_lookup_table[cpu]));
> +			node = numa_cpu_lookup_table[cpu];
> +			set_cpu_numa_node(cpu, node);
> +			set_cpu_numa_mem(cpu, local_memory_node(node));
>  		}
> +#endif
>  	}
>  
>  	/* Init the cpumasks so the boot CPU is related to itself */
> -- 
> 2.17.1

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-21 11:38   ` Srikar Dronamraju
@ 2020-07-22  7:46     ` peterz
  -1 siblings, 0 replies; 56+ messages in thread
From: peterz @ 2020-07-22  7:46 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Valentin Schneider, Nick Piggin, Oliver OHalloran, Nathan Lynch,
	Michael Neuling, Anton Blanchard, Gautham R Shenoy,
	Vaidyanathan Srinivasan, Jordan Niethe

On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> Currently "CACHE" domain happens to be the 2nd sched domain as per
> powerpc_topology. This domain will collapse if cpumask of l2-cache is
> same as SMT domain. However we could generalize this domain such that it
> could mean either be a "CACHE" domain or a "BIGCORE" domain.

What's the whole smallcore vs bigcore thing?

Would it make sense to have an actual overview of the various topology
layers somewhere? Because I'm getting lost and can't really make sense
of the code due to that.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  7:46     ` peterz
  0 siblings, 0 replies; 56+ messages in thread
From: peterz @ 2020-07-22  7:46 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Gautham R Shenoy, Jordan Niethe,
	Anton Blanchard, LKML, Valentin Schneider, linuxppc-dev,
	Nick Piggin

On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> Currently "CACHE" domain happens to be the 2nd sched domain as per
> powerpc_topology. This domain will collapse if cpumask of l2-cache is
> same as SMT domain. However we could generalize this domain such that it
> could mean either be a "CACHE" domain or a "BIGCORE" domain.

What's the whole smallcore vs bigcore thing?

Would it make sense to have an actual overview of the various topology
layers somewhere? Because I'm getting lost and can't really make sense
of the code due to that.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 01/10] powerpc/smp: Cache node for reuse
  2020-07-22  7:41     ` Michael Ellerman
@ 2020-07-22  8:04       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  8:04 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, LKML, Ingo Molnar, Peter Zijlstra,
	Valentin Schneider, Nick Piggin, Oliver OHalloran, Nathan Lynch,
	Michael Neuling, Anton Blanchard, Gautham R Shenoy,
	Vaidyanathan Srinivasan, Jordan Niethe

* Michael Ellerman <michaele@au1.ibm.com> [2020-07-22 17:41:41]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> > While cpu_to_node is inline function with access to per_cpu variable.
> > However when using repeatedly, it may be cleaner to cache it in a local
> > variable.
> 
> It's not clear what "cleaner" is supposed to mean. Are you saying it
> makes the source clearer, or the generated code?
> 
> I'm not sure it will make any difference to the latter.

I meant the source code, I am okay dropping the hunks that try to cache
cpu_to_node.

> 
> > Also fix a build error in a some weird config.
> > "error: _numa_cpu_lookup_table_ undeclared"
> 
> Separate patch please.

Okay, will do.

> 
> > No functional change
> 
> The ifdef change means that's not true.

Okay

> > @@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> >  	cpu_callin_map[boot_cpuid] = 1;
> >  
> >  	for_each_possible_cpu(cpu) {
> > +		int node = cpu_to_node(cpu);
> > +
> 
> Does cpu_to_node() even work here?

Except in the case where NUMA is not enabled, (when cpu_to_node would return
-1), It should work here since numa initialization would have happened by
now. It cpu_to_node(cpu) should work once numa_setup_cpu() /
map_cpu_to_node() gets called.  And those are being called before this.

> 
> Doesn't look like it to me.
> 
> More fallout from 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") ?
> 
> >  	}
> >  
> >  	/* Init the cpumasks so the boot CPU is related to itself */

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 01/10] powerpc/smp: Cache node for reuse
@ 2020-07-22  8:04       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  8:04 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Ingo Molnar, Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Gautham R Shenoy, Peter Zijlstra, Jordan Niethe, Anton Blanchard,
	LKML, Nick Piggin, linuxppc-dev, Valentin Schneider

* Michael Ellerman <michaele@au1.ibm.com> [2020-07-22 17:41:41]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> > While cpu_to_node is inline function with access to per_cpu variable.
> > However when using repeatedly, it may be cleaner to cache it in a local
> > variable.
> 
> It's not clear what "cleaner" is supposed to mean. Are you saying it
> makes the source clearer, or the generated code?
> 
> I'm not sure it will make any difference to the latter.

I meant the source code, I am okay dropping the hunks that try to cache
cpu_to_node.

> 
> > Also fix a build error in a some weird config.
> > "error: _numa_cpu_lookup_table_ undeclared"
> 
> Separate patch please.

Okay, will do.

> 
> > No functional change
> 
> The ifdef change means that's not true.

Okay

> > @@ -854,20 +854,24 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> >  	cpu_callin_map[boot_cpuid] = 1;
> >  
> >  	for_each_possible_cpu(cpu) {
> > +		int node = cpu_to_node(cpu);
> > +
> 
> Does cpu_to_node() even work here?

Except in the case where NUMA is not enabled, (when cpu_to_node would return
-1), It should work here since numa initialization would have happened by
now. It cpu_to_node(cpu) should work once numa_setup_cpu() /
map_cpu_to_node() gets called.  And those are being called before this.

> 
> Doesn't look like it to me.
> 
> More fallout from 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") ?
> 
> >  	}
> >  
> >  	/* Init the cpumasks so the boot CPU is related to itself */

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-22  7:46     ` peterz
@ 2020-07-22  8:18       ` Srikar Dronamraju
  -1 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  8:18 UTC (permalink / raw)
  To: peterz
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Valentin Schneider, Nick Piggin, Oliver OHalloran, Nathan Lynch,
	Michael Neuling, Anton Blanchard, Gautham R Shenoy,
	Vaidyanathan Srinivasan, Jordan Niethe

* peterz@infradead.org <peterz@infradead.org> [2020-07-22 09:46:24]:

> On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > same as SMT domain. However we could generalize this domain such that it
> > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> 
> What's the whole smallcore vs bigcore thing?
> 
> Would it make sense to have an actual overview of the various topology
> layers somewhere? Because I'm getting lost and can't really make sense
> of the code due to that.

To quote with an example: using (Power 9)

Power 9 is an SMT 8 core by design. However this 8 thread core can run as 2
independent core with threads 0,2,4 and 6 acting like a core, while threads
1,3,5,7 acting as another core.  

The firmware can decide to showcase them as 2 independent small cores or as
one big core. However the LLC (i.e last level of cache) is shared between
all the threads of the SMT 8 core. Future power chips, LLC might change, it
may be expanded to share with another SMT 8 core or it could be made
specific to SMT 4 core.

The smt 8 core is also known as fused core/ Big core.
The smaller smt 4 core is known as small core.

So on a Power9 Baremetal, the firmware would show up as SMT4 core.
and we have a CACHE level at SMT 8. CACHE level would be very very similar
to MC domain in X86.

Hope this is clear.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  8:18       ` Srikar Dronamraju
  0 siblings, 0 replies; 56+ messages in thread
From: Srikar Dronamraju @ 2020-07-22  8:18 UTC (permalink / raw)
  To: peterz
  Cc: Ingo Molnar, Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Gautham R Shenoy, Jordan Niethe,
	Anton Blanchard, LKML, Valentin Schneider, linuxppc-dev,
	Nick Piggin

* peterz@infradead.org <peterz@infradead.org> [2020-07-22 09:46:24]:

> On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > same as SMT domain. However we could generalize this domain such that it
> > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> 
> What's the whole smallcore vs bigcore thing?
> 
> Would it make sense to have an actual overview of the various topology
> layers somewhere? Because I'm getting lost and can't really make sense
> of the code due to that.

To quote with an example: using (Power 9)

Power 9 is an SMT 8 core by design. However this 8 thread core can run as 2
independent core with threads 0,2,4 and 6 acting like a core, while threads
1,3,5,7 acting as another core.  

The firmware can decide to showcase them as 2 independent small cores or as
one big core. However the LLC (i.e last level of cache) is shared between
all the threads of the SMT 8 core. Future power chips, LLC might change, it
may be expanded to share with another SMT 8 core or it could be made
specific to SMT 4 core.

The smt 8 core is also known as fused core/ Big core.
The smaller smt 4 core is known as small core.

So on a Power9 Baremetal, the firmware would show up as SMT4 core.
and we have a CACHE level at SMT 8. CACHE level would be very very similar
to MC domain in X86.

Hope this is clear.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-22  8:18       ` Srikar Dronamraju
@ 2020-07-22  8:48         ` Peter Zijlstra
  -1 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2020-07-22  8:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Valentin Schneider, Nick Piggin, Oliver OHalloran, Nathan Lynch,
	Michael Neuling, Anton Blanchard, Gautham R Shenoy,
	Vaidyanathan Srinivasan, Jordan Niethe

On Wed, Jul 22, 2020 at 01:48:22PM +0530, Srikar Dronamraju wrote:
> * peterz@infradead.org <peterz@infradead.org> [2020-07-22 09:46:24]:
> 
> > On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > > same as SMT domain. However we could generalize this domain such that it
> > > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> > 
> > What's the whole smallcore vs bigcore thing?
> > 
> > Would it make sense to have an actual overview of the various topology
> > layers somewhere? Because I'm getting lost and can't really make sense
> > of the code due to that.
> 
> To quote with an example: using (Power 9)
> 
> Power 9 is an SMT 8 core by design. However this 8 thread core can run as 2
> independent core with threads 0,2,4 and 6 acting like a core, while threads
> 1,3,5,7 acting as another core.  
> 
> The firmware can decide to showcase them as 2 independent small cores or as
> one big core. However the LLC (i.e last level of cache) is shared between
> all the threads of the SMT 8 core. Future power chips, LLC might change, it
> may be expanded to share with another SMT 8 core or it could be made
> specific to SMT 4 core.
> 
> The smt 8 core is also known as fused core/ Big core.
> The smaller smt 4 core is known as small core.
> 
> So on a Power9 Baremetal, the firmware would show up as SMT4 core.
> and we have a CACHE level at SMT 8. CACHE level would be very very similar
> to MC domain in X86.
> 
> Hope this is clear.

Ooh, that thing. I thought P9 was in actual fact an SMT4 hardware part,
but to be compatible with P8 it has this fused option where two cores
act like a single SMT8 part.

The interleaving enumeration is done due to P7 asymmetric cores,
resuting in schedulers having the preference to use the lower threads.

Combined that results in a P9-fused configuration using two independent
full cores when there's only 2 runnable threads.

Which is a subtly different story from yours.

But reading your explanation, it looks like the Linux topology setup
could actually unravel the fused-faux-SMT8 into two independent SMT4
parts, negating that firmware option.

Anyway, a few comments just around there might be helpfull.


Thanks!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  8:48         ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2020-07-22  8:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Gautham R Shenoy, Jordan Niethe,
	Anton Blanchard, LKML, Valentin Schneider, linuxppc-dev,
	Nick Piggin

On Wed, Jul 22, 2020 at 01:48:22PM +0530, Srikar Dronamraju wrote:
> * peterz@infradead.org <peterz@infradead.org> [2020-07-22 09:46:24]:
> 
> > On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > > same as SMT domain. However we could generalize this domain such that it
> > > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> > 
> > What's the whole smallcore vs bigcore thing?
> > 
> > Would it make sense to have an actual overview of the various topology
> > layers somewhere? Because I'm getting lost and can't really make sense
> > of the code due to that.
> 
> To quote with an example: using (Power 9)
> 
> Power 9 is an SMT 8 core by design. However this 8 thread core can run as 2
> independent core with threads 0,2,4 and 6 acting like a core, while threads
> 1,3,5,7 acting as another core.  
> 
> The firmware can decide to showcase them as 2 independent small cores or as
> one big core. However the LLC (i.e last level of cache) is shared between
> all the threads of the SMT 8 core. Future power chips, LLC might change, it
> may be expanded to share with another SMT 8 core or it could be made
> specific to SMT 4 core.
> 
> The smt 8 core is also known as fused core/ Big core.
> The smaller smt 4 core is known as small core.
> 
> So on a Power9 Baremetal, the firmware would show up as SMT4 core.
> and we have a CACHE level at SMT 8. CACHE level would be very very similar
> to MC domain in X86.
> 
> Hope this is clear.

Ooh, that thing. I thought P9 was in actual fact an SMT4 hardware part,
but to be compatible with P8 it has this fused option where two cores
act like a single SMT8 part.

The interleaving enumeration is done due to P7 asymmetric cores,
resuting in schedulers having the preference to use the lower threads.

Combined that results in a P9-fused configuration using two independent
full cores when there's only 2 runnable threads.

Which is a subtly different story from yours.

But reading your explanation, it looks like the Linux topology setup
could actually unravel the fused-faux-SMT8 into two independent SMT4
parts, negating that firmware option.

Anyway, a few comments just around there might be helpfull.


Thanks!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
  2020-07-22  8:48         ` Peter Zijlstra
@ 2020-07-22  8:54           ` peterz
  -1 siblings, 0 replies; 56+ messages in thread
From: peterz @ 2020-07-22  8:54 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Michael Ellerman, linuxppc-dev, LKML, Ingo Molnar,
	Valentin Schneider, Nick Piggin, Oliver OHalloran, Nathan Lynch,
	Michael Neuling, Anton Blanchard, Gautham R Shenoy,
	Vaidyanathan Srinivasan, Jordan Niethe

On Wed, Jul 22, 2020 at 10:48:54AM +0200, Peter Zijlstra wrote:
> But reading your explanation, it looks like the Linux topology setup
> could actually unravel the fused-faux-SMT8 into two independent SMT4
> parts, negating that firmware option.

Ah, it looks like that's exactly what you're doing. Nice!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain
@ 2020-07-22  8:54           ` peterz
  0 siblings, 0 replies; 56+ messages in thread
From: peterz @ 2020-07-22  8:54 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Nathan Lynch, Oliver OHalloran, Michael Neuling,
	Michael Ellerman, Gautham R Shenoy, Jordan Niethe,
	Anton Blanchard, LKML, Valentin Schneider, linuxppc-dev,
	Nick Piggin

On Wed, Jul 22, 2020 at 10:48:54AM +0200, Peter Zijlstra wrote:
> But reading your explanation, it looks like the Linux topology setup
> could actually unravel the fused-faux-SMT8 into two independent SMT4
> parts, negating that firmware option.

Ah, it looks like that's exactly what you're doing. Nice!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
  2020-07-22  6:57       ` Srikar Dronamraju
@ 2020-07-24  7:10         ` Gautham R Shenoy
  -1 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-24  7:10 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Gautham R Shenoy, Michael Ellerman, linuxppc-dev, LKML,
	Ingo Molnar, Peter Zijlstra, Valentin Schneider, Nick Piggin,
	Oliver OHalloran, Nathan Lynch, Michael Neuling, Anton Blanchard,
	Vaidyanathan Srinivasan, Jordan Niethe

On Wed, Jul 22, 2020 at 12:27:47PM +0530, Srikar Dronamraju wrote:
> * Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:51:14]:
> 
> > Hi Srikar,
> > 
> > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > > index 72f16dc0cb26..57468877499a 100644
> > > --- a/arch/powerpc/kernel/smp.c
> > > +++ b/arch/powerpc/kernel/smp.c
> > > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
> > >  	if (!l2_cache)
> > >  		return false;
> > > 
> > > +	cpumask_set_cpu(cpu, mask_fn(cpu));
> > 
> > 
> > Ok, we need to do this because "cpu" is not yet set in the
> > cpu_online_mask. Prior to your patch the "cpu" was getting set in
> > cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> > the patch.
> > 
> 
> Right.
> 
> > 
> > >  	for_each_cpu(i, cpu_online_mask) {
> > >  		/*
> > >  		 * when updating the marks the current CPU has not been marked
> > > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> > >  	 * add it to it's own thread sibling mask.
> > >  	 */
> > >  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > > +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
> Note: Above, we are explicitly setting the cpu_core_mask.

You are right. I missed this.

> 
> > > 
> > >  	for (i = first_thread; i < first_thread + threads_per_core; i++)
> > >  		if (cpu_online(i))
> > >  			set_cpus_related(i, cpu, cpu_sibling_mask);
> > > 
> > >  	add_cpu_to_smallcore_masks(cpu);
> > > -	/*
> > > -	 * Copy the thread sibling mask into the cache sibling mask
> > > -	 * and mark any CPUs that share an L2 with this CPU.
> > > -	 */
> > > -	for_each_cpu(i, cpu_sibling_mask(cpu))
> > > -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
> > >  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > > 
> > > -	/*
> > > -	 * Copy the cache sibling mask into core sibling mask and mark
> > > -	 * any CPUs on the same chip as this CPU.
> > > -	 */
> > > -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > > -		set_cpus_related(cpu, i, cpu_core_mask);
> > > +	if (pkg_id == -1) {
> > 
> > I suppose this "if" condition is an optimization, since if pkg_id != -1,
> > we anyway set these CPUs in the cpu_core_mask below.
> > 
> > However...
> 
> This is not just an optimization.
> The hunk removed would only work if cpu_l2_cache_mask is bigger than
> cpu_sibling_mask. (this was the previous assumption that we want to break)
> If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
> then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
> topology.
> 
> > 
> > > +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > > +
> > > +		/*
> > > +		 * Copy the sibling mask into core sibling mask and
> > > +		 * mark any CPUs on the same chip as this CPU.
> > > +		 */
> > > +		if (shared_caches)
> > > +			mask = cpu_l2_cache_mask;
> > > +
> > > +		for_each_cpu(i, mask(cpu))
> > > +			set_cpus_related(cpu, i, cpu_core_mask);
> > > 
> > > -	if (pkg_id == -1)
> > >  		return;
> > > +	}
> > 
> > 
> > ... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
> > "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> > 
> > 
> 
> As noted above, we are setting before. So we don't missing the cpu and hence
> have not different from before.


Fair enough.

> 
> > --
> > Thanks and Regards
> > gautham.
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling
@ 2020-07-24  7:10         ` Gautham R Shenoy
  0 siblings, 0 replies; 56+ messages in thread
From: Gautham R Shenoy @ 2020-07-24  7:10 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran,
	Michael Neuling, Michael Ellerman, Peter Zijlstra, Jordan Niethe,
	Anton Blanchard, LKML, Valentin Schneider, Nick Piggin,
	linuxppc-dev, Ingo Molnar

On Wed, Jul 22, 2020 at 12:27:47PM +0530, Srikar Dronamraju wrote:
> * Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-07-22 11:51:14]:
> 
> > Hi Srikar,
> > 
> > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > > index 72f16dc0cb26..57468877499a 100644
> > > --- a/arch/powerpc/kernel/smp.c
> > > +++ b/arch/powerpc/kernel/smp.c
> > > @@ -1196,6 +1196,7 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
> > >  	if (!l2_cache)
> > >  		return false;
> > > 
> > > +	cpumask_set_cpu(cpu, mask_fn(cpu));
> > 
> > 
> > Ok, we need to do this because "cpu" is not yet set in the
> > cpu_online_mask. Prior to your patch the "cpu" was getting set in
> > cpu_l2_cache_map(cpu) as a side-effect of the code that is removed in
> > the patch.
> > 
> 
> Right.
> 
> > 
> > >  	for_each_cpu(i, cpu_online_mask) {
> > >  		/*
> > >  		 * when updating the marks the current CPU has not been marked
> > > @@ -1278,29 +1279,30 @@ static void add_cpu_to_masks(int cpu)
> > >  	 * add it to it's own thread sibling mask.
> > >  	 */
> > >  	cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));
> > > +	cpumask_set_cpu(cpu, cpu_core_mask(cpu));
> 
> Note: Above, we are explicitly setting the cpu_core_mask.

You are right. I missed this.

> 
> > > 
> > >  	for (i = first_thread; i < first_thread + threads_per_core; i++)
> > >  		if (cpu_online(i))
> > >  			set_cpus_related(i, cpu, cpu_sibling_mask);
> > > 
> > >  	add_cpu_to_smallcore_masks(cpu);
> > > -	/*
> > > -	 * Copy the thread sibling mask into the cache sibling mask
> > > -	 * and mark any CPUs that share an L2 with this CPU.
> > > -	 */
> > > -	for_each_cpu(i, cpu_sibling_mask(cpu))
> > > -		set_cpus_related(cpu, i, cpu_l2_cache_mask);
> > >  	update_mask_by_l2(cpu, cpu_l2_cache_mask);
> > > 
> > > -	/*
> > > -	 * Copy the cache sibling mask into core sibling mask and mark
> > > -	 * any CPUs on the same chip as this CPU.
> > > -	 */
> > > -	for_each_cpu(i, cpu_l2_cache_mask(cpu))
> > > -		set_cpus_related(cpu, i, cpu_core_mask);
> > > +	if (pkg_id == -1) {
> > 
> > I suppose this "if" condition is an optimization, since if pkg_id != -1,
> > we anyway set these CPUs in the cpu_core_mask below.
> > 
> > However...
> 
> This is not just an optimization.
> The hunk removed would only work if cpu_l2_cache_mask is bigger than
> cpu_sibling_mask. (this was the previous assumption that we want to break)
> If the cpu_sibling_mask is bigger than cpu_l2_cache_mask and pkg_id is -1,
> then setting only cpu_l2_cache_mask in cpu_core_mask will result in a broken 
> topology.
> 
> > 
> > > +		struct cpumask *(*mask)(int) = cpu_sibling_mask;
> > > +
> > > +		/*
> > > +		 * Copy the sibling mask into core sibling mask and
> > > +		 * mark any CPUs on the same chip as this CPU.
> > > +		 */
> > > +		if (shared_caches)
> > > +			mask = cpu_l2_cache_mask;
> > > +
> > > +		for_each_cpu(i, mask(cpu))
> > > +			set_cpus_related(cpu, i, cpu_core_mask);
> > > 
> > > -	if (pkg_id == -1)
> > >  		return;
> > > +	}
> > 
> > 
> > ... since "cpu" is not yet set in the cpu_online_mask, do we not miss setting
> > "cpu" in the cpu_core_mask(cpu) in the for-loop below ?
> > 
> > 
> 
> As noted above, we are setting before. So we don't missing the cpu and hence
> have not different from before.


Fair enough.

> 
> > --
> > Thanks and Regards
> > gautham.
> 
> -- 
> Thanks and Regards
> Srikar Dronamraju

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2020-07-24  7:12 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-21 11:38 [PATCH v2 00/10] Coregroup support on Powerpc Srikar Dronamraju
2020-07-21 11:38 ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 01/10] powerpc/smp: Cache node for reuse Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  7:41   ` Michael Ellerman
2020-07-22  7:41     ` Michael Ellerman
2020-07-22  8:04     ` Srikar Dronamraju
2020-07-22  8:04       ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 02/10] powerpc/smp: Merge Power9 topology with Power topology Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  5:48   ` Gautham R Shenoy
2020-07-22  5:48     ` Gautham R Shenoy
2020-07-21 11:38 ` [PATCH v2 03/10] powerpc/smp: Move powerpc_topology above Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 04/10] powerpc/smp: Enable small core scheduling sooner Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  5:59   ` Gautham R Shenoy
2020-07-22  5:59     ` Gautham R Shenoy
2020-07-22  6:59     ` Srikar Dronamraju
2020-07-22  6:59       ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 05/10] powerpc/smp: Dont assume l2-cache to be superset of sibling Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  6:21   ` Gautham R Shenoy
2020-07-22  6:21     ` Gautham R Shenoy
2020-07-22  6:57     ` Srikar Dronamraju
2020-07-22  6:57       ` Srikar Dronamraju
2020-07-24  7:10       ` Gautham R Shenoy
2020-07-24  7:10         ` Gautham R Shenoy
2020-07-21 11:38 ` [PATCH v2 06/10] powerpc/smp: Generalize 2nd sched domain Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  6:56   ` Gautham R Shenoy
2020-07-22  6:56     ` Gautham R Shenoy
2020-07-22  7:39     ` Srikar Dronamraju
2020-07-22  7:39       ` Srikar Dronamraju
2020-07-22  7:46   ` peterz
2020-07-22  7:46     ` peterz
2020-07-22  8:18     ` Srikar Dronamraju
2020-07-22  8:18       ` Srikar Dronamraju
2020-07-22  8:48       ` Peter Zijlstra
2020-07-22  8:48         ` Peter Zijlstra
2020-07-22  8:54         ` peterz
2020-07-22  8:54           ` peterz
2020-07-21 11:38 ` [PATCH v2 07/10] Powerpc/numa: Detect support for coregroup Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 08/10] powerpc/smp: Allocate cpumask only after searching thread group Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 09/10] Powerpc/smp: Create coregroup domain Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  7:04   ` Gautham R Shenoy
2020-07-22  7:04     ` Gautham R Shenoy
2020-07-22  7:29     ` Srikar Dronamraju
2020-07-22  7:29       ` Srikar Dronamraju
2020-07-21 11:38 ` [PATCH v2 10/10] powerpc/smp: Implement cpu_to_coregroup_id Srikar Dronamraju
2020-07-21 11:38   ` Srikar Dronamraju
2020-07-22  7:06   ` Gautham R Shenoy
2020-07-22  7:06     ` Gautham R Shenoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.