All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-05-25  8:14 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology. Sorry for posting in the middle of
merge window but better to get this tested earlier so that it is ready
for next merge window.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.

P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
agree with the changes first before we include them.

v2[2]->v3:
	- Dropped support to get the device node for the CPU's LLC
	- Updated cacheinfo to support calling of detect_cache_attributes
	  early in smp_prepare_cpus stage
	- Added support to check if LLC is valid and shared in the cacheinfo
	- Used the same in arch_topology

v1[1]->v2[2]:
	- Updated ID validity check include all non-negative value
	- Added support to get the device node for the CPU's last level cache
	- Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
[2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com

Sudeep Holla (16):
  cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  cacheinfo: Add helper to access any cache index for a given CPU
  cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  cacheinfo: Allow early detection and population of cache attributes
  arch_topology: Add support to parse and detect cache attributes
  arch_topology: Use the last level cache information from the cacheinfo
  arm64: topology: Remove redundant setting of llc_id in CPU topology
  arch_topology: Drop LLC identifier stash from the CPU topology
  arch_topology: Set thread sibling cpumask only within the cluster
  arch_topology: Check for non-negative value rather than -1 for IDs validity
  arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  arch_topology: Don't set cluster identifier as physical package identifier
  arch_topology: Drop unnecessary check for uninitialised package_id
  arch_topology: Set cluster identifier in each core/thread from /cpu-map
  arch_topology: Add support for parsing sockets in /cpu-map

 arch/arm64/kernel/topology.c  |  14 -----
 drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
 drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
 include/linux/arch_topology.h |   1 -
 include/linux/cacheinfo.h     |   3 +
 5 files changed, 135 insertions(+), 89 deletions(-)

--
2.36.1


^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-05-25  8:14 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology. Sorry for posting in the middle of
merge window but better to get this tested earlier so that it is ready
for next merge window.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.

P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
agree with the changes first before we include them.

v2[2]->v3:
	- Dropped support to get the device node for the CPU's LLC
	- Updated cacheinfo to support calling of detect_cache_attributes
	  early in smp_prepare_cpus stage
	- Added support to check if LLC is valid and shared in the cacheinfo
	- Used the same in arch_topology

v1[1]->v2[2]:
	- Updated ID validity check include all non-negative value
	- Added support to get the device node for the CPU's last level cache
	- Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
[2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com

Sudeep Holla (16):
  cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  cacheinfo: Add helper to access any cache index for a given CPU
  cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  cacheinfo: Allow early detection and population of cache attributes
  arch_topology: Add support to parse and detect cache attributes
  arch_topology: Use the last level cache information from the cacheinfo
  arm64: topology: Remove redundant setting of llc_id in CPU topology
  arch_topology: Drop LLC identifier stash from the CPU topology
  arch_topology: Set thread sibling cpumask only within the cluster
  arch_topology: Check for non-negative value rather than -1 for IDs validity
  arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  arch_topology: Don't set cluster identifier as physical package identifier
  arch_topology: Drop unnecessary check for uninitialised package_id
  arch_topology: Set cluster identifier in each core/thread from /cpu-map
  arch_topology: Add support for parsing sockets in /cpu-map

 arch/arm64/kernel/topology.c  |  14 -----
 drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
 drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
 include/linux/arch_topology.h |   1 -
 include/linux/cacheinfo.h     |   3 +
 5 files changed, 135 insertions(+), 89 deletions(-)

--
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-05-25  8:14 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology. Sorry for posting in the middle of
merge window but better to get this tested earlier so that it is ready
for next merge window.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.

P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
agree with the changes first before we include them.

v2[2]->v3:
	- Dropped support to get the device node for the CPU's LLC
	- Updated cacheinfo to support calling of detect_cache_attributes
	  early in smp_prepare_cpus stage
	- Added support to check if LLC is valid and shared in the cacheinfo
	- Used the same in arch_topology

v1[1]->v2[2]:
	- Updated ID validity check include all non-negative value
	- Added support to get the device node for the CPU's last level cache
	- Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
[2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com

Sudeep Holla (16):
  cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  cacheinfo: Add helper to access any cache index for a given CPU
  cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  cacheinfo: Allow early detection and population of cache attributes
  arch_topology: Add support to parse and detect cache attributes
  arch_topology: Use the last level cache information from the cacheinfo
  arm64: topology: Remove redundant setting of llc_id in CPU topology
  arch_topology: Drop LLC identifier stash from the CPU topology
  arch_topology: Set thread sibling cpumask only within the cluster
  arch_topology: Check for non-negative value rather than -1 for IDs validity
  arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  arch_topology: Don't set cluster identifier as physical package identifier
  arch_topology: Drop unnecessary check for uninitialised package_id
  arch_topology: Set cluster identifier in each core/thread from /cpu-map
  arch_topology: Add support for parsing sockets in /cpu-map

 arch/arm64/kernel/topology.c  |  14 -----
 drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
 drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
 include/linux/arch_topology.h |   1 -
 include/linux/cacheinfo.h     |   3 +
 5 files changed, 135 insertions(+), 89 deletions(-)

--
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  2022-05-25  8:14 ` Sudeep Holla
  (?)
@ 2022-05-25  8:14   ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
   much earlier before the CPUs are registered as devices.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
 #include <linux/cpu.h>
 #include <linux/device.h>
 #include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		return 0;
 	}
 
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
+	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
 		return -ENOENT;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
@ 2022-05-25  8:14   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
   much earlier before the CPUs are registered as devices.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
 #include <linux/cpu.h>
 #include <linux/device.h>
 #include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		return 0;
 	}
 
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
+	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
 		return -ENOENT;
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
@ 2022-05-25  8:14   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
   much earlier before the CPUs are registered as devices.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
 #include <linux/cpu.h>
 #include <linux/device.h>
 #include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		return 0;
 	}
 
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
+	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
 		return -ENOENT;
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
  2022-05-25  8:14   ` Sudeep Holla
  (?)
@ 2022-05-25  8:14     ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offseting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index b0bde272e2ae..c4547d8ac6f3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
 #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
 #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
 #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
+#define per_cpu_cacheinfo_idx(cpu, idx)		\
+				(per_cpu_cacheinfo(cpu) + (idx))
 
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 {
@@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
 	}
 
 	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		if (this_leaf->level != 1)
 			np = of_find_next_cache_node(np);
 		else
@@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		/* skip if shared_cpu_map is already populated */
 		if (!cpumask_empty(&this_leaf->shared_cpu_map))
 			continue;
@@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 			if (i == cpu || !sib_cpu_ci->info_list)
 				continue;/* skip if itself or no cacheinfo */
-			sib_leaf = sib_cpu_ci->info_list + index;
+			sib_leaf = per_cpu_cacheinfo_idx(i, index);
 			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
 				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
 				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
@@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 static void cache_shared_cpu_map_remove(unsigned int cpu)
 {
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	struct cacheinfo *this_leaf, *sib_leaf;
 	unsigned int sibling, index;
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
 			struct cpu_cacheinfo *sib_cpu_ci;
 
@@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
 	int rc;
 	struct device *ci_dev, *parent;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	const struct attribute_group **cache_groups;
 
 	rc = cpu_cache_sysfs_init(cpu);
@@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
 
 	parent = per_cpu_cache_dev(cpu);
 	for (i = 0; i < cache_leaves(cpu); i++) {
-		this_leaf = this_cpu_ci->info_list + i;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
 		if (this_leaf->disable_sysfs)
 			continue;
 		if (this_leaf->type == CACHE_TYPE_NOCACHE)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-05-25  8:14     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offseting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index b0bde272e2ae..c4547d8ac6f3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
 #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
 #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
 #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
+#define per_cpu_cacheinfo_idx(cpu, idx)		\
+				(per_cpu_cacheinfo(cpu) + (idx))
 
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 {
@@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
 	}
 
 	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		if (this_leaf->level != 1)
 			np = of_find_next_cache_node(np);
 		else
@@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		/* skip if shared_cpu_map is already populated */
 		if (!cpumask_empty(&this_leaf->shared_cpu_map))
 			continue;
@@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 			if (i == cpu || !sib_cpu_ci->info_list)
 				continue;/* skip if itself or no cacheinfo */
-			sib_leaf = sib_cpu_ci->info_list + index;
+			sib_leaf = per_cpu_cacheinfo_idx(i, index);
 			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
 				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
 				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
@@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 static void cache_shared_cpu_map_remove(unsigned int cpu)
 {
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	struct cacheinfo *this_leaf, *sib_leaf;
 	unsigned int sibling, index;
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
 			struct cpu_cacheinfo *sib_cpu_ci;
 
@@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
 	int rc;
 	struct device *ci_dev, *parent;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	const struct attribute_group **cache_groups;
 
 	rc = cpu_cache_sysfs_init(cpu);
@@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
 
 	parent = per_cpu_cache_dev(cpu);
 	for (i = 0; i < cache_leaves(cpu); i++) {
-		this_leaf = this_cpu_ci->info_list + i;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
 		if (this_leaf->disable_sysfs)
 			continue;
 		if (this_leaf->type == CACHE_TYPE_NOCACHE)
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-05-25  8:14     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offseting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index b0bde272e2ae..c4547d8ac6f3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
 #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
 #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
 #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
+#define per_cpu_cacheinfo_idx(cpu, idx)		\
+				(per_cpu_cacheinfo(cpu) + (idx))
 
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 {
@@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
 	}
 
 	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		if (this_leaf->level != 1)
 			np = of_find_next_cache_node(np);
 		else
@@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		/* skip if shared_cpu_map is already populated */
 		if (!cpumask_empty(&this_leaf->shared_cpu_map))
 			continue;
@@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 			if (i == cpu || !sib_cpu_ci->info_list)
 				continue;/* skip if itself or no cacheinfo */
-			sib_leaf = sib_cpu_ci->info_list + index;
+			sib_leaf = per_cpu_cacheinfo_idx(i, index);
 			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
 				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
 				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
@@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 static void cache_shared_cpu_map_remove(unsigned int cpu)
 {
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	struct cacheinfo *this_leaf, *sib_leaf;
 	unsigned int sibling, index;
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
 			struct cpu_cacheinfo *sib_cpu_ci;
 
@@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
 	int rc;
 	struct device *ci_dev, *parent;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	const struct attribute_group **cache_groups;
 
 	rc = cpu_cache_sysfs_init(cpu);
@@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
 
 	parent = per_cpu_cache_dev(cpu);
 	for (i = 0; i < cache_leaves(cpu); i++) {
-		this_leaf = this_cpu_ci->info_list + i;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
 		if (this_leaf->disable_sysfs)
 			continue;
 		if (this_leaf->type == CACHE_TYPE_NOCACHE)
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  2022-05-25  8:14     ` Sudeep Holla
  (?)
@ 2022-05-25  8:14       ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
if the cache leaves are the shared based on fw_token pointer. However it is
defined conditionally only if CONFIG_OF is enabled which is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index c4547d8ac6f3..417e1ebf5525 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 	return ci_cacheinfo(cpu);
 }
 
-#ifdef CONFIG_OF
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
+	/*
+	 * For non DT/ACPI systems, assume unique level 1 caches,
+	 * system-wide shared caches for all other levels. This will be used
+	 * only if arch specific code has not populated shared_cpu_map
+	 */
+	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
+		return !(this_leaf->level == 1);
+
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+#ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
 	const char *size_prop;
@@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
 }
 #else
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
-					   struct cacheinfo *sib_leaf)
-{
-	/*
-	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
-	 * shared caches for all other levels. This will be used only if
-	 * arch specific code has not populated shared_cpu_map
-	 */
-	return !(this_leaf->level == 1);
-}
 #endif
 
 int __weak cache_setup_acpi(unsigned int cpu)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
@ 2022-05-25  8:14       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
if the cache leaves are the shared based on fw_token pointer. However it is
defined conditionally only if CONFIG_OF is enabled which is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index c4547d8ac6f3..417e1ebf5525 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 	return ci_cacheinfo(cpu);
 }
 
-#ifdef CONFIG_OF
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
+	/*
+	 * For non DT/ACPI systems, assume unique level 1 caches,
+	 * system-wide shared caches for all other levels. This will be used
+	 * only if arch specific code has not populated shared_cpu_map
+	 */
+	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
+		return !(this_leaf->level == 1);
+
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+#ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
 	const char *size_prop;
@@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
 }
 #else
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
-					   struct cacheinfo *sib_leaf)
-{
-	/*
-	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
-	 * shared caches for all other levels. This will be used only if
-	 * arch specific code has not populated shared_cpu_map
-	 */
-	return !(this_leaf->level == 1);
-}
 #endif
 
 int __weak cache_setup_acpi(unsigned int cpu)
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
@ 2022-05-25  8:14       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
if the cache leaves are the shared based on fw_token pointer. However it is
defined conditionally only if CONFIG_OF is enabled which is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index c4547d8ac6f3..417e1ebf5525 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 	return ci_cacheinfo(cpu);
 }
 
-#ifdef CONFIG_OF
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
+	/*
+	 * For non DT/ACPI systems, assume unique level 1 caches,
+	 * system-wide shared caches for all other levels. This will be used
+	 * only if arch specific code has not populated shared_cpu_map
+	 */
+	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
+		return !(this_leaf->level == 1);
+
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+#ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
 	const char *size_prop;
@@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
 }
 #else
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
-					   struct cacheinfo *sib_leaf)
-{
-	/*
-	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
-	 * shared caches for all other levels. This will be used only if
-	 * arch specific code has not populated shared_cpu_map
-	 */
-	return !(this_leaf->level == 1);
-}
 #endif
 
 int __weak cache_setup_acpi(unsigned int cpu)
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  2022-05-25  8:14       ` Sudeep Holla
  (?)
@ 2022-05-25  8:14         ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

It is useful to have helper to check if the given two CPUs share last level
cache. We can do that check by comparing fw_token or by comparing the cache
ID. Currently we check just for fw_token as the cache ID is optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid or not.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
 include/linux/cacheinfo.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 417e1ebf5525..ed74db18468f 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+bool last_level_cache_is_valid(unsigned int cpu)
+{
+	struct cacheinfo *llc;
+
+	if (!cache_leaves(cpu))
+		return false;
+
+	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+	return llc->fw_token;
+}
+
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
+{
+	struct cacheinfo *llc_x, *llc_y;
+
+	if (!last_level_cache_is_valid(cpu_x) ||
+	    !last_level_cache_is_valid(cpu_y))
+		return false;
+
+	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
+	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+
+	return cache_leaves_are_shared(llc_x, llc_y);
+}
+
 #ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 4ff37cb763ae..7e429bc5c1a4 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
+bool last_level_cache_is_valid(unsigned int cpu);
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
@ 2022-05-25  8:14         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

It is useful to have helper to check if the given two CPUs share last level
cache. We can do that check by comparing fw_token or by comparing the cache
ID. Currently we check just for fw_token as the cache ID is optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid or not.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
 include/linux/cacheinfo.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 417e1ebf5525..ed74db18468f 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+bool last_level_cache_is_valid(unsigned int cpu)
+{
+	struct cacheinfo *llc;
+
+	if (!cache_leaves(cpu))
+		return false;
+
+	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+	return llc->fw_token;
+}
+
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
+{
+	struct cacheinfo *llc_x, *llc_y;
+
+	if (!last_level_cache_is_valid(cpu_x) ||
+	    !last_level_cache_is_valid(cpu_y))
+		return false;
+
+	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
+	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+
+	return cache_leaves_are_shared(llc_x, llc_y);
+}
+
 #ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 4ff37cb763ae..7e429bc5c1a4 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
+bool last_level_cache_is_valid(unsigned int cpu);
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
@ 2022-05-25  8:14         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

It is useful to have helper to check if the given two CPUs share last level
cache. We can do that check by comparing fw_token or by comparing the cache
ID. Currently we check just for fw_token as the cache ID is optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid or not.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
 include/linux/cacheinfo.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 417e1ebf5525..ed74db18468f 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+bool last_level_cache_is_valid(unsigned int cpu)
+{
+	struct cacheinfo *llc;
+
+	if (!cache_leaves(cpu))
+		return false;
+
+	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+	return llc->fw_token;
+}
+
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
+{
+	struct cacheinfo *llc_x, *llc_y;
+
+	if (!last_level_cache_is_valid(cpu_x) ||
+	    !last_level_cache_is_valid(cpu_y))
+		return false;
+
+	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
+	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+
+	return cache_leaves_are_shared(llc_x, llc_y);
+}
+
 #ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 4ff37cb763ae..7e429bc5c1a4 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
+bool last_level_cache_is_valid(unsigned int cpu);
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
  2022-05-25  8:14         ` Sudeep Holla
  (?)
@ 2022-05-25  8:14           ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
 include/linux/cacheinfo.h |  1 +
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index ed74db18468f..976142f3e81d 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if fw_token is already populated */
-	if (this_cpu_ci->info_list->fw_token) {
-		return 0;
-	}
-
 	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
 
 unsigned int coherency_max_size;
 
+static int cache_setup_properties(unsigned int cpu)
+{
+	int ret = 0;
+
+	if (of_have_populated_dt())
+		ret = cache_setup_of_node(cpu);
+	else if (!acpi_disabled)
+		ret = cache_setup_acpi(cpu);
+
+	return ret;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (this_cpu_ci->cpu_map_populated)
 		return 0;
 
-	if (of_have_populated_dt())
-		ret = cache_setup_of_node(cpu);
-	else if (!acpi_disabled)
-		ret = cache_setup_acpi(cpu);
-
-	if (ret)
-		return ret;
+	/*
+	 * skip setting up cache properties if LLC is valid, just need
+	 * to update the shared cpu_map if the cache attributes were
+	 * populated early before all the cpus are brought online
+	 */
+	if (!last_level_cache_is_valid(cpu)) {
+		ret = cache_setup_properties(cpu);
+		if (ret)
+			return ret;
+	}
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
 		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
-		/* skip if shared_cpu_map is already populated */
-		if (!cpumask_empty(&this_leaf->shared_cpu_map))
-			continue;
 
 		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
 		for_each_online_cpu(i) {
@@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
 	return -ENOENT;
 }
 
-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
 {
 	int ret;
 
+	if (per_cpu_cacheinfo(cpu)) /* Already setup */
+		goto update_cpu_map;
+
 	if (init_cache_level(cpu) || !cache_leaves(cpu))
 		return -ENOENT;
 
@@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
+
+update_cpu_map:
 	/*
 	 * For systems using DT for cache hierarchy, fw_token
 	 * and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
 bool last_level_cache_is_valid(unsigned int cpu);
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
@ 2022-05-25  8:14           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
 include/linux/cacheinfo.h |  1 +
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index ed74db18468f..976142f3e81d 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if fw_token is already populated */
-	if (this_cpu_ci->info_list->fw_token) {
-		return 0;
-	}
-
 	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
 
 unsigned int coherency_max_size;
 
+static int cache_setup_properties(unsigned int cpu)
+{
+	int ret = 0;
+
+	if (of_have_populated_dt())
+		ret = cache_setup_of_node(cpu);
+	else if (!acpi_disabled)
+		ret = cache_setup_acpi(cpu);
+
+	return ret;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (this_cpu_ci->cpu_map_populated)
 		return 0;
 
-	if (of_have_populated_dt())
-		ret = cache_setup_of_node(cpu);
-	else if (!acpi_disabled)
-		ret = cache_setup_acpi(cpu);
-
-	if (ret)
-		return ret;
+	/*
+	 * skip setting up cache properties if LLC is valid, just need
+	 * to update the shared cpu_map if the cache attributes were
+	 * populated early before all the cpus are brought online
+	 */
+	if (!last_level_cache_is_valid(cpu)) {
+		ret = cache_setup_properties(cpu);
+		if (ret)
+			return ret;
+	}
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
 		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
-		/* skip if shared_cpu_map is already populated */
-		if (!cpumask_empty(&this_leaf->shared_cpu_map))
-			continue;
 
 		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
 		for_each_online_cpu(i) {
@@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
 	return -ENOENT;
 }
 
-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
 {
 	int ret;
 
+	if (per_cpu_cacheinfo(cpu)) /* Already setup */
+		goto update_cpu_map;
+
 	if (init_cache_level(cpu) || !cache_leaves(cpu))
 		return -ENOENT;
 
@@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
+
+update_cpu_map:
 	/*
 	 * For systems using DT for cache hierarchy, fw_token
 	 * and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
 bool last_level_cache_is_valid(unsigned int cpu);
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
@ 2022-05-25  8:14           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
 include/linux/cacheinfo.h |  1 +
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index ed74db18468f..976142f3e81d 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if fw_token is already populated */
-	if (this_cpu_ci->info_list->fw_token) {
-		return 0;
-	}
-
 	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
 
 unsigned int coherency_max_size;
 
+static int cache_setup_properties(unsigned int cpu)
+{
+	int ret = 0;
+
+	if (of_have_populated_dt())
+		ret = cache_setup_of_node(cpu);
+	else if (!acpi_disabled)
+		ret = cache_setup_acpi(cpu);
+
+	return ret;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (this_cpu_ci->cpu_map_populated)
 		return 0;
 
-	if (of_have_populated_dt())
-		ret = cache_setup_of_node(cpu);
-	else if (!acpi_disabled)
-		ret = cache_setup_acpi(cpu);
-
-	if (ret)
-		return ret;
+	/*
+	 * skip setting up cache properties if LLC is valid, just need
+	 * to update the shared cpu_map if the cache attributes were
+	 * populated early before all the cpus are brought online
+	 */
+	if (!last_level_cache_is_valid(cpu)) {
+		ret = cache_setup_properties(cpu);
+		if (ret)
+			return ret;
+	}
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
 		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
-		/* skip if shared_cpu_map is already populated */
-		if (!cpumask_empty(&this_leaf->shared_cpu_map))
-			continue;
 
 		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
 		for_each_online_cpu(i) {
@@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
 	return -ENOENT;
 }
 
-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
 {
 	int ret;
 
+	if (per_cpu_cacheinfo(cpu)) /* Already setup */
+		goto update_cpu_map;
+
 	if (init_cache_level(cpu) || !cache_leaves(cpu))
 		return -ENOENT;
 
@@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
+
+update_cpu_map:
 	/*
 	 * For systems using DT for cache hierarchy, fw_token
 	 * and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
 bool last_level_cache_is_valid(unsigned int cpu);
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
  2022-05-25  8:14           ` Sudeep Holla
  (?)
@ 2022-05-25  8:14             ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.

In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.

Note that this change builds the cacheinfo early even on ACPI systems, but
the current mechanism of building llc_sibling mask remains unchanged.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73b836047cf..765723448b10 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cacheinfo.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
 #include <linux/device.h>
@@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
+	int ret, cpu;
+
 	reset_cpu_topology();
+	ret = parse_acpi_topology();
+	if (!ret)
+		ret = of_have_populated_dt() && parse_dt_topology();
 
-	/*
-	 * Discard anything that was parsed if we hit an error so we
-	 * don't use partial information.
-	 */
-	if (parse_acpi_topology())
-		reset_cpu_topology();
-	else if (of_have_populated_dt() && parse_dt_topology())
+	if(ret) {
+		/*
+		 * Discard anything that was parsed if we hit an error so we
+		 * don't use partial information.
+		 */
 		reset_cpu_topology();
+		return;
+	}
+
+	for_each_possible_cpu(cpu)
+		detect_cache_attributes(cpu);
 }
 #endif
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
@ 2022-05-25  8:14             ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.

In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.

Note that this change builds the cacheinfo early even on ACPI systems, but
the current mechanism of building llc_sibling mask remains unchanged.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73b836047cf..765723448b10 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cacheinfo.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
 #include <linux/device.h>
@@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
+	int ret, cpu;
+
 	reset_cpu_topology();
+	ret = parse_acpi_topology();
+	if (!ret)
+		ret = of_have_populated_dt() && parse_dt_topology();
 
-	/*
-	 * Discard anything that was parsed if we hit an error so we
-	 * don't use partial information.
-	 */
-	if (parse_acpi_topology())
-		reset_cpu_topology();
-	else if (of_have_populated_dt() && parse_dt_topology())
+	if(ret) {
+		/*
+		 * Discard anything that was parsed if we hit an error so we
+		 * don't use partial information.
+		 */
 		reset_cpu_topology();
+		return;
+	}
+
+	for_each_possible_cpu(cpu)
+		detect_cache_attributes(cpu);
 }
 #endif
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
@ 2022-05-25  8:14             ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.

In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.

Note that this change builds the cacheinfo early even on ACPI systems, but
the current mechanism of building llc_sibling mask remains unchanged.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73b836047cf..765723448b10 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cacheinfo.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
 #include <linux/device.h>
@@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
+	int ret, cpu;
+
 	reset_cpu_topology();
+	ret = parse_acpi_topology();
+	if (!ret)
+		ret = of_have_populated_dt() && parse_dt_topology();
 
-	/*
-	 * Discard anything that was parsed if we hit an error so we
-	 * don't use partial information.
-	 */
-	if (parse_acpi_topology())
-		reset_cpu_topology();
-	else if (of_have_populated_dt() && parse_dt_topology())
+	if(ret) {
+		/*
+		 * Discard anything that was parsed if we hit an error so we
+		 * don't use partial information.
+		 */
 		reset_cpu_topology();
+		return;
+	}
+
+	for_each_possible_cpu(cpu)
+		detect_cache_attributes(cpu);
 }
 #endif
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
  2022-05-25  8:14             ` Sudeep Holla
  (?)
@ 2022-05-25  8:14               ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.

This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 765723448b10..4c486e4e6f2f 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
 	}
-	if (cpu_topology[cpu].llc_id != -1) {
+
+	if (last_level_cache_is_valid(cpu)) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
 	}
@@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if (last_level_cache_is_shared(cpu, cpuid)) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-05-25  8:14               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.

This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 765723448b10..4c486e4e6f2f 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
 	}
-	if (cpu_topology[cpu].llc_id != -1) {
+
+	if (last_level_cache_is_valid(cpu)) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
 	}
@@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if (last_level_cache_is_shared(cpu, cpuid)) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-05-25  8:14               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.

This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 765723448b10..4c486e4e6f2f 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
 	}
-	if (cpu_topology[cpu].llc_id != -1) {
+
+	if (last_level_cache_is_valid(cpu)) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
 	}
@@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if (last_level_cache_is_shared(cpu, cpuid)) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
  2022-05-25  8:14               ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                 ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/arm64/kernel/topology.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
 		return 0;
 
 	for_each_possible_cpu(cpu) {
-		int i, cache_id;
-
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
 		cpu_topology[cpu].cluster_id = topology_id;
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
-
-		i = acpi_find_last_cache_level(cpu);
-
-		if (i > 0) {
-			/*
-			 * this is the only part of cpu_topology that has
-			 * a direct relationship with the cache topology
-			 */
-			cache_id = find_acpi_cpu_cache_topology(cpu, i);
-			if (cache_id > 0)
-				cpu_topology[cpu].llc_id = cache_id;
-		}
 	}
 
 	return 0;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
@ 2022-05-25  8:14                 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/arm64/kernel/topology.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
 		return 0;
 
 	for_each_possible_cpu(cpu) {
-		int i, cache_id;
-
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
 		cpu_topology[cpu].cluster_id = topology_id;
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
-
-		i = acpi_find_last_cache_level(cpu);
-
-		if (i > 0) {
-			/*
-			 * this is the only part of cpu_topology that has
-			 * a direct relationship with the cache topology
-			 */
-			cache_id = find_acpi_cpu_cache_topology(cpu, i);
-			if (cache_id > 0)
-				cpu_topology[cpu].llc_id = cache_id;
-		}
 	}
 
 	return 0;
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
@ 2022-05-25  8:14                 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/arm64/kernel/topology.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
 		return 0;
 
 	for_each_possible_cpu(cpu) {
-		int i, cache_id;
-
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
 		cpu_topology[cpu].cluster_id = topology_id;
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
-
-		i = acpi_find_last_cache_level(cpu);
-
-		if (i > 0) {
-			/*
-			 * this is the only part of cpu_topology that has
-			 * a direct relationship with the cache topology
-			 */
-			cache_id = find_acpi_cpu_cache_topology(cpu, i);
-			if (cache_id > 0)
-				cpu_topology[cpu].llc_id = cache_id;
-		}
 	}
 
 	return 0;
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-05-25  8:14                 ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                   ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology information.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c  | 1 -
 include/linux/arch_topology.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 4c486e4e6f2f..76c702c217c5 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
 		cpu_topo->core_id = -1;
 		cpu_topo->cluster_id = -1;
 		cpu_topo->package_id = -1;
-		cpu_topo->llc_id = -1;
 
 		clear_cpu_topology(cpu);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
 	int core_id;
 	int cluster_id;
 	int package_id;
-	int llc_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 	cpumask_t cluster_sibling;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-05-25  8:14                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology information.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c  | 1 -
 include/linux/arch_topology.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 4c486e4e6f2f..76c702c217c5 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
 		cpu_topo->core_id = -1;
 		cpu_topo->cluster_id = -1;
 		cpu_topo->package_id = -1;
-		cpu_topo->llc_id = -1;
 
 		clear_cpu_topology(cpu);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
 	int core_id;
 	int cluster_id;
 	int package_id;
-	int llc_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 	cpumask_t cluster_sibling;
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-05-25  8:14                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology information.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c  | 1 -
 include/linux/arch_topology.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 4c486e4e6f2f..76c702c217c5 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
 		cpu_topo->core_id = -1;
 		cpu_topo->cluster_id = -1;
 		cpu_topo->package_id = -1;
-		cpu_topo->llc_id = -1;
 
 		clear_cpu_topology(cpu);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
 	int core_id;
 	int cluster_id;
 	int package_id;
-	int llc_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 	cpumask_t cluster_sibling;
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
  2022-05-25  8:14                   ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                     ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that imay result in getting the thread
siblings wrongs as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.

So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.

This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 76c702c217c5..59dc2c80c439 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
-		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
-		    cpuid_topo->cluster_id != -1) {
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+			continue;
+
+		if (cpuid_topo->cluster_id != -1) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
 
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
 		if (cpuid_topo->core_id != cpu_topo->core_id)
 			continue;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
@ 2022-05-25  8:14                     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that imay result in getting the thread
siblings wrongs as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.

So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.

This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 76c702c217c5..59dc2c80c439 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
-		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
-		    cpuid_topo->cluster_id != -1) {
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+			continue;
+
+		if (cpuid_topo->cluster_id != -1) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
 
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
 		if (cpuid_topo->core_id != cpu_topo->core_id)
 			continue;
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
@ 2022-05-25  8:14                     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that imay result in getting the thread
siblings wrongs as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.

So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.

This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 76c702c217c5..59dc2c80c439 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
-		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
-		    cpuid_topo->cluster_id != -1) {
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+			continue;
+
+		if (cpuid_topo->cluster_id != -1) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
 
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
 		if (cpuid_topo->core_id != cpu_topo->core_id)
 			continue;
 
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
  2022-05-25  8:14                     ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                       ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring, Andy Shevchenko

Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 59dc2c80c439..f73a5e669e42 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id == -1)
+		if (cpu_topology[cpu].package_id < 0)
 			ret = -EINVAL;
 
 out_map:
@@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
 			continue;
 
-		if (cpuid_topo->cluster_id != -1) {
+		if (cpuid_topo->cluster_id >= 0) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
@ 2022-05-25  8:14                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring, Andy Shevchenko

Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 59dc2c80c439..f73a5e669e42 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id == -1)
+		if (cpu_topology[cpu].package_id < 0)
 			ret = -EINVAL;
 
 out_map:
@@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
 			continue;
 
-		if (cpuid_topo->cluster_id != -1) {
+		if (cpuid_topo->cluster_id >= 0) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
@ 2022-05-25  8:14                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring, Andy Shevchenko

Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 59dc2c80c439..f73a5e669e42 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id == -1)
+		if (cpu_topology[cpu].package_id < 0)
 			ret = -EINVAL;
 
 out_map:
@@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
 			continue;
 
-		if (cpuid_topo->cluster_id != -1) {
+		if (cpuid_topo->cluster_id >= 0) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  2022-05-25  8:14                       ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                         ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73a5e669e42..6ae450ca68bb 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0)
+		if (cpu_topology[cpu].package_id < 0) {
 			ret = -EINVAL;
+			break;
+		}
 
 out_map:
 	of_node_put(map);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
@ 2022-05-25  8:14                         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73a5e669e42..6ae450ca68bb 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0)
+		if (cpu_topology[cpu].package_id < 0) {
 			ret = -EINVAL;
+			break;
+		}
 
 out_map:
 	of_node_put(map);
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
@ 2022-05-25  8:14                         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73a5e669e42..6ae450ca68bb 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0)
+		if (cpu_topology[cpu].package_id < 0) {
 			ret = -EINVAL;
+			break;
+		}
 
 out_map:
 	of_node_put(map);
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 13/16] arch_topology: Don't set cluster identifier as physical package identifier
  2022-05-25  8:14                         ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                           ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.

The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not support on most of the old and existing systems, we
can assume all such systems have single socket/physical package.

Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 6ae450ca68bb..e7876a7a82ec 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -544,7 +544,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -583,7 +582,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, package_id, core_id++);
+				ret = parse_core(c, 0, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -600,9 +599,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	if (leaf && !has_cores)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
-	if (leaf)
-		package_id++;
-
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 13/16] arch_topology: Don't set cluster identifier as physical package identifier
@ 2022-05-25  8:14                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.

The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not support on most of the old and existing systems, we
can assume all such systems have single socket/physical package.

Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 6ae450ca68bb..e7876a7a82ec 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -544,7 +544,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -583,7 +582,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, package_id, core_id++);
+				ret = parse_core(c, 0, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -600,9 +599,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	if (leaf && !has_cores)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
-	if (leaf)
-		package_id++;
-
 	return 0;
 }
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 13/16] arch_topology: Don't set cluster identifier as physical package identifier
@ 2022-05-25  8:14                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.

The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not support on most of the old and existing systems, we
can assume all such systems have single socket/physical package.

Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 6ae450ca68bb..e7876a7a82ec 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -544,7 +544,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -583,7 +582,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, package_id, core_id++);
+				ret = parse_core(c, 0, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -600,9 +599,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	if (leaf && !has_cores)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
-	if (leaf)
-		package_id++;
-
 	return 0;
 }
 
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 14/16] arch_topology: Drop unnecessary check for uninitialised package_id
  2022-05-25  8:14                           ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                             ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

With the support of socket node parsing from the device tree and
assigning 0 as package_id in absence of socket nodes, there is no need
to check for invalid package_id. It is always initialised to 0 or values
from the device tree socket nodes.

Just drop that now redundant check for uninitialised package_id.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7876a7a82ec..b8f0d72908c8 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -606,7 +606,6 @@ static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
 	int ret = 0;
-	int cpu;
 
 	cn = of_find_node_by_path("/cpus");
 	if (!cn) {
@@ -628,16 +627,6 @@ static int __init parse_dt_topology(void)
 
 	topology_normalize_cpu_scale();
 
-	/*
-	 * Check that all cores are in the topology; the SMP code will
-	 * only mark cores described in the DT as possible.
-	 */
-	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0) {
-			ret = -EINVAL;
-			break;
-		}
-
 out_map:
 	of_node_put(map);
 out:
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 14/16] arch_topology: Drop unnecessary check for uninitialised package_id
@ 2022-05-25  8:14                             ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

With the support of socket node parsing from the device tree and
assigning 0 as package_id in absence of socket nodes, there is no need
to check for invalid package_id. It is always initialised to 0 or values
from the device tree socket nodes.

Just drop that now redundant check for uninitialised package_id.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7876a7a82ec..b8f0d72908c8 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -606,7 +606,6 @@ static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
 	int ret = 0;
-	int cpu;
 
 	cn = of_find_node_by_path("/cpus");
 	if (!cn) {
@@ -628,16 +627,6 @@ static int __init parse_dt_topology(void)
 
 	topology_normalize_cpu_scale();
 
-	/*
-	 * Check that all cores are in the topology; the SMP code will
-	 * only mark cores described in the DT as possible.
-	 */
-	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0) {
-			ret = -EINVAL;
-			break;
-		}
-
 out_map:
 	of_node_put(map);
 out:
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 14/16] arch_topology: Drop unnecessary check for uninitialised package_id
@ 2022-05-25  8:14                             ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

With the support of socket node parsing from the device tree and
assigning 0 as package_id in absence of socket nodes, there is no need
to check for invalid package_id. It is always initialised to 0 or values
from the device tree socket nodes.

Just drop that now redundant check for uninitialised package_id.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7876a7a82ec..b8f0d72908c8 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -606,7 +606,6 @@ static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
 	int ret = 0;
-	int cpu;
 
 	cn = of_find_node_by_path("/cpus");
 	if (!cn) {
@@ -628,16 +627,6 @@ static int __init parse_dt_topology(void)
 
 	topology_normalize_cpu_scale();
 
-	/*
-	 * Check that all cores are in the topology; the SMP code will
-	 * only mark cores described in the DT as possible.
-	 */
-	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0) {
-			ret = -EINVAL;
-			break;
-		}
-
 out_map:
 	of_node_put(map);
 out:
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-05-25  8:14                             ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                               ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.

We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b8f0d72908c8..5f4f148a7769 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 }
 
 static int __init parse_core(struct device_node *core, int package_id,
-			     int core_id)
+			     int cluster_id, int core_id)
 {
 	char name[20];
 	bool leaf = true;
@@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
 				cpu_topology[cpu].package_id = package_id;
+				cpu_topology[cpu].cluster_id = cluster_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else if (cpu != -ENODEV) {
@@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 		}
 
 		cpu_topology[cpu].package_id = package_id;
+		cpu_topology[cpu].cluster_id = cluster_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf && cpu != -ENODEV) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -538,7 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -558,7 +561,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, depth + 1);
+			ret = parse_cluster(c, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -582,7 +585,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, core_id++);
+				ret = parse_core(c, 0, cluster_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -621,7 +624,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, 0);
+	ret = parse_cluster(map, -1, 0);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-05-25  8:14                               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.

We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b8f0d72908c8..5f4f148a7769 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 }
 
 static int __init parse_core(struct device_node *core, int package_id,
-			     int core_id)
+			     int cluster_id, int core_id)
 {
 	char name[20];
 	bool leaf = true;
@@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
 				cpu_topology[cpu].package_id = package_id;
+				cpu_topology[cpu].cluster_id = cluster_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else if (cpu != -ENODEV) {
@@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 		}
 
 		cpu_topology[cpu].package_id = package_id;
+		cpu_topology[cpu].cluster_id = cluster_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf && cpu != -ENODEV) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -538,7 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -558,7 +561,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, depth + 1);
+			ret = parse_cluster(c, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -582,7 +585,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, core_id++);
+				ret = parse_core(c, 0, cluster_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -621,7 +624,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, 0);
+	ret = parse_cluster(map, -1, 0);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-05-25  8:14                               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.

We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b8f0d72908c8..5f4f148a7769 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 }
 
 static int __init parse_core(struct device_node *core, int package_id,
-			     int core_id)
+			     int cluster_id, int core_id)
 {
 	char name[20];
 	bool leaf = true;
@@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
 				cpu_topology[cpu].package_id = package_id;
+				cpu_topology[cpu].cluster_id = cluster_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else if (cpu != -ENODEV) {
@@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 		}
 
 		cpu_topology[cpu].package_id = package_id;
+		cpu_topology[cpu].cluster_id = cluster_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf && cpu != -ENODEV) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -538,7 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -558,7 +561,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, depth + 1);
+			ret = parse_cluster(c, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -582,7 +585,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, core_id++);
+				ret = parse_core(c, 0, cluster_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -621,7 +624,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, 0);
+	ret = parse_cluster(map, -1, 0);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 16/16] arch_topology: Add support for parsing sockets in /cpu-map
  2022-05-25  8:14                               ` Sudeep Holla
  (?)
@ 2022-05-25  8:14                                 ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.

Also it is likely that most single socket systems skip to as the node
since it is optional.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 5f4f148a7769..676e0ed333b1 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -540,8 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+				int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -561,7 +561,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, i, depth + 1);
+			ret = parse_cluster(c, package_id, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -585,7 +585,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, cluster_id, core_id++);
+				ret = parse_core(c, package_id, cluster_id,
+						 core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -605,6 +606,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 	return 0;
 }
 
+static int __init parse_socket(struct device_node *socket)
+{
+	char name[20];
+	struct device_node *c;
+	bool has_socket = false;
+	int package_id = 0, ret;
+
+	do {
+		snprintf(name, sizeof(name), "socket%d", package_id);
+		c = of_get_child_by_name(socket, name);
+		if (c) {
+			has_socket = true;
+			ret = parse_cluster(c, package_id, -1, 0);
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		package_id++;
+	} while (c);
+
+	if (!has_socket)
+		ret = parse_cluster(socket, 0, -1, 0);
+
+	return ret;
+}
+
 static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
@@ -624,7 +651,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, -1, 0);
+	ret = parse_socket(map);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 16/16] arch_topology: Add support for parsing sockets in /cpu-map
@ 2022-05-25  8:14                                 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.

Also it is likely that most single socket systems skip to as the node
since it is optional.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 5f4f148a7769..676e0ed333b1 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -540,8 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+				int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -561,7 +561,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, i, depth + 1);
+			ret = parse_cluster(c, package_id, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -585,7 +585,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, cluster_id, core_id++);
+				ret = parse_core(c, package_id, cluster_id,
+						 core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -605,6 +606,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 	return 0;
 }
 
+static int __init parse_socket(struct device_node *socket)
+{
+	char name[20];
+	struct device_node *c;
+	bool has_socket = false;
+	int package_id = 0, ret;
+
+	do {
+		snprintf(name, sizeof(name), "socket%d", package_id);
+		c = of_get_child_by_name(socket, name);
+		if (c) {
+			has_socket = true;
+			ret = parse_cluster(c, package_id, -1, 0);
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		package_id++;
+	} while (c);
+
+	if (!has_socket)
+		ret = parse_cluster(socket, 0, -1, 0);
+
+	return ret;
+}
+
 static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
@@ -624,7 +651,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, -1, 0);
+	ret = parse_socket(map);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH v3 16/16] arch_topology: Add support for parsing sockets in /cpu-map
@ 2022-05-25  8:14                                 ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-05-25  8:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.

Also it is likely that most single socket systems skip to as the node
since it is optional.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 5f4f148a7769..676e0ed333b1 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -540,8 +540,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+				int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -561,7 +561,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, i, depth + 1);
+			ret = parse_cluster(c, package_id, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -585,7 +585,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, cluster_id, core_id++);
+				ret = parse_core(c, package_id, cluster_id,
+						 core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -605,6 +606,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 	return 0;
 }
 
+static int __init parse_socket(struct device_node *socket)
+{
+	char name[20];
+	struct device_node *c;
+	bool has_socket = false;
+	int package_id = 0, ret;
+
+	do {
+		snprintf(name, sizeof(name), "socket%d", package_id);
+		c = of_get_child_by_name(socket, name);
+		if (c) {
+			has_socket = true;
+			ret = parse_cluster(c, package_id, -1, 0);
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		package_id++;
+	} while (c);
+
+	if (!has_socket)
+		ret = parse_cluster(socket, 0, -1, 0);
+
+	return ret;
+}
+
 static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
@@ -624,7 +651,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, -1, 0);
+	ret = parse_socket(map);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.36.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
  2022-05-25  8:14     ` Sudeep Holla
  (?)
@ 2022-06-01  2:44       ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:44 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo for a given CPU at a given index is used at quite a few
> places by fetching the base point for index 0 using the helper
> per_cpu_cacheinfo(cpu) and offseting it by the required index.
> 
> Instead, add another helper to fetch the required pointer directly and
> use it to simplify and improve readability.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
> 

s/offseting/offsetting

It looks good to me with below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index b0bde272e2ae..c4547d8ac6f3 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
>   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
>   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
>   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> +				(per_cpu_cacheinfo(cpu) + (idx))
>   
>   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   {
> @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   	}
>   
>   	while (index < cache_leaves(cpu)) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		if (this_leaf->level != 1)
>   			np = of_find_next_cache_node(np);
>   		else
> @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		/* skip if shared_cpu_map is already populated */
>   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
>   			continue;
> @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   			if (i == cpu || !sib_cpu_ci->info_list)
>   				continue;/* skip if itself or no cacheinfo */
> -			sib_leaf = sib_cpu_ci->info_list + index;
> +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
>   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
>   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
>   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   static void cache_shared_cpu_map_remove(unsigned int cpu)
>   {
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	struct cacheinfo *this_leaf, *sib_leaf;
>   	unsigned int sibling, index;
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
>   			struct cpu_cacheinfo *sib_cpu_ci;
> 

In cache_shared_cpu_map_remove(), the newly introduced macro
can be applied when the sibling CPU's cache info is fetched.

     sib_leaf = sib_cpu_ci->info_list + index;

     to

     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);

   
> @@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
>   	int rc;
>   	struct device *ci_dev, *parent;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	const struct attribute_group **cache_groups;
>   
>   	rc = cpu_cache_sysfs_init(cpu);
> @@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
>   
>   	parent = per_cpu_cache_dev(cpu);
>   	for (i = 0; i < cache_leaves(cpu); i++) {
> -		this_leaf = this_cpu_ci->info_list + i;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
>   		if (this_leaf->disable_sysfs)
>   			continue;
>   		if (this_leaf->type == CACHE_TYPE_NOCACHE)
> 

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-06-01  2:44       ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:44 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo for a given CPU at a given index is used at quite a few
> places by fetching the base point for index 0 using the helper
> per_cpu_cacheinfo(cpu) and offseting it by the required index.
> 
> Instead, add another helper to fetch the required pointer directly and
> use it to simplify and improve readability.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
> 

s/offseting/offsetting

It looks good to me with below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index b0bde272e2ae..c4547d8ac6f3 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
>   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
>   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
>   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> +				(per_cpu_cacheinfo(cpu) + (idx))
>   
>   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   {
> @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   	}
>   
>   	while (index < cache_leaves(cpu)) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		if (this_leaf->level != 1)
>   			np = of_find_next_cache_node(np);
>   		else
> @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		/* skip if shared_cpu_map is already populated */
>   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
>   			continue;
> @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   			if (i == cpu || !sib_cpu_ci->info_list)
>   				continue;/* skip if itself or no cacheinfo */
> -			sib_leaf = sib_cpu_ci->info_list + index;
> +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
>   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
>   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
>   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   static void cache_shared_cpu_map_remove(unsigned int cpu)
>   {
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	struct cacheinfo *this_leaf, *sib_leaf;
>   	unsigned int sibling, index;
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
>   			struct cpu_cacheinfo *sib_cpu_ci;
> 

In cache_shared_cpu_map_remove(), the newly introduced macro
can be applied when the sibling CPU's cache info is fetched.

     sib_leaf = sib_cpu_ci->info_list + index;

     to

     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);

   
> @@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
>   	int rc;
>   	struct device *ci_dev, *parent;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	const struct attribute_group **cache_groups;
>   
>   	rc = cpu_cache_sysfs_init(cpu);
> @@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
>   
>   	parent = per_cpu_cache_dev(cpu);
>   	for (i = 0; i < cache_leaves(cpu); i++) {
> -		this_leaf = this_cpu_ci->info_list + i;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
>   		if (this_leaf->disable_sysfs)
>   			continue;
>   		if (this_leaf->type == CACHE_TYPE_NOCACHE)
> 

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-06-01  2:44       ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:44 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo for a given CPU at a given index is used at quite a few
> places by fetching the base point for index 0 using the helper
> per_cpu_cacheinfo(cpu) and offseting it by the required index.
> 
> Instead, add another helper to fetch the required pointer directly and
> use it to simplify and improve readability.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
> 

s/offseting/offsetting

It looks good to me with below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index b0bde272e2ae..c4547d8ac6f3 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
>   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
>   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
>   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> +				(per_cpu_cacheinfo(cpu) + (idx))
>   
>   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   {
> @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   	}
>   
>   	while (index < cache_leaves(cpu)) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		if (this_leaf->level != 1)
>   			np = of_find_next_cache_node(np);
>   		else
> @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		/* skip if shared_cpu_map is already populated */
>   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
>   			continue;
> @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   			if (i == cpu || !sib_cpu_ci->info_list)
>   				continue;/* skip if itself or no cacheinfo */
> -			sib_leaf = sib_cpu_ci->info_list + index;
> +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
>   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
>   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
>   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   
>   static void cache_shared_cpu_map_remove(unsigned int cpu)
>   {
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	struct cacheinfo *this_leaf, *sib_leaf;
>   	unsigned int sibling, index;
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
> -		this_leaf = this_cpu_ci->info_list + index;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
>   			struct cpu_cacheinfo *sib_cpu_ci;
> 

In cache_shared_cpu_map_remove(), the newly introduced macro
can be applied when the sibling CPU's cache info is fetched.

     sib_leaf = sib_cpu_ci->info_list + index;

     to

     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);

   
> @@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
>   	int rc;
>   	struct device *ci_dev, *parent;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	const struct attribute_group **cache_groups;
>   
>   	rc = cpu_cache_sysfs_init(cpu);
> @@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
>   
>   	parent = per_cpu_cache_dev(cpu);
>   	for (i = 0; i < cache_leaves(cpu); i++) {
> -		this_leaf = this_cpu_ci->info_list + i;
> +		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
>   		if (this_leaf->disable_sysfs)
>   			continue;
>   		if (this_leaf->type == CACHE_TYPE_NOCACHE)
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  2022-05-25  8:14   ` Sudeep Holla
  (?)
@ 2022-06-01  2:45     ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:45 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The of_cpu_device_node_get takes care of fetching the CPU'd device node
> either from cached cpu_dev->of_node if cpu_dev is initialised or uses
> of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
> 
> Just use of_cpu_device_node_get instead of getting the cpu device first
> and then using cpu_dev->of_node for two reasons:
> 1. There is no other use of cpu_dev and can be simplified
> 2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
>     much earlier before the CPUs are registered as devices.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index dad296229161..b0bde272e2ae 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -14,7 +14,7 @@
>   #include <linux/cpu.h>
>   #include <linux/device.h>
>   #include <linux/init.h>
> -#include <linux/of.h>
> +#include <linux/of_device.h>
>   #include <linux/sched.h>
>   #include <linux/slab.h>
>   #include <linux/smp.h>
> @@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct device *cpu_dev = get_cpu_device(cpu);
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> @@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   		return 0;
>   	}
>   
> -	if (!cpu_dev) {
> -		pr_err("No cpu device for CPU %d\n", cpu);
> -		return -ENODEV;
> -	}
> -	np = cpu_dev->of_node;
> +	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
>   		return -ENOENT;
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
@ 2022-06-01  2:45     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:45 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The of_cpu_device_node_get takes care of fetching the CPU'd device node
> either from cached cpu_dev->of_node if cpu_dev is initialised or uses
> of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
> 
> Just use of_cpu_device_node_get instead of getting the cpu device first
> and then using cpu_dev->of_node for two reasons:
> 1. There is no other use of cpu_dev and can be simplified
> 2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
>     much earlier before the CPUs are registered as devices.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index dad296229161..b0bde272e2ae 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -14,7 +14,7 @@
>   #include <linux/cpu.h>
>   #include <linux/device.h>
>   #include <linux/init.h>
> -#include <linux/of.h>
> +#include <linux/of_device.h>
>   #include <linux/sched.h>
>   #include <linux/slab.h>
>   #include <linux/smp.h>
> @@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct device *cpu_dev = get_cpu_device(cpu);
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> @@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   		return 0;
>   	}
>   
> -	if (!cpu_dev) {
> -		pr_err("No cpu device for CPU %d\n", cpu);
> -		return -ENODEV;
> -	}
> -	np = cpu_dev->of_node;
> +	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
>   		return -ENOENT;
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
@ 2022-06-01  2:45     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:45 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The of_cpu_device_node_get takes care of fetching the CPU'd device node
> either from cached cpu_dev->of_node if cpu_dev is initialised or uses
> of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
> 
> Just use of_cpu_device_node_get instead of getting the cpu device first
> and then using cpu_dev->of_node for two reasons:
> 1. There is no other use of cpu_dev and can be simplified
> 2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
>     much earlier before the CPUs are registered as devices.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index dad296229161..b0bde272e2ae 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -14,7 +14,7 @@
>   #include <linux/cpu.h>
>   #include <linux/device.h>
>   #include <linux/init.h>
> -#include <linux/of.h>
> +#include <linux/of_device.h>
>   #include <linux/sched.h>
>   #include <linux/slab.h>
>   #include <linux/smp.h>
> @@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct device *cpu_dev = get_cpu_device(cpu);
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> @@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
>   		return 0;
>   	}
>   
> -	if (!cpu_dev) {
> -		pr_err("No cpu device for CPU %d\n", cpu);
> -		return -ENODEV;
> -	}
> -	np = cpu_dev->of_node;
> +	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
>   		return -ENOENT;
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  2022-05-25  8:14       ` Sudeep Holla
  (?)
@ 2022-06-01  2:51         ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:51 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
> if the cache leaves are the shared based on fw_token pointer. However it is
> defined conditionally only if CONFIG_OF is enabled which is wrong.
> 
> Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
> generic. It also handles the case where both OF and ACPI is not defined.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 20 +++++++++-----------
>   1 file changed, 9 insertions(+), 11 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index c4547d8ac6f3..417e1ebf5525 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   	return ci_cacheinfo(cpu);
>   }
>   
> -#ifdef CONFIG_OF
>   static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   					   struct cacheinfo *sib_leaf)
>   {
> +	/*
> +	 * For non DT/ACPI systems, assume unique level 1 caches,
> +	 * system-wide shared caches for all other levels. This will be used
> +	 * only if arch specific code has not populated shared_cpu_map
> +	 */
> +	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
> +		return !(this_leaf->level == 1);
> +
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   

	if (!IS_ENABLED(CONFIG_OF) && !IS_ENABLED(CONFIG_ACPI))

         or

	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))

> +#ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
>   	const char *size_prop;
> @@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   }
>   #else
>   static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
> -static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> -					   struct cacheinfo *sib_leaf)
> -{
> -	/*
> -	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
> -	 * shared caches for all other levels. This will be used only if
> -	 * arch specific code has not populated shared_cpu_map
> -	 */
> -	return !(this_leaf->level == 1);
> -}
>   #endif
>   
>   int __weak cache_setup_acpi(unsigned int cpu)
> 

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
@ 2022-06-01  2:51         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:51 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
> if the cache leaves are the shared based on fw_token pointer. However it is
> defined conditionally only if CONFIG_OF is enabled which is wrong.
> 
> Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
> generic. It also handles the case where both OF and ACPI is not defined.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 20 +++++++++-----------
>   1 file changed, 9 insertions(+), 11 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index c4547d8ac6f3..417e1ebf5525 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   	return ci_cacheinfo(cpu);
>   }
>   
> -#ifdef CONFIG_OF
>   static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   					   struct cacheinfo *sib_leaf)
>   {
> +	/*
> +	 * For non DT/ACPI systems, assume unique level 1 caches,
> +	 * system-wide shared caches for all other levels. This will be used
> +	 * only if arch specific code has not populated shared_cpu_map
> +	 */
> +	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
> +		return !(this_leaf->level == 1);
> +
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   

	if (!IS_ENABLED(CONFIG_OF) && !IS_ENABLED(CONFIG_ACPI))

         or

	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))

> +#ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
>   	const char *size_prop;
> @@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   }
>   #else
>   static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
> -static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> -					   struct cacheinfo *sib_leaf)
> -{
> -	/*
> -	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
> -	 * shared caches for all other levels. This will be used only if
> -	 * arch specific code has not populated shared_cpu_map
> -	 */
> -	return !(this_leaf->level == 1);
> -}
>   #endif
>   
>   int __weak cache_setup_acpi(unsigned int cpu)
> 

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
@ 2022-06-01  2:51         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  2:51 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> cache_leaves_are_shared is already used even with ACPI and PPTT. It checks
> if the cache leaves are the shared based on fw_token pointer. However it is
> defined conditionally only if CONFIG_OF is enabled which is wrong.
> 
> Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
> generic. It also handles the case where both OF and ACPI is not defined.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c | 20 +++++++++-----------
>   1 file changed, 9 insertions(+), 11 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index c4547d8ac6f3..417e1ebf5525 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>   	return ci_cacheinfo(cpu);
>   }
>   
> -#ifdef CONFIG_OF
>   static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   					   struct cacheinfo *sib_leaf)
>   {
> +	/*
> +	 * For non DT/ACPI systems, assume unique level 1 caches,
> +	 * system-wide shared caches for all other levels. This will be used
> +	 * only if arch specific code has not populated shared_cpu_map
> +	 */
> +	if (!IS_ENABLED(CONFIG_OF) && !(IS_ENABLED(CONFIG_ACPI)))
> +		return !(this_leaf->level == 1);
> +
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   

	if (!IS_ENABLED(CONFIG_OF) && !IS_ENABLED(CONFIG_ACPI))

         or

	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))

> +#ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
>   	const char *size_prop;
> @@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
>   }
>   #else
>   static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
> -static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> -					   struct cacheinfo *sib_leaf)
> -{
> -	/*
> -	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
> -	 * shared caches for all other levels. This will be used only if
> -	 * arch specific code has not populated shared_cpu_map
> -	 */
> -	return !(this_leaf->level == 1);
> -}
>   #endif
>   
>   int __weak cache_setup_acpi(unsigned int cpu)
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  2022-05-25  8:14         ` Sudeep Holla
  (?)
@ 2022-06-01  3:20           ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:20 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> It is useful to have helper to check if the given two CPUs share last level
> cache. We can do that check by comparing fw_token or by comparing the cache
> ID. Currently we check just for fw_token as the cache ID is optional.
> 
> This helper can be used to build the llc_sibling during arch specific
> topology parsing and feeding information to the sched_domains. This also
> helps to get rid of llc_id in the CPU topology as it is sort of duplicate
> information.
> 
> Also add helper to check if the llc information in cacheinfo is valid or not.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
>   include/linux/cacheinfo.h |  2 ++
>   2 files changed, 28 insertions(+)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 417e1ebf5525..ed74db18468f 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   
> +bool last_level_cache_is_valid(unsigned int cpu)
> +{
> +	struct cacheinfo *llc;
> +
> +	if (!cache_leaves(cpu))
> +		return false;
> +
> +	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
> +
> +	return llc->fw_token;
> +}
> +

	return !!llc->fw_token;

> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
> +{
> +	struct cacheinfo *llc_x, *llc_y;
> +
> +	if (!last_level_cache_is_valid(cpu_x) ||
> +	    !last_level_cache_is_valid(cpu_y))
> +		return false;
> +
> +	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
> +	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
> +
> +	return cache_leaves_are_shared(llc_x, llc_y);
> +}
> +
>   #ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 4ff37cb763ae..7e429bc5c1a4 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>   int init_cache_level(unsigned int cpu);
>   int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
> +bool last_level_cache_is_valid(unsigned int cpu);
> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
@ 2022-06-01  3:20           ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:20 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> It is useful to have helper to check if the given two CPUs share last level
> cache. We can do that check by comparing fw_token or by comparing the cache
> ID. Currently we check just for fw_token as the cache ID is optional.
> 
> This helper can be used to build the llc_sibling during arch specific
> topology parsing and feeding information to the sched_domains. This also
> helps to get rid of llc_id in the CPU topology as it is sort of duplicate
> information.
> 
> Also add helper to check if the llc information in cacheinfo is valid or not.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
>   include/linux/cacheinfo.h |  2 ++
>   2 files changed, 28 insertions(+)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 417e1ebf5525..ed74db18468f 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   
> +bool last_level_cache_is_valid(unsigned int cpu)
> +{
> +	struct cacheinfo *llc;
> +
> +	if (!cache_leaves(cpu))
> +		return false;
> +
> +	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
> +
> +	return llc->fw_token;
> +}
> +

	return !!llc->fw_token;

> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
> +{
> +	struct cacheinfo *llc_x, *llc_y;
> +
> +	if (!last_level_cache_is_valid(cpu_x) ||
> +	    !last_level_cache_is_valid(cpu_y))
> +		return false;
> +
> +	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
> +	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
> +
> +	return cache_leaves_are_shared(llc_x, llc_y);
> +}
> +
>   #ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 4ff37cb763ae..7e429bc5c1a4 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>   int init_cache_level(unsigned int cpu);
>   int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
> +bool last_level_cache_is_valid(unsigned int cpu);
> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
@ 2022-06-01  3:20           ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:20 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> It is useful to have helper to check if the given two CPUs share last level
> cache. We can do that check by comparing fw_token or by comparing the cache
> ID. Currently we check just for fw_token as the cache ID is optional.
> 
> This helper can be used to build the llc_sibling during arch specific
> topology parsing and feeding information to the sched_domains. This also
> helps to get rid of llc_id in the CPU topology as it is sort of duplicate
> information.
> 
> Also add helper to check if the llc information in cacheinfo is valid or not.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
>   include/linux/cacheinfo.h |  2 ++
>   2 files changed, 28 insertions(+)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 417e1ebf5525..ed74db18468f 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>   	return sib_leaf->fw_token == this_leaf->fw_token;
>   }
>   
> +bool last_level_cache_is_valid(unsigned int cpu)
> +{
> +	struct cacheinfo *llc;
> +
> +	if (!cache_leaves(cpu))
> +		return false;
> +
> +	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
> +
> +	return llc->fw_token;
> +}
> +

	return !!llc->fw_token;

> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
> +{
> +	struct cacheinfo *llc_x, *llc_y;
> +
> +	if (!last_level_cache_is_valid(cpu_x) ||
> +	    !last_level_cache_is_valid(cpu_y))
> +		return false;
> +
> +	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
> +	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
> +
> +	return cache_leaves_are_shared(llc_x, llc_y);
> +}
> +
>   #ifdef CONFIG_OF
>   /* OF properties to query for a given cache type */
>   struct cache_type_info {
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 4ff37cb763ae..7e429bc5c1a4 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>   int init_cache_level(unsigned int cpu);
>   int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
> +bool last_level_cache_is_valid(unsigned int cpu);
> +bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
  2022-05-25  8:14           ` Sudeep Holla
  (?)
@ 2022-06-01  3:25             ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:25 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Some architecture/platforms may need to setup cache properties very
> early in the boot along with other cpu topologies so that all these
> information can be used to build sched_domains which is used by the
> scheduler.
> 
> Allow detect_cache_attributes to be called quite early during the boot.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
>   include/linux/cacheinfo.h |  1 +
>   2 files changed, 29 insertions(+), 17 deletions(-)
> 

With the comments improved, as below:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index ed74db18468f..976142f3e81d 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> -	/* skip if fw_token is already populated */
> -	if (this_cpu_ci->info_list->fw_token) {
> -		return 0;
> -	}
> -
>   	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
> @@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
>   
>   unsigned int coherency_max_size;
>   
> +static int cache_setup_properties(unsigned int cpu)
> +{
> +	int ret = 0;
> +
> +	if (of_have_populated_dt())
> +		ret = cache_setup_of_node(cpu);
> +	else if (!acpi_disabled)
> +		ret = cache_setup_acpi(cpu);
> +
> +	return ret;
> +}
> +
>   static int cache_shared_cpu_map_setup(unsigned int cpu)
>   {
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	if (this_cpu_ci->cpu_map_populated)
>   		return 0;
>   
> -	if (of_have_populated_dt())
> -		ret = cache_setup_of_node(cpu);
> -	else if (!acpi_disabled)
> -		ret = cache_setup_acpi(cpu);
> -
> -	if (ret)
> -		return ret;
> +	/*
> +	 * skip setting up cache properties if LLC is valid, just need
> +	 * to update the shared cpu_map if the cache attributes were
> +	 * populated early before all the cpus are brought online
> +	 */
> +	if (!last_level_cache_is_valid(cpu)) {
> +		ret = cache_setup_properties(cpu);
> +		if (ret)
> +			return ret;
> +	}
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
>   		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> -		/* skip if shared_cpu_map is already populated */
> -		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> -			continue;
>   
>   		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
>   		for_each_online_cpu(i) {
> @@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
>   	return -ENOENT;
>   }
>   
> -static int detect_cache_attributes(unsigned int cpu)
> +int detect_cache_attributes(unsigned int cpu)
>   {
>   	int ret;
>   
> +	if (per_cpu_cacheinfo(cpu)) /* Already setup */
> +		goto update_cpu_map;
> +
>   	if (init_cache_level(cpu) || !cache_leaves(cpu))
>   		return -ENOENT;
>  

Here it might be worthy to explain when CPU's cache info has been
populated, by mentioning CPU info can be populated at booting
and hot-add time.
  
> @@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
>   	ret = populate_cache_leaves(cpu);
>   	if (ret)
>   		goto free_ci;
> +
> +update_cpu_map:
>   	/*
>   	 * For systems using DT for cache hierarchy, fw_token
>   	 * and shared_cpu_map will be set up here only if they are
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 7e429bc5c1a4..00b7a6ae8617 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
>   bool last_level_cache_is_valid(unsigned int cpu);
>   bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
> +int detect_cache_attributes(unsigned int cpu);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
@ 2022-06-01  3:25             ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:25 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Some architecture/platforms may need to setup cache properties very
> early in the boot along with other cpu topologies so that all these
> information can be used to build sched_domains which is used by the
> scheduler.
> 
> Allow detect_cache_attributes to be called quite early during the boot.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
>   include/linux/cacheinfo.h |  1 +
>   2 files changed, 29 insertions(+), 17 deletions(-)
> 

With the comments improved, as below:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index ed74db18468f..976142f3e81d 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> -	/* skip if fw_token is already populated */
> -	if (this_cpu_ci->info_list->fw_token) {
> -		return 0;
> -	}
> -
>   	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
> @@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
>   
>   unsigned int coherency_max_size;
>   
> +static int cache_setup_properties(unsigned int cpu)
> +{
> +	int ret = 0;
> +
> +	if (of_have_populated_dt())
> +		ret = cache_setup_of_node(cpu);
> +	else if (!acpi_disabled)
> +		ret = cache_setup_acpi(cpu);
> +
> +	return ret;
> +}
> +
>   static int cache_shared_cpu_map_setup(unsigned int cpu)
>   {
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	if (this_cpu_ci->cpu_map_populated)
>   		return 0;
>   
> -	if (of_have_populated_dt())
> -		ret = cache_setup_of_node(cpu);
> -	else if (!acpi_disabled)
> -		ret = cache_setup_acpi(cpu);
> -
> -	if (ret)
> -		return ret;
> +	/*
> +	 * skip setting up cache properties if LLC is valid, just need
> +	 * to update the shared cpu_map if the cache attributes were
> +	 * populated early before all the cpus are brought online
> +	 */
> +	if (!last_level_cache_is_valid(cpu)) {
> +		ret = cache_setup_properties(cpu);
> +		if (ret)
> +			return ret;
> +	}
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
>   		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> -		/* skip if shared_cpu_map is already populated */
> -		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> -			continue;
>   
>   		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
>   		for_each_online_cpu(i) {
> @@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
>   	return -ENOENT;
>   }
>   
> -static int detect_cache_attributes(unsigned int cpu)
> +int detect_cache_attributes(unsigned int cpu)
>   {
>   	int ret;
>   
> +	if (per_cpu_cacheinfo(cpu)) /* Already setup */
> +		goto update_cpu_map;
> +
>   	if (init_cache_level(cpu) || !cache_leaves(cpu))
>   		return -ENOENT;
>  

Here it might be worthy to explain when CPU's cache info has been
populated, by mentioning CPU info can be populated at booting
and hot-add time.
  
> @@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
>   	ret = populate_cache_leaves(cpu);
>   	if (ret)
>   		goto free_ci;
> +
> +update_cpu_map:
>   	/*
>   	 * For systems using DT for cache hierarchy, fw_token
>   	 * and shared_cpu_map will be set up here only if they are
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 7e429bc5c1a4..00b7a6ae8617 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
>   bool last_level_cache_is_valid(unsigned int cpu);
>   bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
> +int detect_cache_attributes(unsigned int cpu);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes
@ 2022-06-01  3:25             ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:25 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Some architecture/platforms may need to setup cache properties very
> early in the boot along with other cpu topologies so that all these
> information can be used to build sched_domains which is used by the
> scheduler.
> 
> Allow detect_cache_attributes to be called quite early during the boot.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/cacheinfo.c  | 45 ++++++++++++++++++++++++---------------
>   include/linux/cacheinfo.h |  1 +
>   2 files changed, 29 insertions(+), 17 deletions(-)
> 

With the comments improved, as below:

Reviewed-by: Gavin Shan <gshan@redhat.com>


> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index ed74db18468f..976142f3e81d 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
>   {
>   	struct device_node *np;
>   	struct cacheinfo *this_leaf;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>   	unsigned int index = 0;
>   
> -	/* skip if fw_token is already populated */
> -	if (this_cpu_ci->info_list->fw_token) {
> -		return 0;
> -	}
> -
>   	np = of_cpu_device_node_get(cpu);
>   	if (!np) {
>   		pr_err("Failed to find cpu%d device node\n", cpu);
> @@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
>   
>   unsigned int coherency_max_size;
>   
> +static int cache_setup_properties(unsigned int cpu)
> +{
> +	int ret = 0;
> +
> +	if (of_have_populated_dt())
> +		ret = cache_setup_of_node(cpu);
> +	else if (!acpi_disabled)
> +		ret = cache_setup_acpi(cpu);
> +
> +	return ret;
> +}
> +
>   static int cache_shared_cpu_map_setup(unsigned int cpu)
>   {
>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>   	if (this_cpu_ci->cpu_map_populated)
>   		return 0;
>   
> -	if (of_have_populated_dt())
> -		ret = cache_setup_of_node(cpu);
> -	else if (!acpi_disabled)
> -		ret = cache_setup_acpi(cpu);
> -
> -	if (ret)
> -		return ret;
> +	/*
> +	 * skip setting up cache properties if LLC is valid, just need
> +	 * to update the shared cpu_map if the cache attributes were
> +	 * populated early before all the cpus are brought online
> +	 */
> +	if (!last_level_cache_is_valid(cpu)) {
> +		ret = cache_setup_properties(cpu);
> +		if (ret)
> +			return ret;
> +	}
>   
>   	for (index = 0; index < cache_leaves(cpu); index++) {
>   		unsigned int i;
>   
>   		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> -		/* skip if shared_cpu_map is already populated */
> -		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> -			continue;
>   
>   		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
>   		for_each_online_cpu(i) {
> @@ -330,10 +336,13 @@ int __weak populate_cache_leaves(unsigned int cpu)
>   	return -ENOENT;
>   }
>   
> -static int detect_cache_attributes(unsigned int cpu)
> +int detect_cache_attributes(unsigned int cpu)
>   {
>   	int ret;
>   
> +	if (per_cpu_cacheinfo(cpu)) /* Already setup */
> +		goto update_cpu_map;
> +
>   	if (init_cache_level(cpu) || !cache_leaves(cpu))
>   		return -ENOENT;
>  

Here it might be worthy to explain when CPU's cache info has been
populated, by mentioning CPU info can be populated at booting
and hot-add time.
  
> @@ -349,6 +358,8 @@ static int detect_cache_attributes(unsigned int cpu)
>   	ret = populate_cache_leaves(cpu);
>   	if (ret)
>   		goto free_ci;
> +
> +update_cpu_map:
>   	/*
>   	 * For systems using DT for cache hierarchy, fw_token
>   	 * and shared_cpu_map will be set up here only if they are
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 7e429bc5c1a4..00b7a6ae8617 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
>   int cache_setup_acpi(unsigned int cpu);
>   bool last_level_cache_is_valid(unsigned int cpu);
>   bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
> +int detect_cache_attributes(unsigned int cpu);
>   #ifndef CONFIG_ACPI_PPTT
>   /*
>    * acpi_find_last_cache_level is only called on ACPI enabled
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
  2022-05-25  8:14             ` Sudeep Holla
  (?)
@ 2022-06-01  3:29               ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:29 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently ACPI populates just the minimum information about the last
> level cache from PPTT in order to feed the same to build sched_domains.
> Similar support for DT platforms is not present.
> 
> In order to enable the same, the entire cache hierarchy information can
> be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
> 
> Note that this change builds the cacheinfo early even on ACPI systems, but
> the current mechanism of building llc_sibling mask remains unchanged.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 23 ++++++++++++++++-------
>   1 file changed, 16 insertions(+), 7 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73b836047cf..765723448b10 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -7,6 +7,7 @@
>    */
>   
>   #include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
>   #include <linux/cpu.h>
>   #include <linux/cpufreq.h>
>   #include <linux/device.h>
> @@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
>   #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>   void __init init_cpu_topology(void)
>   {
> +	int ret, cpu;
> +
>   	reset_cpu_topology();
> +	ret = parse_acpi_topology();
> +	if (!ret)
> +		ret = of_have_populated_dt() && parse_dt_topology();
>   
> -	/*
> -	 * Discard anything that was parsed if we hit an error so we
> -	 * don't use partial information.
> -	 */
> -	if (parse_acpi_topology())
> -		reset_cpu_topology();
> -	else if (of_have_populated_dt() && parse_dt_topology())
> +	if(ret) {
> +		/*
> +		 * Discard anything that was parsed if we hit an error so we
> +		 * don't use partial information.
> +		 */
>   		reset_cpu_topology();
> +		return;
> +	}
> +
> +	for_each_possible_cpu(cpu)
> +		detect_cache_attributes(cpu);
>   }
>   #endif
> 

# ./scripts/checkpatch.pl --codespell patch/check/0006*
ERROR: space required before the open parenthesis '('
#55: FILE: drivers/base/arch_topology.c:786:
+	if(ret) {

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
@ 2022-06-01  3:29               ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:29 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently ACPI populates just the minimum information about the last
> level cache from PPTT in order to feed the same to build sched_domains.
> Similar support for DT platforms is not present.
> 
> In order to enable the same, the entire cache hierarchy information can
> be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
> 
> Note that this change builds the cacheinfo early even on ACPI systems, but
> the current mechanism of building llc_sibling mask remains unchanged.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 23 ++++++++++++++++-------
>   1 file changed, 16 insertions(+), 7 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73b836047cf..765723448b10 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -7,6 +7,7 @@
>    */
>   
>   #include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
>   #include <linux/cpu.h>
>   #include <linux/cpufreq.h>
>   #include <linux/device.h>
> @@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
>   #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>   void __init init_cpu_topology(void)
>   {
> +	int ret, cpu;
> +
>   	reset_cpu_topology();
> +	ret = parse_acpi_topology();
> +	if (!ret)
> +		ret = of_have_populated_dt() && parse_dt_topology();
>   
> -	/*
> -	 * Discard anything that was parsed if we hit an error so we
> -	 * don't use partial information.
> -	 */
> -	if (parse_acpi_topology())
> -		reset_cpu_topology();
> -	else if (of_have_populated_dt() && parse_dt_topology())
> +	if(ret) {
> +		/*
> +		 * Discard anything that was parsed if we hit an error so we
> +		 * don't use partial information.
> +		 */
>   		reset_cpu_topology();
> +		return;
> +	}
> +
> +	for_each_possible_cpu(cpu)
> +		detect_cache_attributes(cpu);
>   }
>   #endif
> 

# ./scripts/checkpatch.pl --codespell patch/check/0006*
ERROR: space required before the open parenthesis '('
#55: FILE: drivers/base/arch_topology.c:786:
+	if(ret) {

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes
@ 2022-06-01  3:29               ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:29 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently ACPI populates just the minimum information about the last
> level cache from PPTT in order to feed the same to build sched_domains.
> Similar support for DT platforms is not present.
> 
> In order to enable the same, the entire cache hierarchy information can
> be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
> 
> Note that this change builds the cacheinfo early even on ACPI systems, but
> the current mechanism of building llc_sibling mask remains unchanged.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 23 ++++++++++++++++-------
>   1 file changed, 16 insertions(+), 7 deletions(-)
> 

With below nits fixed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73b836047cf..765723448b10 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -7,6 +7,7 @@
>    */
>   
>   #include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
>   #include <linux/cpu.h>
>   #include <linux/cpufreq.h>
>   #include <linux/device.h>
> @@ -775,15 +776,23 @@ __weak int __init parse_acpi_topology(void)
>   #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>   void __init init_cpu_topology(void)
>   {
> +	int ret, cpu;
> +
>   	reset_cpu_topology();
> +	ret = parse_acpi_topology();
> +	if (!ret)
> +		ret = of_have_populated_dt() && parse_dt_topology();
>   
> -	/*
> -	 * Discard anything that was parsed if we hit an error so we
> -	 * don't use partial information.
> -	 */
> -	if (parse_acpi_topology())
> -		reset_cpu_topology();
> -	else if (of_have_populated_dt() && parse_dt_topology())
> +	if(ret) {
> +		/*
> +		 * Discard anything that was parsed if we hit an error so we
> +		 * don't use partial information.
> +		 */
>   		reset_cpu_topology();
> +		return;
> +	}
> +
> +	for_each_possible_cpu(cpu)
> +		detect_cache_attributes(cpu);
>   }
>   #endif
> 

# ./scripts/checkpatch.pl --codespell patch/check/0006*
ERROR: space required before the open parenthesis '('
#55: FILE: drivers/base/arch_topology.c:786:
+	if(ret) {

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
  2022-05-25  8:14               ` Sudeep Holla
  (?)
@ 2022-06-01  3:31                 ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:31 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>   		/* not numa in package, lets use the package siblings */
>   		core_mask = &cpu_topology[cpu].core_sibling;
>   	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>   		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>   			core_mask = &cpu_topology[cpu].llc_sibling;
>   	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>   	for_each_online_cpu(cpu) {
>   		cpu_topo = &cpu_topology[cpu];
>   
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>   		}
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-01  3:31                 ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:31 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>   		/* not numa in package, lets use the package siblings */
>   		core_mask = &cpu_topology[cpu].core_sibling;
>   	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>   		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>   			core_mask = &cpu_topology[cpu].llc_sibling;
>   	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>   	for_each_online_cpu(cpu) {
>   		cpu_topo = &cpu_topology[cpu];
>   
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>   		}
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-01  3:31                 ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:31 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>   		/* not numa in package, lets use the package siblings */
>   		core_mask = &cpu_topology[cpu].core_sibling;
>   	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>   		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>   			core_mask = &cpu_topology[cpu].llc_sibling;
>   	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>   	for_each_online_cpu(cpu) {
>   		cpu_topo = &cpu_topology[cpu];
>   
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>   		}
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-05-25  8:14                   ` Sudeep Holla
  (?)
@ 2022-06-01  3:35                     ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:35 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

How about merge the changes to PATCH[08/16]? I don't see why we need put
the changes into separate patches.

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-01  3:35                     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:35 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

How about merge the changes to PATCH[08/16]? I don't see why we need put
the changes into separate patches.

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-01  3:35                     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:35 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

How about merge the changes to PATCH[08/16]? I don't see why we need put
the changes into separate patches.

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
  2022-05-25  8:14                     ` Sudeep Holla
  (?)
@ 2022-06-01  3:36                       ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:36 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently the cluster identifier is not set on the DT based platforms.
> The reset or default value is -1 for all the CPUs. Once we assign the
> cluster identifier values correctly that imay result in getting the thread
> siblings wrongs as the core identifiers can be same for 2 different CPUs
> belonging to 2 different cluster.
> 
> So, in order to get the thread sibling cpumasks correct, we need to
> update them only if the cores they belong are in the same cluster within
> the socket. Let us skip updation of the thread sibling cpumaks if the
> cluster identifier doesn't match.
> 
> This change won't affect even if the cluster identifiers are not set
> currently but will avoid any breakage once we set the same correctly.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 76c702c217c5..59dc2c80c439 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
> -		    cpuid_topo->cluster_id != -1) {
> +		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> +			continue;
> +
> +		if (cpuid_topo->cluster_id != -1) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
>   
> -		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> -		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> -
>   		if (cpuid_topo->core_id != cpu_topo->core_id)
>   			continue;
>   
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
@ 2022-06-01  3:36                       ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:36 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently the cluster identifier is not set on the DT based platforms.
> The reset or default value is -1 for all the CPUs. Once we assign the
> cluster identifier values correctly that imay result in getting the thread
> siblings wrongs as the core identifiers can be same for 2 different CPUs
> belonging to 2 different cluster.
> 
> So, in order to get the thread sibling cpumasks correct, we need to
> update them only if the cores they belong are in the same cluster within
> the socket. Let us skip updation of the thread sibling cpumaks if the
> cluster identifier doesn't match.
> 
> This change won't affect even if the cluster identifiers are not set
> currently but will avoid any breakage once we set the same correctly.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 76c702c217c5..59dc2c80c439 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
> -		    cpuid_topo->cluster_id != -1) {
> +		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> +			continue;
> +
> +		if (cpuid_topo->cluster_id != -1) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
>   
> -		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> -		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> -
>   		if (cpuid_topo->core_id != cpu_topo->core_id)
>   			continue;
>   
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster
@ 2022-06-01  3:36                       ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:36 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Currently the cluster identifier is not set on the DT based platforms.
> The reset or default value is -1 for all the CPUs. Once we assign the
> cluster identifier values correctly that imay result in getting the thread
> siblings wrongs as the core identifiers can be same for 2 different CPUs
> belonging to 2 different cluster.
> 
> So, in order to get the thread sibling cpumasks correct, we need to
> update them only if the cores they belong are in the same cluster within
> the socket. Let us skip updation of the thread sibling cpumaks if the
> cluster identifier doesn't match.
> 
> This change won't affect even if the cluster identifiers are not set
> currently but will avoid any breakage once we set the same correctly.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 76c702c217c5..59dc2c80c439 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -703,15 +703,17 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
> -		    cpuid_topo->cluster_id != -1) {
> +		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> +			continue;
> +
> +		if (cpuid_topo->cluster_id != -1) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
>   
> -		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> -		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> -
>   		if (cpuid_topo->core_id != cpu_topo->core_id)
>   			continue;
>   
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
  2022-05-25  8:14                       ` Sudeep Holla
  (?)
@ 2022-06-01  3:38                         ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:38 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring, Andy Shevchenko

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Instead of just comparing the cpu topology IDs with -1 to check their
> validity, improve that by checking for a valid non-negative value.
> 
> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 59dc2c80c439..f73a5e669e42 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id == -1)
> +		if (cpu_topology[cpu].package_id < 0)
>   			ret = -EINVAL;
>   
>   out_map:
> @@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id != -1) {
> +		if (cpuid_topo->cluster_id >= 0) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
@ 2022-06-01  3:38                         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:38 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring, Andy Shevchenko

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Instead of just comparing the cpu topology IDs with -1 to check their
> validity, improve that by checking for a valid non-negative value.
> 
> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 59dc2c80c439..f73a5e669e42 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id == -1)
> +		if (cpu_topology[cpu].package_id < 0)
>   			ret = -EINVAL;
>   
>   out_map:
> @@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id != -1) {
> +		if (cpuid_topo->cluster_id >= 0) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity
@ 2022-06-01  3:38                         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:38 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring, Andy Shevchenko

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Instead of just comparing the cpu topology IDs with -1 to check their
> validity, improve that by checking for a valid non-negative value.
> 
> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 59dc2c80c439..f73a5e669e42 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,7 +637,7 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id == -1)
> +		if (cpu_topology[cpu].package_id < 0)
>   			ret = -EINVAL;
>   
>   out_map:
> @@ -709,7 +709,7 @@ void update_siblings_masks(unsigned int cpuid)
>   		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>   			continue;
>   
> -		if (cpuid_topo->cluster_id != -1) {
> +		if (cpuid_topo->cluster_id >= 0) {
>   			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
>   			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
>   		}
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  2022-05-25  8:14                         ` Sudeep Holla
  (?)
@ 2022-06-01  3:40                           ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:40 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> There is no point in looping through all the CPU's physical package
> identifier to check if it is valid or not once a CPU which is outside
> the topology(i.e. outlier CPU) is found.
> 
> Let us just break out of the loop early in such case.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73a5e669e42..6ae450ca68bb 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id < 0)
> +		if (cpu_topology[cpu].package_id < 0) {
>   			ret = -EINVAL;
> +			break;
> +		}
>   
>   out_map:
>   	of_node_put(map);
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
@ 2022-06-01  3:40                           ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:40 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> There is no point in looping through all the CPU's physical package
> identifier to check if it is valid or not once a CPU which is outside
> the topology(i.e. outlier CPU) is found.
> 
> Let us just break out of the loop early in such case.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73a5e669e42..6ae450ca68bb 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id < 0)
> +		if (cpu_topology[cpu].package_id < 0) {
>   			ret = -EINVAL;
> +			break;
> +		}
>   
>   out_map:
>   	of_node_put(map);
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
@ 2022-06-01  3:40                           ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:40 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> There is no point in looping through all the CPU's physical package
> identifier to check if it is valid or not once a CPU which is outside
> the topology(i.e. outlier CPU) is found.
> 
> Let us just break out of the loop early in such case.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index f73a5e669e42..6ae450ca68bb 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -637,8 +637,10 @@ static int __init parse_dt_topology(void)
>   	 * only mark cores described in the DT as possible.
>   	 */
>   	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id < 0)
> +		if (cpu_topology[cpu].package_id < 0) {
>   			ret = -EINVAL;
> +			break;
> +		}
>   
>   out_map:
>   	of_node_put(map);
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
  2022-05-25  8:14 ` Sudeep Holla
  (?)
@ 2022-06-01  3:49   ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:49 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology. Sorry for posting in the middle of
> merge window but better to get this tested earlier so that it is ready
> for next merge window.
> 
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
> 
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
> 
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
> 
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.
> 
> P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> agree with the changes first before we include them.
> 
> v2[2]->v3:
> 	- Dropped support to get the device node for the CPU's LLC
> 	- Updated cacheinfo to support calling of detect_cache_attributes
> 	  early in smp_prepare_cpus stage
> 	- Added support to check if LLC is valid and shared in the cacheinfo
> 	- Used the same in arch_topology
> 
> v1[1]->v2[2]:
> 	- Updated ID validity check include all non-negative value
> 	- Added support to get the device node for the CPU's last level cache
> 	- Added support to build llc_sibling on DT platforms
> 
> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> 
> Sudeep Holla (16):
>    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>    cacheinfo: Add helper to access any cache index for a given CPU
>    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>    cacheinfo: Allow early detection and population of cache attributes
>    arch_topology: Add support to parse and detect cache attributes
>    arch_topology: Use the last level cache information from the cacheinfo
>    arm64: topology: Remove redundant setting of llc_id in CPU topology
>    arch_topology: Drop LLC identifier stash from the CPU topology
>    arch_topology: Set thread sibling cpumask only within the cluster
>    arch_topology: Check for non-negative value rather than -1 for IDs validity
>    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>    arch_topology: Don't set cluster identifier as physical package identifier
>    arch_topology: Drop unnecessary check for uninitialised package_id
>    arch_topology: Set cluster identifier in each core/thread from /cpu-map
>    arch_topology: Add support for parsing sockets in /cpu-map
> 
>   arch/arm64/kernel/topology.c  |  14 -----
>   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
>   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
>   include/linux/arch_topology.h |   1 -
>   include/linux/cacheinfo.h     |   3 +
>   5 files changed, 135 insertions(+), 89 deletions(-)
> 

I tried this series on virtual machine where ACPI is enabled and looks good.
Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
tag for it. Besides, I checked the changes related to ACPI part and looks to
me either after the mentioned nits fixed. I leave the changes related to device-tree
to be reviewed by the experts :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-06-01  3:49   ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:49 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology. Sorry for posting in the middle of
> merge window but better to get this tested earlier so that it is ready
> for next merge window.
> 
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
> 
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
> 
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
> 
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.
> 
> P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> agree with the changes first before we include them.
> 
> v2[2]->v3:
> 	- Dropped support to get the device node for the CPU's LLC
> 	- Updated cacheinfo to support calling of detect_cache_attributes
> 	  early in smp_prepare_cpus stage
> 	- Added support to check if LLC is valid and shared in the cacheinfo
> 	- Used the same in arch_topology
> 
> v1[1]->v2[2]:
> 	- Updated ID validity check include all non-negative value
> 	- Added support to get the device node for the CPU's last level cache
> 	- Added support to build llc_sibling on DT platforms
> 
> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> 
> Sudeep Holla (16):
>    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>    cacheinfo: Add helper to access any cache index for a given CPU
>    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>    cacheinfo: Allow early detection and population of cache attributes
>    arch_topology: Add support to parse and detect cache attributes
>    arch_topology: Use the last level cache information from the cacheinfo
>    arm64: topology: Remove redundant setting of llc_id in CPU topology
>    arch_topology: Drop LLC identifier stash from the CPU topology
>    arch_topology: Set thread sibling cpumask only within the cluster
>    arch_topology: Check for non-negative value rather than -1 for IDs validity
>    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>    arch_topology: Don't set cluster identifier as physical package identifier
>    arch_topology: Drop unnecessary check for uninitialised package_id
>    arch_topology: Set cluster identifier in each core/thread from /cpu-map
>    arch_topology: Add support for parsing sockets in /cpu-map
> 
>   arch/arm64/kernel/topology.c  |  14 -----
>   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
>   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
>   include/linux/arch_topology.h |   1 -
>   include/linux/cacheinfo.h     |   3 +
>   5 files changed, 135 insertions(+), 89 deletions(-)
> 

I tried this series on virtual machine where ACPI is enabled and looks good.
Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
tag for it. Besides, I checked the changes related to ACPI part and looks to
me either after the mentioned nits fixed. I leave the changes related to device-tree
to be reviewed by the experts :)

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-06-01  3:49   ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-01  3:49 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

Hi Sudeep,

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology. Sorry for posting in the middle of
> merge window but better to get this tested earlier so that it is ready
> for next merge window.
> 
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
> 
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
> 
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
> 
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.
> 
> P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> agree with the changes first before we include them.
> 
> v2[2]->v3:
> 	- Dropped support to get the device node for the CPU's LLC
> 	- Updated cacheinfo to support calling of detect_cache_attributes
> 	  early in smp_prepare_cpus stage
> 	- Added support to check if LLC is valid and shared in the cacheinfo
> 	- Used the same in arch_topology
> 
> v1[1]->v2[2]:
> 	- Updated ID validity check include all non-negative value
> 	- Added support to get the device node for the CPU's last level cache
> 	- Added support to build llc_sibling on DT platforms
> 
> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> 
> Sudeep Holla (16):
>    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>    cacheinfo: Add helper to access any cache index for a given CPU
>    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>    cacheinfo: Allow early detection and population of cache attributes
>    arch_topology: Add support to parse and detect cache attributes
>    arch_topology: Use the last level cache information from the cacheinfo
>    arm64: topology: Remove redundant setting of llc_id in CPU topology
>    arch_topology: Drop LLC identifier stash from the CPU topology
>    arch_topology: Set thread sibling cpumask only within the cluster
>    arch_topology: Check for non-negative value rather than -1 for IDs validity
>    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>    arch_topology: Don't set cluster identifier as physical package identifier
>    arch_topology: Drop unnecessary check for uninitialised package_id
>    arch_topology: Set cluster identifier in each core/thread from /cpu-map
>    arch_topology: Add support for parsing sockets in /cpu-map
> 
>   arch/arm64/kernel/topology.c  |  14 -----
>   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
>   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
>   include/linux/arch_topology.h |   1 -
>   include/linux/cacheinfo.h     |   3 +
>   5 files changed, 135 insertions(+), 89 deletions(-)
> 

I tried this series on virtual machine where ACPI is enabled and looks good.
Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
tag for it. Besides, I checked the changes related to ACPI part and looks to
me either after the mentioned nits fixed. I leave the changes related to device-tree
to be reviewed by the experts :)

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
  2022-06-01  3:49   ` Gavin Shan
  (?)
@ 2022-06-01 12:03     ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:03 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:49:07AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology. Sorry for posting in the middle of
> > merge window but better to get this tested earlier so that it is ready
> > for next merge window.
> > 
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> > 
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> > 
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> > 
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> > 
> > P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> > agree with the changes first before we include them.
> > 
> > v2[2]->v3:
> > 	- Dropped support to get the device node for the CPU's LLC
> > 	- Updated cacheinfo to support calling of detect_cache_attributes
> > 	  early in smp_prepare_cpus stage
> > 	- Added support to check if LLC is valid and shared in the cacheinfo
> > 	- Used the same in arch_topology
> > 
> > v1[1]->v2[2]:
> > 	- Updated ID validity check include all non-negative value
> > 	- Added support to get the device node for the CPU's last level cache
> > 	- Added support to build llc_sibling on DT platforms
> > 
> > [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> > [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> > 
> > Sudeep Holla (16):
> >    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
> >    cacheinfo: Add helper to access any cache index for a given CPU
> >    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
> >    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
> >    cacheinfo: Allow early detection and population of cache attributes
> >    arch_topology: Add support to parse and detect cache attributes
> >    arch_topology: Use the last level cache information from the cacheinfo
> >    arm64: topology: Remove redundant setting of llc_id in CPU topology
> >    arch_topology: Drop LLC identifier stash from the CPU topology
> >    arch_topology: Set thread sibling cpumask only within the cluster
> >    arch_topology: Check for non-negative value rather than -1 for IDs validity
> >    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> >    arch_topology: Don't set cluster identifier as physical package identifier
> >    arch_topology: Drop unnecessary check for uninitialised package_id
> >    arch_topology: Set cluster identifier in each core/thread from /cpu-map
> >    arch_topology: Add support for parsing sockets in /cpu-map
> > 
> >   arch/arm64/kernel/topology.c  |  14 -----
> >   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
> >   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
> >   include/linux/arch_topology.h |   1 -
> >   include/linux/cacheinfo.h     |   3 +
> >   5 files changed, 135 insertions(+), 89 deletions(-)
> > 
> 
> I tried this series on virtual machine where ACPI is enabled and looks good.
> Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
> tag for it. Besides, I checked the changes related to ACPI part and looks to
> me either after the mentioned nits fixed. I leave the changes related to device-tree
> to be reviewed by the experts :)
> 

Thanks for the review and testing, much appreciated!
Yes the changes for ACPI is very minimal in this series, except the bug fix
you are interested, nothing changes in the system behaviour. The main aim
is to get the same behaviour on a similar DT based system.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-06-01 12:03     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:03 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:49:07AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology. Sorry for posting in the middle of
> > merge window but better to get this tested earlier so that it is ready
> > for next merge window.
> > 
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> > 
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> > 
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> > 
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> > 
> > P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> > agree with the changes first before we include them.
> > 
> > v2[2]->v3:
> > 	- Dropped support to get the device node for the CPU's LLC
> > 	- Updated cacheinfo to support calling of detect_cache_attributes
> > 	  early in smp_prepare_cpus stage
> > 	- Added support to check if LLC is valid and shared in the cacheinfo
> > 	- Used the same in arch_topology
> > 
> > v1[1]->v2[2]:
> > 	- Updated ID validity check include all non-negative value
> > 	- Added support to get the device node for the CPU's last level cache
> > 	- Added support to build llc_sibling on DT platforms
> > 
> > [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> > [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> > 
> > Sudeep Holla (16):
> >    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
> >    cacheinfo: Add helper to access any cache index for a given CPU
> >    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
> >    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
> >    cacheinfo: Allow early detection and population of cache attributes
> >    arch_topology: Add support to parse and detect cache attributes
> >    arch_topology: Use the last level cache information from the cacheinfo
> >    arm64: topology: Remove redundant setting of llc_id in CPU topology
> >    arch_topology: Drop LLC identifier stash from the CPU topology
> >    arch_topology: Set thread sibling cpumask only within the cluster
> >    arch_topology: Check for non-negative value rather than -1 for IDs validity
> >    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> >    arch_topology: Don't set cluster identifier as physical package identifier
> >    arch_topology: Drop unnecessary check for uninitialised package_id
> >    arch_topology: Set cluster identifier in each core/thread from /cpu-map
> >    arch_topology: Add support for parsing sockets in /cpu-map
> > 
> >   arch/arm64/kernel/topology.c  |  14 -----
> >   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
> >   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
> >   include/linux/arch_topology.h |   1 -
> >   include/linux/cacheinfo.h     |   3 +
> >   5 files changed, 135 insertions(+), 89 deletions(-)
> > 
> 
> I tried this series on virtual machine where ACPI is enabled and looks good.
> Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
> tag for it. Besides, I checked the changes related to ACPI part and looks to
> me either after the mentioned nits fixed. I leave the changes related to device-tree
> to be reviewed by the experts :)
> 

Thanks for the review and testing, much appreciated!
Yes the changes for ACPI is very minimal in this series, except the bug fix
you are interested, nothing changes in the system behaviour. The main aim
is to get the same behaviour on a similar DT based system.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-06-01 12:03     ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:03 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:49:07AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology. Sorry for posting in the middle of
> > merge window but better to get this tested earlier so that it is ready
> > for next merge window.
> > 
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> > 
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> > 
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> > 
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> > 
> > P.S: I have not cc-ed Greg and Rafael so that all the users of arch_topology
> > agree with the changes first before we include them.
> > 
> > v2[2]->v3:
> > 	- Dropped support to get the device node for the CPU's LLC
> > 	- Updated cacheinfo to support calling of detect_cache_attributes
> > 	  early in smp_prepare_cpus stage
> > 	- Added support to check if LLC is valid and shared in the cacheinfo
> > 	- Used the same in arch_topology
> > 
> > v1[1]->v2[2]:
> > 	- Updated ID validity check include all non-negative value
> > 	- Added support to get the device node for the CPU's last level cache
> > 	- Added support to build llc_sibling on DT platforms
> > 
> > [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> > [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> > 
> > Sudeep Holla (16):
> >    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
> >    cacheinfo: Add helper to access any cache index for a given CPU
> >    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
> >    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
> >    cacheinfo: Allow early detection and population of cache attributes
> >    arch_topology: Add support to parse and detect cache attributes
> >    arch_topology: Use the last level cache information from the cacheinfo
> >    arm64: topology: Remove redundant setting of llc_id in CPU topology
> >    arch_topology: Drop LLC identifier stash from the CPU topology
> >    arch_topology: Set thread sibling cpumask only within the cluster
> >    arch_topology: Check for non-negative value rather than -1 for IDs validity
> >    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> >    arch_topology: Don't set cluster identifier as physical package identifier
> >    arch_topology: Drop unnecessary check for uninitialised package_id
> >    arch_topology: Set cluster identifier in each core/thread from /cpu-map
> >    arch_topology: Add support for parsing sockets in /cpu-map
> > 
> >   arch/arm64/kernel/topology.c  |  14 -----
> >   drivers/base/arch_topology.c  |  92 +++++++++++++++++----------
> >   drivers/base/cacheinfo.c      | 114 +++++++++++++++++++++-------------
> >   include/linux/arch_topology.h |   1 -
> >   include/linux/cacheinfo.h     |   3 +
> >   5 files changed, 135 insertions(+), 89 deletions(-)
> > 
> 
> I tried this series on virtual machine where ACPI is enabled and looks good.
> Especially for PATCH[10], resolving the issue I have. So I provided my tested-by
> tag for it. Besides, I checked the changes related to ACPI part and looks to
> me either after the mentioned nits fixed. I leave the changes related to device-tree
> to be reviewed by the experts :)
> 

Thanks for the review and testing, much appreciated!
Yes the changes for ACPI is very minimal in this series, except the bug fix
you are interested, nothing changes in the system behaviour. The main aim
is to get the same behaviour on a similar DT based system.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-06-01  3:35                     ` Gavin Shan
  (?)
@ 2022-06-01 12:06                       ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:06 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Sudeep Holla,
	Vincent Guittot, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > Since the cacheinfo LLC information is used directly in arch_topology,
> > there is no need to parse and store the LLC ID information only for
> > ACPI systems in the CPU topology.
> > 
> > Remove the redundant LLC ID from the generic CPU arch_topology information.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/arch_topology.c  | 1 -
> >   include/linux/arch_topology.h | 1 -
> >   2 files changed, 2 deletions(-)
> > 
> 
> How about merge the changes to PATCH[08/16]? I don't see why we need put
> the changes into separate patches.
>

It took a while to remember as I was with the same opinion as yours but
decided to split them for one reason: to keep arch specific change in a
separate patch(if that becomes a need due to some conflict or some other
non-technical reason)

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-01 12:06                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:06 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Sudeep Holla,
	Vincent Guittot, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > Since the cacheinfo LLC information is used directly in arch_topology,
> > there is no need to parse and store the LLC ID information only for
> > ACPI systems in the CPU topology.
> > 
> > Remove the redundant LLC ID from the generic CPU arch_topology information.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/arch_topology.c  | 1 -
> >   include/linux/arch_topology.h | 1 -
> >   2 files changed, 2 deletions(-)
> > 
> 
> How about merge the changes to PATCH[08/16]? I don't see why we need put
> the changes into separate patches.
>

It took a while to remember as I was with the same opinion as yours but
decided to split them for one reason: to keep arch specific change in a
separate patch(if that becomes a need due to some conflict or some other
non-technical reason)

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-01 12:06                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:06 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Sudeep Holla,
	Vincent Guittot, Morten Rasmussen, Dietmar Eggemann, Qing Wang,
	linux-arm-kernel, linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > Since the cacheinfo LLC information is used directly in arch_topology,
> > there is no need to parse and store the LLC ID information only for
> > ACPI systems in the CPU topology.
> > 
> > Remove the redundant LLC ID from the generic CPU arch_topology information.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/arch_topology.c  | 1 -
> >   include/linux/arch_topology.h | 1 -
> >   2 files changed, 2 deletions(-)
> > 
> 
> How about merge the changes to PATCH[08/16]? I don't see why we need put
> the changes into separate patches.
>

It took a while to remember as I was with the same opinion as yours but
decided to split them for one reason: to keep arch specific change in a
separate patch(if that becomes a need due to some conflict or some other
non-technical reason)

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
  2022-06-01  2:44       ` Gavin Shan
  (?)
@ 2022-06-01 12:45         ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:45 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 10:44:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > The cacheinfo for a given CPU at a given index is used at quite a few
> > places by fetching the base point for index 0 using the helper
> > per_cpu_cacheinfo(cpu) and offseting it by the required index.
> > 
> > Instead, add another helper to fetch the required pointer directly and
> > use it to simplify and improve readability.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/cacheinfo.c | 14 +++++++-------
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> 
> s/offseting/offsetting
> 
> It looks good to me with below nits fixed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> 
> 
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index b0bde272e2ae..c4547d8ac6f3 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
> >   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
> >   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
> >   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> > +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> > +				(per_cpu_cacheinfo(cpu) + (idx))
> >   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
> >   {
> > @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
> >   	}
> >   	while (index < cache_leaves(cpu)) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		if (this_leaf->level != 1)
> >   			np = of_find_next_cache_node(np);
> >   		else
> > @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> >   		unsigned int i;
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		/* skip if shared_cpu_map is already populated */
> >   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> >   			continue;
> > @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   			if (i == cpu || !sib_cpu_ci->info_list)
> >   				continue;/* skip if itself or no cacheinfo */
> > -			sib_leaf = sib_cpu_ci->info_list + index;
> > +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
> >   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
> >   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
> >   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> > @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   static void cache_shared_cpu_map_remove(unsigned int cpu)
> >   {
> > -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> >   	struct cacheinfo *this_leaf, *sib_leaf;
> >   	unsigned int sibling, index;
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
> >   			struct cpu_cacheinfo *sib_cpu_ci;
> > 
> 
> In cache_shared_cpu_map_remove(), the newly introduced macro
> can be applied when the sibling CPU's cache info is fetched.
> 
>     sib_leaf = sib_cpu_ci->info_list + index;
> 
>     to
> 
>     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
> 

Right, I clearly missed to identify this. Thanks again for the review, all
other comments are now fixed locally and pushed @[1], will post them as part
of next version.

--
Regards,
Sudeep

[1] https://git.kernel.org/sudeep.holla/h/arch_topology

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-06-01 12:45         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:45 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 10:44:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > The cacheinfo for a given CPU at a given index is used at quite a few
> > places by fetching the base point for index 0 using the helper
> > per_cpu_cacheinfo(cpu) and offseting it by the required index.
> > 
> > Instead, add another helper to fetch the required pointer directly and
> > use it to simplify and improve readability.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/cacheinfo.c | 14 +++++++-------
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> 
> s/offseting/offsetting
> 
> It looks good to me with below nits fixed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> 
> 
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index b0bde272e2ae..c4547d8ac6f3 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
> >   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
> >   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
> >   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> > +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> > +				(per_cpu_cacheinfo(cpu) + (idx))
> >   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
> >   {
> > @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
> >   	}
> >   	while (index < cache_leaves(cpu)) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		if (this_leaf->level != 1)
> >   			np = of_find_next_cache_node(np);
> >   		else
> > @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> >   		unsigned int i;
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		/* skip if shared_cpu_map is already populated */
> >   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> >   			continue;
> > @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   			if (i == cpu || !sib_cpu_ci->info_list)
> >   				continue;/* skip if itself or no cacheinfo */
> > -			sib_leaf = sib_cpu_ci->info_list + index;
> > +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
> >   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
> >   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
> >   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> > @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   static void cache_shared_cpu_map_remove(unsigned int cpu)
> >   {
> > -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> >   	struct cacheinfo *this_leaf, *sib_leaf;
> >   	unsigned int sibling, index;
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
> >   			struct cpu_cacheinfo *sib_cpu_ci;
> > 
> 
> In cache_shared_cpu_map_remove(), the newly introduced macro
> can be applied when the sibling CPU's cache info is fetched.
> 
>     sib_leaf = sib_cpu_ci->info_list + index;
> 
>     to
> 
>     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
> 

Right, I clearly missed to identify this. Thanks again for the review, all
other comments are now fixed locally and pushed @[1], will post them as part
of next version.

--
Regards,
Sudeep

[1] https://git.kernel.org/sudeep.holla/h/arch_topology

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU
@ 2022-06-01 12:45         ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-01 12:45 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Wed, Jun 01, 2022 at 10:44:20AM +0800, Gavin Shan wrote:
> Hi Sudeep,
> 
> On 5/25/22 4:14 PM, Sudeep Holla wrote:
> > The cacheinfo for a given CPU at a given index is used at quite a few
> > places by fetching the base point for index 0 using the helper
> > per_cpu_cacheinfo(cpu) and offseting it by the required index.
> > 
> > Instead, add another helper to fetch the required pointer directly and
> > use it to simplify and improve readability.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >   drivers/base/cacheinfo.c | 14 +++++++-------
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> 
> s/offseting/offsetting
> 
> It looks good to me with below nits fixed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> 
> 
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index b0bde272e2ae..c4547d8ac6f3 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
> >   #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
> >   #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
> >   #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
> > +#define per_cpu_cacheinfo_idx(cpu, idx)		\
> > +				(per_cpu_cacheinfo(cpu) + (idx))
> >   struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
> >   {
> > @@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
> >   	}
> >   	while (index < cache_leaves(cpu)) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		if (this_leaf->level != 1)
> >   			np = of_find_next_cache_node(np);
> >   		else
> > @@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> >   		unsigned int i;
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		/* skip if shared_cpu_map is already populated */
> >   		if (!cpumask_empty(&this_leaf->shared_cpu_map))
> >   			continue;
> > @@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   			if (i == cpu || !sib_cpu_ci->info_list)
> >   				continue;/* skip if itself or no cacheinfo */
> > -			sib_leaf = sib_cpu_ci->info_list + index;
> > +			sib_leaf = per_cpu_cacheinfo_idx(i, index);
> >   			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
> >   				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
> >   				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
> > @@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> >   static void cache_shared_cpu_map_remove(unsigned int cpu)
> >   {
> > -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> >   	struct cacheinfo *this_leaf, *sib_leaf;
> >   	unsigned int sibling, index;
> >   	for (index = 0; index < cache_leaves(cpu); index++) {
> > -		this_leaf = this_cpu_ci->info_list + index;
> > +		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> >   		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
> >   			struct cpu_cacheinfo *sib_cpu_ci;
> > 
> 
> In cache_shared_cpu_map_remove(), the newly introduced macro
> can be applied when the sibling CPU's cache info is fetched.
> 
>     sib_leaf = sib_cpu_ci->info_list + index;
> 
>     to
> 
>     sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
> 

Right, I clearly missed to identify this. Thanks again for the review, all
other comments are now fixed locally and pushed @[1], will post them as part
of next version.

--
Regards,
Sudeep

[1] https://git.kernel.org/sudeep.holla/h/arch_topology

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
  2022-05-25  8:14                 ` Sudeep Holla
  (?)
@ 2022-06-02  6:42                   ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and fetch the LLC ID information only for
> ACPI systems.
> 
> Just drop the redundant parsing and setting of llc_id in CPU topology
> from ACPI PPTT.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   arch/arm64/kernel/topology.c | 14 --------------
>   1 file changed, 14 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 9ab78ad826e2..869ffc4d4484 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
>   		return 0;
>   
>   	for_each_possible_cpu(cpu) {
> -		int i, cache_id;
> -
>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>   		if (topology_id < 0)
>   			return topology_id;
> @@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
>   		cpu_topology[cpu].cluster_id = topology_id;
>   		topology_id = find_acpi_cpu_topology_package(cpu);
>   		cpu_topology[cpu].package_id = topology_id;
> -
> -		i = acpi_find_last_cache_level(cpu);
> -
> -		if (i > 0) {
> -			/*
> -			 * this is the only part of cpu_topology that has
> -			 * a direct relationship with the cache topology
> -			 */
> -			cache_id = find_acpi_cpu_cache_topology(cpu, i);
> -			if (cache_id > 0)
> -				cpu_topology[cpu].llc_id = cache_id;
> -		}
>   	}
>   
>   	return 0;
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
@ 2022-06-02  6:42                   ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and fetch the LLC ID information only for
> ACPI systems.
> 
> Just drop the redundant parsing and setting of llc_id in CPU topology
> from ACPI PPTT.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   arch/arm64/kernel/topology.c | 14 --------------
>   1 file changed, 14 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 9ab78ad826e2..869ffc4d4484 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
>   		return 0;
>   
>   	for_each_possible_cpu(cpu) {
> -		int i, cache_id;
> -
>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>   		if (topology_id < 0)
>   			return topology_id;
> @@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
>   		cpu_topology[cpu].cluster_id = topology_id;
>   		topology_id = find_acpi_cpu_topology_package(cpu);
>   		cpu_topology[cpu].package_id = topology_id;
> -
> -		i = acpi_find_last_cache_level(cpu);
> -
> -		if (i > 0) {
> -			/*
> -			 * this is the only part of cpu_topology that has
> -			 * a direct relationship with the cache topology
> -			 */
> -			cache_id = find_acpi_cpu_cache_topology(cpu, i);
> -			if (cache_id > 0)
> -				cpu_topology[cpu].llc_id = cache_id;
> -		}
>   	}
>   
>   	return 0;
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology
@ 2022-06-02  6:42                   ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and fetch the LLC ID information only for
> ACPI systems.
> 
> Just drop the redundant parsing and setting of llc_id in CPU topology
> from ACPI PPTT.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   arch/arm64/kernel/topology.c | 14 --------------
>   1 file changed, 14 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 9ab78ad826e2..869ffc4d4484 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
>   		return 0;
>   
>   	for_each_possible_cpu(cpu) {
> -		int i, cache_id;
> -
>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>   		if (topology_id < 0)
>   			return topology_id;
> @@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
>   		cpu_topology[cpu].cluster_id = topology_id;
>   		topology_id = find_acpi_cpu_topology_package(cpu);
>   		cpu_topology[cpu].package_id = topology_id;
> -
> -		i = acpi_find_last_cache_level(cpu);
> -
> -		if (i > 0) {
> -			/*
> -			 * this is the only part of cpu_topology that has
> -			 * a direct relationship with the cache topology
> -			 */
> -			cache_id = find_acpi_cpu_cache_topology(cpu, i);
> -			if (cache_id > 0)
> -				cpu_topology[cpu].llc_id = cache_id;
> -		}
>   	}
>   
>   	return 0;
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-05-25  8:14                   ` Sudeep Holla
  (?)
@ 2022-06-02  6:42                     ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-02  6:42                     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-02  6:42                     ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:42 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Dietmar Eggemann, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 5/25/22 4:14 PM, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and store the LLC ID information only for
> ACPI systems in the CPU topology.
> 
> Remove the redundant LLC ID from the generic CPU arch_topology information.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   drivers/base/arch_topology.c  | 1 -
>   include/linux/arch_topology.h | 1 -
>   2 files changed, 2 deletions(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 4c486e4e6f2f..76c702c217c5 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -747,7 +747,6 @@ void __init reset_cpu_topology(void)
>   		cpu_topo->core_id = -1;
>   		cpu_topo->cluster_id = -1;
>   		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
>   
>   		clear_cpu_topology(cpu);
>   	}
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 58cbe18d825c..a07b510e7dc5 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -68,7 +68,6 @@ struct cpu_topology {
>   	int core_id;
>   	int cluster_id;
>   	int package_id;
> -	int llc_id;
>   	cpumask_t thread_sibling;
>   	cpumask_t core_sibling;
>   	cpumask_t cluster_sibling;
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-06-01 12:06                       ` Sudeep Holla
  (?)
@ 2022-06-02  6:44                         ` Gavin Shan
  -1 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi Sudeep,

On 6/1/22 8:06 PM, Sudeep Holla wrote:
> On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
>> On 5/25/22 4:14 PM, Sudeep Holla wrote:
>>> Since the cacheinfo LLC information is used directly in arch_topology,
>>> there is no need to parse and store the LLC ID information only for
>>> ACPI systems in the CPU topology.
>>>
>>> Remove the redundant LLC ID from the generic CPU arch_topology information.
>>>
>>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>>> ---
>>>    drivers/base/arch_topology.c  | 1 -
>>>    include/linux/arch_topology.h | 1 -
>>>    2 files changed, 2 deletions(-)
>>>
>>
>> How about merge the changes to PATCH[08/16]? I don't see why we need put
>> the changes into separate patches.
>>
> 
> It took a while to remember as I was with the same opinion as yours but
> decided to split them for one reason: to keep arch specific change in a
> separate patch(if that becomes a need due to some conflict or some other
> non-technical reason)
> 

Ok. Thanks for the explanation, which sounds reasonable to me.

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-02  6:44                         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi Sudeep,

On 6/1/22 8:06 PM, Sudeep Holla wrote:
> On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
>> On 5/25/22 4:14 PM, Sudeep Holla wrote:
>>> Since the cacheinfo LLC information is used directly in arch_topology,
>>> there is no need to parse and store the LLC ID information only for
>>> ACPI systems in the CPU topology.
>>>
>>> Remove the redundant LLC ID from the generic CPU arch_topology information.
>>>
>>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>>> ---
>>>    drivers/base/arch_topology.c  | 1 -
>>>    include/linux/arch_topology.h | 1 -
>>>    2 files changed, 2 deletions(-)
>>>
>>
>> How about merge the changes to PATCH[08/16]? I don't see why we need put
>> the changes into separate patches.
>>
> 
> It took a while to remember as I was with the same opinion as yours but
> decided to split them for one reason: to keep arch specific change in a
> separate patch(if that becomes a need due to some conflict or some other
> non-technical reason)
> 

Ok. Thanks for the explanation, which sounds reasonable to me.

Thanks,
Gavin


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology
@ 2022-06-02  6:44                         ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2022-06-02  6:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Dietmar Eggemann, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Hi Sudeep,

On 6/1/22 8:06 PM, Sudeep Holla wrote:
> On Wed, Jun 01, 2022 at 11:35:20AM +0800, Gavin Shan wrote:
>> On 5/25/22 4:14 PM, Sudeep Holla wrote:
>>> Since the cacheinfo LLC information is used directly in arch_topology,
>>> there is no need to parse and store the LLC ID information only for
>>> ACPI systems in the CPU topology.
>>>
>>> Remove the redundant LLC ID from the generic CPU arch_topology information.
>>>
>>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>>> ---
>>>    drivers/base/arch_topology.c  | 1 -
>>>    include/linux/arch_topology.h | 1 -
>>>    2 files changed, 2 deletions(-)
>>>
>>
>> How about merge the changes to PATCH[08/16]? I don't see why we need put
>> the changes into separate patches.
>>
> 
> It took a while to remember as I was with the same opinion as yours but
> decided to split them for one reason: to keep arch specific change in a
> separate patch(if that becomes a need due to some conflict or some other
> non-technical reason)
> 

Ok. Thanks for the explanation, which sounds reasonable to me.

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
  2022-05-25  8:14               ` Sudeep Holla
  (?)
@ 2022-06-02 14:26                 ` Dietmar Eggemann
  -1 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-02 14:26 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  		/* not numa in package, lets use the package siblings */
>  		core_mask = &cpu_topology[cpu].core_sibling;
>  	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>  			core_mask = &cpu_topology[cpu].llc_sibling;
>  	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>  	for_each_online_cpu(cpu) {
>  		cpu_topo = &cpu_topology[cpu];
>  
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>  		}

I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
like that last_level_cache_is_shared() isn't working as
expected.

I instrumented cpu_coregroup_mask() like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
        const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

        if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
                core_mask = &cpu_topology[cpu].core_sibling;
                (1)
        }

	(2)

        if (last_level_cache_is_valid(cpu)) {
                if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
                        core_mask = &cpu_topology[cpu].llc_sibling;
                        (3)
        }

        if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
            cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
                core_mask = &cpu_topology[cpu].cluster_sibling;
                (4)

        (5)
        return core_mask;
}

and got:

(A) v3 patch-set:

[   11.561133] (1) cpu_coregroup_mask[0]=0-47
[   11.565670] (2) last_level_cache_is_valid(0)=1
[   11.570587] (3) cpu_coregroup_mask[0]=0    <-- llc_sibling=0 (should be 0-23)
[   11.574833] (4) cpu_coregroup_mask[0]=0-3  <-- Altra hack kicks in!
[   11.579275] (5) cpu_coregroup_mask[0]=0-3

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3
cpu0 0
domain0 00000000,00000000,0000000f
domain1 ffffffff,ffffffff,ffffffff

So the MC domain is missing.

(B) mainline as reference (cpu_coregroup_mask() slightly different):

[   11.585008] (1) cpu_coregroup_mask[0]=0-47
[   11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!!
[   11.594079] (5) cpu_coregroup_mask[0]=0-23

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
MC                                            <-- !!!
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4
cpu0 0
domain0 00000000,00000000,0000000f
domain1 00000000,00000000,00ffffff            <-- !!!
domain2 ffffffff,ffffffff,ffffffff

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-02 14:26                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-02 14:26 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  		/* not numa in package, lets use the package siblings */
>  		core_mask = &cpu_topology[cpu].core_sibling;
>  	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>  			core_mask = &cpu_topology[cpu].llc_sibling;
>  	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>  	for_each_online_cpu(cpu) {
>  		cpu_topo = &cpu_topology[cpu];
>  
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>  		}

I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
like that last_level_cache_is_shared() isn't working as
expected.

I instrumented cpu_coregroup_mask() like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
        const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

        if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
                core_mask = &cpu_topology[cpu].core_sibling;
                (1)
        }

	(2)

        if (last_level_cache_is_valid(cpu)) {
                if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
                        core_mask = &cpu_topology[cpu].llc_sibling;
                        (3)
        }

        if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
            cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
                core_mask = &cpu_topology[cpu].cluster_sibling;
                (4)

        (5)
        return core_mask;
}

and got:

(A) v3 patch-set:

[   11.561133] (1) cpu_coregroup_mask[0]=0-47
[   11.565670] (2) last_level_cache_is_valid(0)=1
[   11.570587] (3) cpu_coregroup_mask[0]=0    <-- llc_sibling=0 (should be 0-23)
[   11.574833] (4) cpu_coregroup_mask[0]=0-3  <-- Altra hack kicks in!
[   11.579275] (5) cpu_coregroup_mask[0]=0-3

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3
cpu0 0
domain0 00000000,00000000,0000000f
domain1 ffffffff,ffffffff,ffffffff

So the MC domain is missing.

(B) mainline as reference (cpu_coregroup_mask() slightly different):

[   11.585008] (1) cpu_coregroup_mask[0]=0-47
[   11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!!
[   11.594079] (5) cpu_coregroup_mask[0]=0-23

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
MC                                            <-- !!!
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4
cpu0 0
domain0 00000000,00000000,0000000f
domain1 00000000,00000000,00ffffff            <-- !!!
domain2 ffffffff,ffffffff,ffffffff

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-02 14:26                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-02 14:26 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
> 
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  		/* not numa in package, lets use the package siblings */
>  		core_mask = &cpu_topology[cpu].core_sibling;
>  	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> +
> +	if (last_level_cache_is_valid(cpu)) {
>  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>  			core_mask = &cpu_topology[cpu].llc_sibling;
>  	}
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
>  	for_each_online_cpu(cpu) {
>  		cpu_topo = &cpu_topology[cpu];
>  
> -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> +		if (last_level_cache_is_shared(cpu, cpuid)) {
>  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
>  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
>  		}

I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
like that last_level_cache_is_shared() isn't working as
expected.

I instrumented cpu_coregroup_mask() like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
        const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));

        if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
                core_mask = &cpu_topology[cpu].core_sibling;
                (1)
        }

	(2)

        if (last_level_cache_is_valid(cpu)) {
                if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
                        core_mask = &cpu_topology[cpu].llc_sibling;
                        (3)
        }

        if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
            cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
                core_mask = &cpu_topology[cpu].cluster_sibling;
                (4)

        (5)
        return core_mask;
}

and got:

(A) v3 patch-set:

[   11.561133] (1) cpu_coregroup_mask[0]=0-47
[   11.565670] (2) last_level_cache_is_valid(0)=1
[   11.570587] (3) cpu_coregroup_mask[0]=0    <-- llc_sibling=0 (should be 0-23)
[   11.574833] (4) cpu_coregroup_mask[0]=0-3  <-- Altra hack kicks in!
[   11.579275] (5) cpu_coregroup_mask[0]=0-3

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3
cpu0 0
domain0 00000000,00000000,0000000f
domain1 ffffffff,ffffffff,ffffffff

So the MC domain is missing.

(B) mainline as reference (cpu_coregroup_mask() slightly different):

[   11.585008] (1) cpu_coregroup_mask[0]=0-47
[   11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!!
[   11.594079] (5) cpu_coregroup_mask[0]=0-23

# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
MC                                            <-- !!!
DIE

# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4
cpu0 0
domain0 00000000,00000000,0000000f
domain1 00000000,00000000,00ffffff            <-- !!!
domain2 ffffffff,ffffffff,ffffffff

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-05-25  8:14                               ` Sudeep Holla
  (?)
@ 2022-06-03 12:30                                 ` Dietmar Eggemann
  -1 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-03 12:30 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> Let us set the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map.
> 
> We don't support nesting of clusters yet as there are no real hardware
> to support clusters of clusters.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b8f0d72908c8..5f4f148a7769 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
>  }
>  
>  static int __init parse_core(struct device_node *core, int package_id,
> -			     int core_id)
> +			     int cluster_id, int core_id)
>  {
>  	char name[20];
>  	bool leaf = true;
> @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  			cpu = get_cpu_for_node(t);
>  			if (cpu >= 0) {
>  				cpu_topology[cpu].package_id = package_id;
> +				cpu_topology[cpu].cluster_id = cluster_id;
>  				cpu_topology[cpu].core_id = core_id;
>  				cpu_topology[cpu].thread_id = i;
>  			} else if (cpu != -ENODEV) {
> @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  		}
>  
>  		cpu_topology[cpu].package_id = package_id;
> +		cpu_topology[cpu].cluster_id = cluster_id;

I'm still not convinced that this is the right thing to do. Let's take
the juno board as an example here. And I guess our assumption should be
that we want to make CONFIG_SCHED_CLUSTER a default option, like
CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
combinations.

(1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:

MC  <-- !!!
DIE

(2) SDs w/ CONFIG_SCHED_CLUSTER:

CLS <-- !!!
DIE

In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
cpumasks are equal and MC does not have any additional flags compared to
CLS.
I'm not convinced that we can change the degeneration rules without
destroying other scenarios of the scheduler so that here MC stays and
CLS gets removed instead.

Even though MC and CLS are doing the same right now from the perspective
of the scheduler, we should also see MC and not CLS under (2). CLS only
makes sense longer term if the scheduler also makes use of it (next to
MC) in the wakeup-path for instance. Especially when this happens, a
platform should always construct the same scheduler domain hierarchy, no
matter which CONFIG_SCHED_XXX options are enabled.


You can see this in update_siblings_masks()

    if (last_level_cache_is_shared)
        set llc_sibling

    if (cpuid_topo->package_id != cpu_topo->package_id)
        continue

    set core_sibling

  If llc cache and socket boundaries are congruent, llc_sibling and
  core_sibling are the same.

    if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
        continue

    set cluster_sibling

  Now we potentially set clusters. Since socket=0 is by default and we
  use the existing juno.dts, the cluster nodes end up being congruent to
  the llc cache cpumasks as well.

The problem is that we code `llc cache` and `DT cluster nodes` as the
same thing in juno.dts. `Cluster0/1` are congruent with the llc
information, although they should be actually `socket0/1` right now. But
we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
And then we use llc_sibling and cluster_sibling in two different SD
cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).

Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
which is a subset of the cpumasks of CONFIG_SCHED_MC.

---

IMHO we probably could just introduce your changes w/o setting `cpu-map
cluster nodes` in DT for now. We would just have to make sure that for
all `*.dts` affected, the `llc cache` info can take over the old role of
the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
matter if CONFIG_SCHED_CLUSTER is set or not.

[...]

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-03 12:30                                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-03 12:30 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> Let us set the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map.
> 
> We don't support nesting of clusters yet as there are no real hardware
> to support clusters of clusters.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b8f0d72908c8..5f4f148a7769 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
>  }
>  
>  static int __init parse_core(struct device_node *core, int package_id,
> -			     int core_id)
> +			     int cluster_id, int core_id)
>  {
>  	char name[20];
>  	bool leaf = true;
> @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  			cpu = get_cpu_for_node(t);
>  			if (cpu >= 0) {
>  				cpu_topology[cpu].package_id = package_id;
> +				cpu_topology[cpu].cluster_id = cluster_id;
>  				cpu_topology[cpu].core_id = core_id;
>  				cpu_topology[cpu].thread_id = i;
>  			} else if (cpu != -ENODEV) {
> @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  		}
>  
>  		cpu_topology[cpu].package_id = package_id;
> +		cpu_topology[cpu].cluster_id = cluster_id;

I'm still not convinced that this is the right thing to do. Let's take
the juno board as an example here. And I guess our assumption should be
that we want to make CONFIG_SCHED_CLUSTER a default option, like
CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
combinations.

(1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:

MC  <-- !!!
DIE

(2) SDs w/ CONFIG_SCHED_CLUSTER:

CLS <-- !!!
DIE

In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
cpumasks are equal and MC does not have any additional flags compared to
CLS.
I'm not convinced that we can change the degeneration rules without
destroying other scenarios of the scheduler so that here MC stays and
CLS gets removed instead.

Even though MC and CLS are doing the same right now from the perspective
of the scheduler, we should also see MC and not CLS under (2). CLS only
makes sense longer term if the scheduler also makes use of it (next to
MC) in the wakeup-path for instance. Especially when this happens, a
platform should always construct the same scheduler domain hierarchy, no
matter which CONFIG_SCHED_XXX options are enabled.


You can see this in update_siblings_masks()

    if (last_level_cache_is_shared)
        set llc_sibling

    if (cpuid_topo->package_id != cpu_topo->package_id)
        continue

    set core_sibling

  If llc cache and socket boundaries are congruent, llc_sibling and
  core_sibling are the same.

    if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
        continue

    set cluster_sibling

  Now we potentially set clusters. Since socket=0 is by default and we
  use the existing juno.dts, the cluster nodes end up being congruent to
  the llc cache cpumasks as well.

The problem is that we code `llc cache` and `DT cluster nodes` as the
same thing in juno.dts. `Cluster0/1` are congruent with the llc
information, although they should be actually `socket0/1` right now. But
we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
And then we use llc_sibling and cluster_sibling in two different SD
cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).

Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
which is a subset of the cpumasks of CONFIG_SCHED_MC.

---

IMHO we probably could just introduce your changes w/o setting `cpu-map
cluster nodes` in DT for now. We would just have to make sure that for
all `*.dts` affected, the `llc cache` info can take over the old role of
the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
matter if CONFIG_SCHED_CLUSTER is set or not.

[...]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-03 12:30                                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-03 12:30 UTC (permalink / raw)
  To: Sudeep Holla, linux-kernel
  Cc: Atish Patra, Atish Patra, Vincent Guittot, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 25/05/2022 10:14, Sudeep Holla wrote:
> Let us set the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map.
> 
> We don't support nesting of clusters yet as there are no real hardware
> to support clusters of clusters.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b8f0d72908c8..5f4f148a7769 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
>  }
>  
>  static int __init parse_core(struct device_node *core, int package_id,
> -			     int core_id)
> +			     int cluster_id, int core_id)
>  {
>  	char name[20];
>  	bool leaf = true;
> @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  			cpu = get_cpu_for_node(t);
>  			if (cpu >= 0) {
>  				cpu_topology[cpu].package_id = package_id;
> +				cpu_topology[cpu].cluster_id = cluster_id;
>  				cpu_topology[cpu].core_id = core_id;
>  				cpu_topology[cpu].thread_id = i;
>  			} else if (cpu != -ENODEV) {
> @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
>  		}
>  
>  		cpu_topology[cpu].package_id = package_id;
> +		cpu_topology[cpu].cluster_id = cluster_id;

I'm still not convinced that this is the right thing to do. Let's take
the juno board as an example here. And I guess our assumption should be
that we want to make CONFIG_SCHED_CLUSTER a default option, like
CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
combinations.

(1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:

MC  <-- !!!
DIE

(2) SDs w/ CONFIG_SCHED_CLUSTER:

CLS <-- !!!
DIE

In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
cpumasks are equal and MC does not have any additional flags compared to
CLS.
I'm not convinced that we can change the degeneration rules without
destroying other scenarios of the scheduler so that here MC stays and
CLS gets removed instead.

Even though MC and CLS are doing the same right now from the perspective
of the scheduler, we should also see MC and not CLS under (2). CLS only
makes sense longer term if the scheduler also makes use of it (next to
MC) in the wakeup-path for instance. Especially when this happens, a
platform should always construct the same scheduler domain hierarchy, no
matter which CONFIG_SCHED_XXX options are enabled.


You can see this in update_siblings_masks()

    if (last_level_cache_is_shared)
        set llc_sibling

    if (cpuid_topo->package_id != cpu_topo->package_id)
        continue

    set core_sibling

  If llc cache and socket boundaries are congruent, llc_sibling and
  core_sibling are the same.

    if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
        continue

    set cluster_sibling

  Now we potentially set clusters. Since socket=0 is by default and we
  use the existing juno.dts, the cluster nodes end up being congruent to
  the llc cache cpumasks as well.

The problem is that we code `llc cache` and `DT cluster nodes` as the
same thing in juno.dts. `Cluster0/1` are congruent with the llc
information, although they should be actually `socket0/1` right now. But
we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
And then we use llc_sibling and cluster_sibling in two different SD
cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).

Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
which is a subset of the cpumasks of CONFIG_SCHED_MC.

---

IMHO we probably could just introduce your changes w/o setting `cpu-map
cluster nodes` in DT for now. We would just have to make sure that for
all `*.dts` affected, the `llc cache` info can take over the old role of
the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
matter if CONFIG_SCHED_CLUSTER is set or not.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
  2022-06-02 14:26                 ` Dietmar Eggemann
  (?)
@ 2022-06-06  9:54                   ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06  9:54 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 02, 2022 at 04:26:00PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > The cacheinfo is now initialised early along with the CPU topology
> > initialisation. Instead of relying on the LLC ID information parsed
> > separately only with ACPI PPTT elsewhere, migrate to use the similar
> > information from the cacheinfo.
> > 
> > This is generic for both DT and ACPI systems. The ACPI LLC ID information
> > parsed separately can now be removed from arch specific code.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 765723448b10..4c486e4e6f2f 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> >  		/* not numa in package, lets use the package siblings */
> >  		core_mask = &cpu_topology[cpu].core_sibling;
> >  	}
> > -	if (cpu_topology[cpu].llc_id != -1) {
> > +
> > +	if (last_level_cache_is_valid(cpu)) {
> >  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> >  			core_mask = &cpu_topology[cpu].llc_sibling;
> >  	}
> > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
> >  	for_each_online_cpu(cpu) {
> >  		cpu_topo = &cpu_topology[cpu];
> >  
> > -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> > +		if (last_level_cache_is_shared(cpu, cpuid)) {
> >  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> >  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> >  		}
> 
> I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
> like that last_level_cache_is_shared() isn't working as
> expected.
>

Thanks a lot for detailed instrumentation, I am unable to identify why it is
not working though. I will take a deeper look later.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-06  9:54                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06  9:54 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 02, 2022 at 04:26:00PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > The cacheinfo is now initialised early along with the CPU topology
> > initialisation. Instead of relying on the LLC ID information parsed
> > separately only with ACPI PPTT elsewhere, migrate to use the similar
> > information from the cacheinfo.
> > 
> > This is generic for both DT and ACPI systems. The ACPI LLC ID information
> > parsed separately can now be removed from arch specific code.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 765723448b10..4c486e4e6f2f 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> >  		/* not numa in package, lets use the package siblings */
> >  		core_mask = &cpu_topology[cpu].core_sibling;
> >  	}
> > -	if (cpu_topology[cpu].llc_id != -1) {
> > +
> > +	if (last_level_cache_is_valid(cpu)) {
> >  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> >  			core_mask = &cpu_topology[cpu].llc_sibling;
> >  	}
> > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
> >  	for_each_online_cpu(cpu) {
> >  		cpu_topo = &cpu_topology[cpu];
> >  
> > -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> > +		if (last_level_cache_is_shared(cpu, cpuid)) {
> >  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> >  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> >  		}
> 
> I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
> like that last_level_cache_is_shared() isn't working as
> expected.
>

Thanks a lot for detailed instrumentation, I am unable to identify why it is
not working though. I will take a deeper look later.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
@ 2022-06-06  9:54                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06  9:54 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 02, 2022 at 04:26:00PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > The cacheinfo is now initialised early along with the CPU topology
> > initialisation. Instead of relying on the LLC ID information parsed
> > separately only with ACPI PPTT elsewhere, migrate to use the similar
> > information from the cacheinfo.
> > 
> > This is generic for both DT and ACPI systems. The ACPI LLC ID information
> > parsed separately can now be removed from arch specific code.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 765723448b10..4c486e4e6f2f 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> >  		/* not numa in package, lets use the package siblings */
> >  		core_mask = &cpu_topology[cpu].core_sibling;
> >  	}
> > -	if (cpu_topology[cpu].llc_id != -1) {
> > +
> > +	if (last_level_cache_is_valid(cpu)) {
> >  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> >  			core_mask = &cpu_topology[cpu].llc_sibling;
> >  	}
> > @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
> >  	for_each_online_cpu(cpu) {
> >  		cpu_topo = &cpu_topology[cpu];
> >  
> > -		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> > +		if (last_level_cache_is_shared(cpu, cpuid)) {
> >  			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> >  			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> >  		}
> 
> I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
> like that last_level_cache_is_shared() isn't working as
> expected.
>

Thanks a lot for detailed instrumentation, I am unable to identify why it is
not working though. I will take a deeper look later.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-03 12:30                                 ` Dietmar Eggemann
  (?)
@ 2022-06-06 10:21                                   ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06 10:21 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > Let us set the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map.
> > 
> > We don't support nesting of clusters yet as there are no real hardware
> > to support clusters of clusters.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b8f0d72908c8..5f4f148a7769 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> >  }
> >  
> >  static int __init parse_core(struct device_node *core, int package_id,
> > -			     int core_id)
> > +			     int cluster_id, int core_id)
> >  {
> >  	char name[20];
> >  	bool leaf = true;
> > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  			cpu = get_cpu_for_node(t);
> >  			if (cpu >= 0) {
> >  				cpu_topology[cpu].package_id = package_id;
> > +				cpu_topology[cpu].cluster_id = cluster_id;
> >  				cpu_topology[cpu].core_id = core_id;
> >  				cpu_topology[cpu].thread_id = i;
> >  			} else if (cpu != -ENODEV) {
> > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  		}
> >  
> >  		cpu_topology[cpu].package_id = package_id;
> > +		cpu_topology[cpu].cluster_id = cluster_id;
> 
> I'm still not convinced that this is the right thing to do. Let's take
> the juno board as an example here. And I guess our assumption should be
> that we want to make CONFIG_SCHED_CLUSTER a default option, like
> CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> combinations.
>

Agreed on the config part.

> (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
>
> MC  <-- !!!
> DIE
>
> (2) SDs w/ CONFIG_SCHED_CLUSTER:
>
> CLS <-- !!!
> DIE
>

Yes I have seen this.

> In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> cpumasks are equal and MC does not have any additional flags compared to
> CLS.
> I'm not convinced that we can change the degeneration rules without
> destroying other scenarios of the scheduler so that here MC stays and
> CLS gets removed instead.
>

Why ? Are you suggesting that we shouldn't present the hardware cluster
to the topology because of the above reason ? If so, sorry that is not a
valid reason. We could add login to return NULL or appropriate value
needed in cpu_clustergroup_mask id it matches MC level mask if we can't
deal that in generic scheduler code. But the topology code can't be
compromised for that reason as it is user visible.

> Even though MC and CLS are doing the same right now from the perspective
> of the scheduler, we should also see MC and not CLS under (2). CLS only
> makes sense longer term if the scheduler also makes use of it (next to
> MC) in the wakeup-path for instance. Especially when this happens, a
> platform should always construct the same scheduler domain hierarchy, no
> matter which CONFIG_SCHED_XXX options are enabled.
>
>
> You can see this in update_siblings_masks()
>
>     if (last_level_cache_is_shared)
>         set llc_sibling
>
>     if (cpuid_topo->package_id != cpu_topo->package_id)
>         continue
>
>     set core_sibling
>
>   If llc cache and socket boundaries are congruent, llc_sibling and
>   core_sibling are the same.
>
>     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>         continue
>
>     set cluster_sibling
>
>   Now we potentially set clusters. Since socket=0 is by default and we
>   use the existing juno.dts, the cluster nodes end up being congruent to
>   the llc cache cpumasks as well.
>

Correct and I see no problems as it matches what the hardware is. So I am
not expecting any change in any cpumasks there as they all are aligned with
the hardware.

> The problem is that we code `llc cache` and `DT cluster nodes` as the
> same thing in juno.dts. 

Why is that a problem ? If so, blame hardware and deal with it as we have to
😄 as usual we get all sorts of topology.

> `Cluster0/1` are congruent with the llc information, although they should 
> be actually `socket0/1` right now.

That was complete non-sense and wrong. Boot and check in ACPI mode.

> But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> And then we use llc_sibling and cluster_sibling in two different SD
> cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
>

We just need to deal with that. How is that dealt today with ACPI. My
changes are making these aligned with ACPI. If something is broken as
per you understanding with ACPI, then that needs fixing. The topology
presented and parsed by ACPI is correct and we are aligning DT with that.
There is no question on that.

> Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> which is a subset of the cpumasks of CONFIG_SCHED_MC.
>

But that change also introduced cluster masks into the topology which again
aligns with my changes.

> IMHO we probably could just introduce your changes w/o setting `cpu-map
> cluster nodes` in DT for now. We would just have to make sure that for
> all `*.dts` affected, the `llc cache` info can take over the old role of
> the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> matter if CONFIG_SCHED_CLUSTER is set or not.

Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
broken based on your argument above. If it is, that needs fixing and this
series just gets topology parsing in both ACPI and DT aligned, nothing
more or nothing less. In the process it may be introducing clusters, but
if it is not dealt correctly in ACPI, then it won't be in DT too and needs
fixing anyways.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-06 10:21                                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06 10:21 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > Let us set the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map.
> > 
> > We don't support nesting of clusters yet as there are no real hardware
> > to support clusters of clusters.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b8f0d72908c8..5f4f148a7769 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> >  }
> >  
> >  static int __init parse_core(struct device_node *core, int package_id,
> > -			     int core_id)
> > +			     int cluster_id, int core_id)
> >  {
> >  	char name[20];
> >  	bool leaf = true;
> > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  			cpu = get_cpu_for_node(t);
> >  			if (cpu >= 0) {
> >  				cpu_topology[cpu].package_id = package_id;
> > +				cpu_topology[cpu].cluster_id = cluster_id;
> >  				cpu_topology[cpu].core_id = core_id;
> >  				cpu_topology[cpu].thread_id = i;
> >  			} else if (cpu != -ENODEV) {
> > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  		}
> >  
> >  		cpu_topology[cpu].package_id = package_id;
> > +		cpu_topology[cpu].cluster_id = cluster_id;
> 
> I'm still not convinced that this is the right thing to do. Let's take
> the juno board as an example here. And I guess our assumption should be
> that we want to make CONFIG_SCHED_CLUSTER a default option, like
> CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> combinations.
>

Agreed on the config part.

> (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
>
> MC  <-- !!!
> DIE
>
> (2) SDs w/ CONFIG_SCHED_CLUSTER:
>
> CLS <-- !!!
> DIE
>

Yes I have seen this.

> In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> cpumasks are equal and MC does not have any additional flags compared to
> CLS.
> I'm not convinced that we can change the degeneration rules without
> destroying other scenarios of the scheduler so that here MC stays and
> CLS gets removed instead.
>

Why ? Are you suggesting that we shouldn't present the hardware cluster
to the topology because of the above reason ? If so, sorry that is not a
valid reason. We could add login to return NULL or appropriate value
needed in cpu_clustergroup_mask id it matches MC level mask if we can't
deal that in generic scheduler code. But the topology code can't be
compromised for that reason as it is user visible.

> Even though MC and CLS are doing the same right now from the perspective
> of the scheduler, we should also see MC and not CLS under (2). CLS only
> makes sense longer term if the scheduler also makes use of it (next to
> MC) in the wakeup-path for instance. Especially when this happens, a
> platform should always construct the same scheduler domain hierarchy, no
> matter which CONFIG_SCHED_XXX options are enabled.
>
>
> You can see this in update_siblings_masks()
>
>     if (last_level_cache_is_shared)
>         set llc_sibling
>
>     if (cpuid_topo->package_id != cpu_topo->package_id)
>         continue
>
>     set core_sibling
>
>   If llc cache and socket boundaries are congruent, llc_sibling and
>   core_sibling are the same.
>
>     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>         continue
>
>     set cluster_sibling
>
>   Now we potentially set clusters. Since socket=0 is by default and we
>   use the existing juno.dts, the cluster nodes end up being congruent to
>   the llc cache cpumasks as well.
>

Correct and I see no problems as it matches what the hardware is. So I am
not expecting any change in any cpumasks there as they all are aligned with
the hardware.

> The problem is that we code `llc cache` and `DT cluster nodes` as the
> same thing in juno.dts. 

Why is that a problem ? If so, blame hardware and deal with it as we have to
😄 as usual we get all sorts of topology.

> `Cluster0/1` are congruent with the llc information, although they should 
> be actually `socket0/1` right now.

That was complete non-sense and wrong. Boot and check in ACPI mode.

> But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> And then we use llc_sibling and cluster_sibling in two different SD
> cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
>

We just need to deal with that. How is that dealt today with ACPI. My
changes are making these aligned with ACPI. If something is broken as
per you understanding with ACPI, then that needs fixing. The topology
presented and parsed by ACPI is correct and we are aligning DT with that.
There is no question on that.

> Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> which is a subset of the cpumasks of CONFIG_SCHED_MC.
>

But that change also introduced cluster masks into the topology which again
aligns with my changes.

> IMHO we probably could just introduce your changes w/o setting `cpu-map
> cluster nodes` in DT for now. We would just have to make sure that for
> all `*.dts` affected, the `llc cache` info can take over the old role of
> the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> matter if CONFIG_SCHED_CLUSTER is set or not.

Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
broken based on your argument above. If it is, that needs fixing and this
series just gets topology parsing in both ACPI and DT aligned, nothing
more or nothing less. In the process it may be introducing clusters, but
if it is not dealt correctly in ACPI, then it won't be in DT too and needs
fixing anyways.

--
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-06 10:21                                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-06 10:21 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, Atish Patra, Atish Patra, Vincent Guittot,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> On 25/05/2022 10:14, Sudeep Holla wrote:
> > Let us set the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map.
> > 
> > We don't support nesting of clusters yet as there are no real hardware
> > to support clusters of clusters.
> > 
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b8f0d72908c8..5f4f148a7769 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> >  }
> >  
> >  static int __init parse_core(struct device_node *core, int package_id,
> > -			     int core_id)
> > +			     int cluster_id, int core_id)
> >  {
> >  	char name[20];
> >  	bool leaf = true;
> > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  			cpu = get_cpu_for_node(t);
> >  			if (cpu >= 0) {
> >  				cpu_topology[cpu].package_id = package_id;
> > +				cpu_topology[cpu].cluster_id = cluster_id;
> >  				cpu_topology[cpu].core_id = core_id;
> >  				cpu_topology[cpu].thread_id = i;
> >  			} else if (cpu != -ENODEV) {
> > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> >  		}
> >  
> >  		cpu_topology[cpu].package_id = package_id;
> > +		cpu_topology[cpu].cluster_id = cluster_id;
> 
> I'm still not convinced that this is the right thing to do. Let's take
> the juno board as an example here. And I guess our assumption should be
> that we want to make CONFIG_SCHED_CLUSTER a default option, like
> CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> combinations.
>

Agreed on the config part.

> (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
>
> MC  <-- !!!
> DIE
>
> (2) SDs w/ CONFIG_SCHED_CLUSTER:
>
> CLS <-- !!!
> DIE
>

Yes I have seen this.

> In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> cpumasks are equal and MC does not have any additional flags compared to
> CLS.
> I'm not convinced that we can change the degeneration rules without
> destroying other scenarios of the scheduler so that here MC stays and
> CLS gets removed instead.
>

Why ? Are you suggesting that we shouldn't present the hardware cluster
to the topology because of the above reason ? If so, sorry that is not a
valid reason. We could add login to return NULL or appropriate value
needed in cpu_clustergroup_mask id it matches MC level mask if we can't
deal that in generic scheduler code. But the topology code can't be
compromised for that reason as it is user visible.

> Even though MC and CLS are doing the same right now from the perspective
> of the scheduler, we should also see MC and not CLS under (2). CLS only
> makes sense longer term if the scheduler also makes use of it (next to
> MC) in the wakeup-path for instance. Especially when this happens, a
> platform should always construct the same scheduler domain hierarchy, no
> matter which CONFIG_SCHED_XXX options are enabled.
>
>
> You can see this in update_siblings_masks()
>
>     if (last_level_cache_is_shared)
>         set llc_sibling
>
>     if (cpuid_topo->package_id != cpu_topo->package_id)
>         continue
>
>     set core_sibling
>
>   If llc cache and socket boundaries are congruent, llc_sibling and
>   core_sibling are the same.
>
>     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
>         continue
>
>     set cluster_sibling
>
>   Now we potentially set clusters. Since socket=0 is by default and we
>   use the existing juno.dts, the cluster nodes end up being congruent to
>   the llc cache cpumasks as well.
>

Correct and I see no problems as it matches what the hardware is. So I am
not expecting any change in any cpumasks there as they all are aligned with
the hardware.

> The problem is that we code `llc cache` and `DT cluster nodes` as the
> same thing in juno.dts. 

Why is that a problem ? If so, blame hardware and deal with it as we have to
😄 as usual we get all sorts of topology.

> `Cluster0/1` are congruent with the llc information, although they should 
> be actually `socket0/1` right now.

That was complete non-sense and wrong. Boot and check in ACPI mode.

> But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> And then we use llc_sibling and cluster_sibling in two different SD
> cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
>

We just need to deal with that. How is that dealt today with ACPI. My
changes are making these aligned with ACPI. If something is broken as
per you understanding with ACPI, then that needs fixing. The topology
presented and parsed by ACPI is correct and we are aligning DT with that.
There is no question on that.

> Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> which is a subset of the cpumasks of CONFIG_SCHED_MC.
>

But that change also introduced cluster masks into the topology which again
aligns with my changes.

> IMHO we probably could just introduce your changes w/o setting `cpu-map
> cluster nodes` in DT for now. We would just have to make sure that for
> all `*.dts` affected, the `llc cache` info can take over the old role of
> the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> matter if CONFIG_SCHED_CLUSTER is set or not.

Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
broken based on your argument above. If it is, that needs fixing and this
series just gets topology parsing in both ACPI and DT aligned, nothing
more or nothing less. In the process it may be introducing clusters, but
if it is not dealt correctly in ACPI, then it won't be in DT too and needs
fixing anyways.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-06 10:21                                   ` Sudeep Holla
  (?)
@ 2022-06-10 10:08                                     ` Vincent Guittot
  -1 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-10 10:08 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> > On 25/05/2022 10:14, Sudeep Holla wrote:
> > > Let us set the cluster identifier as parsed from the device tree
> > > cluster nodes within /cpu-map.
> > >
> > > We don't support nesting of clusters yet as there are no real hardware
> > > to support clusters of clusters.
> > >
> > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > > ---
> > >  drivers/base/arch_topology.c | 13 ++++++++-----
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > > index b8f0d72908c8..5f4f148a7769 100644
> > > --- a/drivers/base/arch_topology.c
> > > +++ b/drivers/base/arch_topology.c
> > > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> > >  }
> > >
> > >  static int __init parse_core(struct device_node *core, int package_id,
> > > -                        int core_id)
> > > +                        int cluster_id, int core_id)
> > >  {
> > >     char name[20];
> > >     bool leaf = true;
> > > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >                     cpu = get_cpu_for_node(t);
> > >                     if (cpu >= 0) {
> > >                             cpu_topology[cpu].package_id = package_id;
> > > +                           cpu_topology[cpu].cluster_id = cluster_id;
> > >                             cpu_topology[cpu].core_id = core_id;
> > >                             cpu_topology[cpu].thread_id = i;
> > >                     } else if (cpu != -ENODEV) {
> > > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >             }
> > >
> > >             cpu_topology[cpu].package_id = package_id;
> > > +           cpu_topology[cpu].cluster_id = cluster_id;
> >
> > I'm still not convinced that this is the right thing to do. Let's take
> > the juno board as an example here. And I guess our assumption should be
> > that we want to make CONFIG_SCHED_CLUSTER a default option, like
> > CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> > combinations.
> >
>
> Agreed on the config part.
>
> > (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
> >
> > MC  <-- !!!
> > DIE
> >
> > (2) SDs w/ CONFIG_SCHED_CLUSTER:
> >
> > CLS <-- !!!
> > DIE
> >
>
> Yes I have seen this.
>
> > In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> > cpumasks are equal and MC does not have any additional flags compared to
> > CLS.
> > I'm not convinced that we can change the degeneration rules without
> > destroying other scenarios of the scheduler so that here MC stays and
> > CLS gets removed instead.
> >
>
> Why ? Are you suggesting that we shouldn't present the hardware cluster
> to the topology because of the above reason ? If so, sorry that is not a
> valid reason. We could add login to return NULL or appropriate value
> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> deal that in generic scheduler code. But the topology code can't be
> compromised for that reason as it is user visible.

I tend to agree with Dietmar. The legacy use of cluster node in DT
refers to the dynamiQ or legacy b.L cluster which is also aligned to
the LLC and the MC scheduling level. The new cluster level that has
been introduced recently does not target this level but some
intermediate levels either inside like for the kupeng920 or the v9
complex or outside like for the ampere altra. So I would say that
there is one cluster node level in DT that refers to the same MC/LLC
level and only an additional child/parent cluster node should be used
to fill the clustergroup_mask.

IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
topology instead to define cpu_coregroup_mask whereas DT described the
dynamiQ instead of using cache topology. If you use cache topology
now, then you should skip the dynamiQ

Finally, even if CLS and MC have the same scheduling behavior for now,
they might ends up with different scheduling properties which would
mean that replacing MC level by CLS one for current SoC would become
wrong

>
> > Even though MC and CLS are doing the same right now from the perspective
> > of the scheduler, we should also see MC and not CLS under (2). CLS only
> > makes sense longer term if the scheduler also makes use of it (next to
> > MC) in the wakeup-path for instance. Especially when this happens, a
> > platform should always construct the same scheduler domain hierarchy, no
> > matter which CONFIG_SCHED_XXX options are enabled.
> >
> >
> > You can see this in update_siblings_masks()
> >
> >     if (last_level_cache_is_shared)
> >         set llc_sibling
> >
> >     if (cpuid_topo->package_id != cpu_topo->package_id)
> >         continue
> >
> >     set core_sibling
> >
> >   If llc cache and socket boundaries are congruent, llc_sibling and
> >   core_sibling are the same.
> >
> >     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> >         continue
> >
> >     set cluster_sibling
> >
> >   Now we potentially set clusters. Since socket=0 is by default and we
> >   use the existing juno.dts, the cluster nodes end up being congruent to
> >   the llc cache cpumasks as well.
> >
>
> Correct and I see no problems as it matches what the hardware is. So I am
> not expecting any change in any cpumasks there as they all are aligned with
> the hardware.
>
> > The problem is that we code `llc cache` and `DT cluster nodes` as the
> > same thing in juno.dts.
>
> Why is that a problem ? If so, blame hardware and deal with it as we have to
> 😄 as usual we get all sorts of topology.
>
> > `Cluster0/1` are congruent with the llc information, although they should
> > be actually `socket0/1` right now.
>
> That was complete non-sense and wrong. Boot and check in ACPI mode.
>
> > But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> > And then we use llc_sibling and cluster_sibling in two different SD
> > cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
> >
>
> We just need to deal with that. How is that dealt today with ACPI. My
> changes are making these aligned with ACPI. If something is broken as
> per you understanding with ACPI, then that needs fixing. The topology
> presented and parsed by ACPI is correct and we are aligning DT with that.
> There is no question on that.
>
> > Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> > which is a subset of the cpumasks of CONFIG_SCHED_MC.
> >
>
> But that change also introduced cluster masks into the topology which again
> aligns with my changes.
>
> > IMHO we probably could just introduce your changes w/o setting `cpu-map
> > cluster nodes` in DT for now. We would just have to make sure that for
> > all `*.dts` affected, the `llc cache` info can take over the old role of
> > the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> > matter if CONFIG_SCHED_CLUSTER is set or not.
>
> Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
> broken based on your argument above. If it is, that needs fixing and this
> series just gets topology parsing in both ACPI and DT aligned, nothing
> more or nothing less. In the process it may be introducing clusters, but
> if it is not dealt correctly in ACPI, then it won't be in DT too and needs
> fixing anyways.
>
> --
> Regards,
> Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-10 10:08                                     ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-10 10:08 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> > On 25/05/2022 10:14, Sudeep Holla wrote:
> > > Let us set the cluster identifier as parsed from the device tree
> > > cluster nodes within /cpu-map.
> > >
> > > We don't support nesting of clusters yet as there are no real hardware
> > > to support clusters of clusters.
> > >
> > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > > ---
> > >  drivers/base/arch_topology.c | 13 ++++++++-----
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > > index b8f0d72908c8..5f4f148a7769 100644
> > > --- a/drivers/base/arch_topology.c
> > > +++ b/drivers/base/arch_topology.c
> > > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> > >  }
> > >
> > >  static int __init parse_core(struct device_node *core, int package_id,
> > > -                        int core_id)
> > > +                        int cluster_id, int core_id)
> > >  {
> > >     char name[20];
> > >     bool leaf = true;
> > > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >                     cpu = get_cpu_for_node(t);
> > >                     if (cpu >= 0) {
> > >                             cpu_topology[cpu].package_id = package_id;
> > > +                           cpu_topology[cpu].cluster_id = cluster_id;
> > >                             cpu_topology[cpu].core_id = core_id;
> > >                             cpu_topology[cpu].thread_id = i;
> > >                     } else if (cpu != -ENODEV) {
> > > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >             }
> > >
> > >             cpu_topology[cpu].package_id = package_id;
> > > +           cpu_topology[cpu].cluster_id = cluster_id;
> >
> > I'm still not convinced that this is the right thing to do. Let's take
> > the juno board as an example here. And I guess our assumption should be
> > that we want to make CONFIG_SCHED_CLUSTER a default option, like
> > CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> > combinations.
> >
>
> Agreed on the config part.
>
> > (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
> >
> > MC  <-- !!!
> > DIE
> >
> > (2) SDs w/ CONFIG_SCHED_CLUSTER:
> >
> > CLS <-- !!!
> > DIE
> >
>
> Yes I have seen this.
>
> > In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> > cpumasks are equal and MC does not have any additional flags compared to
> > CLS.
> > I'm not convinced that we can change the degeneration rules without
> > destroying other scenarios of the scheduler so that here MC stays and
> > CLS gets removed instead.
> >
>
> Why ? Are you suggesting that we shouldn't present the hardware cluster
> to the topology because of the above reason ? If so, sorry that is not a
> valid reason. We could add login to return NULL or appropriate value
> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> deal that in generic scheduler code. But the topology code can't be
> compromised for that reason as it is user visible.

I tend to agree with Dietmar. The legacy use of cluster node in DT
refers to the dynamiQ or legacy b.L cluster which is also aligned to
the LLC and the MC scheduling level. The new cluster level that has
been introduced recently does not target this level but some
intermediate levels either inside like for the kupeng920 or the v9
complex or outside like for the ampere altra. So I would say that
there is one cluster node level in DT that refers to the same MC/LLC
level and only an additional child/parent cluster node should be used
to fill the clustergroup_mask.

IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
topology instead to define cpu_coregroup_mask whereas DT described the
dynamiQ instead of using cache topology. If you use cache topology
now, then you should skip the dynamiQ

Finally, even if CLS and MC have the same scheduling behavior for now,
they might ends up with different scheduling properties which would
mean that replacing MC level by CLS one for current SoC would become
wrong

>
> > Even though MC and CLS are doing the same right now from the perspective
> > of the scheduler, we should also see MC and not CLS under (2). CLS only
> > makes sense longer term if the scheduler also makes use of it (next to
> > MC) in the wakeup-path for instance. Especially when this happens, a
> > platform should always construct the same scheduler domain hierarchy, no
> > matter which CONFIG_SCHED_XXX options are enabled.
> >
> >
> > You can see this in update_siblings_masks()
> >
> >     if (last_level_cache_is_shared)
> >         set llc_sibling
> >
> >     if (cpuid_topo->package_id != cpu_topo->package_id)
> >         continue
> >
> >     set core_sibling
> >
> >   If llc cache and socket boundaries are congruent, llc_sibling and
> >   core_sibling are the same.
> >
> >     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> >         continue
> >
> >     set cluster_sibling
> >
> >   Now we potentially set clusters. Since socket=0 is by default and we
> >   use the existing juno.dts, the cluster nodes end up being congruent to
> >   the llc cache cpumasks as well.
> >
>
> Correct and I see no problems as it matches what the hardware is. So I am
> not expecting any change in any cpumasks there as they all are aligned with
> the hardware.
>
> > The problem is that we code `llc cache` and `DT cluster nodes` as the
> > same thing in juno.dts.
>
> Why is that a problem ? If so, blame hardware and deal with it as we have to
> 😄 as usual we get all sorts of topology.
>
> > `Cluster0/1` are congruent with the llc information, although they should
> > be actually `socket0/1` right now.
>
> That was complete non-sense and wrong. Boot and check in ACPI mode.
>
> > But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> > And then we use llc_sibling and cluster_sibling in two different SD
> > cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
> >
>
> We just need to deal with that. How is that dealt today with ACPI. My
> changes are making these aligned with ACPI. If something is broken as
> per you understanding with ACPI, then that needs fixing. The topology
> presented and parsed by ACPI is correct and we are aligning DT with that.
> There is no question on that.
>
> > Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> > which is a subset of the cpumasks of CONFIG_SCHED_MC.
> >
>
> But that change also introduced cluster masks into the topology which again
> aligns with my changes.
>
> > IMHO we probably could just introduce your changes w/o setting `cpu-map
> > cluster nodes` in DT for now. We would just have to make sure that for
> > all `*.dts` affected, the `llc cache` info can take over the old role of
> > the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> > matter if CONFIG_SCHED_CLUSTER is set or not.
>
> Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
> broken based on your argument above. If it is, that needs fixing and this
> series just gets topology parsing in both ACPI and DT aligned, nothing
> more or nothing less. In the process it may be introducing clusters, but
> if it is not dealt correctly in ACPI, then it won't be in DT too and needs
> fixing anyways.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-10 10:08                                     ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-10 10:08 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 03, 2022 at 02:30:04PM +0200, Dietmar Eggemann wrote:
> > On 25/05/2022 10:14, Sudeep Holla wrote:
> > > Let us set the cluster identifier as parsed from the device tree
> > > cluster nodes within /cpu-map.
> > >
> > > We don't support nesting of clusters yet as there are no real hardware
> > > to support clusters of clusters.
> > >
> > > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > > ---
> > >  drivers/base/arch_topology.c | 13 ++++++++-----
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > > index b8f0d72908c8..5f4f148a7769 100644
> > > --- a/drivers/base/arch_topology.c
> > > +++ b/drivers/base/arch_topology.c
> > > @@ -492,7 +492,7 @@ static int __init get_cpu_for_node(struct device_node *node)
> > >  }
> > >
> > >  static int __init parse_core(struct device_node *core, int package_id,
> > > -                        int core_id)
> > > +                        int cluster_id, int core_id)
> > >  {
> > >     char name[20];
> > >     bool leaf = true;
> > > @@ -508,6 +508,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >                     cpu = get_cpu_for_node(t);
> > >                     if (cpu >= 0) {
> > >                             cpu_topology[cpu].package_id = package_id;
> > > +                           cpu_topology[cpu].cluster_id = cluster_id;
> > >                             cpu_topology[cpu].core_id = core_id;
> > >                             cpu_topology[cpu].thread_id = i;
> > >                     } else if (cpu != -ENODEV) {
> > > @@ -529,6 +530,7 @@ static int __init parse_core(struct device_node *core, int package_id,
> > >             }
> > >
> > >             cpu_topology[cpu].package_id = package_id;
> > > +           cpu_topology[cpu].cluster_id = cluster_id;
> >
> > I'm still not convinced that this is the right thing to do. Let's take
> > the juno board as an example here. And I guess our assumption should be
> > that we want to make CONFIG_SCHED_CLUSTER a default option, like
> > CONFIG_SCHED_MC is. Simply to avoid a unmanageable zoo of config-option
> > combinations.
> >
>
> Agreed on the config part.
>
> > (1) Scheduler Domains (SDs) w/o CONFIG_SCHED_CLUSTER:
> >
> > MC  <-- !!!
> > DIE
> >
> > (2) SDs w/ CONFIG_SCHED_CLUSTER:
> >
> > CLS <-- !!!
> > DIE
> >
>
> Yes I have seen this.
>
> > In (2) MC gets degenerated in sd_parent_degenerate() since CLS and MC
> > cpumasks are equal and MC does not have any additional flags compared to
> > CLS.
> > I'm not convinced that we can change the degeneration rules without
> > destroying other scenarios of the scheduler so that here MC stays and
> > CLS gets removed instead.
> >
>
> Why ? Are you suggesting that we shouldn't present the hardware cluster
> to the topology because of the above reason ? If so, sorry that is not a
> valid reason. We could add login to return NULL or appropriate value
> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> deal that in generic scheduler code. But the topology code can't be
> compromised for that reason as it is user visible.

I tend to agree with Dietmar. The legacy use of cluster node in DT
refers to the dynamiQ or legacy b.L cluster which is also aligned to
the LLC and the MC scheduling level. The new cluster level that has
been introduced recently does not target this level but some
intermediate levels either inside like for the kupeng920 or the v9
complex or outside like for the ampere altra. So I would say that
there is one cluster node level in DT that refers to the same MC/LLC
level and only an additional child/parent cluster node should be used
to fill the clustergroup_mask.

IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
topology instead to define cpu_coregroup_mask whereas DT described the
dynamiQ instead of using cache topology. If you use cache topology
now, then you should skip the dynamiQ

Finally, even if CLS and MC have the same scheduling behavior for now,
they might ends up with different scheduling properties which would
mean that replacing MC level by CLS one for current SoC would become
wrong

>
> > Even though MC and CLS are doing the same right now from the perspective
> > of the scheduler, we should also see MC and not CLS under (2). CLS only
> > makes sense longer term if the scheduler also makes use of it (next to
> > MC) in the wakeup-path for instance. Especially when this happens, a
> > platform should always construct the same scheduler domain hierarchy, no
> > matter which CONFIG_SCHED_XXX options are enabled.
> >
> >
> > You can see this in update_siblings_masks()
> >
> >     if (last_level_cache_is_shared)
> >         set llc_sibling
> >
> >     if (cpuid_topo->package_id != cpu_topo->package_id)
> >         continue
> >
> >     set core_sibling
> >
> >   If llc cache and socket boundaries are congruent, llc_sibling and
> >   core_sibling are the same.
> >
> >     if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
> >         continue
> >
> >     set cluster_sibling
> >
> >   Now we potentially set clusters. Since socket=0 is by default and we
> >   use the existing juno.dts, the cluster nodes end up being congruent to
> >   the llc cache cpumasks as well.
> >
>
> Correct and I see no problems as it matches what the hardware is. So I am
> not expecting any change in any cpumasks there as they all are aligned with
> the hardware.
>
> > The problem is that we code `llc cache` and `DT cluster nodes` as the
> > same thing in juno.dts.
>
> Why is that a problem ? If so, blame hardware and deal with it as we have to
> 😄 as usual we get all sorts of topology.
>
> > `Cluster0/1` are congruent with the llc information, although they should
> > be actually `socket0/1` right now.
>
> That was complete non-sense and wrong. Boot and check in ACPI mode.
>
> > But we can't set-up a cpu-map with a `socketX` containing `coreY` directly.
> > And then we use llc_sibling and cluster_sibling in two different SD
> > cpumask functions (cpu_coregroup_mask() and cpu_clustergroup_mask()).
> >
>
> We just need to deal with that. How is that dealt today with ACPI. My
> changes are making these aligned with ACPI. If something is broken as
> per you understanding with ACPI, then that needs fixing. The topology
> presented and parsed by ACPI is correct and we are aligning DT with that.
> There is no question on that.
>
> > Remember, CONFIG_SCHED_CLUSTER was introduced in ACPI/PPTT as a cpumask
> > which is a subset of the cpumasks of CONFIG_SCHED_MC.
> >
>
> But that change also introduced cluster masks into the topology which again
> aligns with my changes.
>
> > IMHO we probably could just introduce your changes w/o setting `cpu-map
> > cluster nodes` in DT for now. We would just have to make sure that for
> > all `*.dts` affected, the `llc cache` info can take over the old role of
> > the `cluster nodes`. In this case e.g. Juno ends up with MC, DIE no
> > matter if CONFIG_SCHED_CLUSTER is set or not.
>
> Sure I can agree with that if Juno ACPI is not broken. But I am sure it is
> broken based on your argument above. If it is, that needs fixing and this
> series just gets topology parsing in both ACPI and DT aligned, nothing
> more or nothing less. In the process it may be introducing clusters, but
> if it is not dealt correctly in ACPI, then it won't be in DT too and needs
> fixing anyways.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-10 10:08                                     ` Vincent Guittot
  (?)
@ 2022-06-10 10:27                                       ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-10 10:27 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > to the topology because of the above reason ? If so, sorry that is not a
> > valid reason. We could add login to return NULL or appropriate value
> > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > deal that in generic scheduler code. But the topology code can't be
> > compromised for that reason as it is user visible.
> 
> I tend to agree with Dietmar. The legacy use of cluster node in DT
> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> the LLC and the MC scheduling level. The new cluster level that has
> been introduced recently does not target this level but some
> intermediate levels either inside like for the kupeng920 or the v9
> complex or outside like for the ampere altra. So I would say that
> there is one cluster node level in DT that refers to the same MC/LLC
> level and only an additional child/parent cluster node should be used
> to fill the clustergroup_mask.
>

Again I completely disagree. Let us look at the problems separately.
The hardware topology that some of the tools like lscpu and lstopo expects
what the hardware looks like and not the scheduler's view of the hardware.
So the topology masks that gets exposed to the user-space needs fixing
even today. I have reports from various tooling people about the same.
E.g. Juno getting exposed as dual socket system is utter non-sense.

Yes scheduler uses most of the topology masks as is but that is not a must.
There are these *group_mask functions that can implement what scheduler
needs to be fed.

I am not sure why the 2 issues are getting mixed up and that is the main
reason why I jumped into this to make sure the topology masks are
not tampered based on the way it needs to be used for scheduler.

Both ACPI and DT on a platform must present exact same hardware topology
to the user-space, there is no space for argument there.

> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> topology instead to define cpu_coregroup_mask whereas DT described the
> dynamiQ instead of using cache topology. If you use cache topology
> now, then you should skip the dynamiQ
>

Yes, unless someone can work out a binding to represent that and convince
DT maintainers ;).

> Finally, even if CLS and MC have the same scheduling behavior for now,
> they might ends up with different scheduling properties which would
> mean that replacing MC level by CLS one for current SoC would become
> wrong
>

Again as I mentioned to Dietmar, that is something we can and must deal with
in those *group_mask and not expect topology mask to be altered to meet
CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
as the topology is already user-space visible and (tooling) people are
complaining that DT systems are broken and doesn't match ACPI systems.

So unless someone gives me non-scheduler and topology specific reasons
to change that, sorry but my opinion on this matter is not going to change ;).

You will get this view of topology, find a way to manage with all those
*group_mask functions. By the way it is already handled for ACPI systems,
so if you are not happy with that, then that needs fixing as this change
set just aligns the behaviour on similar ACPI system. So the Juno example
is incorrect for the reason that the behaviour of scheduler there is different
with DT and ACPI.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-10 10:27                                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-10 10:27 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > to the topology because of the above reason ? If so, sorry that is not a
> > valid reason. We could add login to return NULL or appropriate value
> > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > deal that in generic scheduler code. But the topology code can't be
> > compromised for that reason as it is user visible.
> 
> I tend to agree with Dietmar. The legacy use of cluster node in DT
> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> the LLC and the MC scheduling level. The new cluster level that has
> been introduced recently does not target this level but some
> intermediate levels either inside like for the kupeng920 or the v9
> complex or outside like for the ampere altra. So I would say that
> there is one cluster node level in DT that refers to the same MC/LLC
> level and only an additional child/parent cluster node should be used
> to fill the clustergroup_mask.
>

Again I completely disagree. Let us look at the problems separately.
The hardware topology that some of the tools like lscpu and lstopo expects
what the hardware looks like and not the scheduler's view of the hardware.
So the topology masks that gets exposed to the user-space needs fixing
even today. I have reports from various tooling people about the same.
E.g. Juno getting exposed as dual socket system is utter non-sense.

Yes scheduler uses most of the topology masks as is but that is not a must.
There are these *group_mask functions that can implement what scheduler
needs to be fed.

I am not sure why the 2 issues are getting mixed up and that is the main
reason why I jumped into this to make sure the topology masks are
not tampered based on the way it needs to be used for scheduler.

Both ACPI and DT on a platform must present exact same hardware topology
to the user-space, there is no space for argument there.

> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> topology instead to define cpu_coregroup_mask whereas DT described the
> dynamiQ instead of using cache topology. If you use cache topology
> now, then you should skip the dynamiQ
>

Yes, unless someone can work out a binding to represent that and convince
DT maintainers ;).

> Finally, even if CLS and MC have the same scheduling behavior for now,
> they might ends up with different scheduling properties which would
> mean that replacing MC level by CLS one for current SoC would become
> wrong
>

Again as I mentioned to Dietmar, that is something we can and must deal with
in those *group_mask and not expect topology mask to be altered to meet
CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
as the topology is already user-space visible and (tooling) people are
complaining that DT systems are broken and doesn't match ACPI systems.

So unless someone gives me non-scheduler and topology specific reasons
to change that, sorry but my opinion on this matter is not going to change ;).

You will get this view of topology, find a way to manage with all those
*group_mask functions. By the way it is already handled for ACPI systems,
so if you are not happy with that, then that needs fixing as this change
set just aligns the behaviour on similar ACPI system. So the Juno example
is incorrect for the reason that the behaviour of scheduler there is different
with DT and ACPI.

--
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-10 10:27                                       ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-10 10:27 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > to the topology because of the above reason ? If so, sorry that is not a
> > valid reason. We could add login to return NULL or appropriate value
> > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > deal that in generic scheduler code. But the topology code can't be
> > compromised for that reason as it is user visible.
> 
> I tend to agree with Dietmar. The legacy use of cluster node in DT
> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> the LLC and the MC scheduling level. The new cluster level that has
> been introduced recently does not target this level but some
> intermediate levels either inside like for the kupeng920 or the v9
> complex or outside like for the ampere altra. So I would say that
> there is one cluster node level in DT that refers to the same MC/LLC
> level and only an additional child/parent cluster node should be used
> to fill the clustergroup_mask.
>

Again I completely disagree. Let us look at the problems separately.
The hardware topology that some of the tools like lscpu and lstopo expects
what the hardware looks like and not the scheduler's view of the hardware.
So the topology masks that gets exposed to the user-space needs fixing
even today. I have reports from various tooling people about the same.
E.g. Juno getting exposed as dual socket system is utter non-sense.

Yes scheduler uses most of the topology masks as is but that is not a must.
There are these *group_mask functions that can implement what scheduler
needs to be fed.

I am not sure why the 2 issues are getting mixed up and that is the main
reason why I jumped into this to make sure the topology masks are
not tampered based on the way it needs to be used for scheduler.

Both ACPI and DT on a platform must present exact same hardware topology
to the user-space, there is no space for argument there.

> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> topology instead to define cpu_coregroup_mask whereas DT described the
> dynamiQ instead of using cache topology. If you use cache topology
> now, then you should skip the dynamiQ
>

Yes, unless someone can work out a binding to represent that and convince
DT maintainers ;).

> Finally, even if CLS and MC have the same scheduling behavior for now,
> they might ends up with different scheduling properties which would
> mean that replacing MC level by CLS one for current SoC would become
> wrong
>

Again as I mentioned to Dietmar, that is something we can and must deal with
in those *group_mask and not expect topology mask to be altered to meet
CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
as the topology is already user-space visible and (tooling) people are
complaining that DT systems are broken and doesn't match ACPI systems.

So unless someone gives me non-scheduler and topology specific reasons
to change that, sorry but my opinion on this matter is not going to change ;).

You will get this view of topology, find a way to manage with all those
*group_mask functions. By the way it is already handled for ACPI systems,
so if you are not happy with that, then that needs fixing as this change
set just aligns the behaviour on similar ACPI system. So the Juno example
is incorrect for the reason that the behaviour of scheduler there is different
with DT and ACPI.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-10 10:27                                       ` Sudeep Holla
  (?)
@ 2022-06-13  9:19                                         ` Dietmar Eggemann
  -1 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-13  9:19 UTC (permalink / raw)
  To: Sudeep Holla, Vincent Guittot
  Cc: linux-kernel, Atish Patra, Atish Patra, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 10/06/2022 12:27, Sudeep Holla wrote:
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>>>
> 
> [...]
> 
>>> Why ? Are you suggesting that we shouldn't present the hardware cluster
>>> to the topology because of the above reason ? If so, sorry that is not a
>>> valid reason. We could add login to return NULL or appropriate value
>>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
>>> deal that in generic scheduler code. But the topology code can't be
>>> compromised for that reason as it is user visible.
>>
>> I tend to agree with Dietmar. The legacy use of cluster node in DT
>> refers to the dynamiQ or legacy b.L cluster which is also aligned to
>> the LLC and the MC scheduling level. The new cluster level that has
>> been introduced recently does not target this level but some
>> intermediate levels either inside like for the kupeng920 or the v9
>> complex or outside like for the ampere altra. So I would say that
>> there is one cluster node level in DT that refers to the same MC/LLC
>> level and only an additional child/parent cluster node should be used
>> to fill the clustergroup_mask.
>>
> 
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
> 
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
> 
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.

I'm all in favor of not mixing up those 2 issues. But I don't understand
why you have to glue them together.

(1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)

(2) Introduce CONFIG_SCHED_CLUSTER for DT systems


(1) This can be solved with your patch-set w/o setting `(1. level)
    cpu-map cluster nodes`. The `socket nodes` taking over the
    functionality of the `cluster nodes` sorts out the `Juno is seen as
    having 2 packages`.
    This will make core_sibling not suitable for cpu_coregroup_mask()
    anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
    takes over here.
    There is no need to involve `cluster nodes` anymore.

(2) This will only make sense for Armv9 L2 complexes if we connect `2.
    level cpu-map cluster nodes` with cluster_id and cluster_sibling.
    And only then clusters would mean the same thing in ACPI and DT.
    I guess this was mentioned already a couple of times.

> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.
> 
>> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
>> topology instead to define cpu_coregroup_mask whereas DT described the
>> dynamiQ instead of using cache topology. If you use cache topology
>> now, then you should skip the dynamiQ
>>
> 
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
> 
>> Finally, even if CLS and MC have the same scheduling behavior for now,
>> they might ends up with different scheduling properties which would
>> mean that replacing MC level by CLS one for current SoC would become
>> wrong
>>
> 
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.
> 
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).

`lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
core_siblings (old filename). It's not reading
/sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
CPU0 on Juno when it can stay 0x01?

> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,
> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.

[...]


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-13  9:19                                         ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-13  9:19 UTC (permalink / raw)
  To: Sudeep Holla, Vincent Guittot
  Cc: linux-kernel, Atish Patra, Atish Patra, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 10/06/2022 12:27, Sudeep Holla wrote:
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>>>
> 
> [...]
> 
>>> Why ? Are you suggesting that we shouldn't present the hardware cluster
>>> to the topology because of the above reason ? If so, sorry that is not a
>>> valid reason. We could add login to return NULL or appropriate value
>>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
>>> deal that in generic scheduler code. But the topology code can't be
>>> compromised for that reason as it is user visible.
>>
>> I tend to agree with Dietmar. The legacy use of cluster node in DT
>> refers to the dynamiQ or legacy b.L cluster which is also aligned to
>> the LLC and the MC scheduling level. The new cluster level that has
>> been introduced recently does not target this level but some
>> intermediate levels either inside like for the kupeng920 or the v9
>> complex or outside like for the ampere altra. So I would say that
>> there is one cluster node level in DT that refers to the same MC/LLC
>> level and only an additional child/parent cluster node should be used
>> to fill the clustergroup_mask.
>>
> 
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
> 
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
> 
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.

I'm all in favor of not mixing up those 2 issues. But I don't understand
why you have to glue them together.

(1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)

(2) Introduce CONFIG_SCHED_CLUSTER for DT systems


(1) This can be solved with your patch-set w/o setting `(1. level)
    cpu-map cluster nodes`. The `socket nodes` taking over the
    functionality of the `cluster nodes` sorts out the `Juno is seen as
    having 2 packages`.
    This will make core_sibling not suitable for cpu_coregroup_mask()
    anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
    takes over here.
    There is no need to involve `cluster nodes` anymore.

(2) This will only make sense for Armv9 L2 complexes if we connect `2.
    level cpu-map cluster nodes` with cluster_id and cluster_sibling.
    And only then clusters would mean the same thing in ACPI and DT.
    I guess this was mentioned already a couple of times.

> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.
> 
>> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
>> topology instead to define cpu_coregroup_mask whereas DT described the
>> dynamiQ instead of using cache topology. If you use cache topology
>> now, then you should skip the dynamiQ
>>
> 
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
> 
>> Finally, even if CLS and MC have the same scheduling behavior for now,
>> they might ends up with different scheduling properties which would
>> mean that replacing MC level by CLS one for current SoC would become
>> wrong
>>
> 
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.
> 
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).

`lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
core_siblings (old filename). It's not reading
/sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
CPU0 on Juno when it can stay 0x01?

> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,
> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.

[...]


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-13  9:19                                         ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-13  9:19 UTC (permalink / raw)
  To: Sudeep Holla, Vincent Guittot
  Cc: linux-kernel, Atish Patra, Atish Patra, Morten Rasmussen,
	Qing Wang, linux-arm-kernel, linux-riscv, Rob Herring

On 10/06/2022 12:27, Sudeep Holla wrote:
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
>>>
> 
> [...]
> 
>>> Why ? Are you suggesting that we shouldn't present the hardware cluster
>>> to the topology because of the above reason ? If so, sorry that is not a
>>> valid reason. We could add login to return NULL or appropriate value
>>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
>>> deal that in generic scheduler code. But the topology code can't be
>>> compromised for that reason as it is user visible.
>>
>> I tend to agree with Dietmar. The legacy use of cluster node in DT
>> refers to the dynamiQ or legacy b.L cluster which is also aligned to
>> the LLC and the MC scheduling level. The new cluster level that has
>> been introduced recently does not target this level but some
>> intermediate levels either inside like for the kupeng920 or the v9
>> complex or outside like for the ampere altra. So I would say that
>> there is one cluster node level in DT that refers to the same MC/LLC
>> level and only an additional child/parent cluster node should be used
>> to fill the clustergroup_mask.
>>
> 
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
> 
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
> 
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.

I'm all in favor of not mixing up those 2 issues. But I don't understand
why you have to glue them together.

(1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)

(2) Introduce CONFIG_SCHED_CLUSTER for DT systems


(1) This can be solved with your patch-set w/o setting `(1. level)
    cpu-map cluster nodes`. The `socket nodes` taking over the
    functionality of the `cluster nodes` sorts out the `Juno is seen as
    having 2 packages`.
    This will make core_sibling not suitable for cpu_coregroup_mask()
    anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
    takes over here.
    There is no need to involve `cluster nodes` anymore.

(2) This will only make sense for Armv9 L2 complexes if we connect `2.
    level cpu-map cluster nodes` with cluster_id and cluster_sibling.
    And only then clusters would mean the same thing in ACPI and DT.
    I guess this was mentioned already a couple of times.

> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.
> 
>> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
>> topology instead to define cpu_coregroup_mask whereas DT described the
>> dynamiQ instead of using cache topology. If you use cache topology
>> now, then you should skip the dynamiQ
>>
> 
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
> 
>> Finally, even if CLS and MC have the same scheduling behavior for now,
>> they might ends up with different scheduling properties which would
>> mean that replacing MC level by CLS one for current SoC would become
>> wrong
>>
> 
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.
> 
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).

`lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
core_siblings (old filename). It's not reading
/sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
CPU0 on Juno when it can stay 0x01?

> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,
> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.

[...]


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-13  9:19                                         ` Dietmar Eggemann
  (?)
@ 2022-06-13 11:17                                           ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-13 11:17 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> On 10/06/2022 12:27, Sudeep Holla wrote:
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >>>
> > 
> > [...]
> > 
> >>> Why ? Are you suggesting that we shouldn't present the hardware cluster
> >>> to the topology because of the above reason ? If so, sorry that is not a
> >>> valid reason. We could add login to return NULL or appropriate value
> >>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> >>> deal that in generic scheduler code. But the topology code can't be
> >>> compromised for that reason as it is user visible.
> >>
> >> I tend to agree with Dietmar. The legacy use of cluster node in DT
> >> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> >> the LLC and the MC scheduling level. The new cluster level that has
> >> been introduced recently does not target this level but some
> >> intermediate levels either inside like for the kupeng920 or the v9
> >> complex or outside like for the ampere altra. So I would say that
> >> there is one cluster node level in DT that refers to the same MC/LLC
> >> level and only an additional child/parent cluster node should be used
> >> to fill the clustergroup_mask.
> >>
> > 
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > 
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> > 
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> 
> I'm all in favor of not mixing up those 2 issues. But I don't understand
> why you have to glue them together.
>

What are you referring as 'glue them together'. As I said this series just
address the hardware topology and if there is any impact on sched domains
then it is do with alignment with ACPI and DT platform behaviour. I am not
adding anything more to glue topology and info needed for sched domains.

> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>

Correct.

> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>

If that is a problem, you need to disable it for DT platforms. Not
supporting proper hardware topology is not the way to workaround the
issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.

>
> (1) This can be solved with your patch-set w/o setting `(1. level)
>     cpu-map cluster nodes`. The `socket nodes` taking over the
>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>     having 2 packages`.
>     This will make core_sibling not suitable for cpu_coregroup_mask()
>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>     takes over here.
>     There is no need to involve `cluster nodes` anymore.
>

Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
which is fine but I don't agree with your approach.

> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>     And only then clusters would mean the same thing in ACPI and DT.
>     I guess this was mentioned already a couple of times.
>

Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
cpu nodes in cluster mask. So we are just aligning to it on DT platforms
here. So if you are saying that is an issue, please fix that for ACPI too.

> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
> > 
> >> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> >> topology instead to define cpu_coregroup_mask whereas DT described the
> >> dynamiQ instead of using cache topology. If you use cache topology
> >> now, then you should skip the dynamiQ
> >>
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> > 
> >> Finally, even if CLS and MC have the same scheduling behavior for now,
> >> they might ends up with different scheduling properties which would
> >> mean that replacing MC level by CLS one for current SoC would become
> >> wrong
> >>
> > 
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> > 
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> 
> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
> core_siblings (old filename). It's not reading
> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
> CPU0 on Juno when it can stay 0x01?
>

On ACPI ? If so, it could be the package ID in the ACPI table which can be
invalid and kernel use the table offset as ID. It is not ideal but doesn't
affect the masks. The current value 0 or 1 on Juno is cluster ID and they
contribute to wrong package cpu masks.


And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
and not on DT which was my main point.

As pointed out earlier, have you checked ACPI on Juno and with 
CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
differs, then it is an issue. But AFAIU, it doesn't and that is my main
argument. You are just assuming what we have on Juno with DT is correct
which may be w.r.t to scheduler but definitely not with respect to the
hardware topology exposed to the users. So my aim is to get that fixed.
If you are not happy with that, then how can be be happy with what is the
current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
your opinion yet on that matter.

--
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-13 11:17                                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-13 11:17 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> On 10/06/2022 12:27, Sudeep Holla wrote:
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >>>
> > 
> > [...]
> > 
> >>> Why ? Are you suggesting that we shouldn't present the hardware cluster
> >>> to the topology because of the above reason ? If so, sorry that is not a
> >>> valid reason. We could add login to return NULL or appropriate value
> >>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> >>> deal that in generic scheduler code. But the topology code can't be
> >>> compromised for that reason as it is user visible.
> >>
> >> I tend to agree with Dietmar. The legacy use of cluster node in DT
> >> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> >> the LLC and the MC scheduling level. The new cluster level that has
> >> been introduced recently does not target this level but some
> >> intermediate levels either inside like for the kupeng920 or the v9
> >> complex or outside like for the ampere altra. So I would say that
> >> there is one cluster node level in DT that refers to the same MC/LLC
> >> level and only an additional child/parent cluster node should be used
> >> to fill the clustergroup_mask.
> >>
> > 
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > 
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> > 
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> 
> I'm all in favor of not mixing up those 2 issues. But I don't understand
> why you have to glue them together.
>

What are you referring as 'glue them together'. As I said this series just
address the hardware topology and if there is any impact on sched domains
then it is do with alignment with ACPI and DT platform behaviour. I am not
adding anything more to glue topology and info needed for sched domains.

> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>

Correct.

> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>

If that is a problem, you need to disable it for DT platforms. Not
supporting proper hardware topology is not the way to workaround the
issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.

>
> (1) This can be solved with your patch-set w/o setting `(1. level)
>     cpu-map cluster nodes`. The `socket nodes` taking over the
>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>     having 2 packages`.
>     This will make core_sibling not suitable for cpu_coregroup_mask()
>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>     takes over here.
>     There is no need to involve `cluster nodes` anymore.
>

Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
which is fine but I don't agree with your approach.

> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>     And only then clusters would mean the same thing in ACPI and DT.
>     I guess this was mentioned already a couple of times.
>

Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
cpu nodes in cluster mask. So we are just aligning to it on DT platforms
here. So if you are saying that is an issue, please fix that for ACPI too.

> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
> > 
> >> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> >> topology instead to define cpu_coregroup_mask whereas DT described the
> >> dynamiQ instead of using cache topology. If you use cache topology
> >> now, then you should skip the dynamiQ
> >>
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> > 
> >> Finally, even if CLS and MC have the same scheduling behavior for now,
> >> they might ends up with different scheduling properties which would
> >> mean that replacing MC level by CLS one for current SoC would become
> >> wrong
> >>
> > 
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> > 
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> 
> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
> core_siblings (old filename). It's not reading
> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
> CPU0 on Juno when it can stay 0x01?
>

On ACPI ? If so, it could be the package ID in the ACPI table which can be
invalid and kernel use the table offset as ID. It is not ideal but doesn't
affect the masks. The current value 0 or 1 on Juno is cluster ID and they
contribute to wrong package cpu masks.


And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
and not on DT which was my main point.

As pointed out earlier, have you checked ACPI on Juno and with 
CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
differs, then it is an issue. But AFAIU, it doesn't and that is my main
argument. You are just assuming what we have on Juno with DT is correct
which may be w.r.t to scheduler but definitely not with respect to the
hardware topology exposed to the users. So my aim is to get that fixed.
If you are not happy with that, then how can be be happy with what is the
current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
your opinion yet on that matter.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-13 11:17                                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-13 11:17 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> On 10/06/2022 12:27, Sudeep Holla wrote:
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >>>
> > 
> > [...]
> > 
> >>> Why ? Are you suggesting that we shouldn't present the hardware cluster
> >>> to the topology because of the above reason ? If so, sorry that is not a
> >>> valid reason. We could add login to return NULL or appropriate value
> >>> needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> >>> deal that in generic scheduler code. But the topology code can't be
> >>> compromised for that reason as it is user visible.
> >>
> >> I tend to agree with Dietmar. The legacy use of cluster node in DT
> >> refers to the dynamiQ or legacy b.L cluster which is also aligned to
> >> the LLC and the MC scheduling level. The new cluster level that has
> >> been introduced recently does not target this level but some
> >> intermediate levels either inside like for the kupeng920 or the v9
> >> complex or outside like for the ampere altra. So I would say that
> >> there is one cluster node level in DT that refers to the same MC/LLC
> >> level and only an additional child/parent cluster node should be used
> >> to fill the clustergroup_mask.
> >>
> > 
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > 
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> > 
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> 
> I'm all in favor of not mixing up those 2 issues. But I don't understand
> why you have to glue them together.
>

What are you referring as 'glue them together'. As I said this series just
address the hardware topology and if there is any impact on sched domains
then it is do with alignment with ACPI and DT platform behaviour. I am not
adding anything more to glue topology and info needed for sched domains.

> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>

Correct.

> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>

If that is a problem, you need to disable it for DT platforms. Not
supporting proper hardware topology is not the way to workaround the
issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.

>
> (1) This can be solved with your patch-set w/o setting `(1. level)
>     cpu-map cluster nodes`. The `socket nodes` taking over the
>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>     having 2 packages`.
>     This will make core_sibling not suitable for cpu_coregroup_mask()
>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>     takes over here.
>     There is no need to involve `cluster nodes` anymore.
>

Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
which is fine but I don't agree with your approach.

> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>     And only then clusters would mean the same thing in ACPI and DT.
>     I guess this was mentioned already a couple of times.
>

Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
cpu nodes in cluster mask. So we are just aligning to it on DT platforms
here. So if you are saying that is an issue, please fix that for ACPI too.

> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
> > 
> >> IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> >> topology instead to define cpu_coregroup_mask whereas DT described the
> >> dynamiQ instead of using cache topology. If you use cache topology
> >> now, then you should skip the dynamiQ
> >>
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> > 
> >> Finally, even if CLS and MC have the same scheduling behavior for now,
> >> they might ends up with different scheduling properties which would
> >> mean that replacing MC level by CLS one for current SoC would become
> >> wrong
> >>
> > 
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> > 
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> 
> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
> core_siblings (old filename). It's not reading
> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
> CPU0 on Juno when it can stay 0x01?
>

On ACPI ? If so, it could be the package ID in the ACPI table which can be
invalid and kernel use the table offset as ID. It is not ideal but doesn't
affect the masks. The current value 0 or 1 on Juno is cluster ID and they
contribute to wrong package cpu masks.


And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
and not on DT which was my main point.

As pointed out earlier, have you checked ACPI on Juno and with 
CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
differs, then it is an issue. But AFAIU, it doesn't and that is my main
argument. You are just assuming what we have on Juno with DT is correct
which may be w.r.t to scheduler but definitely not with respect to the
hardware topology exposed to the users. So my aim is to get that fixed.
If you are not happy with that, then how can be be happy with what is the
current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
your opinion yet on that matter.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-10 10:27                                       ` Sudeep Holla
  (?)
@ 2022-06-14 17:59                                         ` Vincent Guittot
  -1 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-14 17:59 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
>
> [...]
>
> > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > to the topology because of the above reason ? If so, sorry that is not a
> > > valid reason. We could add login to return NULL or appropriate value
> > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > deal that in generic scheduler code. But the topology code can't be
> > > compromised for that reason as it is user visible.
> >
> > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > the LLC and the MC scheduling level. The new cluster level that has
> > been introduced recently does not target this level but some
> > intermediate levels either inside like for the kupeng920 or the v9
> > complex or outside like for the ampere altra. So I would say that
> > there is one cluster node level in DT that refers to the same MC/LLC
> > level and only an additional child/parent cluster node should be used
> > to fill the clustergroup_mask.
> >
>
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
>
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
>
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.
>
> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.

But that's exactly my point there:
ACPI doesn't show the dynamiQ level anywhere but only the llc which
are the same and your patch makes the dynamiQ level visible for DT in
addition to llc

>
> > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > topology instead to define cpu_coregroup_mask whereas DT described the
> > dynamiQ instead of using cache topology. If you use cache topology
> > now, then you should skip the dynamiQ
> >
>
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
>
> > Finally, even if CLS and MC have the same scheduling behavior for now,
> > they might ends up with different scheduling properties which would
> > mean that replacing MC level by CLS one for current SoC would become
> > wrong
> >
>
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.

again, your proposal doesn't help here because the DT will show a
level that doesn't appears in ACPI

>
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).
>
> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,

AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
level that you make now visible to DT

> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.
>
> --
> Regards,
> Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-14 17:59                                         ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-14 17:59 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
>
> [...]
>
> > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > to the topology because of the above reason ? If so, sorry that is not a
> > > valid reason. We could add login to return NULL or appropriate value
> > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > deal that in generic scheduler code. But the topology code can't be
> > > compromised for that reason as it is user visible.
> >
> > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > the LLC and the MC scheduling level. The new cluster level that has
> > been introduced recently does not target this level but some
> > intermediate levels either inside like for the kupeng920 or the v9
> > complex or outside like for the ampere altra. So I would say that
> > there is one cluster node level in DT that refers to the same MC/LLC
> > level and only an additional child/parent cluster node should be used
> > to fill the clustergroup_mask.
> >
>
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
>
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
>
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.
>
> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.

But that's exactly my point there:
ACPI doesn't show the dynamiQ level anywhere but only the llc which
are the same and your patch makes the dynamiQ level visible for DT in
addition to llc

>
> > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > topology instead to define cpu_coregroup_mask whereas DT described the
> > dynamiQ instead of using cache topology. If you use cache topology
> > now, then you should skip the dynamiQ
> >
>
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
>
> > Finally, even if CLS and MC have the same scheduling behavior for now,
> > they might ends up with different scheduling properties which would
> > mean that replacing MC level by CLS one for current SoC would become
> > wrong
> >
>
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.

again, your proposal doesn't help here because the DT will show a
level that doesn't appears in ACPI

>
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).
>
> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,

AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
level that you make now visible to DT

> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-14 17:59                                         ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-14 17:59 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
>
> [...]
>
> > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > to the topology because of the above reason ? If so, sorry that is not a
> > > valid reason. We could add login to return NULL or appropriate value
> > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > deal that in generic scheduler code. But the topology code can't be
> > > compromised for that reason as it is user visible.
> >
> > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > the LLC and the MC scheduling level. The new cluster level that has
> > been introduced recently does not target this level but some
> > intermediate levels either inside like for the kupeng920 or the v9
> > complex or outside like for the ampere altra. So I would say that
> > there is one cluster node level in DT that refers to the same MC/LLC
> > level and only an additional child/parent cluster node should be used
> > to fill the clustergroup_mask.
> >
>
> Again I completely disagree. Let us look at the problems separately.
> The hardware topology that some of the tools like lscpu and lstopo expects
> what the hardware looks like and not the scheduler's view of the hardware.
> So the topology masks that gets exposed to the user-space needs fixing
> even today. I have reports from various tooling people about the same.
> E.g. Juno getting exposed as dual socket system is utter non-sense.
>
> Yes scheduler uses most of the topology masks as is but that is not a must.
> There are these *group_mask functions that can implement what scheduler
> needs to be fed.
>
> I am not sure why the 2 issues are getting mixed up and that is the main
> reason why I jumped into this to make sure the topology masks are
> not tampered based on the way it needs to be used for scheduler.
>
> Both ACPI and DT on a platform must present exact same hardware topology
> to the user-space, there is no space for argument there.

But that's exactly my point there:
ACPI doesn't show the dynamiQ level anywhere but only the llc which
are the same and your patch makes the dynamiQ level visible for DT in
addition to llc

>
> > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > topology instead to define cpu_coregroup_mask whereas DT described the
> > dynamiQ instead of using cache topology. If you use cache topology
> > now, then you should skip the dynamiQ
> >
>
> Yes, unless someone can work out a binding to represent that and convince
> DT maintainers ;).
>
> > Finally, even if CLS and MC have the same scheduling behavior for now,
> > they might ends up with different scheduling properties which would
> > mean that replacing MC level by CLS one for current SoC would become
> > wrong
> >
>
> Again as I mentioned to Dietmar, that is something we can and must deal with
> in those *group_mask and not expect topology mask to be altered to meet
> CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> as the topology is already user-space visible and (tooling) people are
> complaining that DT systems are broken and doesn't match ACPI systems.

again, your proposal doesn't help here because the DT will show a
level that doesn't appears in ACPI

>
> So unless someone gives me non-scheduler and topology specific reasons
> to change that, sorry but my opinion on this matter is not going to change ;).
>
> You will get this view of topology, find a way to manage with all those
> *group_mask functions. By the way it is already handled for ACPI systems,

AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
level that you make now visible to DT

> so if you are not happy with that, then that needs fixing as this change
> set just aligns the behaviour on similar ACPI system. So the Juno example
> is incorrect for the reason that the behaviour of scheduler there is different
> with DT and ACPI.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-14 17:59                                         ` Vincent Guittot
  (?)
@ 2022-06-15 17:00                                           ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-15 17:00 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Please note until we agree on unified view for hardware topology, I will
temporarily ignore any scheduler domain related issues/concerns as this
thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
concerns, but deferring it until we agree on the hardware topology view
which is user visible and how that impacts sched domain topology can be
considered soon following that.

On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> >
> > [...]
> >
> > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > valid reason. We could add login to return NULL or appropriate value
> > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > deal that in generic scheduler code. But the topology code can't be
> > > > compromised for that reason as it is user visible.
> > >
> > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > the LLC and the MC scheduling level. The new cluster level that has
> > > been introduced recently does not target this level but some
> > > intermediate levels either inside like for the kupeng920 or the v9
> > > complex or outside like for the ampere altra. So I would say that
> > > there is one cluster node level in DT that refers to the same MC/LLC
> > > level and only an additional child/parent cluster node should be used
> > > to fill the clustergroup_mask.
> > >
> >
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> >
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> >
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> >
> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
>
> But that's exactly my point there:
> ACPI doesn't show the dynamiQ level anywhere but only the llc which
> are the same and your patch makes the dynamiQ level visible for DT in
> addition to llc
>

Sorry if I am missing something obvious here, but both ACPI and DT has no
special representation for dynamiQ clusters and hence it is impossible to
deduce the same from either DT or ACPI. Can you provide some details
or example as what you are referring as dynamiQ. Also what you mean by
dynamiQ not shown on ACPI while shown with DT systems. If there is any
discrepancies, we need to fix.

Now, what I refer as discrepancy for example on Juno is below:
(value read from a subset of per cpu sysfs files)
cpu                     0       1       2       3       4       5
cluster_id              -1      -1      -1      -1      -1      -1
physical_package_id     1       0       0       1       1       1
cluster_cpus_list       0       1       2       3       4       5
package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5

The above one is for DT which is wrong in all the 4 entries above.
The below one is on ACPI and after applying my series on Juno.

cpu                     0       1       2       3       4       5
cluster_id              1       0       0       1       1       1
physical_package_id     0       0       0       0       0       0
cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5

This matches the expectation from the various userspace tools like lscpu,
lstopo,..etc.

> >
> > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > dynamiQ instead of using cache topology. If you use cache topology
> > > now, then you should skip the dynamiQ
> > >
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> >
> > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > they might ends up with different scheduling properties which would
> > > mean that replacing MC level by CLS one for current SoC would become
> > > wrong
> > >
> >
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> 
> again, your proposal doesn't help here because the DT will show a
> level that doesn't appears in ACPI
>

Which level exactly ? It matches exactly for Juno, the sysfs files are
exact match after my changes. Again don't mix the scheduler domains for
arguments here.

> >
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> >
> > You will get this view of topology, find a way to manage with all those
> > *group_mask functions. By the way it is already handled for ACPI systems,
> 
> AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> level that you make now visible to DT

Again, no. There is no binding for dynamiQ level either in DT or ACPI and
hence there is no way it can become visible on DT. So I have no idea why
there is a thought process or assumption about existence of dynamiQ level
in the DT. It doesn't exist. If that is wrong, can you point me to the
bindings as well as existing device tree ? If you are referring to the
phantom domains Dietmar mentioned in earlier threads, then they don't exist.
It is made up and one need to get the bindings pushed before we can address
such a system.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 17:00                                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-15 17:00 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Please note until we agree on unified view for hardware topology, I will
temporarily ignore any scheduler domain related issues/concerns as this
thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
concerns, but deferring it until we agree on the hardware topology view
which is user visible and how that impacts sched domain topology can be
considered soon following that.

On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> >
> > [...]
> >
> > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > valid reason. We could add login to return NULL or appropriate value
> > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > deal that in generic scheduler code. But the topology code can't be
> > > > compromised for that reason as it is user visible.
> > >
> > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > the LLC and the MC scheduling level. The new cluster level that has
> > > been introduced recently does not target this level but some
> > > intermediate levels either inside like for the kupeng920 or the v9
> > > complex or outside like for the ampere altra. So I would say that
> > > there is one cluster node level in DT that refers to the same MC/LLC
> > > level and only an additional child/parent cluster node should be used
> > > to fill the clustergroup_mask.
> > >
> >
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> >
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> >
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> >
> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
>
> But that's exactly my point there:
> ACPI doesn't show the dynamiQ level anywhere but only the llc which
> are the same and your patch makes the dynamiQ level visible for DT in
> addition to llc
>

Sorry if I am missing something obvious here, but both ACPI and DT has no
special representation for dynamiQ clusters and hence it is impossible to
deduce the same from either DT or ACPI. Can you provide some details
or example as what you are referring as dynamiQ. Also what you mean by
dynamiQ not shown on ACPI while shown with DT systems. If there is any
discrepancies, we need to fix.

Now, what I refer as discrepancy for example on Juno is below:
(value read from a subset of per cpu sysfs files)
cpu                     0       1       2       3       4       5
cluster_id              -1      -1      -1      -1      -1      -1
physical_package_id     1       0       0       1       1       1
cluster_cpus_list       0       1       2       3       4       5
package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5

The above one is for DT which is wrong in all the 4 entries above.
The below one is on ACPI and after applying my series on Juno.

cpu                     0       1       2       3       4       5
cluster_id              1       0       0       1       1       1
physical_package_id     0       0       0       0       0       0
cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5

This matches the expectation from the various userspace tools like lscpu,
lstopo,..etc.

> >
> > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > dynamiQ instead of using cache topology. If you use cache topology
> > > now, then you should skip the dynamiQ
> > >
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> >
> > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > they might ends up with different scheduling properties which would
> > > mean that replacing MC level by CLS one for current SoC would become
> > > wrong
> > >
> >
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> 
> again, your proposal doesn't help here because the DT will show a
> level that doesn't appears in ACPI
>

Which level exactly ? It matches exactly for Juno, the sysfs files are
exact match after my changes. Again don't mix the scheduler domains for
arguments here.

> >
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> >
> > You will get this view of topology, find a way to manage with all those
> > *group_mask functions. By the way it is already handled for ACPI systems,
> 
> AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> level that you make now visible to DT

Again, no. There is no binding for dynamiQ level either in DT or ACPI and
hence there is no way it can become visible on DT. So I have no idea why
there is a thought process or assumption about existence of dynamiQ level
in the DT. It doesn't exist. If that is wrong, can you point me to the
bindings as well as existing device tree ? If you are referring to the
phantom domains Dietmar mentioned in earlier threads, then they don't exist.
It is made up and one need to get the bindings pushed before we can address
such a system.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 17:00                                           ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-15 17:00 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Sudeep Holla, Morten Rasmussen, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

Please note until we agree on unified view for hardware topology, I will
temporarily ignore any scheduler domain related issues/concerns as this
thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
concerns, but deferring it until we agree on the hardware topology view
which is user visible and how that impacts sched domain topology can be
considered soon following that.

On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> >
> > [...]
> >
> > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > valid reason. We could add login to return NULL or appropriate value
> > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > deal that in generic scheduler code. But the topology code can't be
> > > > compromised for that reason as it is user visible.
> > >
> > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > the LLC and the MC scheduling level. The new cluster level that has
> > > been introduced recently does not target this level but some
> > > intermediate levels either inside like for the kupeng920 or the v9
> > > complex or outside like for the ampere altra. So I would say that
> > > there is one cluster node level in DT that refers to the same MC/LLC
> > > level and only an additional child/parent cluster node should be used
> > > to fill the clustergroup_mask.
> > >
> >
> > Again I completely disagree. Let us look at the problems separately.
> > The hardware topology that some of the tools like lscpu and lstopo expects
> > what the hardware looks like and not the scheduler's view of the hardware.
> > So the topology masks that gets exposed to the user-space needs fixing
> > even today. I have reports from various tooling people about the same.
> > E.g. Juno getting exposed as dual socket system is utter non-sense.
> >
> > Yes scheduler uses most of the topology masks as is but that is not a must.
> > There are these *group_mask functions that can implement what scheduler
> > needs to be fed.
> >
> > I am not sure why the 2 issues are getting mixed up and that is the main
> > reason why I jumped into this to make sure the topology masks are
> > not tampered based on the way it needs to be used for scheduler.
> >
> > Both ACPI and DT on a platform must present exact same hardware topology
> > to the user-space, there is no space for argument there.
>
> But that's exactly my point there:
> ACPI doesn't show the dynamiQ level anywhere but only the llc which
> are the same and your patch makes the dynamiQ level visible for DT in
> addition to llc
>

Sorry if I am missing something obvious here, but both ACPI and DT has no
special representation for dynamiQ clusters and hence it is impossible to
deduce the same from either DT or ACPI. Can you provide some details
or example as what you are referring as dynamiQ. Also what you mean by
dynamiQ not shown on ACPI while shown with DT systems. If there is any
discrepancies, we need to fix.

Now, what I refer as discrepancy for example on Juno is below:
(value read from a subset of per cpu sysfs files)
cpu                     0       1       2       3       4       5
cluster_id              -1      -1      -1      -1      -1      -1
physical_package_id     1       0       0       1       1       1
cluster_cpus_list       0       1       2       3       4       5
package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5

The above one is for DT which is wrong in all the 4 entries above.
The below one is on ACPI and after applying my series on Juno.

cpu                     0       1       2       3       4       5
cluster_id              1       0       0       1       1       1
physical_package_id     0       0       0       0       0       0
cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5

This matches the expectation from the various userspace tools like lscpu,
lstopo,..etc.

> >
> > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > dynamiQ instead of using cache topology. If you use cache topology
> > > now, then you should skip the dynamiQ
> > >
> >
> > Yes, unless someone can work out a binding to represent that and convince
> > DT maintainers ;).
> >
> > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > they might ends up with different scheduling properties which would
> > > mean that replacing MC level by CLS one for current SoC would become
> > > wrong
> > >
> >
> > Again as I mentioned to Dietmar, that is something we can and must deal with
> > in those *group_mask and not expect topology mask to be altered to meet
> > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > as the topology is already user-space visible and (tooling) people are
> > complaining that DT systems are broken and doesn't match ACPI systems.
> 
> again, your proposal doesn't help here because the DT will show a
> level that doesn't appears in ACPI
>

Which level exactly ? It matches exactly for Juno, the sysfs files are
exact match after my changes. Again don't mix the scheduler domains for
arguments here.

> >
> > So unless someone gives me non-scheduler and topology specific reasons
> > to change that, sorry but my opinion on this matter is not going to change ;).
> >
> > You will get this view of topology, find a way to manage with all those
> > *group_mask functions. By the way it is already handled for ACPI systems,
> 
> AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> level that you make now visible to DT

Again, no. There is no binding for dynamiQ level either in DT or ACPI and
hence there is no way it can become visible on DT. So I have no idea why
there is a thought process or assumption about existence of dynamiQ level
in the DT. It doesn't exist. If that is wrong, can you point me to the
bindings as well as existing device tree ? If you are referring to the
phantom domains Dietmar mentioned in earlier threads, then they don't exist.
It is made up and one need to get the bindings pushed before we can address
such a system.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-15 17:00                                           ` Sudeep Holla
  (?)
@ 2022-06-15 22:44                                             ` Vincent Guittot
  -1 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.
>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.

The cpu-map node in DT is following the dynamiQ or the legacy
big.LITTLE topology. As an example  the hikey6220 has 2 clusters, the
hikey960 has 2 clusters that reflect big.LITTLE and the sdm845 has one
cluster that reflects the dynamiQ cluster.

now my mistake is to have made the assumption that core_sibling is
arch_topology was used to reflect the cores sharing cache but after
looking more deeply in the code it seems to be a lucky coincidence

>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 22:44                                             ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.
>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.

The cpu-map node in DT is following the dynamiQ or the legacy
big.LITTLE topology. As an example  the hikey6220 has 2 clusters, the
hikey960 has 2 clusters that reflect big.LITTLE and the sdm845 has one
cluster that reflects the dynamiQ cluster.

now my mistake is to have made the assumption that core_sibling is
arch_topology was used to reflect the cores sharing cache but after
looking more deeply in the code it seems to be a lucky coincidence

>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 22:44                                             ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:44 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.
>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.

The cpu-map node in DT is following the dynamiQ or the legacy
big.LITTLE topology. As an example  the hikey6220 has 2 clusters, the
hikey960 has 2 clusters that reflect big.LITTLE and the sdm845 has one
cluster that reflects the dynamiQ cluster.

now my mistake is to have made the assumption that core_sibling is
arch_topology was used to reflect the cores sharing cache but after
looking more deeply in the code it seems to be a lucky coincidence

>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-15 17:00                                           ` Sudeep Holla
  (?)
@ 2022-06-15 22:45                                             ` Vincent Guittot
  -1 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:45 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.

On my side, what i'm really interested in, it's the hardware topology
reported to the scheduler

>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.
>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 22:45                                             ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:45 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.

On my side, what i'm really interested in, it's the hardware topology
reported to the scheduler

>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.
>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-15 22:45                                             ` Vincent Guittot
  0 siblings, 0 replies; 153+ messages in thread
From: Vincent Guittot @ 2022-06-15 22:45 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Dietmar Eggemann, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Wed, 15 Jun 2022 at 19:01, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> Please note until we agree on unified view for hardware topology, I will
> temporarily ignore any scheduler domain related issues/concerns as this
> thread/discussion is mixing up too much IMO. I am not ignoring sched_domain
> concerns, but deferring it until we agree on the hardware topology view
> which is user visible and how that impacts sched domain topology can be
> considered soon following that.

On my side, what i'm really interested in, it's the hardware topology
reported to the scheduler

>
> On Tue, Jun 14, 2022 at 07:59:23PM +0200, Vincent Guittot wrote:
> > On Fri, 10 Jun 2022 at 12:27, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> > > > On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > Why ? Are you suggesting that we shouldn't present the hardware cluster
> > > > > to the topology because of the above reason ? If so, sorry that is not a
> > > > > valid reason. We could add login to return NULL or appropriate value
> > > > > needed in cpu_clustergroup_mask id it matches MC level mask if we can't
> > > > > deal that in generic scheduler code. But the topology code can't be
> > > > > compromised for that reason as it is user visible.
> > > >
> > > > I tend to agree with Dietmar. The legacy use of cluster node in DT
> > > > refers to the dynamiQ or legacy b.L cluster which is also aligned to
> > > > the LLC and the MC scheduling level. The new cluster level that has
> > > > been introduced recently does not target this level but some
> > > > intermediate levels either inside like for the kupeng920 or the v9
> > > > complex or outside like for the ampere altra. So I would say that
> > > > there is one cluster node level in DT that refers to the same MC/LLC
> > > > level and only an additional child/parent cluster node should be used
> > > > to fill the clustergroup_mask.
> > > >
> > >
> > > Again I completely disagree. Let us look at the problems separately.
> > > The hardware topology that some of the tools like lscpu and lstopo expects
> > > what the hardware looks like and not the scheduler's view of the hardware.
> > > So the topology masks that gets exposed to the user-space needs fixing
> > > even today. I have reports from various tooling people about the same.
> > > E.g. Juno getting exposed as dual socket system is utter non-sense.
> > >
> > > Yes scheduler uses most of the topology masks as is but that is not a must.
> > > There are these *group_mask functions that can implement what scheduler
> > > needs to be fed.
> > >
> > > I am not sure why the 2 issues are getting mixed up and that is the main
> > > reason why I jumped into this to make sure the topology masks are
> > > not tampered based on the way it needs to be used for scheduler.
> > >
> > > Both ACPI and DT on a platform must present exact same hardware topology
> > > to the user-space, there is no space for argument there.
> >
> > But that's exactly my point there:
> > ACPI doesn't show the dynamiQ level anywhere but only the llc which
> > are the same and your patch makes the dynamiQ level visible for DT in
> > addition to llc
> >
>
> Sorry if I am missing something obvious here, but both ACPI and DT has no
> special representation for dynamiQ clusters and hence it is impossible to
> deduce the same from either DT or ACPI. Can you provide some details
> or example as what you are referring as dynamiQ. Also what you mean by
> dynamiQ not shown on ACPI while shown with DT systems. If there is any
> discrepancies, we need to fix.
>
> Now, what I refer as discrepancy for example on Juno is below:
> (value read from a subset of per cpu sysfs files)
> cpu                     0       1       2       3       4       5
> cluster_id              -1      -1      -1      -1      -1      -1
> physical_package_id     1       0       0       1       1       1
> cluster_cpus_list       0       1       2       3       4       5
> package_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
>
> The above one is for DT which is wrong in all the 4 entries above.
> The below one is on ACPI and after applying my series on Juno.
>
> cpu                     0       1       2       3       4       5
> cluster_id              1       0       0       1       1       1
> physical_package_id     0       0       0       0       0       0
> cluster_cpus_list       0,3-5   1-2     1-2     0,3-5   0,3-5   0,3-5
> package_cpus_list       0-5     0-5     0-5     0-5     0-5     0-5
>
> This matches the expectation from the various userspace tools like lscpu,
> lstopo,..etc.
>
> > >
> > > > IIUC, we don't describe the dynamiQ level in ACPI which  uses cache
> > > > topology instead to define cpu_coregroup_mask whereas DT described the
> > > > dynamiQ instead of using cache topology. If you use cache topology
> > > > now, then you should skip the dynamiQ
> > > >
> > >
> > > Yes, unless someone can work out a binding to represent that and convince
> > > DT maintainers ;).
> > >
> > > > Finally, even if CLS and MC have the same scheduling behavior for now,
> > > > they might ends up with different scheduling properties which would
> > > > mean that replacing MC level by CLS one for current SoC would become
> > > > wrong
> > > >
> > >
> > > Again as I mentioned to Dietmar, that is something we can and must deal with
> > > in those *group_mask and not expect topology mask to be altered to meet
> > > CLS/MC or whatever sched domains needs. Sorry, that is my strong opinion
> > > as the topology is already user-space visible and (tooling) people are
> > > complaining that DT systems are broken and doesn't match ACPI systems.
> >
> > again, your proposal doesn't help here because the DT will show a
> > level that doesn't appears in ACPI
> >
>
> Which level exactly ? It matches exactly for Juno, the sysfs files are
> exact match after my changes. Again don't mix the scheduler domains for
> arguments here.
>
> > >
> > > So unless someone gives me non-scheduler and topology specific reasons
> > > to change that, sorry but my opinion on this matter is not going to change ;).
> > >
> > > You will get this view of topology, find a way to manage with all those
> > > *group_mask functions. By the way it is already handled for ACPI systems,
> >
> > AFAICT, no it's not, the cluster described in ACPI is not the dynamiQ
> > level that you make now visible to DT
>
> Again, no. There is no binding for dynamiQ level either in DT or ACPI and
> hence there is no way it can become visible on DT. So I have no idea why
> there is a thought process or assumption about existence of dynamiQ level
> in the DT. It doesn't exist. If that is wrong, can you point me to the
> bindings as well as existing device tree ? If you are referring to the
> phantom domains Dietmar mentioned in earlier threads, then they don't exist.
> It is made up and one need to get the bindings pushed before we can address
> such a system.
>
> --
> Regards,
> Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-13 11:17                                           ` Sudeep Holla
  (?)
@ 2022-06-16 16:02                                             ` Dietmar Eggemann
  -1 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-16 16:02 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 13/06/2022 12:17, Sudeep Holla wrote:
> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> Again I completely disagree. Let us look at the problems separately.
>>> The hardware topology that some of the tools like lscpu and lstopo expects
>>> what the hardware looks like and not the scheduler's view of the hardware.
>>> So the topology masks that gets exposed to the user-space needs fixing
>>> even today. I have reports from various tooling people about the same.
>>> E.g. Juno getting exposed as dual socket system is utter non-sense.
>>>
>>> Yes scheduler uses most of the topology masks as is but that is not a must.
>>> There are these *group_mask functions that can implement what scheduler
>>> needs to be fed.
>>>
>>> I am not sure why the 2 issues are getting mixed up and that is the main
>>> reason why I jumped into this to make sure the topology masks are
>>> not tampered based on the way it needs to be used for scheduler.
>>
>> I'm all in favor of not mixing up those 2 issues. But I don't understand
>> why you have to glue them together.
>>
> 
> What are you referring as 'glue them together'. As I said this series just
> address the hardware topology and if there is any impact on sched domains
> then it is do with alignment with ACPI and DT platform behaviour. I am not
> adding anything more to glue topology and info needed for sched domains.

You can fix (1) without (2) parsing 1. level cluster nodes as
cluster_siblings.

>> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>>
> 
> Correct.
> 
>> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>>
> 
> If that is a problem, you need to disable it for DT platforms. Not
> supporting proper hardware topology is not the way to workaround the
> issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.
> 
>>
>> (1) This can be solved with your patch-set w/o setting `(1. level)
>>     cpu-map cluster nodes`. The `socket nodes` taking over the
>>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>>     having 2 packages`.
>>     This will make core_sibling not suitable for cpu_coregroup_mask()
>>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>>     takes over here.
>>     There is no need to involve `cluster nodes` anymore.
>>
> 
> Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
> which is fine but I don't agree with your approach.
> 
>> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>>     And only then clusters would mean the same thing in ACPI and DT.
>>     I guess this was mentioned already a couple of times.
>>
> 
> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level

cpu_map {
  socket0 {
    cluster0 {    <-- 1. level cluster
      cluster0 {  <-- 2. level cluster (3 -->)
        core0 {

        };
        core1 {

        };
      cluster1 {
  ...

Armv9 L2 complexes: e.g. QC SM8450:

      .---------------.
CPU   |0 1 2 3 4 5 6 7|
      +---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
      +---------------+
  L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
      +---------------+
  L3  |<--         -->|
      +---------------+
      |<-- cluster -->|
      +---------------+
      |<--   DSU   -->|
      '---------------'

Only if we map (i) into cluster_sibling, we get the same hardware
representation (for the task scheduler) for ACPI (4) and DT (5) systems.

(4) examples:

Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
L3-tag (cpu_clustergroup_mask()).

X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2

Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
(little CPUs)) 2 CPUs sharing L2

> cpu nodes in cluster mask. So we are just aligning to it on DT platforms
> here. So if you are saying that is an issue, please fix that for ACPI too.
> 

[...]

>>> So unless someone gives me non-scheduler and topology specific reasons
>>> to change that, sorry but my opinion on this matter is not going to change ;).
>>
>> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
>> core_siblings (old filename). It's not reading
>> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
>> CPU0 on Juno when it can stay 0x01?
>>
> 
> On ACPI ? If so, it could be the package ID in the ACPI table which can be
> invalid and kernel use the table offset as ID. It is not ideal but doesn't
> affect the masks. The current value 0 or 1 on Juno is cluster ID and they
> contribute to wrong package cpu masks.
> 
> 
> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> and not on DT which was my main point.

Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
logically different information than a Juno `cluster_cpus_list` file.

Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).

I think key is to agree what a CLUSTER actually represent and especially
in case when `the level of topology above CPUs` is congruent with LLC
boundaries. Because my feeling is that people didn't pay attention to
this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
is the Ampere Altra hack in cpu_coregroup_mask().

> As pointed out earlier, have you checked ACPI on Juno and with 
> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> argument. You are just assuming what we have on Juno with DT is correct
> which may be w.r.t to scheduler but definitely not with respect to the
> hardware topology exposed to the users. So my aim is to get that fixed.

I never run Juno w/ ACPI. Are you saying that
find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
which match the `1. level cluster nodes`?

The function header of find_acpi_cpu_topology_cluster() says that `...
the cluster, if present is the level of topology above CPUs. ...`.

From this perspective I can see your point. But this is then still
clearly poorly designed. How would we ever support CONFIG_SCHED_CLUSTER
in DT when it really (potentially) would bring a benefit (i.e. in the
Armv9 L2-complex case) and not only create trouble for the task
scheduler to setup its domains correctly? Also in case we stick to
setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
sched domain and MC the exceptional one. Just to respect the way how the
task scheduler removes not useful domains today.

> If you are not happy with that, then how can be be happy with what is the
> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> your opinion yet on that matter.

I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
of topology above CPUs` is congruent with LLC` creates trouble to the
scheduler. So I don't see why we should replicate this for DT. Let's
discuss further tomorrow in person.

[...]





^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-16 16:02                                             ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-16 16:02 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 13/06/2022 12:17, Sudeep Holla wrote:
> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> Again I completely disagree. Let us look at the problems separately.
>>> The hardware topology that some of the tools like lscpu and lstopo expects
>>> what the hardware looks like and not the scheduler's view of the hardware.
>>> So the topology masks that gets exposed to the user-space needs fixing
>>> even today. I have reports from various tooling people about the same.
>>> E.g. Juno getting exposed as dual socket system is utter non-sense.
>>>
>>> Yes scheduler uses most of the topology masks as is but that is not a must.
>>> There are these *group_mask functions that can implement what scheduler
>>> needs to be fed.
>>>
>>> I am not sure why the 2 issues are getting mixed up and that is the main
>>> reason why I jumped into this to make sure the topology masks are
>>> not tampered based on the way it needs to be used for scheduler.
>>
>> I'm all in favor of not mixing up those 2 issues. But I don't understand
>> why you have to glue them together.
>>
> 
> What are you referring as 'glue them together'. As I said this series just
> address the hardware topology and if there is any impact on sched domains
> then it is do with alignment with ACPI and DT platform behaviour. I am not
> adding anything more to glue topology and info needed for sched domains.

You can fix (1) without (2) parsing 1. level cluster nodes as
cluster_siblings.

>> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>>
> 
> Correct.
> 
>> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>>
> 
> If that is a problem, you need to disable it for DT platforms. Not
> supporting proper hardware topology is not the way to workaround the
> issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.
> 
>>
>> (1) This can be solved with your patch-set w/o setting `(1. level)
>>     cpu-map cluster nodes`. The `socket nodes` taking over the
>>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>>     having 2 packages`.
>>     This will make core_sibling not suitable for cpu_coregroup_mask()
>>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>>     takes over here.
>>     There is no need to involve `cluster nodes` anymore.
>>
> 
> Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
> which is fine but I don't agree with your approach.
> 
>> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>>     And only then clusters would mean the same thing in ACPI and DT.
>>     I guess this was mentioned already a couple of times.
>>
> 
> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level

cpu_map {
  socket0 {
    cluster0 {    <-- 1. level cluster
      cluster0 {  <-- 2. level cluster (3 -->)
        core0 {

        };
        core1 {

        };
      cluster1 {
  ...

Armv9 L2 complexes: e.g. QC SM8450:

      .---------------.
CPU   |0 1 2 3 4 5 6 7|
      +---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
      +---------------+
  L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
      +---------------+
  L3  |<--         -->|
      +---------------+
      |<-- cluster -->|
      +---------------+
      |<--   DSU   -->|
      '---------------'

Only if we map (i) into cluster_sibling, we get the same hardware
representation (for the task scheduler) for ACPI (4) and DT (5) systems.

(4) examples:

Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
L3-tag (cpu_clustergroup_mask()).

X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2

Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
(little CPUs)) 2 CPUs sharing L2

> cpu nodes in cluster mask. So we are just aligning to it on DT platforms
> here. So if you are saying that is an issue, please fix that for ACPI too.
> 

[...]

>>> So unless someone gives me non-scheduler and topology specific reasons
>>> to change that, sorry but my opinion on this matter is not going to change ;).
>>
>> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
>> core_siblings (old filename). It's not reading
>> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
>> CPU0 on Juno when it can stay 0x01?
>>
> 
> On ACPI ? If so, it could be the package ID in the ACPI table which can be
> invalid and kernel use the table offset as ID. It is not ideal but doesn't
> affect the masks. The current value 0 or 1 on Juno is cluster ID and they
> contribute to wrong package cpu masks.
> 
> 
> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> and not on DT which was my main point.

Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
logically different information than a Juno `cluster_cpus_list` file.

Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).

I think key is to agree what a CLUSTER actually represent and especially
in case when `the level of topology above CPUs` is congruent with LLC
boundaries. Because my feeling is that people didn't pay attention to
this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
is the Ampere Altra hack in cpu_coregroup_mask().

> As pointed out earlier, have you checked ACPI on Juno and with 
> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> argument. You are just assuming what we have on Juno with DT is correct
> which may be w.r.t to scheduler but definitely not with respect to the
> hardware topology exposed to the users. So my aim is to get that fixed.

I never run Juno w/ ACPI. Are you saying that
find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
which match the `1. level cluster nodes`?

The function header of find_acpi_cpu_topology_cluster() says that `...
the cluster, if present is the level of topology above CPUs. ...`.

From this perspective I can see your point. But this is then still
clearly poorly designed. How would we ever support CONFIG_SCHED_CLUSTER
in DT when it really (potentially) would bring a benefit (i.e. in the
Armv9 L2-complex case) and not only create trouble for the task
scheduler to setup its domains correctly? Also in case we stick to
setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
sched domain and MC the exceptional one. Just to respect the way how the
task scheduler removes not useful domains today.

> If you are not happy with that, then how can be be happy with what is the
> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> your opinion yet on that matter.

I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
of topology above CPUs` is congruent with LLC` creates trouble to the
scheduler. So I don't see why we should replicate this for DT. Let's
discuss further tomorrow in person.

[...]





_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-16 16:02                                             ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-16 16:02 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 13/06/2022 12:17, Sudeep Holla wrote:
> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> Again I completely disagree. Let us look at the problems separately.
>>> The hardware topology that some of the tools like lscpu and lstopo expects
>>> what the hardware looks like and not the scheduler's view of the hardware.
>>> So the topology masks that gets exposed to the user-space needs fixing
>>> even today. I have reports from various tooling people about the same.
>>> E.g. Juno getting exposed as dual socket system is utter non-sense.
>>>
>>> Yes scheduler uses most of the topology masks as is but that is not a must.
>>> There are these *group_mask functions that can implement what scheduler
>>> needs to be fed.
>>>
>>> I am not sure why the 2 issues are getting mixed up and that is the main
>>> reason why I jumped into this to make sure the topology masks are
>>> not tampered based on the way it needs to be used for scheduler.
>>
>> I'm all in favor of not mixing up those 2 issues. But I don't understand
>> why you have to glue them together.
>>
> 
> What are you referring as 'glue them together'. As I said this series just
> address the hardware topology and if there is any impact on sched domains
> then it is do with alignment with ACPI and DT platform behaviour. I am not
> adding anything more to glue topology and info needed for sched domains.

You can fix (1) without (2) parsing 1. level cluster nodes as
cluster_siblings.

>> (1) DT systems broken in userspace (lstopo shows Juno with 2 Packages)
>>
> 
> Correct.
> 
>> (2) Introduce CONFIG_SCHED_CLUSTER for DT systems
>>
> 
> If that is a problem, you need to disable it for DT platforms. Not
> supporting proper hardware topology is not the way to workaround the
> issues enabling CONFIG_SCHED_CLUSTER for DT systems IMO.
> 
>>
>> (1) This can be solved with your patch-set w/o setting `(1. level)
>>     cpu-map cluster nodes`. The `socket nodes` taking over the
>>     functionality of the `cluster nodes` sorts out the `Juno is seen as
>>     having 2 packages`.
>>     This will make core_sibling not suitable for cpu_coregroup_mask()
>>     anymore. But this is OK since llc from cacheinfo (i.e. llc_sibling)
>>     takes over here.
>>     There is no need to involve `cluster nodes` anymore.
>>
> 
> Again you are just deferring introducing CONFIG_SCHED_CLUSTER on DT
> which is fine but I don't agree with your approach.
> 
>> (2) This will only make sense for Armv9 L2 complexes if we connect `2.
>>     level cpu-map cluster nodes` with cluster_id and cluster_sibling.
>>     And only then clusters would mean the same thing in ACPI and DT.
>>     I guess this was mentioned already a couple of times.
>>
> 
> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level

cpu_map {
  socket0 {
    cluster0 {    <-- 1. level cluster
      cluster0 {  <-- 2. level cluster (3 -->)
        core0 {

        };
        core1 {

        };
      cluster1 {
  ...

Armv9 L2 complexes: e.g. QC SM8450:

      .---------------.
CPU   |0 1 2 3 4 5 6 7|
      +---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
      +---------------+
  L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
      +---------------+
  L3  |<--         -->|
      +---------------+
      |<-- cluster -->|
      +---------------+
      |<--   DSU   -->|
      '---------------'

Only if we map (i) into cluster_sibling, we get the same hardware
representation (for the task scheduler) for ACPI (4) and DT (5) systems.

(4) examples:

Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
L3-tag (cpu_clustergroup_mask()).

X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2

Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
(little CPUs)) 2 CPUs sharing L2

> cpu nodes in cluster mask. So we are just aligning to it on DT platforms
> here. So if you are saying that is an issue, please fix that for ACPI too.
> 

[...]

>>> So unless someone gives me non-scheduler and topology specific reasons
>>> to change that, sorry but my opinion on this matter is not going to change ;).
>>
>> `lstopo` is fine with a now correct /sys/.../topology/package_cpus (or
>> core_siblings (old filename). It's not reading
>> /sys/.../topology/cluster_cpus (yet) so why set it (wrongly) to 0x39 for
>> CPU0 on Juno when it can stay 0x01?
>>
> 
> On ACPI ? If so, it could be the package ID in the ACPI table which can be
> invalid and kernel use the table offset as ID. It is not ideal but doesn't
> affect the masks. The current value 0 or 1 on Juno is cluster ID and they
> contribute to wrong package cpu masks.
> 
> 
> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> and not on DT which was my main point.

Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
logically different information than a Juno `cluster_cpus_list` file.

Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).

I think key is to agree what a CLUSTER actually represent and especially
in case when `the level of topology above CPUs` is congruent with LLC
boundaries. Because my feeling is that people didn't pay attention to
this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
is the Ampere Altra hack in cpu_coregroup_mask().

> As pointed out earlier, have you checked ACPI on Juno and with 
> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> argument. You are just assuming what we have on Juno with DT is correct
> which may be w.r.t to scheduler but definitely not with respect to the
> hardware topology exposed to the users. So my aim is to get that fixed.

I never run Juno w/ ACPI. Are you saying that
find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
which match the `1. level cluster nodes`?

The function header of find_acpi_cpu_topology_cluster() says that `...
the cluster, if present is the level of topology above CPUs. ...`.

From this perspective I can see your point. But this is then still
clearly poorly designed. How would we ever support CONFIG_SCHED_CLUSTER
in DT when it really (potentially) would bring a benefit (i.e. in the
Armv9 L2-complex case) and not only create trouble for the task
scheduler to setup its domains correctly? Also in case we stick to
setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
sched domain and MC the exceptional one. Just to respect the way how the
task scheduler removes not useful domains today.

> If you are not happy with that, then how can be be happy with what is the
> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> your opinion yet on that matter.

I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
of topology above CPUs` is congruent with LLC` creates trouble to the
scheduler. So I don't see why we should replicate this for DT. Let's
discuss further tomorrow in person.

[...]





_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-16 16:02                                             ` Dietmar Eggemann
  (?)
@ 2022-06-17 11:16                                               ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-17 11:16 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> On 13/06/2022 12:17, Sudeep Holla wrote:
> > On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> Again I completely disagree. Let us look at the problems separately.
> >>> The hardware topology that some of the tools like lscpu and lstopo expects
> >>> what the hardware looks like and not the scheduler's view of the hardware.
> >>> So the topology masks that gets exposed to the user-space needs fixing
> >>> even today. I have reports from various tooling people about the same.
> >>> E.g. Juno getting exposed as dual socket system is utter non-sense.
> >>>
> >>> Yes scheduler uses most of the topology masks as is but that is not a must.
> >>> There are these *group_mask functions that can implement what scheduler
> >>> needs to be fed.
> >>>
> >>> I am not sure why the 2 issues are getting mixed up and that is the main
> >>> reason why I jumped into this to make sure the topology masks are
> >>> not tampered based on the way it needs to be used for scheduler.
> >>
> >> I'm all in favor of not mixing up those 2 issues. But I don't understand
> >> why you have to glue them together.
> >>
> > 
> > What are you referring as 'glue them together'. As I said this series just
> > address the hardware topology and if there is any impact on sched domains
> > then it is do with alignment with ACPI and DT platform behaviour. I am not
> > adding anything more to glue topology and info needed for sched domains.
> 
> You can fix (1) without (2) parsing 1. level cluster nodes as
> cluster_siblings.
>

Technically yes, but I see no point in delaying it as it is considered as
broken with respect to the moment ACPI exposed the correct value and at the
same time resulted in exposing incorrect value in case of DT. I am referring
to the same change that introduced SCHED_CLUSTER. The damage is done and it
needs repairing ASAP.

> > Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> 
> cpu_map {
>   socket0 {
>     cluster0 {    <-- 1. level cluster
>       cluster0 {  <-- 2. level cluster (3 -->)

Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
node which is 2. level cluster in this DT snippet.

>         core0 {
> 
>         };
>         core1 {
> 
>         };
>       cluster1 {
>   ...
> 
> Armv9 L2 complexes: e.g. QC SM8450:
> 
>       .---------------.
> CPU   |0 1 2 3 4 5 6 7|
>       +---------------+
> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>       +---------------+
>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)

Again before I assume, what exactly <--3 here and in above snippet mean ?

>       +---------------+
>   L3  |<--         -->|
>       +---------------+
>       |<-- cluster -->|

I think this is 2 level cluster or only cluster in this system w.r.t hardware.
So lets not confuse with multi-level if not necessary.

>       +---------------+
>       |<--   DSU   -->|
>       '---------------'
>
> Only if we map (i) into cluster_sibling, we get the same hardware
> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>

What is (i) above ?

> (4) examples:
>
> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> L3-tag (cpu_clustergroup_mask()).
>

Again decouple cache info and cluster info from h/w, you have all the info.
You can couple them together if that helps when you feed sched_domains.

> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>
> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> (little CPUs)) 2 CPUs sharing L2

[...]

> > And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> > and not on DT which was my main point.
> 
> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> logically different information than a Juno `cluster_cpus_list` file.
>

And why is that ?

> Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
> LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).
>

Correct because that is how the hardware clusters are designed on those SoC.
Cache topology is different.

> I think key is to agree what a CLUSTER actually represent and especially
> in case when `the level of topology above CPUs` is congruent with LLC
> boundaries. Because my feeling is that people didn't pay attention to
> this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
> is the Ampere Altra hack in cpu_coregroup_mask().
>

The is defined by hardware and DT/ACPI has bindings for it. We can't redefine
CLUSTER in a way that breaks those definitions. Again cluster is part of CPU
topology and cache topology can be different and need not be congruent with
CPU topology in some ways, Ofcourse they are in some other ways but there will
be no single line of alignment across SoCs which is quite evident with the
examples you have listed. Just add Ampere Altra to make it more fun.

> > As pointed out earlier, have you checked ACPI on Juno and with 
> > CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> > differs, then it is an issue. But AFAIU, it doesn't and that is my main
> > argument. You are just assuming what we have on Juno with DT is correct
> > which may be w.r.t to scheduler but definitely not with respect to the
> > hardware topology exposed to the users. So my aim is to get that fixed.
> 
> I never run Juno w/ ACPI. Are you saying that
> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> which match the `1. level cluster nodes`?
>

Again I am totally confused as why this is now 1.level cluster where as above
SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
has 1 DSU cluster, Juno has 2 clusters.

> The function header of find_acpi_cpu_topology_cluster() says that `...
> the cluster, if present is the level of topology above CPUs. ...`.
>

Exactly and that's how sysfs is also defined and we can't go back and change
that now. I don't see any issue TBH.

> From this perspective I can see your point. But this is then still
> clearly poorly designed.

Not really as per the definition.

> How would we ever support CONFIG_SCHED_CLUSTER
> in DT when it really (potentially) would bring a benefit (i.e. in the
> Armv9 L2-complex case) and not only create trouble for the task
> scheduler to setup its domains correctly?

Indeed, that is the next problem once we get all these aligned across
DT and ACPI. They have diverged and I prefer not to allow that anymore
by adding more divergence e.g. to support Armv9 L2-complex case. Please
consider that on top of these, I am not addressing that at the moment.
In fact I am not addressing any sched_domain topics or issues. I have made
that clear 😉.

> Also in case we stick to
> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> sched domain and MC the exceptional one. Just to respect the way how the
> task scheduler removes not useful domains today.
>

Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
taste. The topology masks are just inputs to these and will not be changed
or diverged for these reasons. Sorry if that is not helpful, but that is the
reality with sysfs exposed to the user-space.

> > If you are not happy with that, then how can be be happy with what is the
> > current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> > your opinion yet on that matter.
> 
> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> of topology above CPUs` is congruent with LLC` creates trouble to the
> scheduler. So I don't see why we should replicate this for DT. Let's
> discuss further tomorrow in person.

I see it differently. If that creates a trouble, fix that and you will not
have any issues with DT too.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-17 11:16                                               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-17 11:16 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> On 13/06/2022 12:17, Sudeep Holla wrote:
> > On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> Again I completely disagree. Let us look at the problems separately.
> >>> The hardware topology that some of the tools like lscpu and lstopo expects
> >>> what the hardware looks like and not the scheduler's view of the hardware.
> >>> So the topology masks that gets exposed to the user-space needs fixing
> >>> even today. I have reports from various tooling people about the same.
> >>> E.g. Juno getting exposed as dual socket system is utter non-sense.
> >>>
> >>> Yes scheduler uses most of the topology masks as is but that is not a must.
> >>> There are these *group_mask functions that can implement what scheduler
> >>> needs to be fed.
> >>>
> >>> I am not sure why the 2 issues are getting mixed up and that is the main
> >>> reason why I jumped into this to make sure the topology masks are
> >>> not tampered based on the way it needs to be used for scheduler.
> >>
> >> I'm all in favor of not mixing up those 2 issues. But I don't understand
> >> why you have to glue them together.
> >>
> > 
> > What are you referring as 'glue them together'. As I said this series just
> > address the hardware topology and if there is any impact on sched domains
> > then it is do with alignment with ACPI and DT platform behaviour. I am not
> > adding anything more to glue topology and info needed for sched domains.
> 
> You can fix (1) without (2) parsing 1. level cluster nodes as
> cluster_siblings.
>

Technically yes, but I see no point in delaying it as it is considered as
broken with respect to the moment ACPI exposed the correct value and at the
same time resulted in exposing incorrect value in case of DT. I am referring
to the same change that introduced SCHED_CLUSTER. The damage is done and it
needs repairing ASAP.

> > Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> 
> cpu_map {
>   socket0 {
>     cluster0 {    <-- 1. level cluster
>       cluster0 {  <-- 2. level cluster (3 -->)

Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
node which is 2. level cluster in this DT snippet.

>         core0 {
> 
>         };
>         core1 {
> 
>         };
>       cluster1 {
>   ...
> 
> Armv9 L2 complexes: e.g. QC SM8450:
> 
>       .---------------.
> CPU   |0 1 2 3 4 5 6 7|
>       +---------------+
> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>       +---------------+
>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)

Again before I assume, what exactly <--3 here and in above snippet mean ?

>       +---------------+
>   L3  |<--         -->|
>       +---------------+
>       |<-- cluster -->|

I think this is 2 level cluster or only cluster in this system w.r.t hardware.
So lets not confuse with multi-level if not necessary.

>       +---------------+
>       |<--   DSU   -->|
>       '---------------'
>
> Only if we map (i) into cluster_sibling, we get the same hardware
> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>

What is (i) above ?

> (4) examples:
>
> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> L3-tag (cpu_clustergroup_mask()).
>

Again decouple cache info and cluster info from h/w, you have all the info.
You can couple them together if that helps when you feed sched_domains.

> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>
> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> (little CPUs)) 2 CPUs sharing L2

[...]

> > And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> > and not on DT which was my main point.
> 
> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> logically different information than a Juno `cluster_cpus_list` file.
>

And why is that ?

> Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
> LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).
>

Correct because that is how the hardware clusters are designed on those SoC.
Cache topology is different.

> I think key is to agree what a CLUSTER actually represent and especially
> in case when `the level of topology above CPUs` is congruent with LLC
> boundaries. Because my feeling is that people didn't pay attention to
> this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
> is the Ampere Altra hack in cpu_coregroup_mask().
>

The is defined by hardware and DT/ACPI has bindings for it. We can't redefine
CLUSTER in a way that breaks those definitions. Again cluster is part of CPU
topology and cache topology can be different and need not be congruent with
CPU topology in some ways, Ofcourse they are in some other ways but there will
be no single line of alignment across SoCs which is quite evident with the
examples you have listed. Just add Ampere Altra to make it more fun.

> > As pointed out earlier, have you checked ACPI on Juno and with 
> > CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> > differs, then it is an issue. But AFAIU, it doesn't and that is my main
> > argument. You are just assuming what we have on Juno with DT is correct
> > which may be w.r.t to scheduler but definitely not with respect to the
> > hardware topology exposed to the users. So my aim is to get that fixed.
> 
> I never run Juno w/ ACPI. Are you saying that
> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> which match the `1. level cluster nodes`?
>

Again I am totally confused as why this is now 1.level cluster where as above
SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
has 1 DSU cluster, Juno has 2 clusters.

> The function header of find_acpi_cpu_topology_cluster() says that `...
> the cluster, if present is the level of topology above CPUs. ...`.
>

Exactly and that's how sysfs is also defined and we can't go back and change
that now. I don't see any issue TBH.

> From this perspective I can see your point. But this is then still
> clearly poorly designed.

Not really as per the definition.

> How would we ever support CONFIG_SCHED_CLUSTER
> in DT when it really (potentially) would bring a benefit (i.e. in the
> Armv9 L2-complex case) and not only create trouble for the task
> scheduler to setup its domains correctly?

Indeed, that is the next problem once we get all these aligned across
DT and ACPI. They have diverged and I prefer not to allow that anymore
by adding more divergence e.g. to support Armv9 L2-complex case. Please
consider that on top of these, I am not addressing that at the moment.
In fact I am not addressing any sched_domain topics or issues. I have made
that clear 😉.

> Also in case we stick to
> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> sched domain and MC the exceptional one. Just to respect the way how the
> task scheduler removes not useful domains today.
>

Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
taste. The topology masks are just inputs to these and will not be changed
or diverged for these reasons. Sorry if that is not helpful, but that is the
reality with sysfs exposed to the user-space.

> > If you are not happy with that, then how can be be happy with what is the
> > current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> > your opinion yet on that matter.
> 
> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> of topology above CPUs` is congruent with LLC` creates trouble to the
> scheduler. So I don't see why we should replicate this for DT. Let's
> discuss further tomorrow in person.

I see it differently. If that creates a trouble, fix that and you will not
have any issues with DT too.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-17 11:16                                               ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-17 11:16 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> On 13/06/2022 12:17, Sudeep Holla wrote:
> > On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> Again I completely disagree. Let us look at the problems separately.
> >>> The hardware topology that some of the tools like lscpu and lstopo expects
> >>> what the hardware looks like and not the scheduler's view of the hardware.
> >>> So the topology masks that gets exposed to the user-space needs fixing
> >>> even today. I have reports from various tooling people about the same.
> >>> E.g. Juno getting exposed as dual socket system is utter non-sense.
> >>>
> >>> Yes scheduler uses most of the topology masks as is but that is not a must.
> >>> There are these *group_mask functions that can implement what scheduler
> >>> needs to be fed.
> >>>
> >>> I am not sure why the 2 issues are getting mixed up and that is the main
> >>> reason why I jumped into this to make sure the topology masks are
> >>> not tampered based on the way it needs to be used for scheduler.
> >>
> >> I'm all in favor of not mixing up those 2 issues. But I don't understand
> >> why you have to glue them together.
> >>
> > 
> > What are you referring as 'glue them together'. As I said this series just
> > address the hardware topology and if there is any impact on sched domains
> > then it is do with alignment with ACPI and DT platform behaviour. I am not
> > adding anything more to glue topology and info needed for sched domains.
> 
> You can fix (1) without (2) parsing 1. level cluster nodes as
> cluster_siblings.
>

Technically yes, but I see no point in delaying it as it is considered as
broken with respect to the moment ACPI exposed the correct value and at the
same time resulted in exposing incorrect value in case of DT. I am referring
to the same change that introduced SCHED_CLUSTER. The damage is done and it
needs repairing ASAP.

> > Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> 
> cpu_map {
>   socket0 {
>     cluster0 {    <-- 1. level cluster
>       cluster0 {  <-- 2. level cluster (3 -->)

Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
node which is 2. level cluster in this DT snippet.

>         core0 {
> 
>         };
>         core1 {
> 
>         };
>       cluster1 {
>   ...
> 
> Armv9 L2 complexes: e.g. QC SM8450:
> 
>       .---------------.
> CPU   |0 1 2 3 4 5 6 7|
>       +---------------+
> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>       +---------------+
>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)

Again before I assume, what exactly <--3 here and in above snippet mean ?

>       +---------------+
>   L3  |<--         -->|
>       +---------------+
>       |<-- cluster -->|

I think this is 2 level cluster or only cluster in this system w.r.t hardware.
So lets not confuse with multi-level if not necessary.

>       +---------------+
>       |<--   DSU   -->|
>       '---------------'
>
> Only if we map (i) into cluster_sibling, we get the same hardware
> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>

What is (i) above ?

> (4) examples:
>
> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> L3-tag (cpu_clustergroup_mask()).
>

Again decouple cache info and cluster info from h/w, you have all the info.
You can couple them together if that helps when you feed sched_domains.

> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>
> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> (little CPUs)) 2 CPUs sharing L2

[...]

> > And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> > and not on DT which was my main point.
> 
> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> logically different information than a Juno `cluster_cpus_list` file.
>

And why is that ?

> Kunpeng920 `cluster_cpus_list` contain 4 CPUs sharing L3-tag (less than
> LLC) whereas Juno cluster_cpus_list contain 2 or 4 CPUs (which is LLC).
>

Correct because that is how the hardware clusters are designed on those SoC.
Cache topology is different.

> I think key is to agree what a CLUSTER actually represent and especially
> in case when `the level of topology above CPUs` is congruent with LLC
> boundaries. Because my feeling is that people didn't pay attention to
> this detail when they introduced SCHED_CONFIG_CLUSTER. A related issue
> is the Ampere Altra hack in cpu_coregroup_mask().
>

The is defined by hardware and DT/ACPI has bindings for it. We can't redefine
CLUSTER in a way that breaks those definitions. Again cluster is part of CPU
topology and cache topology can be different and need not be congruent with
CPU topology in some ways, Ofcourse they are in some other ways but there will
be no single line of alignment across SoCs which is quite evident with the
examples you have listed. Just add Ampere Altra to make it more fun.

> > As pointed out earlier, have you checked ACPI on Juno and with 
> > CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> > differs, then it is an issue. But AFAIU, it doesn't and that is my main
> > argument. You are just assuming what we have on Juno with DT is correct
> > which may be w.r.t to scheduler but definitely not with respect to the
> > hardware topology exposed to the users. So my aim is to get that fixed.
> 
> I never run Juno w/ ACPI. Are you saying that
> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> which match the `1. level cluster nodes`?
>

Again I am totally confused as why this is now 1.level cluster where as above
SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
has 1 DSU cluster, Juno has 2 clusters.

> The function header of find_acpi_cpu_topology_cluster() says that `...
> the cluster, if present is the level of topology above CPUs. ...`.
>

Exactly and that's how sysfs is also defined and we can't go back and change
that now. I don't see any issue TBH.

> From this perspective I can see your point. But this is then still
> clearly poorly designed.

Not really as per the definition.

> How would we ever support CONFIG_SCHED_CLUSTER
> in DT when it really (potentially) would bring a benefit (i.e. in the
> Armv9 L2-complex case) and not only create trouble for the task
> scheduler to setup its domains correctly?

Indeed, that is the next problem once we get all these aligned across
DT and ACPI. They have diverged and I prefer not to allow that anymore
by adding more divergence e.g. to support Armv9 L2-complex case. Please
consider that on top of these, I am not addressing that at the moment.
In fact I am not addressing any sched_domain topics or issues. I have made
that clear 😉.

> Also in case we stick to
> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> sched domain and MC the exceptional one. Just to respect the way how the
> task scheduler removes not useful domains today.
>

Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
taste. The topology masks are just inputs to these and will not be changed
or diverged for these reasons. Sorry if that is not helpful, but that is the
reality with sysfs exposed to the user-space.

> > If you are not happy with that, then how can be be happy with what is the
> > current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> > your opinion yet on that matter.
> 
> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> of topology above CPUs` is congruent with LLC` creates trouble to the
> scheduler. So I don't see why we should replicate this for DT. Let's
> discuss further tomorrow in person.

I see it differently. If that creates a trouble, fix that and you will not
have any issues with DT too.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-17 11:16                                               ` Sudeep Holla
  (?)
@ 2022-06-20 13:27                                                 ` Dietmar Eggemann
  -1 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-20 13:27 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 17/06/2022 13:16, Sudeep Holla wrote:
> On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
>> On 13/06/2022 12:17, Sudeep Holla wrote:
>>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>>>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> What are you referring as 'glue them together'. As I said this series just
>>> address the hardware topology and if there is any impact on sched domains
>>> then it is do with alignment with ACPI and DT platform behaviour. I am not
>>> adding anything more to glue topology and info needed for sched domains.
>>
>> You can fix (1) without (2) parsing 1. level cluster nodes as
>> cluster_siblings.
>>
> 
> Technically yes, but I see no point in delaying it as it is considered as
> broken with respect to the moment ACPI exposed the correct value and at the
> same time resulted in exposing incorrect value in case of DT. I am referring
> to the same change that introduced SCHED_CLUSTER. The damage is done and it
> needs repairing ASAP.

OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
topology-based cluster information. This can be DT cluster-node
information or ACPI L3-tag information e.g. for KP920.

>>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
>>
>> cpu_map {
>>   socket0 {
>>     cluster0 {    <-- 1. level cluster
>>       cluster0 {  <-- 2. level cluster (3 -->)
> 
> Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> node which is 2. level cluster in this DT snippet.
> 
>>         core0 {
>>
>>         };
>>         core1 {
>>
>>         };
>>       cluster1 {
>>   ...
>>
>> Armv9 L2 complexes: e.g. QC SM8450:
>>
>>       .---------------.
>> CPU   |0 1 2 3 4 5 6 7|
>>       +---------------+
>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>       +---------------+
>>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> 
> Again before I assume, what exactly <--3 here and in above snippet mean ?

I wanted to show that we could encode `2. level cluster` as `Armv9 L2
complexes`. But since we agreed (after the email was sent) not to
support `nested cluster-nodes`, this idea does not hold anymore.

>>       +---------------+
>>   L3  |<--         -->|
>>       +---------------+
>>       |<-- cluster -->|
> 
> I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> So lets not confuse with multi-level if not necessary.

No need, we said no `nested cluster-node` support in DT.

>>       +---------------+
>>       |<--   DSU   -->|
>>       '---------------'
>>
>> Only if we map (i) into cluster_sibling, we get the same hardware
>> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>>
> 
> What is (i) above ?

Sorry, (i) was meant to be `3 -->`.

>> (4) examples:
>>
>> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
>> L3-tag (cpu_clustergroup_mask()).
>>
> 
> Again decouple cache info and cluster info from h/w, you have all the info.
> You can couple them together if that helps when you feed sched_domains.

OK, this is what we finally agreed.

>> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>>
>> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
>> (little CPUs)) 2 CPUs sharing L2
> 
> [...]
> 
>>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
>>> and not on DT which was my main point.

OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
scheduler sees the correct information (the hierarchy of cpumasks).

>> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
>> logically different information than a Juno `cluster_cpus_list` file.
>>
> And why is that ?

If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
(`... the level of topology above CPUs ...` then I can see that both
definitions are the same. (CPUS should be rather `cores` here, I guess?).

[...]

>>> As pointed out earlier, have you checked ACPI on Juno and with 
>>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
>>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
>>> argument. You are just assuming what we have on Juno with DT is correct
>>> which may be w.r.t to scheduler but definitely not with respect to the
>>> hardware topology exposed to the users. So my aim is to get that fixed.
>>
>> I never run Juno w/ ACPI. Are you saying that
>> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
>> which match the `1. level cluster nodes`?
>>
> 
> Again I am totally confused as why this is now 1.level cluster where as above
> SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> has 1 DSU cluster, Juno has 2 clusters.

No need in agreeing what level (could) stand(s) here for. We said `no
nested cluster-node`.

>> The function header of find_acpi_cpu_topology_cluster() says that `...
>> the cluster, if present is the level of topology above CPUs. ...`.
>>
> 
> Exactly and that's how sysfs is also defined and we can't go back and change
> that now. I don't see any issue TBH.

OK.

>> From this perspective I can see your point. But this is then still
>> clearly poorly designed.
> 
> Not really as per the definition.

Not from the viewpoint of topology and cacheinfo. But from the scheduler
and the whole thing gets activated by a scheduler CONFIG option.

>> How would we ever support CONFIG_SCHED_CLUSTER
>> in DT when it really (potentially) would bring a benefit (i.e. in the
>> Armv9 L2-complex case) and not only create trouble for the task
>> scheduler to setup its domains correctly?
> 
> Indeed, that is the next problem once we get all these aligned across
> DT and ACPI. They have diverged and I prefer not to allow that anymore
> by adding more divergence e.g. to support Armv9 L2-complex case. Please
> consider that on top of these, I am not addressing that at the moment.
> In fact I am not addressing any sched_domain topics or issues. I have made
> that clear 😉.

If I recall the content of our discussion correctly, `Armv9 L2
complexes` support could come from L2 (probably `LLC - 1` ?) detection
from cacheinfo. But this is not part of this patch-set.

>> Also in case we stick to
>> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
>> sched domain and MC the exceptional one. Just to respect the way how the
>> task scheduler removes not useful domains today.
> 
> Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> taste. The topology masks are just inputs to these and will not be changed
> or diverged for these reasons. Sorry if that is not helpful, but that is the
> reality with sysfs exposed to the user-space.

Agreed. We don't have to rely on the task scheduler to build the right
sched domain hierarchy out of every cpu_xxx_mask() functions. We can
tweak cpu_xxx_mask() to get this done.

>>> If you are not happy with that, then how can be be happy with what is the
>>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
>>> your opinion yet on that matter.
>>
>> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
>> of topology above CPUs` is congruent with LLC` creates trouble to the
>> scheduler. So I don't see why we should replicate this for DT. Let's
>> discuss further tomorrow in person.
> 
> I see it differently. If that creates a trouble, fix that and you will not
> have any issues with DT too.

OK, I think we (arm64) found a way to support a default
CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
way to include `Armv9 L2 complexes` support via cacheinfo later.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-20 13:27                                                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-20 13:27 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 17/06/2022 13:16, Sudeep Holla wrote:
> On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
>> On 13/06/2022 12:17, Sudeep Holla wrote:
>>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>>>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> What are you referring as 'glue them together'. As I said this series just
>>> address the hardware topology and if there is any impact on sched domains
>>> then it is do with alignment with ACPI and DT platform behaviour. I am not
>>> adding anything more to glue topology and info needed for sched domains.
>>
>> You can fix (1) without (2) parsing 1. level cluster nodes as
>> cluster_siblings.
>>
> 
> Technically yes, but I see no point in delaying it as it is considered as
> broken with respect to the moment ACPI exposed the correct value and at the
> same time resulted in exposing incorrect value in case of DT. I am referring
> to the same change that introduced SCHED_CLUSTER. The damage is done and it
> needs repairing ASAP.

OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
topology-based cluster information. This can be DT cluster-node
information or ACPI L3-tag information e.g. for KP920.

>>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
>>
>> cpu_map {
>>   socket0 {
>>     cluster0 {    <-- 1. level cluster
>>       cluster0 {  <-- 2. level cluster (3 -->)
> 
> Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> node which is 2. level cluster in this DT snippet.
> 
>>         core0 {
>>
>>         };
>>         core1 {
>>
>>         };
>>       cluster1 {
>>   ...
>>
>> Armv9 L2 complexes: e.g. QC SM8450:
>>
>>       .---------------.
>> CPU   |0 1 2 3 4 5 6 7|
>>       +---------------+
>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>       +---------------+
>>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> 
> Again before I assume, what exactly <--3 here and in above snippet mean ?

I wanted to show that we could encode `2. level cluster` as `Armv9 L2
complexes`. But since we agreed (after the email was sent) not to
support `nested cluster-nodes`, this idea does not hold anymore.

>>       +---------------+
>>   L3  |<--         -->|
>>       +---------------+
>>       |<-- cluster -->|
> 
> I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> So lets not confuse with multi-level if not necessary.

No need, we said no `nested cluster-node` support in DT.

>>       +---------------+
>>       |<--   DSU   -->|
>>       '---------------'
>>
>> Only if we map (i) into cluster_sibling, we get the same hardware
>> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>>
> 
> What is (i) above ?

Sorry, (i) was meant to be `3 -->`.

>> (4) examples:
>>
>> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
>> L3-tag (cpu_clustergroup_mask()).
>>
> 
> Again decouple cache info and cluster info from h/w, you have all the info.
> You can couple them together if that helps when you feed sched_domains.

OK, this is what we finally agreed.

>> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>>
>> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
>> (little CPUs)) 2 CPUs sharing L2
> 
> [...]
> 
>>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
>>> and not on DT which was my main point.

OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
scheduler sees the correct information (the hierarchy of cpumasks).

>> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
>> logically different information than a Juno `cluster_cpus_list` file.
>>
> And why is that ?

If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
(`... the level of topology above CPUs ...` then I can see that both
definitions are the same. (CPUS should be rather `cores` here, I guess?).

[...]

>>> As pointed out earlier, have you checked ACPI on Juno and with 
>>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
>>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
>>> argument. You are just assuming what we have on Juno with DT is correct
>>> which may be w.r.t to scheduler but definitely not with respect to the
>>> hardware topology exposed to the users. So my aim is to get that fixed.
>>
>> I never run Juno w/ ACPI. Are you saying that
>> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
>> which match the `1. level cluster nodes`?
>>
> 
> Again I am totally confused as why this is now 1.level cluster where as above
> SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> has 1 DSU cluster, Juno has 2 clusters.

No need in agreeing what level (could) stand(s) here for. We said `no
nested cluster-node`.

>> The function header of find_acpi_cpu_topology_cluster() says that `...
>> the cluster, if present is the level of topology above CPUs. ...`.
>>
> 
> Exactly and that's how sysfs is also defined and we can't go back and change
> that now. I don't see any issue TBH.

OK.

>> From this perspective I can see your point. But this is then still
>> clearly poorly designed.
> 
> Not really as per the definition.

Not from the viewpoint of topology and cacheinfo. But from the scheduler
and the whole thing gets activated by a scheduler CONFIG option.

>> How would we ever support CONFIG_SCHED_CLUSTER
>> in DT when it really (potentially) would bring a benefit (i.e. in the
>> Armv9 L2-complex case) and not only create trouble for the task
>> scheduler to setup its domains correctly?
> 
> Indeed, that is the next problem once we get all these aligned across
> DT and ACPI. They have diverged and I prefer not to allow that anymore
> by adding more divergence e.g. to support Armv9 L2-complex case. Please
> consider that on top of these, I am not addressing that at the moment.
> In fact I am not addressing any sched_domain topics or issues. I have made
> that clear 😉.

If I recall the content of our discussion correctly, `Armv9 L2
complexes` support could come from L2 (probably `LLC - 1` ?) detection
from cacheinfo. But this is not part of this patch-set.

>> Also in case we stick to
>> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
>> sched domain and MC the exceptional one. Just to respect the way how the
>> task scheduler removes not useful domains today.
> 
> Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> taste. The topology masks are just inputs to these and will not be changed
> or diverged for these reasons. Sorry if that is not helpful, but that is the
> reality with sysfs exposed to the user-space.

Agreed. We don't have to rely on the task scheduler to build the right
sched domain hierarchy out of every cpu_xxx_mask() functions. We can
tweak cpu_xxx_mask() to get this done.

>>> If you are not happy with that, then how can be be happy with what is the
>>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
>>> your opinion yet on that matter.
>>
>> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
>> of topology above CPUs` is congruent with LLC` creates trouble to the
>> scheduler. So I don't see why we should replicate this for DT. Let's
>> discuss further tomorrow in person.
> 
> I see it differently. If that creates a trouble, fix that and you will not
> have any issues with DT too.

OK, I think we (arm64) found a way to support a default
CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
way to include `Armv9 L2 complexes` support via cacheinfo later.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-20 13:27                                                 ` Dietmar Eggemann
  0 siblings, 0 replies; 153+ messages in thread
From: Dietmar Eggemann @ 2022-06-20 13:27 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Qing Wang, linux-arm-kernel, linux-riscv,
	Rob Herring

On 17/06/2022 13:16, Sudeep Holla wrote:
> On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
>> On 13/06/2022 12:17, Sudeep Holla wrote:
>>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
>>>> On 10/06/2022 12:27, Sudeep Holla wrote:
>>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
>>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:

[...]

>>> What are you referring as 'glue them together'. As I said this series just
>>> address the hardware topology and if there is any impact on sched domains
>>> then it is do with alignment with ACPI and DT platform behaviour. I am not
>>> adding anything more to glue topology and info needed for sched domains.
>>
>> You can fix (1) without (2) parsing 1. level cluster nodes as
>> cluster_siblings.
>>
> 
> Technically yes, but I see no point in delaying it as it is considered as
> broken with respect to the moment ACPI exposed the correct value and at the
> same time resulted in exposing incorrect value in case of DT. I am referring
> to the same change that introduced SCHED_CLUSTER. The damage is done and it
> needs repairing ASAP.

OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
topology-based cluster information. This can be DT cluster-node
information or ACPI L3-tag information e.g. for KP920.

>>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
>>
>> cpu_map {
>>   socket0 {
>>     cluster0 {    <-- 1. level cluster
>>       cluster0 {  <-- 2. level cluster (3 -->)
> 
> Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> node which is 2. level cluster in this DT snippet.
> 
>>         core0 {
>>
>>         };
>>         core1 {
>>
>>         };
>>       cluster1 {
>>   ...
>>
>> Armv9 L2 complexes: e.g. QC SM8450:
>>
>>       .---------------.
>> CPU   |0 1 2 3 4 5 6 7|
>>       +---------------+
>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>       +---------------+
>>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> 
> Again before I assume, what exactly <--3 here and in above snippet mean ?

I wanted to show that we could encode `2. level cluster` as `Armv9 L2
complexes`. But since we agreed (after the email was sent) not to
support `nested cluster-nodes`, this idea does not hold anymore.

>>       +---------------+
>>   L3  |<--         -->|
>>       +---------------+
>>       |<-- cluster -->|
> 
> I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> So lets not confuse with multi-level if not necessary.

No need, we said no `nested cluster-node` support in DT.

>>       +---------------+
>>       |<--   DSU   -->|
>>       '---------------'
>>
>> Only if we map (i) into cluster_sibling, we get the same hardware
>> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
>>
> 
> What is (i) above ?

Sorry, (i) was meant to be `3 -->`.

>> (4) examples:
>>
>> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
>> L3-tag (cpu_clustergroup_mask()).
>>
> 
> Again decouple cache info and cluster info from h/w, you have all the info.
> You can couple them together if that helps when you feed sched_domains.

OK, this is what we finally agreed.

>> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
>>
>> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
>> (little CPUs)) 2 CPUs sharing L2
> 
> [...]
> 
>>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
>>> and not on DT which was my main point.

OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
scheduler sees the correct information (the hierarchy of cpumasks).

>> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
>> logically different information than a Juno `cluster_cpus_list` file.
>>
> And why is that ?

If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
(`... the level of topology above CPUs ...` then I can see that both
definitions are the same. (CPUS should be rather `cores` here, I guess?).

[...]

>>> As pointed out earlier, have you checked ACPI on Juno and with 
>>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
>>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
>>> argument. You are just assuming what we have on Juno with DT is correct
>>> which may be w.r.t to scheduler but definitely not with respect to the
>>> hardware topology exposed to the users. So my aim is to get that fixed.
>>
>> I never run Juno w/ ACPI. Are you saying that
>> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
>> which match the `1. level cluster nodes`?
>>
> 
> Again I am totally confused as why this is now 1.level cluster where as above
> SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> has 1 DSU cluster, Juno has 2 clusters.

No need in agreeing what level (could) stand(s) here for. We said `no
nested cluster-node`.

>> The function header of find_acpi_cpu_topology_cluster() says that `...
>> the cluster, if present is the level of topology above CPUs. ...`.
>>
> 
> Exactly and that's how sysfs is also defined and we can't go back and change
> that now. I don't see any issue TBH.

OK.

>> From this perspective I can see your point. But this is then still
>> clearly poorly designed.
> 
> Not really as per the definition.

Not from the viewpoint of topology and cacheinfo. But from the scheduler
and the whole thing gets activated by a scheduler CONFIG option.

>> How would we ever support CONFIG_SCHED_CLUSTER
>> in DT when it really (potentially) would bring a benefit (i.e. in the
>> Armv9 L2-complex case) and not only create trouble for the task
>> scheduler to setup its domains correctly?
> 
> Indeed, that is the next problem once we get all these aligned across
> DT and ACPI. They have diverged and I prefer not to allow that anymore
> by adding more divergence e.g. to support Armv9 L2-complex case. Please
> consider that on top of these, I am not addressing that at the moment.
> In fact I am not addressing any sched_domain topics or issues. I have made
> that clear 😉.

If I recall the content of our discussion correctly, `Armv9 L2
complexes` support could come from L2 (probably `LLC - 1` ?) detection
from cacheinfo. But this is not part of this patch-set.

>> Also in case we stick to
>> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
>> sched domain and MC the exceptional one. Just to respect the way how the
>> task scheduler removes not useful domains today.
> 
> Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> taste. The topology masks are just inputs to these and will not be changed
> or diverged for these reasons. Sorry if that is not helpful, but that is the
> reality with sysfs exposed to the user-space.

Agreed. We don't have to rely on the task scheduler to build the right
sched domain hierarchy out of every cpu_xxx_mask() functions. We can
tweak cpu_xxx_mask() to get this done.

>>> If you are not happy with that, then how can be be happy with what is the
>>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
>>> your opinion yet on that matter.
>>
>> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
>> of topology above CPUs` is congruent with LLC` creates trouble to the
>> scheduler. So I don't see why we should replicate this for DT. Let's
>> discuss further tomorrow in person.
> 
> I see it differently. If that creates a trouble, fix that and you will not
> have any issues with DT too.

OK, I think we (arm64) found a way to support a default
CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
way to include `Armv9 L2 complexes` support via cacheinfo later.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-06-20 13:27                                                 ` Dietmar Eggemann
  (?)
@ 2022-06-21 16:00                                                   ` Sudeep Holla
  -1 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-21 16:00 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Sudeep Holla, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 20, 2022 at 03:27:33PM +0200, Dietmar Eggemann wrote:
> On 17/06/2022 13:16, Sudeep Holla wrote:
> > On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> >> On 13/06/2022 12:17, Sudeep Holla wrote:
> >>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >>>> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> What are you referring as 'glue them together'. As I said this series just
> >>> address the hardware topology and if there is any impact on sched domains
> >>> then it is do with alignment with ACPI and DT platform behaviour. I am not
> >>> adding anything more to glue topology and info needed for sched domains.
> >>
> >> You can fix (1) without (2) parsing 1. level cluster nodes as
> >> cluster_siblings.
> >>
> > 
> > Technically yes, but I see no point in delaying it as it is considered as
> > broken with respect to the moment ACPI exposed the correct value and at the
> > same time resulted in exposing incorrect value in case of DT. I am referring
> > to the same change that introduced SCHED_CLUSTER. The damage is done and it
> > needs repairing ASAP.
> 
> OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
> topology-based cluster information. This can be DT cluster-node
> information or ACPI L3-tag information e.g. for KP920.
>

Agreed. All the information under /sys/.../topology/ are the hardware view
and not the scheduler's view of the hardware for the purpose of building
sched domains.

> >>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> >>
> >> cpu_map {
> >>   socket0 {
> >>     cluster0 {    <-- 1. level cluster
> >>       cluster0 {  <-- 2. level cluster (3 -->)
> > 
> > Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> > node which is 2. level cluster in this DT snippet.
> > 
> >>         core0 {
> >>
> >>         };
> >>         core1 {
> >>
> >>         };
> >>       cluster1 {
> >>   ...
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450:
> >>
> >>       .---------------.
> >> CPU   |0 1 2 3 4 5 6 7|
> >>       +---------------+
> >> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
> >>       +---------------+
> >>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> > 
> > Again before I assume, what exactly <--3 here and in above snippet mean ?
> 
> I wanted to show that we could encode `2. level cluster` as `Armv9 L2
> complexes`. But since we agreed (after the email was sent) not to
> support `nested cluster-nodes`, this idea does not hold anymore.
>

Yes I plan to throw warning if we encounter nested clusters, will be part of
next version(https://git.kernel.org/sudeep.holla/c/94fae12d64bb).

> >>       +---------------+
> >>   L3  |<--         -->|
> >>       +---------------+
> >>       |<-- cluster -->|
> > 
> > I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> > So lets not confuse with multi-level if not necessary.
>
> No need, we said no `nested cluster-node` support in DT.
>

👍

> >>       +---------------+
> >>       |<--   DSU   -->|
> >>       '---------------'
> >>
> >> Only if we map (i) into cluster_sibling, we get the same hardware
> >> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
> >>
> > 
> > What is (i) above ?
> 
> Sorry, (i) was meant to be `3 -->`.
>

Ah ok

> >> (4) examples:
> >>
> >> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> >> L3-tag (cpu_clustergroup_mask()).
> >>
> > 
> > Again decouple cache info and cluster info from h/w, you have all the info.
> > You can couple them together if that helps when you feed sched_domains.
>
> OK, this is what we finally agreed.
>

👍

> >> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> >> (little CPUs)) 2 CPUs sharing L2
> >
> > [...]
> > 
> >>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> >>> and not on DT which was my main point.
> 
> OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
> shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
> cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
> scheduler sees the correct information (the hierarchy of cpumasks).
>

Correct

> >> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> >> logically different information than a Juno `cluster_cpus_list` file.
> >>
> > And why is that ?
> 
> If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
> (`... the level of topology above CPUs ...` then I can see that both
> definitions are the same. (CPUS should be rather `cores` here, I guess?).
> 
> [...]
>

Yes, I have use cores and CPUs interchanges at several places so far, will
try to be more specific in the future.

> >>> As pointed out earlier, have you checked ACPI on Juno and with 
> >>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> >>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> >>> argument. You are just assuming what we have on Juno with DT is correct
> >>> which may be w.r.t to scheduler but definitely not with respect to the
> >>> hardware topology exposed to the users. So my aim is to get that fixed.
> >>
> >> I never run Juno w/ ACPI. Are you saying that
> >> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> >> which match the `1. level cluster nodes`?
> >>
> >
> > Again I am totally confused as why this is now 1.level cluster where as above
> > SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> > has 1 DSU cluster, Juno has 2 clusters.
> 
> No need in agreeing what level (could) stand(s) here for. We said `no
> nested cluster-node`.
>

👍

> >> The function header of find_acpi_cpu_topology_cluster() says that `...
> >> the cluster, if present is the level of topology above CPUs. ...`.
> >>
> > 
> > Exactly and that's how sysfs is also defined and we can't go back and change
> > that now. I don't see any issue TBH.
> 
> OK.
> 
> >> From this perspective I can see your point. But this is then still
> >> clearly poorly designed.
> > 
> > Not really as per the definition.
> 
> Not from the viewpoint of topology and cacheinfo. But from the scheduler
> and the whole thing gets activated by a scheduler CONFIG option.
>

Agreed and I too think it must be enabled by default.

> >> How would we ever support CONFIG_SCHED_CLUSTER
> >> in DT when it really (potentially) would bring a benefit (i.e. in the
> >> Armv9 L2-complex case) and not only create trouble for the task
> >> scheduler to setup its domains correctly?
> > 
> > Indeed, that is the next problem once we get all these aligned across
> > DT and ACPI. They have diverged and I prefer not to allow that anymore
> > by adding more divergence e.g. to support Armv9 L2-complex case. Please
> > consider that on top of these, I am not addressing that at the moment.
> > In fact I am not addressing any sched_domain topics or issues. I have made
> > that clear 😉.
> 
> If I recall the content of our discussion correctly, `Armv9 L2
> complexes` support could come from L2 (probably `LLC - 1` ?) detection
> from cacheinfo. But this is not part of this patch-set.
>

Correct and thanks for making that clear here. I have intentionally not
mentioned it so far as I am not addressing anything specific to such a
topology.

> >> Also in case we stick to
> >> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> >> sched domain and MC the exceptional one. Just to respect the way how the
> >> task scheduler removes not useful domains today.
> > 
> > Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> > taste. The topology masks are just inputs to these and will not be changed
> > or diverged for these reasons. Sorry if that is not helpful, but that is the
> > reality with sysfs exposed to the user-space.
> 
> Agreed. We don't have to rely on the task scheduler to build the right
> sched domain hierarchy out of every cpu_xxx_mask() functions. We can
> tweak cpu_xxx_mask() to get this done.
>

Correct, I am not saying it is as simple as that, but to keep hardware
topology and the info fed to scheduler or the info as viewed from
sched_domains perspective different. The former changes with hardware while
the latter changes with the view of scheduler. Based on the logic it
used to optimise, scheduler can try different view on the same hardware but
that's not where we want to be. We want to provide a generic view based on
all the h/w cpu and cache topology information. It may not be always optimal,
but we can try the best.

> >>> If you are not happy with that, then how can be be happy with what is the
> >>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> >>> your opinion yet on that matter.
> >>
> >> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> >> of topology above CPUs` is congruent with LLC` creates trouble to the
> >> scheduler. So I don't see why we should replicate this for DT. Let's
> >> discuss further tomorrow in person.
> > 
> > I see it differently. If that creates a trouble, fix that and you will not
> > have any issues with DT too.
> 
> OK, I think we (arm64) found a way to support a default
> CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
> way to include `Armv9 L2 complexes` support via cacheinfo later.

Thanks for all the discussion. Hope we may not have to revisit this topic
in such depth until another topology that breaks all our current assumption
arrives in the wild which may not be too long.

I remember talking with Morten and concluding LLC is sufficient for arm64
couple of years back, but now that can be questionable with Armv9 L2
complex.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-21 16:00                                                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-21 16:00 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Sudeep Holla, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 20, 2022 at 03:27:33PM +0200, Dietmar Eggemann wrote:
> On 17/06/2022 13:16, Sudeep Holla wrote:
> > On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> >> On 13/06/2022 12:17, Sudeep Holla wrote:
> >>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >>>> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> What are you referring as 'glue them together'. As I said this series just
> >>> address the hardware topology and if there is any impact on sched domains
> >>> then it is do with alignment with ACPI and DT platform behaviour. I am not
> >>> adding anything more to glue topology and info needed for sched domains.
> >>
> >> You can fix (1) without (2) parsing 1. level cluster nodes as
> >> cluster_siblings.
> >>
> > 
> > Technically yes, but I see no point in delaying it as it is considered as
> > broken with respect to the moment ACPI exposed the correct value and at the
> > same time resulted in exposing incorrect value in case of DT. I am referring
> > to the same change that introduced SCHED_CLUSTER. The damage is done and it
> > needs repairing ASAP.
> 
> OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
> topology-based cluster information. This can be DT cluster-node
> information or ACPI L3-tag information e.g. for KP920.
>

Agreed. All the information under /sys/.../topology/ are the hardware view
and not the scheduler's view of the hardware for the purpose of building
sched domains.

> >>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> >>
> >> cpu_map {
> >>   socket0 {
> >>     cluster0 {    <-- 1. level cluster
> >>       cluster0 {  <-- 2. level cluster (3 -->)
> > 
> > Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> > node which is 2. level cluster in this DT snippet.
> > 
> >>         core0 {
> >>
> >>         };
> >>         core1 {
> >>
> >>         };
> >>       cluster1 {
> >>   ...
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450:
> >>
> >>       .---------------.
> >> CPU   |0 1 2 3 4 5 6 7|
> >>       +---------------+
> >> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
> >>       +---------------+
> >>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> > 
> > Again before I assume, what exactly <--3 here and in above snippet mean ?
> 
> I wanted to show that we could encode `2. level cluster` as `Armv9 L2
> complexes`. But since we agreed (after the email was sent) not to
> support `nested cluster-nodes`, this idea does not hold anymore.
>

Yes I plan to throw warning if we encounter nested clusters, will be part of
next version(https://git.kernel.org/sudeep.holla/c/94fae12d64bb).

> >>       +---------------+
> >>   L3  |<--         -->|
> >>       +---------------+
> >>       |<-- cluster -->|
> > 
> > I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> > So lets not confuse with multi-level if not necessary.
>
> No need, we said no `nested cluster-node` support in DT.
>

👍

> >>       +---------------+
> >>       |<--   DSU   -->|
> >>       '---------------'
> >>
> >> Only if we map (i) into cluster_sibling, we get the same hardware
> >> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
> >>
> > 
> > What is (i) above ?
> 
> Sorry, (i) was meant to be `3 -->`.
>

Ah ok

> >> (4) examples:
> >>
> >> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> >> L3-tag (cpu_clustergroup_mask()).
> >>
> > 
> > Again decouple cache info and cluster info from h/w, you have all the info.
> > You can couple them together if that helps when you feed sched_domains.
>
> OK, this is what we finally agreed.
>

👍

> >> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> >> (little CPUs)) 2 CPUs sharing L2
> >
> > [...]
> > 
> >>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> >>> and not on DT which was my main point.
> 
> OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
> shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
> cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
> scheduler sees the correct information (the hierarchy of cpumasks).
>

Correct

> >> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> >> logically different information than a Juno `cluster_cpus_list` file.
> >>
> > And why is that ?
> 
> If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
> (`... the level of topology above CPUs ...` then I can see that both
> definitions are the same. (CPUS should be rather `cores` here, I guess?).
> 
> [...]
>

Yes, I have use cores and CPUs interchanges at several places so far, will
try to be more specific in the future.

> >>> As pointed out earlier, have you checked ACPI on Juno and with 
> >>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> >>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> >>> argument. You are just assuming what we have on Juno with DT is correct
> >>> which may be w.r.t to scheduler but definitely not with respect to the
> >>> hardware topology exposed to the users. So my aim is to get that fixed.
> >>
> >> I never run Juno w/ ACPI. Are you saying that
> >> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> >> which match the `1. level cluster nodes`?
> >>
> >
> > Again I am totally confused as why this is now 1.level cluster where as above
> > SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> > has 1 DSU cluster, Juno has 2 clusters.
> 
> No need in agreeing what level (could) stand(s) here for. We said `no
> nested cluster-node`.
>

👍

> >> The function header of find_acpi_cpu_topology_cluster() says that `...
> >> the cluster, if present is the level of topology above CPUs. ...`.
> >>
> > 
> > Exactly and that's how sysfs is also defined and we can't go back and change
> > that now. I don't see any issue TBH.
> 
> OK.
> 
> >> From this perspective I can see your point. But this is then still
> >> clearly poorly designed.
> > 
> > Not really as per the definition.
> 
> Not from the viewpoint of topology and cacheinfo. But from the scheduler
> and the whole thing gets activated by a scheduler CONFIG option.
>

Agreed and I too think it must be enabled by default.

> >> How would we ever support CONFIG_SCHED_CLUSTER
> >> in DT when it really (potentially) would bring a benefit (i.e. in the
> >> Armv9 L2-complex case) and not only create trouble for the task
> >> scheduler to setup its domains correctly?
> > 
> > Indeed, that is the next problem once we get all these aligned across
> > DT and ACPI. They have diverged and I prefer not to allow that anymore
> > by adding more divergence e.g. to support Armv9 L2-complex case. Please
> > consider that on top of these, I am not addressing that at the moment.
> > In fact I am not addressing any sched_domain topics or issues. I have made
> > that clear 😉.
> 
> If I recall the content of our discussion correctly, `Armv9 L2
> complexes` support could come from L2 (probably `LLC - 1` ?) detection
> from cacheinfo. But this is not part of this patch-set.
>

Correct and thanks for making that clear here. I have intentionally not
mentioned it so far as I am not addressing anything specific to such a
topology.

> >> Also in case we stick to
> >> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> >> sched domain and MC the exceptional one. Just to respect the way how the
> >> task scheduler removes not useful domains today.
> > 
> > Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> > taste. The topology masks are just inputs to these and will not be changed
> > or diverged for these reasons. Sorry if that is not helpful, but that is the
> > reality with sysfs exposed to the user-space.
> 
> Agreed. We don't have to rely on the task scheduler to build the right
> sched domain hierarchy out of every cpu_xxx_mask() functions. We can
> tweak cpu_xxx_mask() to get this done.
>

Correct, I am not saying it is as simple as that, but to keep hardware
topology and the info fed to scheduler or the info as viewed from
sched_domains perspective different. The former changes with hardware while
the latter changes with the view of scheduler. Based on the logic it
used to optimise, scheduler can try different view on the same hardware but
that's not where we want to be. We want to provide a generic view based on
all the h/w cpu and cache topology information. It may not be always optimal,
but we can try the best.

> >>> If you are not happy with that, then how can be be happy with what is the
> >>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> >>> your opinion yet on that matter.
> >>
> >> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> >> of topology above CPUs` is congruent with LLC` creates trouble to the
> >> scheduler. So I don't see why we should replicate this for DT. Let's
> >> discuss further tomorrow in person.
> > 
> > I see it differently. If that creates a trouble, fix that and you will not
> > have any issues with DT too.
> 
> OK, I think we (arm64) found a way to support a default
> CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
> way to include `Armv9 L2 complexes` support via cacheinfo later.

Thanks for all the discussion. Hope we may not have to revisit this topic
in such depth until another topology that breaks all our current assumption
arrives in the wild which may not be too long.

I remember talking with Morten and concluding LLC is sufficient for arm64
couple of years back, but now that can be questionable with Armv9 L2
complex.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map
@ 2022-06-21 16:00                                                   ` Sudeep Holla
  0 siblings, 0 replies; 153+ messages in thread
From: Sudeep Holla @ 2022-06-21 16:00 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, linux-kernel, Atish Patra, Atish Patra,
	Morten Rasmussen, Sudeep Holla, Qing Wang, linux-arm-kernel,
	linux-riscv, Rob Herring

On Mon, Jun 20, 2022 at 03:27:33PM +0200, Dietmar Eggemann wrote:
> On 17/06/2022 13:16, Sudeep Holla wrote:
> > On Thu, Jun 16, 2022 at 05:02:28PM +0100, Dietmar Eggemann wrote:
> >> On 13/06/2022 12:17, Sudeep Holla wrote:
> >>> On Mon, Jun 13, 2022 at 11:19:36AM +0200, Dietmar Eggemann wrote:
> >>>> On 10/06/2022 12:27, Sudeep Holla wrote:
> >>>>> On Fri, Jun 10, 2022 at 12:08:44PM +0200, Vincent Guittot wrote:
> >>>>>> On Mon, 6 Jun 2022 at 12:22, Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> [...]
> 
> >>> What are you referring as 'glue them together'. As I said this series just
> >>> address the hardware topology and if there is any impact on sched domains
> >>> then it is do with alignment with ACPI and DT platform behaviour. I am not
> >>> adding anything more to glue topology and info needed for sched domains.
> >>
> >> You can fix (1) without (2) parsing 1. level cluster nodes as
> >> cluster_siblings.
> >>
> > 
> > Technically yes, but I see no point in delaying it as it is considered as
> > broken with respect to the moment ACPI exposed the correct value and at the
> > same time resulted in exposing incorrect value in case of DT. I am referring
> > to the same change that introduced SCHED_CLUSTER. The damage is done and it
> > needs repairing ASAP.
> 
> OK, then lets `/sys/.../topology/cluster_cpus` refer to pure
> topology-based cluster information. This can be DT cluster-node
> information or ACPI L3-tag information e.g. for KP920.
>

Agreed. All the information under /sys/.../topology/ are the hardware view
and not the scheduler's view of the hardware for the purpose of building
sched domains.

> >>> Indeed. But I don't get what you mean by 2 level here. ACPI puts 1st level
> >>
> >> cpu_map {
> >>   socket0 {
> >>     cluster0 {    <-- 1. level cluster
> >>       cluster0 {  <-- 2. level cluster (3 -->)
> > 
> > Oh I had misunderstood this level nomenclature, I refer it as leaf cluster
> > node which is 2. level cluster in this DT snippet.
> > 
> >>         core0 {
> >>
> >>         };
> >>         core1 {
> >>
> >>         };
> >>       cluster1 {
> >>   ...
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450:
> >>
> >>       .---------------.
> >> CPU   |0 1 2 3 4 5 6 7|
> >>       +---------------+
> >> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
> >>       +---------------+
> >>   L2  |   |   | | | | | <-- 2. level cluster, Armv9 L2 complexes (<-- 3)
> > 
> > Again before I assume, what exactly <--3 here and in above snippet mean ?
> 
> I wanted to show that we could encode `2. level cluster` as `Armv9 L2
> complexes`. But since we agreed (after the email was sent) not to
> support `nested cluster-nodes`, this idea does not hold anymore.
>

Yes I plan to throw warning if we encounter nested clusters, will be part of
next version(https://git.kernel.org/sudeep.holla/c/94fae12d64bb).

> >>       +---------------+
> >>   L3  |<--         -->|
> >>       +---------------+
> >>       |<-- cluster -->|
> > 
> > I think this is 2 level cluster or only cluster in this system w.r.t hardware.
> > So lets not confuse with multi-level if not necessary.
>
> No need, we said no `nested cluster-node` support in DT.
>

👍

> >>       +---------------+
> >>       |<--   DSU   -->|
> >>       '---------------'
> >>
> >> Only if we map (i) into cluster_sibling, we get the same hardware
> >> representation (for the task scheduler) for ACPI (4) and DT (5) systems.
> >>
> > 
> > What is (i) above ?
> 
> Sorry, (i) was meant to be `3 -->`.
>

Ah ok

> >> (4) examples:
> >>
> >> Kunpeng920 - 24 CPUs sharing LLC (cpu_coregroup_mask()), 4 CPUs sharing
> >> L3-tag (cpu_clustergroup_mask()).
> >>
> > 
> > Again decouple cache info and cluster info from h/w, you have all the info.
> > You can couple them together if that helps when you feed sched_domains.
>
> OK, this is what we finally agreed.
>

👍

> >> X86 Jacobsville - 24 CPUs sharing LLC (L3), 4 CPUs sharing L2
> >>
> >> Armv9 L2 complexes: e.g. QC SM8450 - 8 CPUs sharing LLC (L3), (for A510
> >> (little CPUs)) 2 CPUs sharing L2
> >
> > [...]
> > 
> >>> And yes lstopo doesn't read cluster IDs. But we expose them in ACPI system
> >>> and not on DT which was my main point.
> 
> OK, no further discussion needed here. `/sys/.../topology/cluster_cpus`
> shows L2 (LLC) on Juno, L3-tag an KP920 or L2 on X86 Jacobsville. The
> cpu_xxx_mask()s of (e.g.) default_topology[] have to make sure that the
> scheduler sees the correct information (the hierarchy of cpumasks).
>

Correct

> >> Understood. But a Kunpeng920 `cluster_cpus_list` file would contain
> >> logically different information than a Juno `cluster_cpus_list` file.
> >>
> > And why is that ?
> 
> If we see it from the angle of the definition of SCHED_CONFIG_CLUSTER
> (`... the level of topology above CPUs ...` then I can see that both
> definitions are the same. (CPUS should be rather `cores` here, I guess?).
> 
> [...]
>

Yes, I have use cores and CPUs interchanges at several places so far, will
try to be more specific in the future.

> >>> As pointed out earlier, have you checked ACPI on Juno and with 
> >>> CONFIG_SCHED_CLUSTER ? If the behaviour with my series on DT and ACPI
> >>> differs, then it is an issue. But AFAIU, it doesn't and that is my main
> >>> argument. You are just assuming what we have on Juno with DT is correct
> >>> which may be w.r.t to scheduler but definitely not with respect to the
> >>> hardware topology exposed to the users. So my aim is to get that fixed.
> >>
> >> I never run Juno w/ ACPI. Are you saying that
> >> find_acpi_cpu_topology_cluster() returns cpu_topology[cpu].cluster_id's
> >> which match the `1. level cluster nodes`?
> >>
> >
> > Again I am totally confused as why this is now 1.level cluster where as above
> > SDM was 2.level cluster. Both SoCs have only 1 level of cluster. While SDM
> > has 1 DSU cluster, Juno has 2 clusters.
> 
> No need in agreeing what level (could) stand(s) here for. We said `no
> nested cluster-node`.
>

👍

> >> The function header of find_acpi_cpu_topology_cluster() says that `...
> >> the cluster, if present is the level of topology above CPUs. ...`.
> >>
> > 
> > Exactly and that's how sysfs is also defined and we can't go back and change
> > that now. I don't see any issue TBH.
> 
> OK.
> 
> >> From this perspective I can see your point. But this is then still
> >> clearly poorly designed.
> > 
> > Not really as per the definition.
> 
> Not from the viewpoint of topology and cacheinfo. But from the scheduler
> and the whole thing gets activated by a scheduler CONFIG option.
>

Agreed and I too think it must be enabled by default.

> >> How would we ever support CONFIG_SCHED_CLUSTER
> >> in DT when it really (potentially) would bring a benefit (i.e. in the
> >> Armv9 L2-complex case) and not only create trouble for the task
> >> scheduler to setup its domains correctly?
> > 
> > Indeed, that is the next problem once we get all these aligned across
> > DT and ACPI. They have diverged and I prefer not to allow that anymore
> > by adding more divergence e.g. to support Armv9 L2-complex case. Please
> > consider that on top of these, I am not addressing that at the moment.
> > In fact I am not addressing any sched_domain topics or issues. I have made
> > that clear 😉.
> 
> If I recall the content of our discussion correctly, `Armv9 L2
> complexes` support could come from L2 (probably `LLC - 1` ?) detection
> from cacheinfo. But this is not part of this patch-set.
>

Correct and thanks for making that clear here. I have intentionally not
mentioned it so far as I am not addressing anything specific to such a
topology.

> >> Also in case we stick to
> >> setting CONFIG_SCHED_CLUSTER=1 by default, CLS should be the default LLC
> >> sched domain and MC the exceptional one. Just to respect the way how the
> >> task scheduler removes not useful domains today.
> > 
> > Fix the cpu_clustergroup_mask or any other cpu_*group_mask as per your
> > taste. The topology masks are just inputs to these and will not be changed
> > or diverged for these reasons. Sorry if that is not helpful, but that is the
> > reality with sysfs exposed to the user-space.
> 
> Agreed. We don't have to rely on the task scheduler to build the right
> sched domain hierarchy out of every cpu_xxx_mask() functions. We can
> tweak cpu_xxx_mask() to get this done.
>

Correct, I am not saying it is as simple as that, but to keep hardware
topology and the info fed to scheduler or the info as viewed from
sched_domains perspective different. The former changes with hardware while
the latter changes with the view of scheduler. Based on the logic it
used to optimise, scheduler can try different view on the same hardware but
that's not where we want to be. We want to provide a generic view based on
all the h/w cpu and cache topology information. It may not be always optimal,
but we can try the best.

> >>> If you are not happy with that, then how can be be happy with what is the
> >>> current behaviour on ACPI + and - CONFIG_SCHED_CLUSTER. I haven't got
> >>> your opinion yet on that matter.
> >>
> >> I guess it's clear now that ACPI + CONFIG_SCHED_CLUSTER with ``the level
> >> of topology above CPUs` is congruent with LLC` creates trouble to the
> >> scheduler. So I don't see why we should replicate this for DT. Let's
> >> discuss further tomorrow in person.
> > 
> > I see it differently. If that creates a trouble, fix that and you will not
> > have any issues with DT too.
> 
> OK, I think we (arm64) found a way to support a default
> CONFIG_SCHED_CLUSTER and fixing the `Juno lstopo` issue with a possible
> way to include `Armv9 L2 complexes` support via cacheinfo later.

Thanks for all the discussion. Hope we may not have to revisit this topic
in such depth until another topology that breaks all our current assumption
arrives in the wild which may not be too long.

I remember talking with Morten and concluding LLC is sufficient for arm64
couple of years back, but now that can be questionable with Armv9 L2
complex.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 153+ messages in thread

end of thread, other threads:[~2022-06-21 16:01 UTC | newest]

Thread overview: 153+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-25  8:14 [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
2022-05-25  8:14 ` Sudeep Holla
2022-05-25  8:14 ` Sudeep Holla
2022-05-25  8:14 ` [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
2022-05-25  8:14   ` Sudeep Holla
2022-05-25  8:14   ` Sudeep Holla
2022-05-25  8:14   ` [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
2022-05-25  8:14     ` Sudeep Holla
2022-05-25  8:14     ` Sudeep Holla
2022-05-25  8:14     ` [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
2022-05-25  8:14       ` Sudeep Holla
2022-05-25  8:14       ` Sudeep Holla
2022-05-25  8:14       ` [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
2022-05-25  8:14         ` Sudeep Holla
2022-05-25  8:14         ` Sudeep Holla
2022-05-25  8:14         ` [PATCH v3 05/16] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
2022-05-25  8:14           ` Sudeep Holla
2022-05-25  8:14           ` Sudeep Holla
2022-05-25  8:14           ` [PATCH v3 06/16] arch_topology: Add support to parse and detect " Sudeep Holla
2022-05-25  8:14             ` Sudeep Holla
2022-05-25  8:14             ` Sudeep Holla
2022-05-25  8:14             ` [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
2022-05-25  8:14               ` Sudeep Holla
2022-05-25  8:14               ` Sudeep Holla
2022-05-25  8:14               ` [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
2022-05-25  8:14                 ` Sudeep Holla
2022-05-25  8:14                 ` Sudeep Holla
2022-05-25  8:14                 ` [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
2022-05-25  8:14                   ` Sudeep Holla
2022-05-25  8:14                   ` Sudeep Holla
2022-05-25  8:14                   ` [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
2022-05-25  8:14                     ` Sudeep Holla
2022-05-25  8:14                     ` Sudeep Holla
2022-05-25  8:14                     ` [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
2022-05-25  8:14                       ` Sudeep Holla
2022-05-25  8:14                       ` Sudeep Holla
2022-05-25  8:14                       ` [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
2022-05-25  8:14                         ` Sudeep Holla
2022-05-25  8:14                         ` Sudeep Holla
2022-05-25  8:14                         ` [PATCH v3 13/16] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
2022-05-25  8:14                           ` Sudeep Holla
2022-05-25  8:14                           ` Sudeep Holla
2022-05-25  8:14                           ` [PATCH v3 14/16] arch_topology: Drop unnecessary check for uninitialised package_id Sudeep Holla
2022-05-25  8:14                             ` Sudeep Holla
2022-05-25  8:14                             ` Sudeep Holla
2022-05-25  8:14                             ` [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
2022-05-25  8:14                               ` Sudeep Holla
2022-05-25  8:14                               ` Sudeep Holla
2022-05-25  8:14                               ` [PATCH v3 16/16] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
2022-05-25  8:14                                 ` Sudeep Holla
2022-05-25  8:14                                 ` Sudeep Holla
2022-06-03 12:30                               ` [PATCH v3 15/16] arch_topology: Set cluster identifier in each core/thread from /cpu-map Dietmar Eggemann
2022-06-03 12:30                                 ` Dietmar Eggemann
2022-06-03 12:30                                 ` Dietmar Eggemann
2022-06-06 10:21                                 ` Sudeep Holla
2022-06-06 10:21                                   ` Sudeep Holla
2022-06-06 10:21                                   ` Sudeep Holla
2022-06-10 10:08                                   ` Vincent Guittot
2022-06-10 10:08                                     ` Vincent Guittot
2022-06-10 10:08                                     ` Vincent Guittot
2022-06-10 10:27                                     ` Sudeep Holla
2022-06-10 10:27                                       ` Sudeep Holla
2022-06-10 10:27                                       ` Sudeep Holla
2022-06-13  9:19                                       ` Dietmar Eggemann
2022-06-13  9:19                                         ` Dietmar Eggemann
2022-06-13  9:19                                         ` Dietmar Eggemann
2022-06-13 11:17                                         ` Sudeep Holla
2022-06-13 11:17                                           ` Sudeep Holla
2022-06-13 11:17                                           ` Sudeep Holla
2022-06-16 16:02                                           ` Dietmar Eggemann
2022-06-16 16:02                                             ` Dietmar Eggemann
2022-06-16 16:02                                             ` Dietmar Eggemann
2022-06-17 11:16                                             ` Sudeep Holla
2022-06-17 11:16                                               ` Sudeep Holla
2022-06-17 11:16                                               ` Sudeep Holla
2022-06-20 13:27                                               ` Dietmar Eggemann
2022-06-20 13:27                                                 ` Dietmar Eggemann
2022-06-20 13:27                                                 ` Dietmar Eggemann
2022-06-21 16:00                                                 ` Sudeep Holla
2022-06-21 16:00                                                   ` Sudeep Holla
2022-06-21 16:00                                                   ` Sudeep Holla
2022-06-14 17:59                                       ` Vincent Guittot
2022-06-14 17:59                                         ` Vincent Guittot
2022-06-14 17:59                                         ` Vincent Guittot
2022-06-15 17:00                                         ` Sudeep Holla
2022-06-15 17:00                                           ` Sudeep Holla
2022-06-15 17:00                                           ` Sudeep Holla
2022-06-15 22:44                                           ` Vincent Guittot
2022-06-15 22:44                                             ` Vincent Guittot
2022-06-15 22:44                                             ` Vincent Guittot
2022-06-15 22:45                                           ` Vincent Guittot
2022-06-15 22:45                                             ` Vincent Guittot
2022-06-15 22:45                                             ` Vincent Guittot
2022-06-01  3:40                         ` [PATCH v3 12/16] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Gavin Shan
2022-06-01  3:40                           ` Gavin Shan
2022-06-01  3:40                           ` Gavin Shan
2022-06-01  3:38                       ` [PATCH v3 11/16] arch_topology: Check for non-negative value rather than -1 for IDs validity Gavin Shan
2022-06-01  3:38                         ` Gavin Shan
2022-06-01  3:38                         ` Gavin Shan
2022-06-01  3:36                     ` [PATCH v3 10/16] arch_topology: Set thread sibling cpumask only within the cluster Gavin Shan
2022-06-01  3:36                       ` Gavin Shan
2022-06-01  3:36                       ` Gavin Shan
2022-06-01  3:35                   ` [PATCH v3 09/16] arch_topology: Drop LLC identifier stash from the CPU topology Gavin Shan
2022-06-01  3:35                     ` Gavin Shan
2022-06-01  3:35                     ` Gavin Shan
2022-06-01 12:06                     ` Sudeep Holla
2022-06-01 12:06                       ` Sudeep Holla
2022-06-01 12:06                       ` Sudeep Holla
2022-06-02  6:44                       ` Gavin Shan
2022-06-02  6:44                         ` Gavin Shan
2022-06-02  6:44                         ` Gavin Shan
2022-06-02  6:42                   ` Gavin Shan
2022-06-02  6:42                     ` Gavin Shan
2022-06-02  6:42                     ` Gavin Shan
2022-06-02  6:42                 ` [PATCH v3 08/16] arm64: topology: Remove redundant setting of llc_id in " Gavin Shan
2022-06-02  6:42                   ` Gavin Shan
2022-06-02  6:42                   ` Gavin Shan
2022-06-01  3:31               ` [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo Gavin Shan
2022-06-01  3:31                 ` Gavin Shan
2022-06-01  3:31                 ` Gavin Shan
2022-06-02 14:26               ` Dietmar Eggemann
2022-06-02 14:26                 ` Dietmar Eggemann
2022-06-02 14:26                 ` Dietmar Eggemann
2022-06-06  9:54                 ` Sudeep Holla
2022-06-06  9:54                   ` Sudeep Holla
2022-06-06  9:54                   ` Sudeep Holla
2022-06-01  3:29             ` [PATCH v3 06/16] arch_topology: Add support to parse and detect cache attributes Gavin Shan
2022-06-01  3:29               ` Gavin Shan
2022-06-01  3:29               ` Gavin Shan
2022-06-01  3:25           ` [PATCH v3 05/16] cacheinfo: Allow early detection and population of " Gavin Shan
2022-06-01  3:25             ` Gavin Shan
2022-06-01  3:25             ` Gavin Shan
2022-06-01  3:20         ` [PATCH v3 04/16] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Gavin Shan
2022-06-01  3:20           ` Gavin Shan
2022-06-01  3:20           ` Gavin Shan
2022-06-01  2:51       ` [PATCH v3 03/16] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Gavin Shan
2022-06-01  2:51         ` Gavin Shan
2022-06-01  2:51         ` Gavin Shan
2022-06-01  2:44     ` [PATCH v3 02/16] cacheinfo: Add helper to access any cache index for a given CPU Gavin Shan
2022-06-01  2:44       ` Gavin Shan
2022-06-01  2:44       ` Gavin Shan
2022-06-01 12:45       ` Sudeep Holla
2022-06-01 12:45         ` Sudeep Holla
2022-06-01 12:45         ` Sudeep Holla
2022-06-01  2:45   ` [PATCH v3 01/16] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Gavin Shan
2022-06-01  2:45     ` Gavin Shan
2022-06-01  2:45     ` Gavin Shan
2022-06-01  3:49 ` [PATCH v3 00/16] arch_topology: Updates to add socket support and fix cluster ids Gavin Shan
2022-06-01  3:49   ` Gavin Shan
2022-06-01  3:49   ` Gavin Shan
2022-06-01 12:03   ` Sudeep Holla
2022-06-01 12:03     ` Sudeep Holla
2022-06-01 12:03     ` Sudeep Holla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.