linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
@ 2022-07-04 10:15 Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 01/21] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
                   ` (21 more replies)
  0 siblings, 22 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

Hi Greg,

Let me know if you prefer to pull the patches directly or prefer pull
request. It has been in -next for a while now.

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.


v5[5]->v6:
	- Handled out of memory case in early detected correctly after
	  Conor reported boot failures on some RISC-V platform. Also
	  added a log to show up failure of early cacheinfo detection.
	- Added "Remove the unused find_acpi_cpu_cache_topology()" which
	  was missed earlier and posted separately
	- Added all the additional tags recieved

v4[4]->v5[5]:
	- Added all the tags recieved so far. Rafael has acked only change
	  in ACPI and Catalin has acked only change in arm64.
	- Addressed all the typos pointed by Ionela and dropped the patch
	  removing the checks for invalid package id as discussed and update
	  depth in nested cluster warning check.

v3[3]->v4[4]:
	- Updated ACPI PPTT fw_token to use table offset instead of virtual
	  address as it could get changed for everytime it is mapped before
	  the global acpi_permanent_mmap is set
	- Added warning for the topology with nested clusters
	- Added update to cpu_clustergroup_mask so that introduction of
	  correct cluster_id doesn't break existing platforms by limiting
	  the span of clustergroup_mask(by Ionela)

v2[2]->v3[3]:
        - Dropped support to get the device node for the CPU's LLC
        - Updated cacheinfo to support calling of detect_cache_attributes
          early in smp_prepare_cpus stage
        - Added support to check if LLC is valid and shared in the cacheinfo
        - Used the same in arch_topology

v1[1]->v2[2]:
        - Updated ID validity check include all non-negative value
        - Added support to get the device node for the CPU's last level cache
        - Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
[2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
[3] https://lore.kernel.org/lkml/20220525081416.3306043-1-sudeep.holla@arm.com
[4] https://lore.kernel.org/lkml/20220621192034.3332546-1-sudeep.holla@arm.com
[5] https://lore.kernel.org/lkml/20220627165047.336669-1-sudeep.holla@arm.com

Ionela Voinescu (1):
  arch_topology: Limit span of cpu_clustergroup_mask()

Sudeep Holla (20):
  ACPI: PPTT: Use table offset as fw_token instead of virtual address
  cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  cacheinfo: Add helper to access any cache index for a given CPU
  cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  cacheinfo: Allow early detection and population of cache attributes
  cacheinfo: Use cache identifiers to check if the caches are shared if available
  cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
  arch_topology: Add support to parse and detect cache attributes
  arch_topology: Use the last level cache information from the cacheinfo
  arm64: topology: Remove redundant setting of llc_id in CPU topology
  arch_topology: Drop LLC identifier stash from the CPU topology
  arch_topology: Set thread sibling cpumask only within the cluster
  arch_topology: Check for non-negative value rather than -1 for IDs validity
  arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  arch_topology: Don't set cluster identifier as physical package identifier
  arch_topology: Set cluster identifier in each core/thread from /cpu-map
  arch_topology: Add support for parsing sockets in /cpu-map
  arch_topology: Warn that topology for nested clusters is not supported
  ACPI: Remove the unused find_acpi_cpu_cache_topology()

 arch/arm64/kernel/topology.c  |  14 ----
 drivers/acpi/pptt.c           |  40 +---------
 drivers/base/arch_topology.c  | 102 ++++++++++++++++++------
 drivers/base/cacheinfo.c      | 143 ++++++++++++++++++++++------------
 include/linux/acpi.h          |   5 --
 include/linux/arch_topology.h |   1 -
 include/linux/cacheinfo.h     |   3 +
 7 files changed, 175 insertions(+), 133 deletions(-)

--
2.37.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v6 01/21] ACPI: PPTT: Use table offset as fw_token instead of virtual address
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 02/21] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, linux-acpi

There is need to use the cache sharing information quite early during
the boot before the secondary cores are up and running. The permanent
memory map for all the ACPI tables(via acpi_permanent_mmap) is turned
on in acpi_early_init() which is quite late for the above requirement.

As a result there is possibility that the ACPI PPTT gets mapped to
different virtual addresses. In such scenarios, using virtual address as
fw_token before the acpi_permanent_mmap is enabled results in different
fw_token for the same cache entity and hence wrong cache sharing
information will be deduced based on the same.

Instead of using virtual address, just use the table offset as the
unique firmware token for the caches. The same offset is used as
ACPI identifiers if the firmware has not set a valid one for other
entries in the ACPI PPTT.

Cc: linux-acpi@vger.kernel.org
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/acpi/pptt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 701f61c01359..763f021d45e6 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -437,7 +437,8 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 		pr_debug("found = %p %p\n", found_cache, cpu_node);
 		if (found_cache)
 			update_cache_properties(this_leaf, found_cache,
-			                        cpu_node, table->revision);
+						ACPI_TO_POINTER(ACPI_PTR_DIFF(cpu_node, table)),
+						table->revision);
 
 		index++;
 	}
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 02/21] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 01/21] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 03/21] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
   much earlier before the CPUs are registered as devices.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
 #include <linux/cpu.h>
 #include <linux/device.h>
 #include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		return 0;
 	}
 
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
+	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
 		return -ENOENT;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 03/21] cacheinfo: Add helper to access any cache index for a given CPU
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 01/21] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 02/21] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 04/21] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index b0bde272e2ae..e13ef41763e4 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
 #define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
 #define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
 #define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
+#define per_cpu_cacheinfo_idx(cpu, idx)		\
+				(per_cpu_cacheinfo(cpu) + (idx))
 
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 {
@@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
 	}
 
 	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		if (this_leaf->level != 1)
 			np = of_find_next_cache_node(np);
 		else
@@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		/* skip if shared_cpu_map is already populated */
 		if (!cpumask_empty(&this_leaf->shared_cpu_map))
 			continue;
@@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 			if (i == cpu || !sib_cpu_ci->info_list)
 				continue;/* skip if itself or no cacheinfo */
-			sib_leaf = sib_cpu_ci->info_list + index;
+			sib_leaf = per_cpu_cacheinfo_idx(i, index);
 			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
 				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
 				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
@@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 static void cache_shared_cpu_map_remove(unsigned int cpu)
 {
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	struct cacheinfo *this_leaf, *sib_leaf;
 	unsigned int sibling, index;
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
 			struct cpu_cacheinfo *sib_cpu_ci;
 
@@ -274,7 +275,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			if (!sib_cpu_ci->info_list)
 				continue;
 
-			sib_leaf = sib_cpu_ci->info_list + index;
+			sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
@@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
 	int rc;
 	struct device *ci_dev, *parent;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	const struct attribute_group **cache_groups;
 
 	rc = cpu_cache_sysfs_init(cpu);
@@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
 
 	parent = per_cpu_cache_dev(cpu);
 	for (i = 0; i < cache_leaves(cpu); i++) {
-		this_leaf = this_cpu_ci->info_list + i;
+		this_leaf = per_cpu_cacheinfo_idx(cpu, i);
 		if (this_leaf->disable_sysfs)
 			continue;
 		if (this_leaf->type == CACHE_TYPE_NOCACHE)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 04/21] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (2 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 03/21] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 05/21] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index e13ef41763e4..2cea9201f31c 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -33,13 +33,21 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 	return ci_cacheinfo(cpu);
 }
 
-#ifdef CONFIG_OF
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
+	/*
+	 * For non DT/ACPI systems, assume unique level 1 caches,
+	 * system-wide shared caches for all other levels. This will be used
+	 * only if arch specific code has not populated shared_cpu_map
+	 */
+	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
+		return !(this_leaf->level == 1);
+
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+#ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
 	const char *size_prop;
@@ -193,16 +201,6 @@ static int cache_setup_of_node(unsigned int cpu)
 }
 #else
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
-					   struct cacheinfo *sib_leaf)
-{
-	/*
-	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
-	 * shared caches for all other levels. This will be used only if
-	 * arch specific code has not populated shared_cpu_map
-	 */
-	return !(this_leaf->level == 1);
-}
 #endif
 
 int __weak cache_setup_acpi(unsigned int cpu)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 05/21] cacheinfo: Add support to check if last level cache(LLC) is valid or shared
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (3 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 04/21] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 06/21] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid
or not.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 26 ++++++++++++++++++++++++++
 include/linux/cacheinfo.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 2cea9201f31c..fdc1baa342f1 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
+bool last_level_cache_is_valid(unsigned int cpu)
+{
+	struct cacheinfo *llc;
+
+	if (!cache_leaves(cpu))
+		return false;
+
+	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+	return !!llc->fw_token;
+}
+
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
+{
+	struct cacheinfo *llc_x, *llc_y;
+
+	if (!last_level_cache_is_valid(cpu_x) ||
+	    !last_level_cache_is_valid(cpu_y))
+		return false;
+
+	llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
+	llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+
+	return cache_leaves_are_shared(llc_x, llc_y);
+}
+
 #ifdef CONFIG_OF
 /* OF properties to query for a given cache type */
 struct cache_type_info {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 4ff37cb763ae..7e429bc5c1a4 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
+bool last_level_cache_is_valid(unsigned int cpu);
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 06/21] cacheinfo: Allow early detection and population of cache attributes
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (4 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 05/21] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 07/21] cacheinfo: Use cache identifiers to check if the caches are shared if available Sudeep Holla
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c  | 55 ++++++++++++++++++++++++++-------------
 include/linux/cacheinfo.h |  1 +
 2 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index fdc1baa342f1..4d21a1022fa9 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
 {
 	struct device_node *np;
 	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if fw_token is already populated */
-	if (this_cpu_ci->info_list->fw_token) {
-		return 0;
-	}
-
 	np = of_cpu_device_node_get(cpu);
 	if (!np) {
 		pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
 
 unsigned int coherency_max_size;
 
+static int cache_setup_properties(unsigned int cpu)
+{
+	int ret = 0;
+
+	if (of_have_populated_dt())
+		ret = cache_setup_of_node(cpu);
+	else if (!acpi_disabled)
+		ret = cache_setup_acpi(cpu);
+
+	return ret;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (this_cpu_ci->cpu_map_populated)
 		return 0;
 
-	if (of_have_populated_dt())
-		ret = cache_setup_of_node(cpu);
-	else if (!acpi_disabled)
-		ret = cache_setup_acpi(cpu);
-
-	if (ret)
-		return ret;
+	/*
+	 * skip setting up cache properties if LLC is valid, just need
+	 * to update the shared cpu_map if the cache attributes were
+	 * populated early before all the cpus are brought online
+	 */
+	if (!last_level_cache_is_valid(cpu)) {
+		ret = cache_setup_properties(cpu);
+		if (ret)
+			return ret;
+	}
 
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		unsigned int i;
 
 		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
-		/* skip if shared_cpu_map is already populated */
-		if (!cpumask_empty(&this_leaf->shared_cpu_map))
-			continue;
 
 		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
 		for_each_online_cpu(i) {
@@ -330,17 +336,28 @@ int __weak populate_cache_leaves(unsigned int cpu)
 	return -ENOENT;
 }
 
-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
 {
 	int ret;
 
+	/* Since early detection of the cacheinfo is allowed via this
+	 * function and this also gets called as CPU hotplug callbacks via
+	 * cacheinfo_cpu_online, the initialisation can be skipped and only
+	 * CPU maps can be updated as the CPU online status would be update
+	 * if called via cacheinfo_cpu_online path.
+	 */
+	if (per_cpu_cacheinfo(cpu))
+		goto update_cpu_map;
+
 	if (init_cache_level(cpu) || !cache_leaves(cpu))
 		return -ENOENT;
 
 	per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu),
 					 sizeof(struct cacheinfo), GFP_KERNEL);
-	if (per_cpu_cacheinfo(cpu) == NULL)
+	if (per_cpu_cacheinfo(cpu) == NULL) {
+		cache_leaves(cpu) = 0;
 		return -ENOMEM;
+	}
 
 	/*
 	 * populate_cache_leaves() may completely setup the cache leaves and
@@ -349,6 +366,8 @@ static int detect_cache_attributes(unsigned int cpu)
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
+
+update_cpu_map:
 	/*
 	 * For systems using DT for cache hierarchy, fw_token
 	 * and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
 bool last_level_cache_is_valid(unsigned int cpu);
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
 #ifndef CONFIG_ACPI_PPTT
 /*
  * acpi_find_last_cache_level is only called on ACPI enabled
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 07/21] cacheinfo: Use cache identifiers to check if the caches are shared if available
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (5 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 06/21] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 08/21] cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability Sudeep Holla
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.

We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 4d21a1022fa9..e331b399adeb 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -44,6 +44,10 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
 		return !(this_leaf->level == 1);
 
+	if ((sib_leaf->attributes & CACHE_ID) &&
+	    (this_leaf->attributes & CACHE_ID))
+		return sib_leaf->id == this_leaf->id;
+
 	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
@@ -56,7 +60,8 @@ bool last_level_cache_is_valid(unsigned int cpu)
 
 	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
 
-	return !!llc->fw_token;
+	return (llc->attributes & CACHE_ID) || !!llc->fw_token;
+
 }
 
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 08/21] cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (6 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 07/21] cacheinfo: Use cache identifiers to check if the caches are shared if available Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/cacheinfo.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index e331b399adeb..65d566ff24c4 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -279,6 +279,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 
 			if (i == cpu || !sib_cpu_ci->info_list)
 				continue;/* skip if itself or no cacheinfo */
+
 			sib_leaf = per_cpu_cacheinfo_idx(i, index);
 			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
 				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
@@ -301,14 +302,11 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 	for (index = 0; index < cache_leaves(cpu); index++) {
 		this_leaf = per_cpu_cacheinfo_idx(cpu, index);
 		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
-			struct cpu_cacheinfo *sib_cpu_ci;
-
-			if (sibling == cpu) /* skip itself */
-				continue;
+			struct cpu_cacheinfo *sib_cpu_ci =
+						get_cpu_cacheinfo(sibling);
 
-			sib_cpu_ci = get_cpu_cacheinfo(sibling);
-			if (!sib_cpu_ci->info_list)
-				continue;
+			if (sibling == cpu || !sib_cpu_ci->info_list)
+				continue;/* skip if itself or no cacheinfo */
 
 			sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (7 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 08/21] cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-19 14:22   ` Geert Uytterhoeven
  2022-07-04 10:15 ` [PATCH v6 10/21] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.

In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.

Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 579c851a2bd7..e2f7d9ea558e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cacheinfo.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
 #include <linux/device.h>
@@ -780,15 +781,28 @@ __weak int __init parse_acpi_topology(void)
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
+	int ret, cpu;
+
 	reset_cpu_topology();
+	ret = parse_acpi_topology();
+	if (!ret)
+		ret = of_have_populated_dt() && parse_dt_topology();
 
-	/*
-	 * Discard anything that was parsed if we hit an error so we
-	 * don't use partial information.
-	 */
-	if (parse_acpi_topology())
-		reset_cpu_topology();
-	else if (of_have_populated_dt() && parse_dt_topology())
+	if (ret) {
+		/*
+		 * Discard anything that was parsed if we hit an error so we
+		 * don't use partial information.
+		 */
 		reset_cpu_topology();
+		return;
+	}
+
+	for_each_possible_cpu(cpu) {
+		ret = detect_cache_attributes(cpu);
+		if (ret) {
+			pr_info("Early cacheinfo failed, ret = %d\n", ret);
+			break;
+		}
+	}
 }
 #endif
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 10/21] arch_topology: Use the last level cache information from the cacheinfo
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (8 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 11/21] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.

This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e2f7d9ea558e..4f936c984fb6 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -668,7 +668,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
 	}
-	if (cpu_topology[cpu].llc_id != -1) {
+
+	if (last_level_cache_is_valid(cpu)) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
 	}
@@ -699,7 +700,7 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if (last_level_cache_is_shared(cpu, cpuid)) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 11/21] arm64: topology: Remove redundant setting of llc_id in CPU topology
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (9 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 10/21] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 12/21] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Will Deacon, Catalin Marinas,
	Gavin Shan

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/arm64/kernel/topology.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
 		return 0;
 
 	for_each_possible_cpu(cpu) {
-		int i, cache_id;
-
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
 		cpu_topology[cpu].cluster_id = topology_id;
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
-
-		i = acpi_find_last_cache_level(cpu);
-
-		if (i > 0) {
-			/*
-			 * this is the only part of cpu_topology that has
-			 * a direct relationship with the cache topology
-			 */
-			cache_id = find_acpi_cpu_cache_topology(cpu, i);
-			if (cache_id > 0)
-				cpu_topology[cpu].llc_id = cache_id;
-		}
 	}
 
 	return 0;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 12/21] arch_topology: Drop LLC identifier stash from the CPU topology
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (10 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 11/21] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 13/21] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology
information.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c  | 1 -
 include/linux/arch_topology.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 4f936c984fb6..8206990c679f 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -752,7 +752,6 @@ void __init reset_cpu_topology(void)
 		cpu_topo->core_id = -1;
 		cpu_topo->cluster_id = -1;
 		cpu_topo->package_id = -1;
-		cpu_topo->llc_id = -1;
 
 		clear_cpu_topology(cpu);
 	}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
 	int core_id;
 	int cluster_id;
 	int package_id;
-	int llc_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 	cpumask_t cluster_sibling;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 13/21] arch_topology: Set thread sibling cpumask only within the cluster
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (11 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 12/21] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 14/21] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.

So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.

This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.

Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 8206990c679f..6ab173caf1dc 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -708,15 +708,17 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
-		if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
-		    cpuid_topo->cluster_id != -1) {
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+			continue;
+
+		if (cpuid_topo->cluster_id != -1) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
 
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
 		if (cpuid_topo->core_id != cpu_topo->core_id)
 			continue;
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 14/21] arch_topology: Check for non-negative value rather than -1 for IDs validity
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (12 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 13/21] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:15 ` [PATCH v6 15/21] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Andy Shevchenko, Gavin Shan

Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 6ab173caf1dc..c0b0ee64a79d 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -642,7 +642,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id == -1)
+		if (cpu_topology[cpu].package_id < 0)
 			ret = -EINVAL;
 
 out_map:
@@ -714,7 +714,7 @@ void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
 			continue;
 
-		if (cpuid_topo->cluster_id != -1) {
+		if (cpuid_topo->cluster_id >= 0) {
 			cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
 		}
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 15/21] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (13 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 14/21] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
@ 2022-07-04 10:15 ` Sudeep Holla
  2022-07-04 10:16 ` [PATCH v6 16/21] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:15 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Gavin Shan

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index c0b0ee64a79d..8f6a964d2512 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -642,8 +642,10 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id < 0)
+		if (cpu_topology[cpu].package_id < 0) {
 			ret = -EINVAL;
+			break;
+		}
 
 out_map:
 	of_node_put(map);
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 16/21] arch_topology: Don't set cluster identifier as physical package identifier
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (14 preceding siblings ...)
  2022-07-04 10:15 ` [PATCH v6 15/21] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-04 10:16 ` [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.

The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.

Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 8f6a964d2512..e384afb6cac7 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -549,7 +549,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -588,7 +587,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, package_id, core_id++);
+				ret = parse_core(c, 0, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -605,9 +604,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	if (leaf && !has_cores)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
-	if (leaf)
-		package_id++;
-
 	return 0;
 }
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (15 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 16/21] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-08  0:10   ` Darren Hart
  2022-07-04 10:16 ` [PATCH v6 18/21] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Darren Hart

From: Ionela Voinescu <ionela.voinescu@arm.com>

Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.

To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().

This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.

While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.

Cc: Darren Hart <darren@os.amperecomputing.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e384afb6cac7..591c1f8e15e2 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -686,6 +686,14 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 
 const struct cpumask *cpu_clustergroup_mask(int cpu)
 {
+	/*
+	 * Forbid cpu_clustergroup_mask() to span more or the same CPUs as
+	 * cpu_coregroup_mask().
+	 */
+	if (cpumask_subset(cpu_coregroup_mask(cpu),
+			   &cpu_topology[cpu].cluster_sibling))
+		return get_cpu_mask(cpu);
+
 	return &cpu_topology[cpu].cluster_sibling;
 }
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 18/21] arch_topology: Set cluster identifier in each core/thread from /cpu-map
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (16 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-04 10:16 ` [PATCH v6 19/21] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.

We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 591c1f8e15e2..217a91fc1f59 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -497,7 +497,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 }
 
 static int __init parse_core(struct device_node *core, int package_id,
-			     int core_id)
+			     int cluster_id, int core_id)
 {
 	char name[20];
 	bool leaf = true;
@@ -513,6 +513,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
 				cpu_topology[cpu].package_id = package_id;
+				cpu_topology[cpu].cluster_id = cluster_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else if (cpu != -ENODEV) {
@@ -534,6 +535,7 @@ static int __init parse_core(struct device_node *core, int package_id,
 		}
 
 		cpu_topology[cpu].package_id = package_id;
+		cpu_topology[cpu].cluster_id = cluster_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf && cpu != -ENODEV) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -543,7 +545,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -563,7 +566,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, depth + 1);
+			ret = parse_cluster(c, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -587,7 +590,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, core_id++);
+				ret = parse_core(c, 0, cluster_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -627,7 +630,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, 0);
+	ret = parse_cluster(map, -1, 0);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 19/21] arch_topology: Add support for parsing sockets in /cpu-map
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (17 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 18/21] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-04 10:16 ` [PATCH v6 20/21] arch_topology: Warn that topology for nested clusters is not supported Sudeep Holla
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.

Also it is likely that most single socket systems skip to as the node
since it is optional.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 217a91fc1f59..8719c4458df9 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -545,8 +545,8 @@ static int __init parse_core(struct device_node *core, int package_id,
 	return 0;
 }
 
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+				int cluster_id, int depth)
 {
 	char name[20];
 	bool leaf = true;
@@ -566,7 +566,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 		c = of_get_child_by_name(cluster, name);
 		if (c) {
 			leaf = false;
-			ret = parse_cluster(c, i, depth + 1);
+			ret = parse_cluster(c, package_id, i, depth + 1);
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
@@ -590,7 +590,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, 0, cluster_id, core_id++);
+				ret = parse_core(c, package_id, cluster_id,
+						 core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -610,6 +611,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
 	return 0;
 }
 
+static int __init parse_socket(struct device_node *socket)
+{
+	char name[20];
+	struct device_node *c;
+	bool has_socket = false;
+	int package_id = 0, ret;
+
+	do {
+		snprintf(name, sizeof(name), "socket%d", package_id);
+		c = of_get_child_by_name(socket, name);
+		if (c) {
+			has_socket = true;
+			ret = parse_cluster(c, package_id, -1, 0);
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		package_id++;
+	} while (c);
+
+	if (!has_socket)
+		ret = parse_cluster(socket, 0, -1, 0);
+
+	return ret;
+}
+
 static int __init parse_dt_topology(void)
 {
 	struct device_node *cn, *map;
@@ -630,7 +657,7 @@ static int __init parse_dt_topology(void)
 	if (!map)
 		goto out;
 
-	ret = parse_cluster(map, -1, 0);
+	ret = parse_socket(map);
 	if (ret != 0)
 		goto out_map;
 
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 20/21] arch_topology: Warn that topology for nested clusters is not supported
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (18 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 19/21] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-04 10:16 ` [PATCH v6 21/21] ACPI: Remove the unused find_acpi_cpu_cache_topology() Sudeep Holla
  2022-07-04 15:10 ` [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Conor.Dooley
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv

We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.

Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/arch_topology.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 8719c4458df9..441e14ac33a4 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -567,6 +567,8 @@ static int __init parse_cluster(struct device_node *cluster, int package_id,
 		if (c) {
 			leaf = false;
 			ret = parse_cluster(c, package_id, i, depth + 1);
+			if (depth > 0)
+				pr_warn("Topology for clusters of clusters not yet supported\n");
 			of_node_put(c);
 			if (ret != 0)
 				return ret;
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v6 21/21] ACPI: Remove the unused find_acpi_cpu_cache_topology()
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (19 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 20/21] arch_topology: Warn that topology for nested clusters is not supported Sudeep Holla
@ 2022-07-04 10:16 ` Sudeep Holla
  2022-07-04 15:10 ` [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Conor.Dooley
  21 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 10:16 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman
  Cc: Sudeep Holla, conor.dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois,
	linux-arm-kernel, linux-riscv, Rafael J . Wysocki

The sole user of this find_acpi_cpu_cache_topology() was arm64 topology
which is now consolidated into the generic arch_topology without the need
of this function.

Drop the unused function find_acpi_cpu_cache_topology().

Reported-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/acpi/pptt.c  | 37 -------------------------------------
 include/linux/acpi.h |  5 -----
 2 files changed, 42 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 763f021d45e6..dd3222a15c9c 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -691,43 +691,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
 	return find_acpi_cpu_topology_tag(cpu, level, 0);
 }
 
-/**
- * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
- * @cpu: Kernel logical CPU number
- * @level: The cache level for which we would like a unique ID
- *
- * Determine a unique ID for each unified cache in the system
- *
- * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
- * Otherwise returns a value which represents a unique topological feature.
- */
-int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
-{
-	struct acpi_table_header *table;
-	struct acpi_pptt_cache *found_cache;
-	acpi_status status;
-	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
-	struct acpi_pptt_processor *cpu_node = NULL;
-	int ret = -1;
-
-	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
-	if (ACPI_FAILURE(status)) {
-		acpi_pptt_warn_missing();
-		return -ENOENT;
-	}
-
-	found_cache = acpi_find_cache_node(table, acpi_cpu_id,
-					   CACHE_TYPE_UNIFIED,
-					   level,
-					   &cpu_node);
-	if (found_cache)
-		ret = ACPI_PTR_DIFF(cpu_node, table);
-
-	acpi_put_table(table);
-
-	return ret;
-}
-
 /**
  * find_acpi_cpu_topology_package() - Determine a unique CPU package value
  * @cpu: Kernel logical CPU number
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 4f82a5bc6d98..7b96a8bff6d2 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1429,7 +1429,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level);
 int find_acpi_cpu_topology_cluster(unsigned int cpu);
 int find_acpi_cpu_topology_package(unsigned int cpu);
 int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
-int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
 #else
 static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
 {
@@ -1451,10 +1450,6 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
 {
 	return -EINVAL;
 }
-static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
-{
-	return -EINVAL;
-}
 #endif
 
 #ifdef CONFIG_ACPI_PCC
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
  2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
                   ` (20 preceding siblings ...)
  2022-07-04 10:16 ` [PATCH v6 21/21] ACPI: Remove the unused find_acpi_cpu_cache_topology() Sudeep Holla
@ 2022-07-04 15:10 ` Conor.Dooley
  2022-07-04 15:20   ` Sudeep Holla
       [not found]   ` <507c6b64-fc23-3eea-e4c1-4d426025d658@inria.fr>
  21 siblings, 2 replies; 36+ messages in thread
From: Conor.Dooley @ 2022-07-04 15:10 UTC (permalink / raw)
  To: sudeep.holla, linux-kernel, gregkh
  Cc: Valentina.FernandezAlanis, vincent.guittot, dietmar.eggemann,
	wangqing, robh+dt, rafael, ionela.voinescu, pierre.gondois,
	linux-arm-kernel, linux-riscv

On 04/07/2022 11:15, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Hi Greg,
> 
> Let me know if you prefer to pull the patches directly or prefer pull
> request. It has been in -next for a while now.
> 
> Hi All,
> 
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology.
> 
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
> 
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
> 
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
> 
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.

For DT + RISC-V on PolarFire SoC and SiFive fu540
Tested-by: Conor Dooley <conor.dooley@microchip.com>

Anecdotally, v5 was tested on the !SMP D1 which worked fine when
CONFIG_SMP was enabled.

Thanks,
Conor.

> 
> 
> v5[5]->v6:
>         - Handled out of memory case in early detected correctly after
>           Conor reported boot failures on some RISC-V platform. Also
>           added a log to show up failure of early cacheinfo detection.
>         - Added "Remove the unused find_acpi_cpu_cache_topology()" which
>           was missed earlier and posted separately
>         - Added all the additional tags recieved
> 
> v4[4]->v5[5]:
>         - Added all the tags recieved so far. Rafael has acked only change
>           in ACPI and Catalin has acked only change in arm64.
>         - Addressed all the typos pointed by Ionela and dropped the patch
>           removing the checks for invalid package id as discussed and update
>           depth in nested cluster warning check.
> 
> v3[3]->v4[4]:
>         - Updated ACPI PPTT fw_token to use table offset instead of virtual
>           address as it could get changed for everytime it is mapped before
>           the global acpi_permanent_mmap is set
>         - Added warning for the topology with nested clusters
>         - Added update to cpu_clustergroup_mask so that introduction of
>           correct cluster_id doesn't break existing platforms by limiting
>           the span of clustergroup_mask(by Ionela)
> 
> v2[2]->v3[3]:
>         - Dropped support to get the device node for the CPU's LLC
>         - Updated cacheinfo to support calling of detect_cache_attributes
>           early in smp_prepare_cpus stage
>         - Added support to check if LLC is valid and shared in the cacheinfo
>         - Used the same in arch_topology
> 
> v1[1]->v2[2]:
>         - Updated ID validity check include all non-negative value
>         - Added support to get the device node for the CPU's last level cache
>         - Added support to build llc_sibling on DT platforms
> 
> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> [3] https://lore.kernel.org/lkml/20220525081416.3306043-1-sudeep.holla@arm.com
> [4] https://lore.kernel.org/lkml/20220621192034.3332546-1-sudeep.holla@arm.com
> [5] https://lore.kernel.org/lkml/20220627165047.336669-1-sudeep.holla@arm.com
> 
> Ionela Voinescu (1):
>   arch_topology: Limit span of cpu_clustergroup_mask()
> 
> Sudeep Holla (20):
>   ACPI: PPTT: Use table offset as fw_token instead of virtual address
>   cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>   cacheinfo: Add helper to access any cache index for a given CPU
>   cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>   cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>   cacheinfo: Allow early detection and population of cache attributes
>   cacheinfo: Use cache identifiers to check if the caches are shared if available
>   cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
>   arch_topology: Add support to parse and detect cache attributes
>   arch_topology: Use the last level cache information from the cacheinfo
>   arm64: topology: Remove redundant setting of llc_id in CPU topology
>   arch_topology: Drop LLC identifier stash from the CPU topology
>   arch_topology: Set thread sibling cpumask only within the cluster
>   arch_topology: Check for non-negative value rather than -1 for IDs validity
>   arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>   arch_topology: Don't set cluster identifier as physical package identifier
>   arch_topology: Set cluster identifier in each core/thread from /cpu-map
>   arch_topology: Add support for parsing sockets in /cpu-map
>   arch_topology: Warn that topology for nested clusters is not supported
>   ACPI: Remove the unused find_acpi_cpu_cache_topology()
> 
>  arch/arm64/kernel/topology.c  |  14 ----
>  drivers/acpi/pptt.c           |  40 +---------
>  drivers/base/arch_topology.c  | 102 ++++++++++++++++++------
>  drivers/base/cacheinfo.c      | 143 ++++++++++++++++++++++------------
>  include/linux/acpi.h          |   5 --
>  include/linux/arch_topology.h |   1 -
>  include/linux/cacheinfo.h     |   3 +
>  7 files changed, 175 insertions(+), 133 deletions(-)
> 
> --
> 2.37.0
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
  2022-07-04 15:10 ` [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Conor.Dooley
@ 2022-07-04 15:20   ` Sudeep Holla
       [not found]   ` <507c6b64-fc23-3eea-e4c1-4d426025d658@inria.fr>
  1 sibling, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-04 15:20 UTC (permalink / raw)
  To: Conor.Dooley
  Cc: linux-kernel, gregkh, Valentina.FernandezAlanis, vincent.guittot,
	dietmar.eggemann, wangqing, robh+dt, rafael, ionela.voinescu,
	pierre.gondois, linux-arm-kernel, linux-riscv

On Mon, Jul 04, 2022 at 03:10:30PM +0000, Conor.Dooley@microchip.com wrote:
> On 04/07/2022 11:15, Sudeep Holla wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > 
> > Hi Greg,
> > 
> > Let me know if you prefer to pull the patches directly or prefer pull
> > request. It has been in -next for a while now.
> > 
> > Hi All,
> > 
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology.
> > 
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> > 
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> > 
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> > 
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> 
> For DT + RISC-V on PolarFire SoC and SiFive fu540
> Tested-by: Conor Dooley <conor.dooley@microchip.com>
> 
> Anecdotally, v5 was tested on the !SMP D1 which worked fine when
> CONFIG_SMP was enabled.
> 

Thanks a lot for testing on RISC-V, much appreciated! Thanks for your
patience and help with v5 so that we could figure out the silly issue
finally.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
       [not found]   ` <507c6b64-fc23-3eea-e4c1-4d426025d658@inria.fr>
@ 2022-07-05 19:06     ` Conor.Dooley
  2022-07-05 20:07       ` Sudeep Holla
  0 siblings, 1 reply; 36+ messages in thread
From: Conor.Dooley @ 2022-07-05 19:06 UTC (permalink / raw)
  To: Brice.Goglin
  Cc: linux-kernel, gregkh, Valentina.FernandezAlanis, vincent.guittot,
	dietmar.eggemann, wangqing, robh+dt, rafael, ionela.voinescu,
	pierre.gondois, linux-arm-kernel, linux-riscv, sudeep.holla,
	kernel

[Adding back the CC list from the original thread]

On 05/07/2022 13:27, Brice Goglin wrote:
> [You don't often get email from brice.goglin@inria.fr. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Hello Conor
> 
> I am the main developer of hwloc [1] which is used by many people to
> detect the topology of servers. We're started to see some users of hwloc
> on RISC-V and we got some reports about the topology exposed by
> Linux/sysfs being wrong on some platforms.
> 
> For instance https://github.com/open-mpi/hwloc/issues/536 says HiFive
> Unmatched with SiFive Freedom U740 running Linux 5.15 exposes a single
> core with 4 threads instead of 4 cores, while StarFive VisionFive v1
> with JH7100 running 5.18.5 correctly exposes 2 cores.

And with Sudeep's patches applied I get (next-20220704):
# hwloc-calc -N core all
1
# hwloc-calc -N pu all
4
On a PolarFire SoC (so the same as a SiFive U540).
So unfortunately, these patches are not the fix you seek!

Wracked my brains for a bit, but could not see any differences
between the U740 and the JH7100. Culprit seems to be the lack
of a cpu-map node (which is only present in the downstream dt).

I've sent patches for the upstream devicetrees:
https://lore.kernel.org/linux-riscv/20220705190435.1790466-1-mail@conchuod.ie/

> 
> Can you tell me what's the overall status of the CPU/NUMA topology
> exposed in sysfs on RISC-V?

Heh, you've got the wrong man. I don't know.

> Does it depend a lot on the platform because
> device-tree and/or ACPI aren't always properly filled by vendors? Does
> it depend a lot on the Linux kernel version? Should I expect significant
> improvements for both in the next months?
> 
I don't know that either. This is why it's a good idea to preserve
the CC list!


> Thanks
> 
> Brice
> 
> [1] https://www.open-mpi.org/projects/hwloc/
> 
> 
> 
> Le 04/07/2022 à 17:10, Conor.Dooley@microchip.com a écrit :
>> On 04/07/2022 11:15, Sudeep Holla wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> Hi Greg,
>>>
>>> Let me know if you prefer to pull the patches directly or prefer pull
>>> request. It has been in -next for a while now.
>>>
>>> Hi All,
>>>
>>> This version updates cacheinfo to populate and use the information from
>>> there for all the cache topology.
>>>
>>> This series intends to fix some discrepancies we have in the CPU topology
>>> parsing from the device tree /cpu-map node. Also this diverges from the
>>> behaviour on a ACPI enabled platform. The expectation is that both DT
>>> and ACPI enabled systems must present consistent view of the CPU topology.
>>>
>>> Currently we assign generated cluster count as the physical package identifier
>>> for each CPU which is wrong. The device tree bindings for CPU topology supports
>>> sockets to infer the socket or physical package identifier for a given CPU.
>>> Also we don't check if all the cores/threads belong to the same cluster before
>>> updating their sibling masks which is fine as we don't set the cluster id yet.
>>>
>>> These changes also assigns the cluster identifier as parsed from the device tree
>>> cluster nodes within /cpu-map without support for nesting of the clusters.
>>> Finally, it also add support for socket nodes in /cpu-map. With this the
>>> parsing of exact same information from ACPI PPTT and /cpu-map DT node
>>> aligns well.
>>>
>>> The only exception is that the last level cache id information can be
>>> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
>>> in the device tree.
>> For DT + RISC-V on PolarFire SoC and SiFive fu540
>> Tested-by: Conor Dooley <conor.dooley@microchip.com>
>>
>> Anecdotally, v5 was tested on the !SMP D1 which worked fine when
>> CONFIG_SMP was enabled.
>>
>> Thanks,
>> Conor.
>>
>>>
>>> v5[5]->v6:
>>>          - Handled out of memory case in early detected correctly after
>>>            Conor reported boot failures on some RISC-V platform. Also
>>>            added a log to show up failure of early cacheinfo detection.
>>>          - Added "Remove the unused find_acpi_cpu_cache_topology()" which
>>>            was missed earlier and posted separately
>>>          - Added all the additional tags recieved
>>>
>>> v4[4]->v5[5]:
>>>          - Added all the tags recieved so far. Rafael has acked only change
>>>            in ACPI and Catalin has acked only change in arm64.
>>>          - Addressed all the typos pointed by Ionela and dropped the patch
>>>            removing the checks for invalid package id as discussed and update
>>>            depth in nested cluster warning check.
>>>
>>> v3[3]->v4[4]:
>>>          - Updated ACPI PPTT fw_token to use table offset instead of virtual
>>>            address as it could get changed for everytime it is mapped before
>>>            the global acpi_permanent_mmap is set
>>>          - Added warning for the topology with nested clusters
>>>          - Added update to cpu_clustergroup_mask so that introduction of
>>>            correct cluster_id doesn't break existing platforms by limiting
>>>            the span of clustergroup_mask(by Ionela)
>>>
>>> v2[2]->v3[3]:
>>>          - Dropped support to get the device node for the CPU's LLC
>>>          - Updated cacheinfo to support calling of detect_cache_attributes
>>>            early in smp_prepare_cpus stage
>>>          - Added support to check if LLC is valid and shared in the cacheinfo
>>>          - Used the same in arch_topology
>>>
>>> v1[1]->v2[2]:
>>>          - Updated ID validity check include all non-negative value
>>>          - Added support to get the device node for the CPU's last level cache
>>>          - Added support to build llc_sibling on DT platforms
>>>
>>> [1] https://lore.kernel.org/lkml/20220513095559.1034633-1-sudeep.holla@arm.com
>>> [2] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
>>> [3] https://lore.kernel.org/lkml/20220525081416.3306043-1-sudeep.holla@arm.com
>>> [4] https://lore.kernel.org/lkml/20220621192034.3332546-1-sudeep.holla@arm.com
>>> [5] https://lore.kernel.org/lkml/20220627165047.336669-1-sudeep.holla@arm.com
>>>
>>> Ionela Voinescu (1):
>>>    arch_topology: Limit span of cpu_clustergroup_mask()
>>>
>>> Sudeep Holla (20):
>>>    ACPI: PPTT: Use table offset as fw_token instead of virtual address
>>>    cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
>>>    cacheinfo: Add helper to access any cache index for a given CPU
>>>    cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
>>>    cacheinfo: Add support to check if last level cache(LLC) is valid or shared
>>>    cacheinfo: Allow early detection and population of cache attributes
>>>    cacheinfo: Use cache identifiers to check if the caches are shared if available
>>>    cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
>>>    arch_topology: Add support to parse and detect cache attributes
>>>    arch_topology: Use the last level cache information from the cacheinfo
>>>    arm64: topology: Remove redundant setting of llc_id in CPU topology
>>>    arch_topology: Drop LLC identifier stash from the CPU topology
>>>    arch_topology: Set thread sibling cpumask only within the cluster
>>>    arch_topology: Check for non-negative value rather than -1 for IDs validity
>>>    arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
>>>    arch_topology: Don't set cluster identifier as physical package identifier
>>>    arch_topology: Set cluster identifier in each core/thread from /cpu-map
>>>    arch_topology: Add support for parsing sockets in /cpu-map
>>>    arch_topology: Warn that topology for nested clusters is not supported
>>>    ACPI: Remove the unused find_acpi_cpu_cache_topology()
>>>
>>>   arch/arm64/kernel/topology.c  |  14 ----
>>>   drivers/acpi/pptt.c           |  40 +---------
>>>   drivers/base/arch_topology.c  | 102 ++++++++++++++++++------
>>>   drivers/base/cacheinfo.c      | 143 ++++++++++++++++++++++------------
>>>   include/linux/acpi.h          |   5 --
>>>   include/linux/arch_topology.h |   1 -
>>>   include/linux/cacheinfo.h     |   3 +
>>>   7 files changed, 175 insertions(+), 133 deletions(-)
>>>
>>> -- 
>>> 2.37.0
>>>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
  2022-07-05 19:06     ` Conor.Dooley
@ 2022-07-05 20:07       ` Sudeep Holla
  2022-07-05 20:14         ` Conor.Dooley
  0 siblings, 1 reply; 36+ messages in thread
From: Sudeep Holla @ 2022-07-05 20:07 UTC (permalink / raw)
  To: Conor.Dooley
  Cc: Brice.Goglin, linux-kernel, gregkh, Valentina.FernandezAlanis,
	vincent.guittot, dietmar.eggemann, wangqing, robh+dt, rafael,
	ionela.voinescu, pierre.gondois, linux-arm-kernel, linux-riscv,
	kernel

On Tue, Jul 05, 2022 at 07:06:17PM +0000, Conor.Dooley@microchip.com wrote:
> [Adding back the CC list from the original thread]
> 
> On 05/07/2022 13:27, Brice Goglin wrote:
> > [You don't often get email from brice.goglin@inria.fr. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > 
> > Hello Conor
> > 
> > I am the main developer of hwloc [1] which is used by many people to
> > detect the topology of servers. We're started to see some users of hwloc
> > on RISC-V and we got some reports about the topology exposed by
> > Linux/sysfs being wrong on some platforms.
> > 
> > For instance https://github.com/open-mpi/hwloc/issues/536 says HiFive
> > Unmatched with SiFive Freedom U740 running Linux 5.15 exposes a single
> > core with 4 threads instead of 4 cores, while StarFive VisionFive v1
> > with JH7100 running 5.18.5 correctly exposes 2 cores.
> 
> And with Sudeep's patches applied I get (next-20220704):
> # hwloc-calc -N core all
> 1
> # hwloc-calc -N pu all
> 4
> On a PolarFire SoC (so the same as a SiFive U540).
> So unfortunately, these patches are not the fix you seek!
>

Not sure what you mean by that ?

> Wracked my brains for a bit, but could not see any differences
> between the U740 and the JH7100. Culprit seems to be the lack
> of a cpu-map node (which is only present in the downstream dt).
>

Indeed, the topology depends on /cpu-map node. However on ARM64 we do
have fallback settings in absence of /cpu-map node so that it is handled
correctly. I wasn't sure what was or can be done on RISC-V as /cpu-map
is optional.

> I've sent patches for the upstream devicetrees:
> https://lore.kernel.org/linux-riscv/20220705190435.1790466-1-mail@conchuod.ie/
>

I will take a look.

> > Does it depend a lot on the platform because
> > device-tree and/or ACPI aren't always properly filled by vendors?

Absolutely.

> > Does it depend a lot on the Linux kernel version?

Ideally not much, but hey we had some issues on Arm64 too which this series
is addressing.

> > Should I expect significant improvements for both in the next months?

Not much in topology or nothing planned. I have no idea on NUMA


Hi Conor,

I would have preferred you to add me to the original thread and referred
this thread from there. I don't want to derail the discussion in this
thread as nothing much can be done here.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
  2022-07-05 20:07       ` Sudeep Holla
@ 2022-07-05 20:14         ` Conor.Dooley
  2022-07-05 20:22           ` Sudeep Holla
  0 siblings, 1 reply; 36+ messages in thread
From: Conor.Dooley @ 2022-07-05 20:14 UTC (permalink / raw)
  To: sudeep.holla, Conor.Dooley
  Cc: Brice.Goglin, linux-kernel, gregkh, Valentina.FernandezAlanis,
	vincent.guittot, dietmar.eggemann, wangqing, robh+dt, rafael,
	ionela.voinescu, pierre.gondois, linux-arm-kernel, linux-riscv,
	kernel



On 05/07/2022 21:07, Sudeep Holla wrote:
> On Tue, Jul 05, 2022 at 07:06:17PM +0000, Conor.Dooley@microchip.com wrote:
>> [Adding back the CC list from the original thread]
>>
>> On 05/07/2022 13:27, Brice Goglin wrote:
>>> [You don't often get email from brice.goglin@inria.fr. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> Hello Conor
>>>
>>> I am the main developer of hwloc [1] which is used by many people to
>>> detect the topology of servers. We're started to see some users of hwloc
>>> on RISC-V and we got some reports about the topology exposed by
>>> Linux/sysfs being wrong on some platforms.
>>>
>>> For instance https://github.com/open-mpi/hwloc/issues/536 says HiFive
>>> Unmatched with SiFive Freedom U740 running Linux 5.15 exposes a single
>>> core with 4 threads instead of 4 cores, while StarFive VisionFive v1
>>> with JH7100 running 5.18.5 correctly exposes 2 cores.
>>
>> And with Sudeep's patches applied I get (next-20220704):
>> # hwloc-calc -N core all
>> 1
>> # hwloc-calc -N pu all
>> 4
>> On a PolarFire SoC (so the same as a SiFive U540).
>> So unfortunately, these patches are not the fix you seek!
>>
> 
> Not sure what you mean by that ?

Nothing meaningful really, just saying that this patchset
was unrelated to the problem he reported his response to
it.

> 
>> Wracked my brains for a bit, but could not see any differences
>> between the U740 and the JH7100. Culprit seems to be the lack
>> of a cpu-map node (which is only present in the downstream dt).
>>
> 
> Indeed, the topology depends on /cpu-map node. However on ARM64 we do
> have fallback settings in absence of /cpu-map node so that it is handled
> correctly. I wasn't sure what was or can be done on RISC-V as /cpu-map
> is optional.
> 
>> I've sent patches for the upstream devicetrees:
>> https://lore.kernel.org/linux-riscv/20220705190435.1790466-1-mail@conchuod.ie/
>>
> 
> I will take a look.
> 
>>> Does it depend a lot on the platform because
>>> device-tree and/or ACPI aren't always properly filled by vendors?
> 
> Absolutely.
> 
>>> Does it depend a lot on the Linux kernel version?
> 
> Ideally not much, but hey we had some issues on Arm64 too which this series
> is addressing.
> 
>>> Should I expect significant improvements for both in the next months?
> 
> Not much in topology or nothing planned. I have no idea on NUMA
> 
> 
> Hi Conor,
> 
> I would have preferred you to add me to the original thread and referred
> this thread from there. I don't want to derail the discussion in this
> thread as nothing much can be done here.

This is the original thread! It was just one off-list email that was a
to me only response to this arch_topologu thread that you can see here

But yeah - should have CCed you on the cpu-map stuff too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids
  2022-07-05 20:14         ` Conor.Dooley
@ 2022-07-05 20:22           ` Sudeep Holla
  0 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-05 20:22 UTC (permalink / raw)
  To: Conor.Dooley
  Cc: Brice.Goglin, linux-kernel, gregkh, Valentina.FernandezAlanis,
	vincent.guittot, dietmar.eggemann, wangqing, robh+dt, rafael,
	ionela.voinescu, pierre.gondois, linux-arm-kernel, Sudeep Holla,
	linux-riscv, kernel

On Tue, Jul 05, 2022 at 08:14:38PM +0000, Conor.Dooley@microchip.com wrote:
> 
> On 05/07/2022 21:07, Sudeep Holla wrote:

[...]

> > 
> > I would have preferred you to add me to the original thread and referred
> > this thread from there. I don't want to derail the discussion in this
> > thread as nothing much can be done here.
> 
> This is the original thread! It was just one off-list email that was a
> to me only response to this arch_topologu thread that you can see here
>

Ah OK, private mail. I missed to see that and assumed it was another
thread, sorry for that.

> But yeah - should have CCed you on the cpu-map stuff too.

No worries I spotted that and responded.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-04 10:16 ` [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
@ 2022-07-08  0:10   ` Darren Hart
  2022-07-08  8:04     ` Sudeep Holla
  2022-07-08  9:05     ` Ionela Voinescu
  0 siblings, 2 replies; 36+ messages in thread
From: Darren Hart @ 2022-07-08  0:10 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, Greg Kroah-Hartman, conor.dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Ionela Voinescu,
	Pierre Gondois, linux-arm-kernel, linux-riscv

On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> From: Ionela Voinescu <ionela.voinescu@arm.com>

Hi Sudeep and Ionela,

> 
> Currently the cluster identifier is not set on DT based platforms.
> The reset or default value is -1 for all the CPUs. Once we assign the
> cluster identifier values correctly, the cluster_sibling mask will be
> populated and returned by cpu_clustergroup_mask() to contribute in the
> creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> enabled.
> 
> To avoid topologies that will result in questionable or incorrect
> scheduling domains, impose restrictions regarding the span of clusters,

Can you provide a specific example of a valid topology that results in
the wrong thing currently?

> as presented to scheduling domains building code: cluster_sibling should
> not span more or the same CPUs as cpu_coregroup_mask().
> 
> This is needed in order to obtain a strict separation between the MC and
> CLS levels, and maintain the same domains for existing platforms in
> the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
> is redundant and irrelevant for the scheduler.

Unfortunately, I believe this changes the behavior for the existing
Ampere Altra systems, resulting in degraded performance particularly
latency sensitive workloads by effectively reverting:

  db1e59483d topology: make core_mask include at least cluster_siblings

and ensuring the clustergroup_mask will return with just one CPU for the
condition the above commit addresses.

> 
> While previously the scheduling domain builder code would have removed MC
> as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> now CLS will be removed and MC kept.
> 

This is not desireable for all systems, particular those which don't
have an L3 but do share other resources - such as the snoop filter in
the case of the Ampere Altra.

While not universally supported, we agreed in the discussion on the
above patch to allow systems to define clusters independently from the
L3 as an LLC since this is also independently defined in PPTT.

Going back to my first comment - does this fix an existing system with a
valid topology? It's not clear to me what that would look like. The
Ampere Altra presents a cluster level in PPTT because that is the
desireable topology for the system. If it's not desirable for another
system to have the cluster topology - shouldn't it not present that
layer to the kernel in the first place?

Thanks,

> Cc: Darren Hart <darren@os.amperecomputing.com>
> Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/base/arch_topology.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index e384afb6cac7..591c1f8e15e2 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -686,6 +686,14 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  
>  const struct cpumask *cpu_clustergroup_mask(int cpu)
>  {
> +	/*
> +	 * Forbid cpu_clustergroup_mask() to span more or the same CPUs as
> +	 * cpu_coregroup_mask().
> +	 */
> +	if (cpumask_subset(cpu_coregroup_mask(cpu),
> +			   &cpu_topology[cpu].cluster_sibling))
> +		return get_cpu_mask(cpu);
> +
>  	return &cpu_topology[cpu].cluster_sibling;
>  }
>  
> -- 
> 2.37.0
> 

-- 
Darren Hart
Ampere Computing / OS and Kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-08  0:10   ` Darren Hart
@ 2022-07-08  8:04     ` Sudeep Holla
  2022-07-08 16:27       ` Darren Hart
  2022-07-08  9:05     ` Ionela Voinescu
  1 sibling, 1 reply; 36+ messages in thread
From: Sudeep Holla @ 2022-07-08  8:04 UTC (permalink / raw)
  To: Darren Hart
  Cc: linux-kernel, Greg Kroah-Hartman, conor.dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Ionela Voinescu,
	Pierre Gondois, Sudeep Holla, linux-arm-kernel, linux-riscv

Hi Darren,

I will let Ionela or Dietmar cover some of the scheduler aspects as
I don't have much knowledge in that area.

On Thu, Jul 07, 2022 at 05:10:19PM -0700, Darren Hart wrote:
> On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> > From: Ionela Voinescu <ionela.voinescu@arm.com>
> 
> Hi Sudeep and Ionela,
> 
> > 
> > Currently the cluster identifier is not set on DT based platforms.
> > The reset or default value is -1 for all the CPUs. Once we assign the
> > cluster identifier values correctly, the cluster_sibling mask will be
> > populated and returned by cpu_clustergroup_mask() to contribute in the
> > creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> > enabled.
> > 
> > To avoid topologies that will result in questionable or incorrect
> > scheduling domains, impose restrictions regarding the span of clusters,
> 
> Can you provide a specific example of a valid topology that results in
> the wrong thing currently?
>

As a simple example, Juno with 2 clusters and L2 for each cluster. IIUC
MC is preferred instead of CLS and both MC and CLS domains are exact
match.

> > 
> > While previously the scheduling domain builder code would have removed MC
> > as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> > cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> > now CLS will be removed and MC kept.
> > 
> 
> This is not desireable for all systems, particular those which don't
> have an L3 but do share other resources - such as the snoop filter in
> the case of the Ampere Altra.
> 
> While not universally supported, we agreed in the discussion on the
> above patch to allow systems to define clusters independently from the
> L3 as an LLC since this is also independently defined in PPTT.
>
> Going back to my first comment - does this fix an existing system with a
> valid topology? 

Yes as mentioned above Juno.

> It's not clear to me what that would look like. The Ampere Altra presents
> a cluster level in PPTT because that is the desireable topology for the
> system.

Absolutely wrong reason. It should present because the hardware is so,
not because some OSPM desires something in someway. Sorry that's not how
DT/ACPI is designed for. If 2 different OSPM desires different things, then
one ACPI will not be sufficient.

> If it's not desirable for another system to have the cluster topology -
> shouldn't it not present that layer to the kernel in the first place?

Absolutely 100% yes, it must present it if the hardware is designed so.
No if or but.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-08  0:10   ` Darren Hart
  2022-07-08  8:04     ` Sudeep Holla
@ 2022-07-08  9:05     ` Ionela Voinescu
  2022-07-08 16:14       ` Darren Hart
  1 sibling, 1 reply; 36+ messages in thread
From: Ionela Voinescu @ 2022-07-08  9:05 UTC (permalink / raw)
  To: Darren Hart
  Cc: Sudeep Holla, linux-kernel, Greg Kroah-Hartman, conor.dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Pierre Gondois,
	linux-arm-kernel, linux-riscv

Hi Darren,

On Thursday 07 Jul 2022 at 17:10:19 (-0700), Darren Hart wrote:
> On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> > From: Ionela Voinescu <ionela.voinescu@arm.com>
> 
> Hi Sudeep and Ionela,
> 
> > 
> > Currently the cluster identifier is not set on DT based platforms.
> > The reset or default value is -1 for all the CPUs. Once we assign the
> > cluster identifier values correctly, the cluster_sibling mask will be
> > populated and returned by cpu_clustergroup_mask() to contribute in the
> > creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> > enabled.
> > 
> > To avoid topologies that will result in questionable or incorrect
> > scheduling domains, impose restrictions regarding the span of clusters,
> 
> Can you provide a specific example of a valid topology that results in
> the wrong thing currently?
> 

When CONFIG_SCHED_CLUSTER=y, all typical big.LITTLE platforms will end up
having a CLS level instead of MC, with an extra flag for the CLS level:
SD_PREFER_SIBLING. Additional to this, potentially broken cluster
descriptions in DT (let's say clusters spanning more CPUs than the LLC
domain) will result in broken scheduler topologies.

This drew our attention that the span of clusters should be restricted
to ensure they always span less CPUs than LLC, if LLC information exists
and LLC spans more than 1 core. But the Ampere Altra functionality you
introduced is maintained. I'll detail this below.

> > as presented to scheduling domains building code: cluster_sibling should
> > not span more or the same CPUs as cpu_coregroup_mask().
> > 
> > This is needed in order to obtain a strict separation between the MC and
> > CLS levels, and maintain the same domains for existing platforms in
> > the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
> > is redundant and irrelevant for the scheduler.
> 
> Unfortunately, I believe this changes the behavior for the existing
> Ampere Altra systems, resulting in degraded performance particularly
> latency sensitive workloads by effectively reverting:
> 
>   db1e59483d topology: make core_mask include at least cluster_siblings
> 
> and ensuring the clustergroup_mask will return with just one CPU for the
> condition the above commit addresses.
> 

It does not change the functionality on Ampere Altra. cpu_coregroup_mask
will still return 2 CPUs (cluster span). The difference is that
cpu_clustergroup_mask will see that cpu_coregroup_masks spans the same
CPUs and it will return a single CPU. This results in the CLS level
being invalidated, and the MC level maintained. But MC will span 2 CPUs,
instead of 1, which was the case before your fix. This is alright as
MC and CLS have the same flags so the existing functionality is fully
maintained.

I've tested on Ampere Altra:

without patch:

cpu0/domain0/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
cpu0/domain0/min_interval:2
cpu0/domain0/name:CLS

cpu0/domain1/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_PREFER_SIBLING
cpu0/domain1/min_interval:80
cpu0/domain1/name:DIE

cpu0/domain2/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SERIALIZE SD_OVERLAP SD_NUMA
cpu0/domain2/min_interval:160
cpu0/domain2/name:NUMA

with this patch:

cpu0/domain0/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
cpu0/domain0/min_interval:2
cpu0/domain0/name:MC

cpu0/domain1/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_PREFER_SIBLING
cpu0/domain1/min_interval:80
cpu0/domain1/name:DIE

cpu0/domain2/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SERIALIZE SD_OVERLAP SD_NUMA
cpu0/domain2/min_interval:160
cpu0/domain2/name:NUMA

> > 
> > While previously the scheduling domain builder code would have removed MC
> > as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> > cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> > now CLS will be removed and MC kept.
> > 
> 
> This is not desireable for all systems, particular those which don't
> have an L3 but do share other resources - such as the snoop filter in
> the case of the Ampere Altra.
> 
> While not universally supported, we agreed in the discussion on the
> above patch to allow systems to define clusters independently from the
> L3 as an LLC since this is also independently defined in PPTT.
> 
> Going back to my first comment - does this fix an existing system with a
> valid topology? It's not clear to me what that would look like. The
> Ampere Altra presents a cluster level in PPTT because that is the
> desireable topology for the system. If it's not desirable for another
> system to have the cluster topology - shouldn't it not present that
> layer to the kernel in the first place?
> 

Hopefully my comments above have clarified these.

Thanks,
Ionela.

> Thanks,
> 
> > Cc: Darren Hart <darren@os.amperecomputing.com>
> > Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
> > Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> > ---
> >  drivers/base/arch_topology.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index e384afb6cac7..591c1f8e15e2 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -686,6 +686,14 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> >  
> >  const struct cpumask *cpu_clustergroup_mask(int cpu)
> >  {
> > +	/*
> > +	 * Forbid cpu_clustergroup_mask() to span more or the same CPUs as
> > +	 * cpu_coregroup_mask().
> > +	 */
> > +	if (cpumask_subset(cpu_coregroup_mask(cpu),
> > +			   &cpu_topology[cpu].cluster_sibling))
> > +		return get_cpu_mask(cpu);
> > +
> >  	return &cpu_topology[cpu].cluster_sibling;
> >  }
> >  
> > -- 
> > 2.37.0
> > 
> 
> -- 
> Darren Hart
> Ampere Computing / OS and Kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-08  9:05     ` Ionela Voinescu
@ 2022-07-08 16:14       ` Darren Hart
  0 siblings, 0 replies; 36+ messages in thread
From: Darren Hart @ 2022-07-08 16:14 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: Sudeep Holla, linux-kernel, Greg Kroah-Hartman, conor.dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Pierre Gondois,
	linux-arm-kernel, linux-riscv

On Fri, Jul 08, 2022 at 10:05:32AM +0100, Ionela Voinescu wrote:
> Hi Darren,
> 
> On Thursday 07 Jul 2022 at 17:10:19 (-0700), Darren Hart wrote:
> > On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> > > From: Ionela Voinescu <ionela.voinescu@arm.com>
> > 
> > Hi Sudeep and Ionela,
> > 
> > > 
> > > Currently the cluster identifier is not set on DT based platforms.
> > > The reset or default value is -1 for all the CPUs. Once we assign the
> > > cluster identifier values correctly, the cluster_sibling mask will be
> > > populated and returned by cpu_clustergroup_mask() to contribute in the
> > > creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> > > enabled.
> > > 
> > > To avoid topologies that will result in questionable or incorrect
> > > scheduling domains, impose restrictions regarding the span of clusters,
> > 
> > Can you provide a specific example of a valid topology that results in
> > the wrong thing currently?
> > 
> 
> When CONFIG_SCHED_CLUSTER=y, all typical big.LITTLE platforms will end up
> having a CLS level instead of MC, with an extra flag for the CLS level:
> SD_PREFER_SIBLING. Additional to this, potentially broken cluster
> descriptions in DT (let's say clusters spanning more CPUs than the LLC
> domain) will result in broken scheduler topologies.

You addressed my primary concern below, thank you. Re this point, I was
concerned that we were prioritizing correcting "broken cluster
descriptions" over "correct, but unusual cluster descriptions". Your
solutions seems to elegantly address both.

> 
> This drew our attention that the span of clusters should be restricted
> to ensure they always span less CPUs than LLC, if LLC information exists
> and LLC spans more than 1 core. But the Ampere Altra functionality you
> introduced is maintained. I'll detail this below.
> 
> > > as presented to scheduling domains building code: cluster_sibling should
> > > not span more or the same CPUs as cpu_coregroup_mask().
> > > 
> > > This is needed in order to obtain a strict separation between the MC and
> > > CLS levels, and maintain the same domains for existing platforms in
> > > the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
> > > is redundant and irrelevant for the scheduler.
> > 
> > Unfortunately, I believe this changes the behavior for the existing
> > Ampere Altra systems, resulting in degraded performance particularly
> > latency sensitive workloads by effectively reverting:
> > 
> >   db1e59483d topology: make core_mask include at least cluster_siblings
> > 
> > and ensuring the clustergroup_mask will return with just one CPU for the
> > condition the above commit addresses.
> > 
> 
> It does not change the functionality on Ampere Altra. cpu_coregroup_mask
> will still return 2 CPUs (cluster span). The difference is that
> cpu_clustergroup_mask will see that cpu_coregroup_masks spans the same
> CPUs and it will return a single CPU. This results in the CLS level
> being invalidated, and the MC level maintained. But MC will span 2 CPUs,
> instead of 1, which was the case before your fix. This is alright as
> MC and CLS have the same flags so the existing functionality is fully
> maintained.

Ah, of course. I missed the combined impact of my earlier change plus
yours, which is to first expand MC and then to collapse CLS. It's a
little round about for the Altra, but that seems reasonable as it's a
bit of a corner case in terms topologies.

Thank you for the explanation.

-- 
Darren Hart
Ampere Computing / OS and Kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask()
  2022-07-08  8:04     ` Sudeep Holla
@ 2022-07-08 16:27       ` Darren Hart
  0 siblings, 0 replies; 36+ messages in thread
From: Darren Hart @ 2022-07-08 16:27 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-kernel, Greg Kroah-Hartman, conor.dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Ionela Voinescu,
	Pierre Gondois, linux-arm-kernel, linux-riscv

On Fri, Jul 08, 2022 at 09:04:24AM +0100, Sudeep Holla wrote:
> Hi Darren,
> 
> I will let Ionela or Dietmar cover some of the scheduler aspects as
> I don't have much knowledge in that area.
> 
> On Thu, Jul 07, 2022 at 05:10:19PM -0700, Darren Hart wrote:
> > On Mon, Jul 04, 2022 at 11:16:01AM +0100, Sudeep Holla wrote:
> > > From: Ionela Voinescu <ionela.voinescu@arm.com>
> > 
> > Hi Sudeep and Ionela,
> > 
> > > 
> > > Currently the cluster identifier is not set on DT based platforms.
> > > The reset or default value is -1 for all the CPUs. Once we assign the
> > > cluster identifier values correctly, the cluster_sibling mask will be
> > > populated and returned by cpu_clustergroup_mask() to contribute in the
> > > creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> > > enabled.
> > > 
> > > To avoid topologies that will result in questionable or incorrect
> > > scheduling domains, impose restrictions regarding the span of clusters,
> > 
> > Can you provide a specific example of a valid topology that results in
> > the wrong thing currently?
> >
> 
> As a simple example, Juno with 2 clusters and L2 for each cluster. IIUC
> MC is preferred instead of CLS and both MC and CLS domains are exact
> match.
> 
> > > 
> > > While previously the scheduling domain builder code would have removed MC
> > > as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> > > cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> > > now CLS will be removed and MC kept.
> > > 
> > 
> > This is not desireable for all systems, particular those which don't
> > have an L3 but do share other resources - such as the snoop filter in
> > the case of the Ampere Altra.

I was wrong here. This match also modifies the coregroup, the MC after
this patch is equivalent to the CLS before the patch. The Altra is not
negatively impacted here.

> > 
> > While not universally supported, we agreed in the discussion on the
> > above patch to allow systems to define clusters independently from the
> > L3 as an LLC since this is also independently defined in PPTT.
> >
> > Going back to my first comment - does this fix an existing system with a
> > valid topology? 
> 
> Yes as mentioned above Juno.
> 
> > It's not clear to me what that would look like. The Ampere Altra presents
> > a cluster level in PPTT because that is the desireable topology for the
> > system.
> 
> Absolutely wrong reason. It should present because the hardware is so,
> not because some OSPM desires something in someway. Sorry that's not how
> DT/ACPI is designed for. If 2 different OSPM desires different things, then
> one ACPI will not be sufficient.

Agree. I worded that badly. I should have said the Altra presents a PPTT
topology that accurately reflects the hardwere. There is no shared
cpu-side LLC, and there is an affinity between the DSU pairs which share
a snoop filter.

I do think the general assumption that MC shares a cpu-side LLC will
continue to present challenges to the Altra topology in terms of ongoing
to changes to the code. I don't have a good solution to that at the
moment, something I'll continue to think on.

> 
> > If it's not desirable for another system to have the cluster topology -
> > shouldn't it not present that layer to the kernel in the first place?
> 
> Absolutely 100% yes, it must present it if the hardware is designed so.
> No if or but.
> 
> -- 
> Regards,
> Sudeep

Thanks Sudeep,

-- 
Darren Hart
Ampere Computing / OS and Kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes
  2022-07-04 10:15 ` [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
@ 2022-07-19 14:22   ` Geert Uytterhoeven
  2022-07-19 14:37     ` Conor Dooley
  0 siblings, 1 reply; 36+ messages in thread
From: Geert Uytterhoeven @ 2022-07-19 14:22 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Conor Dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Ionela Voinescu,
	Pierre Gondois, Linux ARM, linux-riscv, Gavin Shan

Hi Sudeep,

On Mon, Jul 4, 2022 at 12:19 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> Currently ACPI populates just the minimum information about the last
> level cache from PPTT in order to feed the same to build sched_domains.
> Similar support for DT platforms is not present.
>
> In order to enable the same, the entire cache hierarchy information can
> be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
>
> Note that this change builds the cacheinfo early even on ACPI systems,
> but the current mechanism of building llc_sibling mask remains unchanged.
>
> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Thanks for your patch!

> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -7,6 +7,7 @@
>   */
>
>  #include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
>  #include <linux/cpu.h>
>  #include <linux/cpufreq.h>
>  #include <linux/device.h>
> @@ -780,15 +781,28 @@ __weak int __init parse_acpi_topology(void)
>  #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>  void __init init_cpu_topology(void)
>  {
> +       int ret, cpu;
> +
>         reset_cpu_topology();
> +       ret = parse_acpi_topology();
> +       if (!ret)
> +               ret = of_have_populated_dt() && parse_dt_topology();
>
> -       /*
> -        * Discard anything that was parsed if we hit an error so we
> -        * don't use partial information.
> -        */
> -       if (parse_acpi_topology())
> -               reset_cpu_topology();
> -       else if (of_have_populated_dt() && parse_dt_topology())
> +       if (ret) {
> +               /*
> +                * Discard anything that was parsed if we hit an error so we
> +                * don't use partial information.
> +                */
>                 reset_cpu_topology();
> +               return;
> +       }
> +
> +       for_each_possible_cpu(cpu) {
> +               ret = detect_cache_attributes(cpu);
> +               if (ret) {
> +                       pr_info("Early cacheinfo failed, ret = %d\n", ret);

This is triggered

    Early cacheinfo failed, ret = -12

on all my RV64 platforms (K210, PolarFire, StarLight).
-12 = -ENOMEM.

The boot continues regardless, and the K210 even has enough spare
RAM after boot to run "ls", unlike two weeks ago ;-)

> +                       break;
> +               }
> +       }
>  }

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes
  2022-07-19 14:22   ` Geert Uytterhoeven
@ 2022-07-19 14:37     ` Conor Dooley
  2022-07-19 15:05       ` Sudeep Holla
  0 siblings, 1 reply; 36+ messages in thread
From: Conor Dooley @ 2022-07-19 14:37 UTC (permalink / raw)
  To: Geert Uytterhoeven, Sudeep Holla
  Cc: Linux Kernel Mailing List, Greg Kroah-Hartman, Conor Dooley,
	valentina.fernandezalanis, Vincent Guittot, Dietmar Eggemann,
	Qing Wang, Rob Herring, Rafael J . Wysocki, Ionela Voinescu,
	Pierre Gondois, Linux ARM, linux-riscv, Gavin Shan

On 19/07/2022 15:22, Geert Uytterhoeven wrote:
> Hi Sudeep,
>

Hey Geert,

  
> On Mon, Jul 4, 2022 at 12:19 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>> Currently ACPI populates just the minimum information about the last
>> level cache from PPTT in order to feed the same to build sched_domains.
>> Similar support for DT platforms is not present.
>>
>> In order to enable the same, the entire cache hierarchy information can
>> be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
>>
>> Note that this change builds the cacheinfo early even on ACPI systems,
>> but the current mechanism of building llc_sibling mask remains unchanged.
>>
>> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
>> Reviewed-by: Gavin Shan <gshan@redhat.com>
>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> 
> Thanks for your patch!
> 
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -7,6 +7,7 @@
>>    */
>>
>>   #include <linux/acpi.h>
>> +#include <linux/cacheinfo.h>
>>   #include <linux/cpu.h>
>>   #include <linux/cpufreq.h>
>>   #include <linux/device.h>
>> @@ -780,15 +781,28 @@ __weak int __init parse_acpi_topology(void)
>>   #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>>   void __init init_cpu_topology(void)
>>   {
>> +       int ret, cpu;
>> +
>>          reset_cpu_topology();
>> +       ret = parse_acpi_topology();
>> +       if (!ret)
>> +               ret = of_have_populated_dt() && parse_dt_topology();
>>
>> -       /*
>> -        * Discard anything that was parsed if we hit an error so we
>> -        * don't use partial information.
>> -        */
>> -       if (parse_acpi_topology())
>> -               reset_cpu_topology();
>> -       else if (of_have_populated_dt() && parse_dt_topology())
>> +       if (ret) {
>> +               /*
>> +                * Discard anything that was parsed if we hit an error so we
>> +                * don't use partial information.
>> +                */
>>                  reset_cpu_topology();
>> +               return;
>> +       }
>> +
>> +       for_each_possible_cpu(cpu) {
>> +               ret = detect_cache_attributes(cpu);
>> +               if (ret) {
>> +                       pr_info("Early cacheinfo failed, ret = %d\n", ret);
> 
> This is triggered
> 
>      Early cacheinfo failed, ret = -12
> 
> on all my RV64 platforms (K210, PolarFire, StarLight).

This should be fixed by Sudeeps most recent patchset, at least
it was when I tested it!
https://lore.kernel.org/all/20220713133344.1201247-1-sudeep.holla@arm.com/

> -12 = -ENOMEM.
> 
> The boot continues regardless, and the K210 even has enough spare
> RAM after boot to run "ls", unlike two weeks ago ;-)
> 
>> +                       break;
>> +               }
>> +       }
>>   }
> 
> Gr{oetje,eeting}s,
> 
>                          Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes
  2022-07-19 14:37     ` Conor Dooley
@ 2022-07-19 15:05       ` Sudeep Holla
  0 siblings, 0 replies; 36+ messages in thread
From: Sudeep Holla @ 2022-07-19 15:05 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Geert Uytterhoeven, Sudeep Holla, Linux Kernel Mailing List,
	Greg Kroah-Hartman, Conor Dooley, valentina.fernandezalanis,
	Vincent Guittot, Dietmar Eggemann, Qing Wang, Rob Herring,
	Rafael J . Wysocki, Ionela Voinescu, Pierre Gondois, Linux ARM,
	linux-riscv, Gavin Shan

On Tue, Jul 19, 2022 at 03:37:22PM +0100, Conor Dooley wrote:
> On 19/07/2022 15:22, Geert Uytterhoeven wrote:
> > Hi Sudeep,
> > 
> 
> Hey Geert,
> 
[...]

> > 
> > This is triggered
> > 
> >      Early cacheinfo failed, ret = -12
> > 
> > on all my RV64 platforms (K210, PolarFire, StarLight).
> 
> This should be fixed by Sudeeps most recent patchset, at least
> it was when I tested it!
> https://lore.kernel.org/all/20220713133344.1201247-1-sudeep.holla@arm.com/
>

Conor you beat me in the response speed :).

> > -12 = -ENOMEM.
> > 
> > The boot continues regardless, and the K210 even has enough spare
> > RAM after boot to run "ls", unlike two weeks ago ;-)
> >

Yes Conor initially reported this and I suspected something to do with
per-cpu allocation as the early cacheinfo failed but succeeded in device
initcall level. However when fixing some hotplug issue, I moved the
detection of cache attributes on all cpus from boot cpu to individual
CPUs in the secondary startup which seem to fix the issue as I assume the
per-cpu allocation is ready to use at that stage.

However we still have one pending issue[0] to address even after [1], but
that doesn't affect DT platforms.

-- 
Regards,
Sudeep

[0] https://lore.kernel.org/all/20220718174151.GA462603@roeck-us.net/
[1] https://lore.kernel.org/all/20220715102609.2160689-1-sudeep.holla@arm.com/

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-07-19 15:05 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-04 10:15 [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 01/21] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 02/21] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 03/21] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 04/21] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 05/21] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 06/21] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 07/21] cacheinfo: Use cache identifiers to check if the caches are shared if available Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 08/21] cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 09/21] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
2022-07-19 14:22   ` Geert Uytterhoeven
2022-07-19 14:37     ` Conor Dooley
2022-07-19 15:05       ` Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 10/21] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 11/21] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 12/21] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 13/21] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 14/21] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
2022-07-04 10:15 ` [PATCH v6 15/21] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
2022-07-04 10:16 ` [PATCH v6 16/21] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
2022-07-04 10:16 ` [PATCH v6 17/21] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
2022-07-08  0:10   ` Darren Hart
2022-07-08  8:04     ` Sudeep Holla
2022-07-08 16:27       ` Darren Hart
2022-07-08  9:05     ` Ionela Voinescu
2022-07-08 16:14       ` Darren Hart
2022-07-04 10:16 ` [PATCH v6 18/21] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
2022-07-04 10:16 ` [PATCH v6 19/21] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
2022-07-04 10:16 ` [PATCH v6 20/21] arch_topology: Warn that topology for nested clusters is not supported Sudeep Holla
2022-07-04 10:16 ` [PATCH v6 21/21] ACPI: Remove the unused find_acpi_cpu_cache_topology() Sudeep Holla
2022-07-04 15:10 ` [PATCH v6 00/21] arch_topology: Updates to add socket support and fix cluster ids Conor.Dooley
2022-07-04 15:20   ` Sudeep Holla
     [not found]   ` <507c6b64-fc23-3eea-e4c1-4d426025d658@inria.fr>
2022-07-05 19:06     ` Conor.Dooley
2022-07-05 20:07       ` Sudeep Holla
2022-07-05 20:14         ` Conor.Dooley
2022-07-05 20:22           ` Sudeep Holla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).