[PATCH 0/3] support for describing cache topology from DT

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] support for describing cache topology from DT
@ 2022-03-29  9:15 ` Qing Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

We don't know anything about the cache topology info without ACPI,
but in fact we can get it from DT like:
*		cpu0: cpu@000 {
*			next-level-cache = <&L2_1>;
*			L2_1: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
* 			};
*			L3_1: l3-cache {
* 				compatible = "cache";
* 			};
*		};
*
*		cpu1: cpu@001 {
*			next-level-cache = <&L2_1>;
*			cpu-idle-states = <&clusteroff_l &mcusysoff
*						&system_mem &system_pll &system_bus
*						&s2idle>;
*		};
*		cpu2: cpu@002 {
*			L2_2: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
*			};
*		};
*
*		cpu3: cpu@003 {
*			next-level-cache = <&L2_2>;
*		};
Building the cache topology has many benefits, here is a part of useage.

Wang Qing (3):
  sched: topology: add input parameter for sched_domain_flags_f()
  arch_topology: support for describing cache topology from DT
  arm64: add arm64 default topology

 arch/arm64/kernel/smp.c        | 56 ++++++++++++++++++++++++++
 arch/powerpc/kernel/smp.c      |  4 +-
 arch/x86/kernel/smpboot.c      |  8 ++--
 drivers/base/arch_topology.c   | 89 +++++++++++++++++++++++++++++++++++++++++-
 include/linux/arch_topology.h  |  4 ++
 include/linux/sched/topology.h | 10 ++---
 kernel/sched/topology.c        |  2 +-
 7 files changed, 160 insertions(+), 13 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/3] support for describing cache topology from DT
@ 2022-03-29  9:15 ` Qing Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

We don't know anything about the cache topology info without ACPI,
but in fact we can get it from DT like:
*		cpu0: cpu@000 {
*			next-level-cache = <&L2_1>;
*			L2_1: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
* 			};
*			L3_1: l3-cache {
* 				compatible = "cache";
* 			};
*		};
*
*		cpu1: cpu@001 {
*			next-level-cache = <&L2_1>;
*			cpu-idle-states = <&clusteroff_l &mcusysoff
*						&system_mem &system_pll &system_bus
*						&s2idle>;
*		};
*		cpu2: cpu@002 {
*			L2_2: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
*			};
*		};
*
*		cpu3: cpu@003 {
*			next-level-cache = <&L2_2>;
*		};
Building the cache topology has many benefits, here is a part of useage.

Wang Qing (3):
  sched: topology: add input parameter for sched_domain_flags_f()
  arch_topology: support for describing cache topology from DT
  arm64: add arm64 default topology

 arch/arm64/kernel/smp.c        | 56 ++++++++++++++++++++++++++
 arch/powerpc/kernel/smp.c      |  4 +-
 arch/x86/kernel/smpboot.c      |  8 ++--
 drivers/base/arch_topology.c   | 89 +++++++++++++++++++++++++++++++++++++++++-
 include/linux/arch_topology.h  |  4 ++
 include/linux/sched/topology.h | 10 ++---
 kernel/sched/topology.c        |  2 +-
 7 files changed, 160 insertions(+), 13 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
  2022-03-29  9:15 ` Qing Wang
@ 2022-03-29  9:15   ` Qing Wang
  -1 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

sched_domain_flags_f() are statically set now, but actually, we can get a lot
of necessary information based on the cpu_map. e.g. we can know whether its
cache is shared.

Allows custom extension without affecting current.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 arch/powerpc/kernel/smp.c      |  4 ++--
 arch/x86/kernel/smpboot.c      |  8 ++++----
 include/linux/sched/topology.h | 10 +++++-----
 kernel/sched/topology.c        |  2 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index de0f6f0..e503d23
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1000,7 +1000,7 @@ static bool shared_caches;
 
 #ifdef CONFIG_SCHED_SMT
 /* cpumask of CPUs with asymmetric SMT dependency */
-static int powerpc_smt_flags(void)
+static int powerpc_smt_flags(const struct cpumask *cpu_map)
 {
 	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 
@@ -1018,7 +1018,7 @@ static int powerpc_smt_flags(void)
  * since the migrated task remains cache hot. We want to take advantage of this
  * at the scheduler level so an extra topology level is required.
  */
-static int powerpc_shared_cache_flags(void)
+static int powerpc_shared_cache_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 2ef1477..c005a8e
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -535,25 +535,25 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 
 
 #if defined(CONFIG_SCHED_SMT) || defined(CONFIG_SCHED_CLUSTER) || defined(CONFIG_SCHED_MC)
-static inline int x86_sched_itmt_flags(void)
+static inline int x86_sched_itmt_flags(const struct cpumask *cpu_map)
 {
 	return sysctl_sched_itmt_enabled ? SD_ASYM_PACKING : 0;
 }
 
 #ifdef CONFIG_SCHED_MC
-static int x86_core_flags(void)
+static int x86_core_flags(const struct cpumask *cpu_map)
 {
 	return cpu_core_flags() | x86_sched_itmt_flags();
 }
 #endif
 #ifdef CONFIG_SCHED_SMT
-static int x86_smt_flags(void)
+static int x86_smt_flags(const struct cpumask *cpu_map)
 {
 	return cpu_smt_flags() | x86_sched_itmt_flags();
 }
 #endif
 #ifdef CONFIG_SCHED_CLUSTER
-static int x86_cluster_flags(void)
+static int x86_cluster_flags(const struct cpumask *cpu_map)
 {
 	return cpu_cluster_flags() | x86_sched_itmt_flags();
 }
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 56cffe4..6aa985a
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -36,28 +36,28 @@ extern const struct sd_flag_debug sd_flag_debug[];
 #endif
 
 #ifdef CONFIG_SCHED_SMT
-static inline int cpu_smt_flags(void)
+static inline int cpu_smt_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_SCHED_CLUSTER
-static inline int cpu_cluster_flags(void)
+static inline int cpu_cluster_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_SCHED_MC
-static inline int cpu_core_flags(void)
+static inline int cpu_core_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_NUMA
-static inline int cpu_numa_flags(void)
+static inline int cpu_numa_flags(const struct cpumask *cpu_map)
 {
 	return SD_NUMA;
 }
@@ -180,7 +180,7 @@ void free_sched_domains(cpumask_var_t doms[], unsigned int ndoms);
 bool cpus_share_cache(int this_cpu, int that_cpu);
 
 typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
-typedef int (*sched_domain_flags_f)(void);
+typedef int (*sched_domain_flags_f)(const struct cpumask *cpu_map);
 
 #define SDTL_OVERLAP	0x01
 
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 05b6c2a..34dfec4
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1556,7 +1556,7 @@ sd_init(struct sched_domain_topology_level *tl,
 	sd_weight = cpumask_weight(tl->mask(cpu));
 
 	if (tl->sd_flags)
-		sd_flags = (*tl->sd_flags)();
+		sd_flags = (*tl->sd_flags)(tl->mask(cpu));
 	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
 			"wrong sd_flags in topology description\n"))
 		sd_flags &= TOPOLOGY_SD_FLAGS;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
@ 2022-03-29  9:15   ` Qing Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

sched_domain_flags_f() are statically set now, but actually, we can get a lot
of necessary information based on the cpu_map. e.g. we can know whether its
cache is shared.

Allows custom extension without affecting current.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 arch/powerpc/kernel/smp.c      |  4 ++--
 arch/x86/kernel/smpboot.c      |  8 ++++----
 include/linux/sched/topology.h | 10 +++++-----
 kernel/sched/topology.c        |  2 +-
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index de0f6f0..e503d23
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1000,7 +1000,7 @@ static bool shared_caches;
 
 #ifdef CONFIG_SCHED_SMT
 /* cpumask of CPUs with asymmetric SMT dependency */
-static int powerpc_smt_flags(void)
+static int powerpc_smt_flags(const struct cpumask *cpu_map)
 {
 	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 
@@ -1018,7 +1018,7 @@ static int powerpc_smt_flags(void)
  * since the migrated task remains cache hot. We want to take advantage of this
  * at the scheduler level so an extra topology level is required.
  */
-static int powerpc_shared_cache_flags(void)
+static int powerpc_shared_cache_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 2ef1477..c005a8e
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -535,25 +535,25 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 
 
 #if defined(CONFIG_SCHED_SMT) || defined(CONFIG_SCHED_CLUSTER) || defined(CONFIG_SCHED_MC)
-static inline int x86_sched_itmt_flags(void)
+static inline int x86_sched_itmt_flags(const struct cpumask *cpu_map)
 {
 	return sysctl_sched_itmt_enabled ? SD_ASYM_PACKING : 0;
 }
 
 #ifdef CONFIG_SCHED_MC
-static int x86_core_flags(void)
+static int x86_core_flags(const struct cpumask *cpu_map)
 {
 	return cpu_core_flags() | x86_sched_itmt_flags();
 }
 #endif
 #ifdef CONFIG_SCHED_SMT
-static int x86_smt_flags(void)
+static int x86_smt_flags(const struct cpumask *cpu_map)
 {
 	return cpu_smt_flags() | x86_sched_itmt_flags();
 }
 #endif
 #ifdef CONFIG_SCHED_CLUSTER
-static int x86_cluster_flags(void)
+static int x86_cluster_flags(const struct cpumask *cpu_map)
 {
 	return cpu_cluster_flags() | x86_sched_itmt_flags();
 }
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 56cffe4..6aa985a
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -36,28 +36,28 @@ extern const struct sd_flag_debug sd_flag_debug[];
 #endif
 
 #ifdef CONFIG_SCHED_SMT
-static inline int cpu_smt_flags(void)
+static inline int cpu_smt_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_SCHED_CLUSTER
-static inline int cpu_cluster_flags(void)
+static inline int cpu_cluster_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_SCHED_MC
-static inline int cpu_core_flags(void)
+static inline int cpu_core_flags(const struct cpumask *cpu_map)
 {
 	return SD_SHARE_PKG_RESOURCES;
 }
 #endif
 
 #ifdef CONFIG_NUMA
-static inline int cpu_numa_flags(void)
+static inline int cpu_numa_flags(const struct cpumask *cpu_map)
 {
 	return SD_NUMA;
 }
@@ -180,7 +180,7 @@ void free_sched_domains(cpumask_var_t doms[], unsigned int ndoms);
 bool cpus_share_cache(int this_cpu, int that_cpu);
 
 typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
-typedef int (*sched_domain_flags_f)(void);
+typedef int (*sched_domain_flags_f)(const struct cpumask *cpu_map);
 
 #define SDTL_OVERLAP	0x01
 
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 05b6c2a..34dfec4
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1556,7 +1556,7 @@ sd_init(struct sched_domain_topology_level *tl,
 	sd_weight = cpumask_weight(tl->mask(cpu));
 
 	if (tl->sd_flags)
-		sd_flags = (*tl->sd_flags)();
+		sd_flags = (*tl->sd_flags)(tl->mask(cpu));
 	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
 			"wrong sd_flags in topology description\n"))
 		sd_flags &= TOPOLOGY_SD_FLAGS;
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] arch_topology: support for describing cache topology from DT
  2022-03-29  9:15 ` Qing Wang
@ 2022-03-29  9:15   ` Qing Wang
  -1 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

When ACPI is not enabled, we can get cache topolopy from DT like:
*		cpu0: cpu@000 {
*			next-level-cache = <&L2_1>;
*			L2_1: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
* 			};
*			L3_1: l3-cache {
* 				compatible = "cache";
* 			};
*		};
*
*		cpu1: cpu@001 {
*			next-level-cache = <&L2_1>;
*			cpu-idle-states = <&clusteroff_l &mcusysoff
*						&system_mem &system_pll &system_bus
*						&s2idle>;
*		};
*		cpu2: cpu@002 {
*			L2_2: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
*			};
*		};
*
*		cpu3: cpu@003 {
*			next-level-cache = <&L2_2>;
*		};
cache_topology hold the pointer describing "next-level-cache", 
it can describe the cache topology of every level.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 drivers/base/arch_topology.c  | 89 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/arch_topology.h |  4 ++
 2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636e..41e0301
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -647,6 +647,92 @@ static int __init parse_dt_topology(void)
 }
 #endif
 
+
+/*
+ * cpu cache topology table
+ */
+#define MAX_CACHE_LEVEL 7
+struct device_node *cache_topology[NR_CPUS][MAX_CACHE_LEVEL];
+
+void init_cpu_cache_topology(void)
+{
+	struct device_node *node_cpu, *node_cache;
+	int cpu;
+	int level = 0;
+
+	for_each_possible_cpu(cpu) {
+		node_cpu = of_get_cpu_node(cpu, NULL);
+		if (!node_cpu)
+			continue;
+
+		level = 0;
+		node_cache = node_cpu;
+		while (level < MAX_CACHE_LEVEL) {
+			node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
+			if (!node_cache)
+				break;
+
+			cache_topology[cpu][level++] = node_cache;
+		}
+		of_node_put(node_cpu);
+	}
+}
+
+/*
+ * private means only shared within cpu_mask
+ * Returns -1 if not described int DT.
+ */
+int cpu_share_private_cache(const struct cpumask *cpu_mask)
+{
+	int cache_level, cpu_id;
+	struct cpumask cache_mask;
+	int cpu = cpumask_first(cpu_mask);
+
+	for (cache_level = 0; cache_level < MAX_CACHE_LEVEL; cache_level++) {
+		if (!cache_topology[cpu][cache_level])
+			return -1;
+
+		cpumask_clear(&cache_mask);
+		for (cpu_id = 0; cpu_id < NR_CPUS; cpu_id++) {
+			if (cache_topology[cpu][cache_level] == cache_topology[cpu_id][cache_level])
+				cpumask_set_cpu(cpu_id, &cache_mask);
+		}
+
+		if (cpumask_equal(cpu_mask, &cache_mask))
+			return 1;
+	}
+
+	return 0;
+}
+
+bool cpu_share_llc(int cpu1, int cpu2)
+{
+	int cache_level;
+
+	for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
+		if (!cache_topology[cpu1][cache_level])
+			continue;
+
+		if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
+			return true;
+
+		return false;
+	}
+
+	return false;
+}
+
+bool cpu_share_l2c(int cpu1, int cpu2)
+{
+	if (!cache_topology[cpu1][0])
+		return false;
+
+	if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
+		return true;
+
+	return false;
+}
+
 /*
  * cpu topology table
  */
@@ -684,7 +770,8 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
+			|| (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18..a402ff6
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -86,6 +86,10 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
 #define topology_cluster_cpumask(cpu)	(&cpu_topology[cpu].cluster_sibling)
 #define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
 void init_cpu_topology(void);
+void init_cpu_cache_topology(void);
+int cpu_share_private_cache(const struct cpumask *cpu_mask);
+bool cpu_share_llc(int cpu1, int cpu2);
+bool cpu_share_l2c(int cpu1, int cpu2);
 void store_cpu_topology(unsigned int cpuid);
 const struct cpumask *cpu_coregroup_mask(int cpu);
 const struct cpumask *cpu_clustergroup_mask(int cpu);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] arch_topology: support for describing cache topology from DT
@ 2022-03-29  9:15   ` Qing Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

When ACPI is not enabled, we can get cache topolopy from DT like:
*		cpu0: cpu@000 {
*			next-level-cache = <&L2_1>;
*			L2_1: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
* 			};
*			L3_1: l3-cache {
* 				compatible = "cache";
* 			};
*		};
*
*		cpu1: cpu@001 {
*			next-level-cache = <&L2_1>;
*			cpu-idle-states = <&clusteroff_l &mcusysoff
*						&system_mem &system_pll &system_bus
*						&s2idle>;
*		};
*		cpu2: cpu@002 {
*			L2_2: l2-cache {
* 				compatible = "cache";
*				next-level-cache = <&L3_1>;
*			};
*		};
*
*		cpu3: cpu@003 {
*			next-level-cache = <&L2_2>;
*		};
cache_topology hold the pointer describing "next-level-cache", 
it can describe the cache topology of every level.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 drivers/base/arch_topology.c  | 89 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/arch_topology.h |  4 ++
 2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636e..41e0301
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -647,6 +647,92 @@ static int __init parse_dt_topology(void)
 }
 #endif
 
+
+/*
+ * cpu cache topology table
+ */
+#define MAX_CACHE_LEVEL 7
+struct device_node *cache_topology[NR_CPUS][MAX_CACHE_LEVEL];
+
+void init_cpu_cache_topology(void)
+{
+	struct device_node *node_cpu, *node_cache;
+	int cpu;
+	int level = 0;
+
+	for_each_possible_cpu(cpu) {
+		node_cpu = of_get_cpu_node(cpu, NULL);
+		if (!node_cpu)
+			continue;
+
+		level = 0;
+		node_cache = node_cpu;
+		while (level < MAX_CACHE_LEVEL) {
+			node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
+			if (!node_cache)
+				break;
+
+			cache_topology[cpu][level++] = node_cache;
+		}
+		of_node_put(node_cpu);
+	}
+}
+
+/*
+ * private means only shared within cpu_mask
+ * Returns -1 if not described int DT.
+ */
+int cpu_share_private_cache(const struct cpumask *cpu_mask)
+{
+	int cache_level, cpu_id;
+	struct cpumask cache_mask;
+	int cpu = cpumask_first(cpu_mask);
+
+	for (cache_level = 0; cache_level < MAX_CACHE_LEVEL; cache_level++) {
+		if (!cache_topology[cpu][cache_level])
+			return -1;
+
+		cpumask_clear(&cache_mask);
+		for (cpu_id = 0; cpu_id < NR_CPUS; cpu_id++) {
+			if (cache_topology[cpu][cache_level] == cache_topology[cpu_id][cache_level])
+				cpumask_set_cpu(cpu_id, &cache_mask);
+		}
+
+		if (cpumask_equal(cpu_mask, &cache_mask))
+			return 1;
+	}
+
+	return 0;
+}
+
+bool cpu_share_llc(int cpu1, int cpu2)
+{
+	int cache_level;
+
+	for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
+		if (!cache_topology[cpu1][cache_level])
+			continue;
+
+		if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
+			return true;
+
+		return false;
+	}
+
+	return false;
+}
+
+bool cpu_share_l2c(int cpu1, int cpu2)
+{
+	if (!cache_topology[cpu1][0])
+		return false;
+
+	if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
+		return true;
+
+	return false;
+}
+
 /*
  * cpu topology table
  */
@@ -684,7 +770,8 @@ void update_siblings_masks(unsigned int cpuid)
 	for_each_online_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+		if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
+			|| (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
 			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
 			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
 		}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18..a402ff6
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -86,6 +86,10 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
 #define topology_cluster_cpumask(cpu)	(&cpu_topology[cpu].cluster_sibling)
 #define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
 void init_cpu_topology(void);
+void init_cpu_cache_topology(void);
+int cpu_share_private_cache(const struct cpumask *cpu_mask);
+bool cpu_share_llc(int cpu1, int cpu2);
+bool cpu_share_l2c(int cpu1, int cpu2);
 void store_cpu_topology(unsigned int cpuid);
 const struct cpumask *cpu_coregroup_mask(int cpu);
 const struct cpumask *cpu_clustergroup_mask(int cpu);
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] arm64: add arm64 default topology
  2022-03-29  9:15 ` Qing Wang
@ 2022-03-29  9:15   ` Qing Wang
  -1 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

default_topology does not fit arm64, especially CPU and cache topology.
Add arm64_topology, so we can do more based on CONFIG_GENERIC_ARCH_TOPOLOGY.

arm64_xxx_flags() prefer to get the cache attribute from DT.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 arch/arm64/kernel/smp.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 27df5c1..d245012
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -715,6 +715,60 @@ void __init smp_init_cpus(void)
 	}
 }
 
+#ifdef CONFIG_SCHED_CLUSTER
+static int arm64_cluster_flags(const struct cpumask *cpu_map)
+{
+	int flag = cpu_cluster_flags();
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+#endif
+
+#ifdef CONFIG_SCHED_MC
+static int arm64_core_flags(const struct cpumask *cpu_map)
+{
+	int flag = cpu_core_flags();
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+#endif
+
+static int arm64_die_flags(const struct cpumask *cpu_map)
+{
+	int flag = 0;
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+
+static struct sched_domain_topology_level arm64_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+#ifdef CONFIG_SCHED_CLUSTER
+	{ cpu_clustergroup_mask, arm64_cluster_flags, SD_INIT_NAME(CLS) },
+#endif
+#ifdef CONFIG_SCHED_MC
+	{ cpu_coregroup_mask, arm64_core_flags, SD_INIT_NAME(MC) },
+#endif
+	{ cpu_cpu_mask, arm64_die_flags, SD_INIT_NAME(DIE) },
+	{ NULL, },
+};
+
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	const struct cpu_operations *ops;
@@ -723,6 +777,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	unsigned int this_cpu;
 
 	init_cpu_topology();
+	init_cpu_cache_topology();
+	set_sched_topology(arm64_topology);
 
 	this_cpu = smp_processor_id();
 	store_cpu_topology(this_cpu);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] arm64: add arm64 default topology
@ 2022-03-29  9:15   ` Qing Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Qing Wang @ 2022-03-29  9:15 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	linux-arm-kernel, linux-kernel, linuxppc-dev
  Cc: Wang Qing

From: Wang Qing <wangqing@vivo.com>

default_topology does not fit arm64, especially CPU and cache topology.
Add arm64_topology, so we can do more based on CONFIG_GENERIC_ARCH_TOPOLOGY.

arm64_xxx_flags() prefer to get the cache attribute from DT.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
 arch/arm64/kernel/smp.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 27df5c1..d245012
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -715,6 +715,60 @@ void __init smp_init_cpus(void)
 	}
 }
 
+#ifdef CONFIG_SCHED_CLUSTER
+static int arm64_cluster_flags(const struct cpumask *cpu_map)
+{
+	int flag = cpu_cluster_flags();
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+#endif
+
+#ifdef CONFIG_SCHED_MC
+static int arm64_core_flags(const struct cpumask *cpu_map)
+{
+	int flag = cpu_core_flags();
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+#endif
+
+static int arm64_die_flags(const struct cpumask *cpu_map)
+{
+	int flag = 0;
+	int ret = cpu_share_private_cache(cpu_map);
+	if (ret == 1)
+		flag |= SD_SHARE_PKG_RESOURCES;
+	else if (ret == 0)
+		flag &= ~SD_SHARE_PKG_RESOURCES;
+
+	return flag;
+}
+
+static struct sched_domain_topology_level arm64_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+#ifdef CONFIG_SCHED_CLUSTER
+	{ cpu_clustergroup_mask, arm64_cluster_flags, SD_INIT_NAME(CLS) },
+#endif
+#ifdef CONFIG_SCHED_MC
+	{ cpu_coregroup_mask, arm64_core_flags, SD_INIT_NAME(MC) },
+#endif
+	{ cpu_cpu_mask, arm64_die_flags, SD_INIT_NAME(DIE) },
+	{ NULL, },
+};
+
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	const struct cpu_operations *ops;
@@ -723,6 +777,8 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	unsigned int this_cpu;
 
 	init_cpu_topology();
+	init_cpu_cache_topology();
+	set_sched_topology(arm64_topology);
 
 	this_cpu = smp_processor_id();
 	store_cpu_topology(this_cpu);
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
  2022-03-29  9:15   ` Qing Wang
  (?)
@ 2022-03-29  9:23     ` Peter Zijlstra
  -1 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2022-03-29  9:23 UTC (permalink / raw)
  To: Qing Wang
  Cc: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-arm-kernel,
	linux-kernel, linuxppc-dev

On Tue, Mar 29, 2022 at 02:15:19AM -0700, Qing Wang wrote:
> From: Wang Qing <wangqing@vivo.com>
> 
> sched_domain_flags_f() are statically set now, but actually, we can get a lot
> of necessary information based on the cpu_map. e.g. we can know whether its
> cache is shared.
> 
> Allows custom extension without affecting current.

Still NAK

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
@ 2022-03-29  9:23     ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2022-03-29  9:23 UTC (permalink / raw)
  To: Qing Wang
  Cc: Juri Lelli, Rafael J. Wysocki, Catalin Marinas, Ben Segall,
	Paul Mackerras, H. Peter Anvin, Will Deacon, Vincent Guittot,
	x86, Ingo Molnar, Mel Gorman, linuxppc-dev, Steven Rostedt,
	Borislav Petkov, Thomas Gleixner, Dietmar Eggemann,
	linux-arm-kernel, Greg Kroah-Hartman, linux-kernel, Sudeep Holla,
	Daniel Bristot de Oliveira

On Tue, Mar 29, 2022 at 02:15:19AM -0700, Qing Wang wrote:
> From: Wang Qing <wangqing@vivo.com>
> 
> sched_domain_flags_f() are statically set now, but actually, we can get a lot
> of necessary information based on the cpu_map. e.g. we can know whether its
> cache is shared.
> 
> Allows custom extension without affecting current.

Still NAK

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f()
@ 2022-03-29  9:23     ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2022-03-29  9:23 UTC (permalink / raw)
  To: Qing Wang
  Cc: Catalin Marinas, Will Deacon, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Sudeep Holla,
	Greg Kroah-Hartman, Rafael J. Wysocki, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-arm-kernel,
	linux-kernel, linuxppc-dev

On Tue, Mar 29, 2022 at 02:15:19AM -0700, Qing Wang wrote:
> From: Wang Qing <wangqing@vivo.com>
> 
> sched_domain_flags_f() are statically set now, but actually, we can get a lot
> of necessary information based on the cpu_map. e.g. we can know whether its
> cache is shared.
> 
> Allows custom extension without affecting current.

Still NAK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-29  9:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-29  9:15 [PATCH 0/3] support for describing cache topology from DT Qing Wang
2022-03-29  9:15 ` Qing Wang
2022-03-29  9:15 ` [PATCH 1/3] sched: topology: add input parameter for sched_domain_flags_f() Qing Wang
2022-03-29  9:15   ` Qing Wang
2022-03-29  9:23   ` Peter Zijlstra
2022-03-29  9:23     ` Peter Zijlstra
2022-03-29  9:23     ` Peter Zijlstra
2022-03-29  9:15 ` [PATCH 2/3] arch_topology: support for describing cache topology from DT Qing Wang
2022-03-29  9:15   ` Qing Wang
2022-03-29  9:15 ` [PATCH 3/3] arm64: add arm64 default topology Qing Wang
2022-03-29  9:15   ` Qing Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.