linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up
@ 2014-01-20 12:39 dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 01/11] sched: define sched_domain_topology_info dietmar.eggemann
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

This patch-set cleans up the scheduler domain (sd) level initialisation
code.  It is based on the idea of Peter Zijlstra to use a single sd
init function sketched here: https://lkml.org/lkml/2013/11/5/239

The sd level data for conventional (SMT, MC, BOOK, CPU) sd levels which
can be set by the arch comprises the cpu mask and the sd topology flags
for this level.

An arch can either use the existing sd level hierarchy defined in the
scheduler core or specify its own complete sd level hierarchy in its
topology shim layer.

This clean-up is necessary because some of the arch specific settings
are just bitrot or accidents.  So far, additional arch specific sd
topology flags are set via flag specific weak functions, like
SD_ASYM_PACKING.  If in the future, we want to further distinguish the
functionality of the different sd levels based on more topology
information (e.g. Vincent's SD_SHARE_POWERDOMAIN flag) we do need a
better interface than this.

Changes in v2:

1) Export struct sched_domain_topology_info and set_sd_topology_info()
   towards the arch to enable it to customize sched domain topology
   data.

2) Keep a default topology setup inside the scheduler for the
   conventional sched domain levels.

3) Sched domain level topology flags can be specified dependent on
   runtime information (e.g. cpu feature) and per cpu.

4) Move struct sched_domain from interface header file into private
   scheduler header file for data encapsulation purposes.  With the
   deletion of the arch specific sd init macros and the sd ptr arg in
   arch_scale_freq_power(), struct sched_domain is no longer needed
   outside the scheduler core code.

What does the patch-set try to achieve:

1) Let the arch define the conventional (SMT, MC, BOOK, CPU) sched
   domain hierarchy.  The arch specifies per sched domain level the
   pointer to the getter function of the corresponding cpu mask and
   the topology flags as well as the name of the sched domain level,
   the latter for debug purposes.

2) Unify the set-up code for conventional and NUMA scheduler domains.
   All scheduler domain topology levels are now allocated in the same
   function sched_init_topology().  All scheduler domains now use a
   common init function sd_init() which makes the existing SD_FOO_INIT
   macros redundant.

3) The arch is no longer limited to the default sched domain levels
   (SMT, MC, BOOK, CPU) but can easily define its own sched domain level
   hierarchy.

4) Prepare the mechanics to make it easier to integrate the provision of
   additional topology related data (e.g. energy information) to the
   scheduler.

Current limitations:

1) The patch-set has only been tested on ARM TC2 (2 clusters, one with 2
   Cortex A15 and the other with 3 Cortex A7) and on an Intel i5-520M (2
   cores with 2 threads each) platform.

2) For other archs it has only been compile tested for certain
   configurations (powerpc: chroma_defconfig, mips: ip27_defconfig,
   s390: defconfig, tile: tilegx_defconfig).

The patch-set is against v3.13-rc8.

Dietmar Eggemann (11):
  sched: define sched_domain_topology_info
  sched: export cpu_smt_mask() and cpu_cpu_mask()
  sched: define TOPOLOGY_SD_FLAGS
  sched: replace SD_INIT_FUNC with sd_init()
  sched: add a name to sched_domain_topology_info
  sched: delete redundant sd init macros
  sched: consolidate sched_init_numa() and sched_init_conv()
  sched: introduce a func ptr for sd topology flags
  sched: provide SD_ASYM_PACKING via topology info table
  sched: delete sd ptr arg in arch_scale_freq_power()
  sched: un-export struct sched_domain

 arch/arm/kernel/topology.c        |    4 +-
 arch/ia64/include/asm/topology.h  |   24 ---
 arch/metag/include/asm/topology.h |   27 ---
 arch/powerpc/kernel/smp.c         |   34 +++-
 arch/s390/include/asm/topology.h  |    2 -
 arch/tile/include/asm/topology.h  |   33 ----
 include/linux/sched.h             |  119 ++++---------
 include/linux/topology.h          |  127 ++------------
 kernel/sched/core.c               |  351 ++++++++++++++++++-------------------
 kernel/sched/fair.c               |   10 +-
 kernel/sched/sched.h              |  169 ++++++++++++++----
 11 files changed, 384 insertions(+), 516 deletions(-)

-- 
1.7.9.5



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 01/11] sched: define sched_domain_topology_info
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 02/11] sched: export cpu_smt_mask() and cpu_cpu_mask() dietmar.eggemann
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

The idea behind creating the struct sched_domain_topology_info is to be
able to only export the scheduler data which can be changed by the arch.

Exporting the existing struct sched_domain_topology_level means that we
also have to expose struct sd_data anf other scheduler internals.

The extra step of allocating the sched_domain_topology array for pure
conventional sched domain set-up doesn't harm when this will be
consolidated with the existing code in the sched_init_numa() function.

This patch extracts the topology information which can be set by the arch
(cpu mask and sd topology flags) from the existing struct
sched_domain_topology and puts it in struct sched_domain_topology_info.

The struct sched_domain_topology_info is exported in
include/linux/sched.h.  It defines the default struct
sched_domain_topology_info default_topology_info[] for all conventional
scheduler domain level (SMT, MC, BOOK, CPU).

In case an arch wants to use a different default topology info array, it
can overwrite the pointer to this table via the new function
set_sd_topology_info().

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 include/linux/sched.h |   10 ++++++++++
 kernel/sched/core.c   |   29 ++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 53f97eb8dbc7..bf2ee608af67 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2780,4 +2780,14 @@ static inline unsigned long rlimit_max(unsigned int limit)
 	return task_rlimit_max(current, limit);
 }
 
+typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
+
+struct sched_domain_topology_info {
+	sched_domain_mask_f mask;
+	int		    flags;
+};
+
+extern void
+set_sd_topology_info(struct sched_domain_topology_info *ti, unsigned int s);
+
 #endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a88f4a485c5e..8b8a37697a7d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5014,7 +5014,6 @@ enum s_alloc {
 struct sched_domain_topology_level;
 
 typedef struct sched_domain *(*sched_domain_init_f)(struct sched_domain_topology_level *tl, int cpu);
-typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
 
 #define SDTL_OVERLAP	0x01
 
@@ -5393,7 +5392,35 @@ static struct sched_domain_topology_level default_topology[] = {
 	{ NULL, },
 };
 
+/*
+ * Topology info list, bottom-up.
+ */
+static struct sched_domain_topology_info default_topology_info[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES },
+#endif
+#ifdef CONFIG_SCHED_MC
+	{ cpu_coregroup_mask, SD_SHARE_PKG_RESOURCES },
+#endif
+#ifdef CONFIG_SCHED_BOOK
+	{ cpu_book_mask, },
+#endif
+	{ cpu_cpu_mask, },
+	{ NULL, },
+};
+
 static struct sched_domain_topology_level *sched_domain_topology = default_topology;
+static struct sched_domain_topology_info *sched_domain_topology_info =
+		default_topology_info;
+static unsigned int sched_domain_topology_info_size =
+		ARRAY_SIZE(default_topology_info);
+
+void
+set_sd_topology_info(struct sched_domain_topology_info *ti, unsigned int s)
+{
+	sched_domain_topology_info = ti;
+	sched_domain_topology_info_size = s;
+}
 
 #define for_each_sd_topology(tl)			\
 	for (tl = sched_domain_topology; tl->init; tl++)
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 02/11] sched: export cpu_smt_mask() and cpu_cpu_mask()
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 01/11] sched: define sched_domain_topology_info dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 03/11] sched: define TOPOLOGY_SD_FLAGS dietmar.eggemann
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Move cpu_smt_mask() and cpu_cpu_mask() from scheduler code into the
include/linux/topology.h so that an arch can use them for specifying cpu
masks for its topology info array.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 include/linux/topology.h |   12 ++++++++++++
 kernel/sched/core.c      |   12 ------------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/topology.h b/include/linux/topology.h
index 12ae6ce997d6..0bfcead0ace7 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -295,4 +295,16 @@ static inline int cpu_to_mem(int cpu)
 #define topology_core_cpumask(cpu)		cpumask_of(cpu)
 #endif
 
+#ifdef CONFIG_SCHED_SMT
+static inline const struct cpumask *cpu_smt_mask(int cpu)
+{
+	return topology_thread_cpumask(cpu);
+}
+#endif
+
+static inline const struct cpumask *cpu_cpu_mask(int cpu)
+{
+	return cpumask_of_node(cpu_to_node(cpu));
+}
+
 #endif /* _LINUX_TOPOLOGY_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8b8a37697a7d..523bb43756d6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4988,11 +4988,6 @@ static int __init isolated_cpu_setup(char *str)
 
 __setup("isolcpus=", isolated_cpu_setup);
 
-static const struct cpumask *cpu_cpu_mask(int cpu)
-{
-	return cpumask_of_node(cpu_to_node(cpu));
-}
-
 struct sd_data {
 	struct sched_domain **__percpu sd;
 	struct sched_group **__percpu sg;
@@ -5368,13 +5363,6 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
 		*per_cpu_ptr(sdd->sgp, cpu) = NULL;
 }
 
-#ifdef CONFIG_SCHED_SMT
-static const struct cpumask *cpu_smt_mask(int cpu)
-{
-	return topology_thread_cpumask(cpu);
-}
-#endif
-
 /*
  * Topology list, bottom-up.
  */
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 03/11] sched: define TOPOLOGY_SD_FLAGS
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 01/11] sched: define sched_domain_topology_info dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 02/11] sched: export cpu_smt_mask() and cpu_cpu_mask() dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 04/11] sched: replace SD_INIT_FUNC with sd_init() dietmar.eggemann
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

TOPOLOGY_SD_FLAGS contains all SD flags which provide topology related
information towards the scheduler. All other SD flags describe scheduler
behavioural aspects.  The aim of TOPOLOGY_SD_FLAGS is to be able to check
that the arch only specifies topology related flags in the topology info
array.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/sched.h |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 88c85b21d633..fcf2d4317217 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1414,3 +1414,19 @@ static inline u64 irq_time_read(int cpu)
 }
 #endif /* CONFIG_64BIT */
 #endif /* CONFIG_IRQ_TIME_ACCOUNTING */
+
+/*
+ * SD_flags allowed in topology descriptions.
+ *
+ * SD_SHARE_CPUPOWER      - describes SMT topologies
+ * SD_SHARE_PKG_RESOURCES - describes shared caches
+ * SD_NUMA                - describes NUMA topologies
+ *
+ * Odd one out:
+ * SD_ASYM_PACKING        - describes SMT quirks
+ */
+#define TOPOLOGY_SD_FLAGS         \
+	(SD_SHARE_CPUPOWER |      \
+	 SD_SHARE_PKG_RESOURCES | \
+	 SD_NUMA |                \
+	 SD_ASYM_PACKING)
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 04/11] sched: replace SD_INIT_FUNC with sd_init()
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (2 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 03/11] sched: define TOPOLOGY_SD_FLAGS dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 05/11] sched: add a name to sched_domain_topology_info dietmar.eggemann
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

This patch incorporates struct sched_domain_topology_info info into struct
sched_domain_topology_level.  It updates sd_init_numa() to reflect the
change that conventional (SMT, MC, BOOK, CPU)  level initialization relies
on the topology_info[] array and not on the default_topology[] any more.

Moreover a counterpart function sched_init_conv() is introduced to handle
the allocation of the topology array for a !CONFIG_NUMA system.

The patch deletes the default topology array default_topology[] and the
SD_INIT_FUNC() macro which are not used any more. The function
sd_local_flags() is deleted too and the appropriate functionality is
directly incorporated into the NUMA specific condition path in sd_init().

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/core.c |  247 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 135 insertions(+), 112 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 523bb43756d6..90aa7c3d3a00 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5006,15 +5006,10 @@ enum s_alloc {
 	sa_none,
 };
 
-struct sched_domain_topology_level;
-
-typedef struct sched_domain *(*sched_domain_init_f)(struct sched_domain_topology_level *tl, int cpu);
-
 #define SDTL_OVERLAP	0x01
 
 struct sched_domain_topology_level {
-	sched_domain_init_f init;
-	sched_domain_mask_f mask;
+	struct sched_domain_topology_info info;
 	int		    flags;
 	int		    numa_level;
 	struct sd_data      data;
@@ -5254,28 +5249,6 @@ int __weak arch_sd_sibling_asym_packing(void)
 # define SD_INIT_NAME(sd, type)		do { } while (0)
 #endif
 
-#define SD_INIT_FUNC(type)						\
-static noinline struct sched_domain *					\
-sd_init_##type(struct sched_domain_topology_level *tl, int cpu) 	\
-{									\
-	struct sched_domain *sd = *per_cpu_ptr(tl->data.sd, cpu);	\
-	*sd = SD_##type##_INIT;						\
-	SD_INIT_NAME(sd, type);						\
-	sd->private = &tl->data;					\
-	return sd;							\
-}
-
-SD_INIT_FUNC(CPU)
-#ifdef CONFIG_SCHED_SMT
- SD_INIT_FUNC(SIBLING)
-#endif
-#ifdef CONFIG_SCHED_MC
- SD_INIT_FUNC(MC)
-#endif
-#ifdef CONFIG_SCHED_BOOK
- SD_INIT_FUNC(BOOK)
-#endif
-
 static int default_relax_domain_level = -1;
 int sched_domain_level_max;
 
@@ -5364,23 +5337,6 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
 }
 
 /*
- * Topology list, bottom-up.
- */
-static struct sched_domain_topology_level default_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ sd_init_SIBLING, cpu_smt_mask, },
-#endif
-#ifdef CONFIG_SCHED_MC
-	{ sd_init_MC, cpu_coregroup_mask, },
-#endif
-#ifdef CONFIG_SCHED_BOOK
-	{ sd_init_BOOK, cpu_book_mask, },
-#endif
-	{ sd_init_CPU, cpu_cpu_mask, },
-	{ NULL, },
-};
-
-/*
  * Topology info list, bottom-up.
  */
 static struct sched_domain_topology_info default_topology_info[] = {
@@ -5394,10 +5350,9 @@ static struct sched_domain_topology_info default_topology_info[] = {
 	{ cpu_book_mask, },
 #endif
 	{ cpu_cpu_mask, },
-	{ NULL, },
 };
 
-static struct sched_domain_topology_level *sched_domain_topology = default_topology;
+static struct sched_domain_topology_level *sched_domain_topology;
 static struct sched_domain_topology_info *sched_domain_topology_info =
 		default_topology_info;
 static unsigned int sched_domain_topology_info_size =
@@ -5411,7 +5366,7 @@ set_sd_topology_info(struct sched_domain_topology_info *ti, unsigned int s)
 }
 
 #define for_each_sd_topology(tl)			\
-	for (tl = sched_domain_topology; tl->init; tl++)
+	for (tl = sched_domain_topology; tl->info.mask; tl++)
 
 #ifdef CONFIG_NUMA
 
@@ -5420,61 +5375,6 @@ static int *sched_domains_numa_distance;
 static struct cpumask ***sched_domains_numa_masks;
 static int sched_domains_curr_level;
 
-static inline int sd_local_flags(int level)
-{
-	if (sched_domains_numa_distance[level] > RECLAIM_DISTANCE)
-		return 0;
-
-	return SD_BALANCE_EXEC | SD_BALANCE_FORK | SD_WAKE_AFFINE;
-}
-
-static struct sched_domain *
-sd_numa_init(struct sched_domain_topology_level *tl, int cpu)
-{
-	struct sched_domain *sd = *per_cpu_ptr(tl->data.sd, cpu);
-	int level = tl->numa_level;
-	int sd_weight = cpumask_weight(
-			sched_domains_numa_masks[level][cpu_to_node(cpu)]);
-
-	*sd = (struct sched_domain){
-		.min_interval		= sd_weight,
-		.max_interval		= 2*sd_weight,
-		.busy_factor		= 32,
-		.imbalance_pct		= 125,
-		.cache_nice_tries	= 2,
-		.busy_idx		= 3,
-		.idle_idx		= 2,
-		.newidle_idx		= 0,
-		.wake_idx		= 0,
-		.forkexec_idx		= 0,
-
-		.flags			= 1*SD_LOAD_BALANCE
-					| 1*SD_BALANCE_NEWIDLE
-					| 0*SD_BALANCE_EXEC
-					| 0*SD_BALANCE_FORK
-					| 0*SD_BALANCE_WAKE
-					| 0*SD_WAKE_AFFINE
-					| 0*SD_SHARE_CPUPOWER
-					| 0*SD_SHARE_PKG_RESOURCES
-					| 1*SD_SERIALIZE
-					| 0*SD_PREFER_SIBLING
-					| 1*SD_NUMA
-					| sd_local_flags(level)
-					,
-		.last_balance		= jiffies,
-		.balance_interval	= sd_weight,
-	};
-	SD_INIT_NAME(sd, NUMA);
-	sd->private = &tl->data;
-
-	/*
-	 * Ugly hack to pass state to sd_numa_mask()...
-	 */
-	sched_domains_curr_level = tl->numa_level;
-
-	return sd;
-}
-
 static const struct cpumask *sd_numa_mask(int cpu)
 {
 	return sched_domains_numa_masks[sched_domains_curr_level][cpu_to_node(cpu)];
@@ -5520,6 +5420,7 @@ static void sched_init_numa(void)
 {
 	int next_distance, curr_distance = node_distance(0, 0);
 	struct sched_domain_topology_level *tl;
+	struct sched_domain_topology_info *ti = sched_domain_topology_info;
 	int level = 0;
 	int i, j, k;
 
@@ -5618,24 +5519,29 @@ static void sched_init_numa(void)
 		}
 	}
 
-	tl = kzalloc((ARRAY_SIZE(default_topology) + level) *
-			sizeof(struct sched_domain_topology_level), GFP_KERNEL);
+	/*
+	 * An extra empty struct sched_domain_topology_level element at the end
+	 * of the array is needed to let for_each_sd_topology() work correctly.
+	 */
+	tl = kzalloc((sched_domain_topology_info_size + level + 1) *
+			sizeof(struct sched_domain_topology_level),
+			GFP_KERNEL);
 	if (!tl)
 		return;
 
 	/*
-	 * Copy the default topology bits..
+	 * Copy the topology info bits..
 	 */
-	for (i = 0; default_topology[i].init; i++)
-		tl[i] = default_topology[i];
+	for (i = 0; i < sched_domain_topology_info_size; i++)
+		tl[i].info = ti[i];
 
 	/*
 	 * .. and append 'j' levels of NUMA goodness.
 	 */
 	for (j = 0; j < level; i++, j++) {
 		tl[i] = (struct sched_domain_topology_level){
-			.init = sd_numa_init,
-			.mask = sd_numa_mask,
+			.info.mask = sd_numa_mask,
+			.info.flags = SD_NUMA,
 			.flags = SDTL_OVERLAP,
 			.numa_level = j,
 		};
@@ -5646,6 +5552,10 @@ static void sched_init_numa(void)
 	sched_domains_numa_levels = level;
 }
 
+static void sched_init_conv(void)
+{
+}
+
 static void sched_domains_numa_masks_set(int cpu)
 {
 	int i, j;
@@ -5698,6 +5608,31 @@ static inline void sched_init_numa(void)
 {
 }
 
+static void sched_init_conv(void)
+{
+	struct sched_domain_topology_level *tl;
+	struct sched_domain_topology_info *ti = sched_domain_topology_info;
+	int i;
+
+	/*
+	 * An extra empty struct sched_domain_topology_level element at the end
+	 * of the array is needed to let for_each_sd_topology() work correctly.
+	 */
+	tl = kzalloc((sched_domain_topology_info_size + 1) *
+		sizeof(struct sched_domain_topology_level),
+		GFP_KERNEL);
+	if (!tl)
+		return;
+
+	/*
+	 * Copy the topology info bits..
+	 */
+	for (i = 0; i < sched_domain_topology_info_size; i++)
+		tl[i].info = ti[i];
+
+	sched_domain_topology = tl;
+}
+
 static int sched_domains_numa_masks_update(struct notifier_block *nfb,
 					   unsigned long action,
 					   void *hcpu)
@@ -5706,6 +5641,93 @@ static int sched_domains_numa_masks_update(struct notifier_block *nfb,
 }
 #endif /* CONFIG_NUMA */
 
+static struct sched_domain *
+sd_init(struct sched_domain_topology_level *tl, int cpu)
+{
+	struct sched_domain *sd = *per_cpu_ptr(tl->data.sd, cpu);
+	int sd_weight;
+
+#ifdef CONFIG_NUMA
+	/*
+	 * Ugly hack to pass state to sd_numa_mask()...
+	 */
+	sched_domains_curr_level = tl->numa_level;
+#endif
+
+	sd_weight = cpumask_weight(tl->info.mask(cpu));
+
+	if (WARN_ONCE(tl->info.flags & ~TOPOLOGY_SD_FLAGS,
+			"wrong flags in topology info\n"))
+		tl->info.flags &= ~TOPOLOGY_SD_FLAGS;
+
+	*sd = (struct sched_domain){
+				.min_interval  = sd_weight,
+				.max_interval  = 2*sd_weight,
+				.busy_factor   = 64,
+				.imbalance_pct = 125,
+
+				.flags =  1*SD_LOAD_BALANCE
+						| 1*SD_BALANCE_NEWIDLE
+						| 1*SD_BALANCE_EXEC
+						| 1*SD_BALANCE_FORK
+						| 1*SD_WAKE_AFFINE
+						| tl->info.flags
+						,
+
+				.last_balance     = jiffies,
+				.balance_interval = sd_weight,
+	};
+
+	/*
+	 * Convert topological properties into behaviour.
+	 */
+
+	if (sd->flags & SD_SHARE_CPUPOWER) {
+		sd->imbalance_pct = 110;
+		sd->smt_gain = 1178; /* ~15% */
+
+		/*
+		 * Call SMT specific arch topology function.
+		 * This goes away once the powerpc arch uses
+		 * the new interface for scheduler domain
+		 * setup.
+		 */
+		sd->flags |= arch_sd_sibling_asym_packing();
+
+		SD_INIT_NAME(sd, SMT);
+	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
+		sd->cache_nice_tries = 1;
+		sd->busy_idx = 2;
+
+		SD_INIT_NAME(sd, MC);
+#ifdef CONFIG_NUMA
+	} else if (sd->flags & SD_NUMA) {
+		sd->busy_factor = 32,
+		sd->cache_nice_tries = 2;
+		sd->busy_idx = 3;
+		sd->idle_idx = 2;
+		sd->flags |= SD_SERIALIZE;
+		if (sched_domains_numa_distance[tl->numa_level]
+				> RECLAIM_DISTANCE) {
+			sd->flags &= ~(SD_BALANCE_EXEC |
+				       SD_BALANCE_FORK |
+				       SD_WAKE_AFFINE);
+		}
+#endif
+	} else {
+		sd->cache_nice_tries = 1;
+		sd->busy_idx = 2;
+		sd->idle_idx = 1;
+		sd->flags |= SD_PREFER_SIBLING;
+
+		SD_INIT_NAME(sd, CPU);
+	}
+
+	sd->private = &tl->data;
+
+	return sd;
+}
+
 static int __sdt_alloc(const struct cpumask *cpu_map)
 {
 	struct sched_domain_topology_level *tl;
@@ -5795,11 +5817,11 @@ struct sched_domain *build_sched_domain(struct sched_domain_topology_level *tl,
 		const struct cpumask *cpu_map, struct sched_domain_attr *attr,
 		struct sched_domain *child, int cpu)
 {
-	struct sched_domain *sd = tl->init(tl, cpu);
+	struct sched_domain *sd = sd_init(tl, cpu);
 	if (!sd)
 		return child;
 
-	cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
+	cpumask_and(sched_domain_span(sd), cpu_map, tl->info.mask(cpu));
 	if (child) {
 		sd->level = child->level + 1;
 		sched_domain_level_max = max(sched_domain_level_max, sd->level);
@@ -6138,6 +6160,7 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	sched_init_conv();
 	sched_init_numa();
 
 	/*
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 05/11] sched: add a name to sched_domain_topology_info
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (3 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 04/11] sched: replace SD_INIT_FUNC with sd_init() dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 06/11] sched: delete redundant sd init macros dietmar.eggemann
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

With this patch, an arbitrary name for a sd level has to be specified in
the topology info table. This feature is still only active if
CONFIG_SCHED_DEBUG is set.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 include/linux/sched.h |   11 ++++++++++-
 kernel/sched/core.c   |   23 +++++++----------------
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index bf2ee608af67..f79a0d5041fb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2784,9 +2784,18 @@ typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
 
 struct sched_domain_topology_info {
 	sched_domain_mask_f mask;
-	int		    flags;
+	int flags;
+#ifdef CONFIG_SCHED_DEBUG
+	char *name;
+#endif
 };
 
+#ifdef CONFIG_SCHED_DEBUG
+# define SD_NAME(n)		.name = #n
+#else
+# define SD_NAME(n)
+#endif
+
 extern void
 set_sd_topology_info(struct sched_domain_topology_info *ti, unsigned int s);
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 90aa7c3d3a00..798a4d2c9d7b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5243,12 +5243,6 @@ int __weak arch_sd_sibling_asym_packing(void)
  * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
  */
 
-#ifdef CONFIG_SCHED_DEBUG
-# define SD_INIT_NAME(sd, type)		sd->name = #type
-#else
-# define SD_INIT_NAME(sd, type)		do { } while (0)
-#endif
-
 static int default_relax_domain_level = -1;
 int sched_domain_level_max;
 
@@ -5341,15 +5335,15 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
  */
 static struct sched_domain_topology_info default_topology_info[] = {
 #ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES },
+	{ cpu_smt_mask, SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES, SD_NAME(SIBLING) },
 #endif
 #ifdef CONFIG_SCHED_MC
-	{ cpu_coregroup_mask, SD_SHARE_PKG_RESOURCES },
+	{ cpu_coregroup_mask, SD_SHARE_PKG_RESOURCES, SD_NAME(MC) },
 #endif
 #ifdef CONFIG_SCHED_BOOK
-	{ cpu_book_mask, },
+	{ cpu_book_mask, SD_NAME(BOOK) },
 #endif
-	{ cpu_cpu_mask, },
+	{ cpu_cpu_mask, SD_NAME(CPU) },
 };
 
 static struct sched_domain_topology_level *sched_domain_topology;
@@ -5676,6 +5670,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 
 				.last_balance     = jiffies,
 				.balance_interval = sd_weight,
+#ifdef CONFIG_SCHED_DEBUG
+				.name             = tl->info.name,
+#endif
 	};
 
 	/*
@@ -5693,13 +5690,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 		 * setup.
 		 */
 		sd->flags |= arch_sd_sibling_asym_packing();
-
-		SD_INIT_NAME(sd, SMT);
 	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
 		sd->cache_nice_tries = 1;
 		sd->busy_idx = 2;
-
-		SD_INIT_NAME(sd, MC);
 #ifdef CONFIG_NUMA
 	} else if (sd->flags & SD_NUMA) {
 		sd->busy_factor = 32,
@@ -5719,8 +5712,6 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 		sd->busy_idx = 2;
 		sd->idle_idx = 1;
 		sd->flags |= SD_PREFER_SIBLING;
-
-		SD_INIT_NAME(sd, CPU);
 	}
 
 	sd->private = &tl->data;
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 06/11] sched: delete redundant sd init macros
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (4 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 05/11] sched: add a name to sched_domain_topology_info dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 07/11] sched: consolidate sched_init_numa() and sched_init_conv() dietmar.eggemann
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

This patch deletes all occurrences of SD_SIBLING_INIT, SD_MC_INIT,
SD_BOOK_INIT and SD_CPU_INIT.

The SD_NODE_INIT in arch/metag/include/asm/topology.h seems to be
leftover, probably because the metag arch got introduced at the same
time (v3.9) the other archs moved away from SD_NODE_INIT.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/ia64/include/asm/topology.h  |   24 --------
 arch/metag/include/asm/topology.h |   27 ---------
 arch/s390/include/asm/topology.h  |    2 -
 arch/tile/include/asm/topology.h  |   33 -----------
 include/linux/topology.h          |  115 -------------------------------------
 5 files changed, 201 deletions(-)

diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h
index a2496e449b75..20d12fa7e0cd 100644
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -46,30 +46,6 @@
 
 void build_cpu_to_node_map(void);
 
-#define SD_CPU_INIT (struct sched_domain) {		\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
-	.min_interval		= 1,			\
-	.max_interval		= 4,			\
-	.busy_factor		= 64,			\
-	.imbalance_pct		= 125,			\
-	.cache_nice_tries	= 2,			\
-	.busy_idx		= 2,			\
-	.idle_idx		= 1,			\
-	.newidle_idx		= 0,			\
-	.wake_idx		= 0,			\
-	.forkexec_idx		= 0,			\
-	.flags			= SD_LOAD_BALANCE	\
-				| SD_BALANCE_NEWIDLE	\
-				| SD_BALANCE_EXEC	\
-				| SD_BALANCE_FORK	\
-				| SD_WAKE_AFFINE,	\
-	.last_balance		= jiffies,		\
-	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
-}
-
 #endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_SMP
diff --git a/arch/metag/include/asm/topology.h b/arch/metag/include/asm/topology.h
index 8e9c0b3b9691..e95f874ded1b 100644
--- a/arch/metag/include/asm/topology.h
+++ b/arch/metag/include/asm/topology.h
@@ -3,33 +3,6 @@
 
 #ifdef CONFIG_NUMA
 
-/* sched_domains SD_NODE_INIT for Meta machines */
-#define SD_NODE_INIT (struct sched_domain) {		\
-	.parent			= NULL,			\
-	.child			= NULL,			\
-	.groups			= NULL,			\
-	.min_interval		= 8,			\
-	.max_interval		= 32,			\
-	.busy_factor		= 32,			\
-	.imbalance_pct		= 125,			\
-	.cache_nice_tries	= 2,			\
-	.busy_idx		= 3,			\
-	.idle_idx		= 2,			\
-	.newidle_idx		= 0,			\
-	.wake_idx		= 0,			\
-	.forkexec_idx		= 0,			\
-	.flags			= SD_LOAD_BALANCE	\
-				| SD_BALANCE_FORK	\
-				| SD_BALANCE_EXEC	\
-				| SD_BALANCE_NEWIDLE	\
-				| SD_SERIALIZE,		\
-	.last_balance		= jiffies,		\
-	.balance_interval	= 1,			\
-	.nr_balance_failed	= 0,			\
-	.max_newidle_lb_cost	= 0,			\
-	.next_decay_max_lb_cost	= jiffies,		\
-}
-
 #define cpu_to_node(cpu)	((void)(cpu), 0)
 #define parent_node(node)	((void)(node), 0)
 
diff --git a/arch/s390/include/asm/topology.h b/arch/s390/include/asm/topology.h
index 05425b18c0aa..07763bdb408d 100644
--- a/arch/s390/include/asm/topology.h
+++ b/arch/s390/include/asm/topology.h
@@ -64,8 +64,6 @@ static inline void s390_init_cpu_topology(void)
 };
 #endif
 
-#define SD_BOOK_INIT	SD_CPU_INIT
-
 #include <asm-generic/topology.h>
 
 #endif /* _ASM_S390_TOPOLOGY_H */
diff --git a/arch/tile/include/asm/topology.h b/arch/tile/include/asm/topology.h
index d15c0d8d550f..938311844233 100644
--- a/arch/tile/include/asm/topology.h
+++ b/arch/tile/include/asm/topology.h
@@ -44,39 +44,6 @@ static inline const struct cpumask *cpumask_of_node(int node)
 /* For now, use numa node -1 for global allocation. */
 #define pcibus_to_node(bus)		((void)(bus), -1)
 
-/*
- * TILE architecture has many cores integrated in one processor, so we need
- * setup bigger balance_interval for both CPU/NODE scheduling domains to
- * reduce process scheduling costs.
- */
-
-/* sched_domains SD_CPU_INIT for TILE architecture */
-#define SD_CPU_INIT (struct sched_domain) {				\
-	.min_interval		= 4,					\
-	.max_interval		= 128,					\
-	.busy_factor		= 64,					\
-	.imbalance_pct		= 125,					\
-	.cache_nice_tries	= 1,					\
-	.busy_idx		= 2,					\
-	.idle_idx		= 1,					\
-	.newidle_idx		= 0,					\
-	.wake_idx		= 0,					\
-	.forkexec_idx		= 0,					\
-									\
-	.flags			= 1*SD_LOAD_BALANCE			\
-				| 1*SD_BALANCE_NEWIDLE			\
-				| 1*SD_BALANCE_EXEC			\
-				| 1*SD_BALANCE_FORK			\
-				| 0*SD_BALANCE_WAKE			\
-				| 0*SD_WAKE_AFFINE			\
-				| 0*SD_SHARE_CPUPOWER			\
-				| 0*SD_SHARE_PKG_RESOURCES		\
-				| 0*SD_SERIALIZE			\
-				,					\
-	.last_balance		= jiffies,				\
-	.balance_interval	= 32,					\
-}
-
 /* By definition, we create nodes based on online memory. */
 #define node_has_online_mem(nid) 1
 
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 0bfcead0ace7..0a7d6142f242 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -66,121 +66,6 @@ int arch_update_cpu_topology(void);
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
 #endif
 
-/*
- * Below are the 3 major initializers used in building sched_domains:
- * SD_SIBLING_INIT, for SMT domains
- * SD_CPU_INIT, for SMP domains
- *
- * Any architecture that cares to do any tuning to these values should do so
- * by defining their own arch-specific initializer in include/asm/topology.h.
- * A definition there will automagically override these default initializers
- * and allow arch-specific performance tuning of sched_domains.
- * (Only non-zero and non-null fields need be specified.)
- */
-
-#ifdef CONFIG_SCHED_SMT
-/* MCD - Do we really need this?  It is always on if CONFIG_SCHED_SMT is,
- * so can't we drop this in favor of CONFIG_SCHED_SMT?
- */
-#define ARCH_HAS_SCHED_WAKE_IDLE
-/* Common values for SMT siblings */
-#ifndef SD_SIBLING_INIT
-#define SD_SIBLING_INIT (struct sched_domain) {				\
-	.min_interval		= 1,					\
-	.max_interval		= 2,					\
-	.busy_factor		= 64,					\
-	.imbalance_pct		= 110,					\
-									\
-	.flags			= 1*SD_LOAD_BALANCE			\
-				| 1*SD_BALANCE_NEWIDLE			\
-				| 1*SD_BALANCE_EXEC			\
-				| 1*SD_BALANCE_FORK			\
-				| 0*SD_BALANCE_WAKE			\
-				| 1*SD_WAKE_AFFINE			\
-				| 1*SD_SHARE_CPUPOWER			\
-				| 1*SD_SHARE_PKG_RESOURCES		\
-				| 0*SD_SERIALIZE			\
-				| 0*SD_PREFER_SIBLING			\
-				| arch_sd_sibling_asym_packing()	\
-				,					\
-	.last_balance		= jiffies,				\
-	.balance_interval	= 1,					\
-	.smt_gain		= 1178,	/* 15% */			\
-	.max_newidle_lb_cost	= 0,					\
-	.next_decay_max_lb_cost	= jiffies,				\
-}
-#endif
-#endif /* CONFIG_SCHED_SMT */
-
-#ifdef CONFIG_SCHED_MC
-/* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
-#ifndef SD_MC_INIT
-#define SD_MC_INIT (struct sched_domain) {				\
-	.min_interval		= 1,					\
-	.max_interval		= 4,					\
-	.busy_factor		= 64,					\
-	.imbalance_pct		= 125,					\
-	.cache_nice_tries	= 1,					\
-	.busy_idx		= 2,					\
-	.wake_idx		= 0,					\
-	.forkexec_idx		= 0,					\
-									\
-	.flags			= 1*SD_LOAD_BALANCE			\
-				| 1*SD_BALANCE_NEWIDLE			\
-				| 1*SD_BALANCE_EXEC			\
-				| 1*SD_BALANCE_FORK			\
-				| 0*SD_BALANCE_WAKE			\
-				| 1*SD_WAKE_AFFINE			\
-				| 0*SD_SHARE_CPUPOWER			\
-				| 1*SD_SHARE_PKG_RESOURCES		\
-				| 0*SD_SERIALIZE			\
-				,					\
-	.last_balance		= jiffies,				\
-	.balance_interval	= 1,					\
-	.max_newidle_lb_cost	= 0,					\
-	.next_decay_max_lb_cost	= jiffies,				\
-}
-#endif
-#endif /* CONFIG_SCHED_MC */
-
-/* Common values for CPUs */
-#ifndef SD_CPU_INIT
-#define SD_CPU_INIT (struct sched_domain) {				\
-	.min_interval		= 1,					\
-	.max_interval		= 4,					\
-	.busy_factor		= 64,					\
-	.imbalance_pct		= 125,					\
-	.cache_nice_tries	= 1,					\
-	.busy_idx		= 2,					\
-	.idle_idx		= 1,					\
-	.newidle_idx		= 0,					\
-	.wake_idx		= 0,					\
-	.forkexec_idx		= 0,					\
-									\
-	.flags			= 1*SD_LOAD_BALANCE			\
-				| 1*SD_BALANCE_NEWIDLE			\
-				| 1*SD_BALANCE_EXEC			\
-				| 1*SD_BALANCE_FORK			\
-				| 0*SD_BALANCE_WAKE			\
-				| 1*SD_WAKE_AFFINE			\
-				| 0*SD_SHARE_CPUPOWER			\
-				| 0*SD_SHARE_PKG_RESOURCES		\
-				| 0*SD_SERIALIZE			\
-				| 1*SD_PREFER_SIBLING			\
-				,					\
-	.last_balance		= jiffies,				\
-	.balance_interval	= 1,					\
-	.max_newidle_lb_cost	= 0,					\
-	.next_decay_max_lb_cost	= jiffies,				\
-}
-#endif
-
-#ifdef CONFIG_SCHED_BOOK
-#ifndef SD_BOOK_INIT
-#error Please define an appropriate SD_BOOK_INIT in include/asm/topology.h!!!
-#endif
-#endif /* CONFIG_SCHED_BOOK */
-
 #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
 DECLARE_PER_CPU(int, numa_node);
 
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 07/11] sched: consolidate sched_init_numa() and sched_init_conv()
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (5 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 06/11] sched: delete redundant sd init macros dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 08/11] sched: introduce a func ptr for sd topology flags dietmar.eggemann
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Consolidate sched_init_numa() and sched_init_conv() into one function
sched_init_topology().

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/core.c |  164 +++++++++++++++++++++------------------------------
 1 file changed, 68 insertions(+), 96 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 798a4d2c9d7b..9edd1d511f3c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5410,13 +5410,72 @@ static bool find_numa_distance(int distance)
 	return false;
 }
 
-static void sched_init_numa(void)
+static void sched_domains_numa_masks_set(int cpu)
+{
+	int i, j;
+	int node = cpu_to_node(cpu);
+
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		for (j = 0; j < nr_node_ids; j++) {
+			if (node_distance(j, node) <= sched_domains_numa_distance[i])
+				cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
+		}
+	}
+}
+
+static void sched_domains_numa_masks_clear(int cpu)
+{
+	int i, j;
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		for (j = 0; j < nr_node_ids; j++)
+			cpumask_clear_cpu(cpu, sched_domains_numa_masks[i][j]);
+	}
+}
+
+/*
+ * Update sched_domains_numa_masks[level][node] array when new cpus
+ * are onlined.
+ */
+static int sched_domains_numa_masks_update(struct notifier_block *nfb,
+					   unsigned long action,
+					   void *hcpu)
+{
+	int cpu = (long)hcpu;
+
+	switch (action & ~CPU_TASKS_FROZEN) {
+	case CPU_ONLINE:
+		sched_domains_numa_masks_set(cpu);
+		break;
+
+	case CPU_DEAD:
+		sched_domains_numa_masks_clear(cpu);
+		break;
+
+	default:
+		return NOTIFY_DONE;
+	}
+
+	return NOTIFY_OK;
+}
+#else
+static int sched_domains_numa_masks_update(struct notifier_block *nfb,
+					   unsigned long action,
+					   void *hcpu)
+{
+	return 0;
+}
+#endif /* CONFIG_NUMA */
+
+static void sched_init_topology(void)
 {
-	int next_distance, curr_distance = node_distance(0, 0);
 	struct sched_domain_topology_level *tl;
 	struct sched_domain_topology_info *ti = sched_domain_topology_info;
 	int level = 0;
-	int i, j, k;
+	int i;
+
+#ifdef CONFIG_NUMA
+	int next_distance, curr_distance = node_distance(0, 0);
+	int j, k;
 
 	sched_domains_numa_distance = kzalloc(sizeof(int) * nr_node_ids, GFP_KERNEL);
 	if (!sched_domains_numa_distance)
@@ -5512,6 +5571,7 @@ static void sched_init_numa(void)
 			}
 		}
 	}
+#endif /* CONFIG_NUMA */
 
 	/*
 	 * An extra empty struct sched_domain_topology_level element at the end
@@ -5529,6 +5589,9 @@ static void sched_init_numa(void)
 	for (i = 0; i < sched_domain_topology_info_size; i++)
 		tl[i].info = ti[i];
 
+	sched_domain_topology = tl;
+
+#ifdef CONFIG_NUMA
 	/*
 	 * .. and append 'j' levels of NUMA goodness.
 	 */
@@ -5541,99 +5604,9 @@ static void sched_init_numa(void)
 		};
 	}
 
-	sched_domain_topology = tl;
-
 	sched_domains_numa_levels = level;
-}
-
-static void sched_init_conv(void)
-{
-}
-
-static void sched_domains_numa_masks_set(int cpu)
-{
-	int i, j;
-	int node = cpu_to_node(cpu);
-
-	for (i = 0; i < sched_domains_numa_levels; i++) {
-		for (j = 0; j < nr_node_ids; j++) {
-			if (node_distance(j, node) <= sched_domains_numa_distance[i])
-				cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
-		}
-	}
-}
-
-static void sched_domains_numa_masks_clear(int cpu)
-{
-	int i, j;
-	for (i = 0; i < sched_domains_numa_levels; i++) {
-		for (j = 0; j < nr_node_ids; j++)
-			cpumask_clear_cpu(cpu, sched_domains_numa_masks[i][j]);
-	}
-}
-
-/*
- * Update sched_domains_numa_masks[level][node] array when new cpus
- * are onlined.
- */
-static int sched_domains_numa_masks_update(struct notifier_block *nfb,
-					   unsigned long action,
-					   void *hcpu)
-{
-	int cpu = (long)hcpu;
-
-	switch (action & ~CPU_TASKS_FROZEN) {
-	case CPU_ONLINE:
-		sched_domains_numa_masks_set(cpu);
-		break;
-
-	case CPU_DEAD:
-		sched_domains_numa_masks_clear(cpu);
-		break;
-
-	default:
-		return NOTIFY_DONE;
-	}
-
-	return NOTIFY_OK;
-}
-#else
-static inline void sched_init_numa(void)
-{
-}
-
-static void sched_init_conv(void)
-{
-	struct sched_domain_topology_level *tl;
-	struct sched_domain_topology_info *ti = sched_domain_topology_info;
-	int i;
-
-	/*
-	 * An extra empty struct sched_domain_topology_level element at the end
-	 * of the array is needed to let for_each_sd_topology() work correctly.
-	 */
-	tl = kzalloc((sched_domain_topology_info_size + 1) *
-		sizeof(struct sched_domain_topology_level),
-		GFP_KERNEL);
-	if (!tl)
-		return;
-
-	/*
-	 * Copy the topology info bits..
-	 */
-	for (i = 0; i < sched_domain_topology_info_size; i++)
-		tl[i].info = ti[i];
-
-	sched_domain_topology = tl;
-}
-
-static int sched_domains_numa_masks_update(struct notifier_block *nfb,
-					   unsigned long action,
-					   void *hcpu)
-{
-	return 0;
-}
 #endif /* CONFIG_NUMA */
+}
 
 static struct sched_domain *
 sd_init(struct sched_domain_topology_level *tl, int cpu)
@@ -6151,8 +6124,7 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
-	sched_init_conv();
-	sched_init_numa();
+	sched_init_topology();
 
 	/*
 	 * There's no userspace yet to cause hotplug operations; hence all the
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 08/11] sched: introduce a func ptr for sd topology flags
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (6 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 07/11] sched: consolidate sched_init_numa() and sched_init_conv() dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 09/11] sched: provide SD_ASYM_PACKING via topology info table dietmar.eggemann
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

To be able to set sd topology flags via the topology_info[] table
dependent on runtime information (cpu feature or per cpu), this patch
changes the provision of the sd topology flags from a simple int to a func
ptr.

The default flags func ptr for SMT and MC level are defined in
include/linux/sched.h.  Since there are no sd topology flags for BOOK and
CPU level in the default topology info table, no default flags func ptr
for these level are defined.  The function sd_init() can handle the fact
that there is no sd devel flags func ptr defined.

The sd topology flags func ptr definition has an int cpu argument which is
only necessary when we start to set up sd topology flags differently for
different sched groups.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 include/linux/sched.h |   15 ++++++++++++++-
 kernel/sched/core.c   |   19 +++++++++++++------
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f79a0d5041fb..055d79e594ef 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2781,10 +2781,11 @@ static inline unsigned long rlimit_max(unsigned int limit)
 }
 
 typedef const struct cpumask *(*sched_domain_mask_f)(int cpu);
+typedef int (*sched_domain_flags_f)(int cpu);
 
 struct sched_domain_topology_info {
 	sched_domain_mask_f mask;
-	int flags;
+	sched_domain_flags_f flags;
 #ifdef CONFIG_SCHED_DEBUG
 	char *name;
 #endif
@@ -2799,4 +2800,16 @@ struct sched_domain_topology_info {
 extern void
 set_sd_topology_info(struct sched_domain_topology_info *ti, unsigned int s);
 
+#ifdef CONFIG_SCHED_SMT
+static inline int cpu_smt_flags(int cpu)
+{
+	return SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES;
+}
+#endif
+
+static inline int cpu_coregroup_flags(int cpu)
+{
+	return SD_SHARE_PKG_RESOURCES;
+}
+
 #endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9edd1d511f3c..79f34cc5f547 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5335,10 +5335,10 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
  */
 static struct sched_domain_topology_info default_topology_info[] = {
 #ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES, SD_NAME(SIBLING) },
+	{ cpu_smt_mask, cpu_smt_flags, SD_NAME(SIBLING) },
 #endif
 #ifdef CONFIG_SCHED_MC
-	{ cpu_coregroup_mask, SD_SHARE_PKG_RESOURCES, SD_NAME(MC) },
+	{ cpu_coregroup_mask, cpu_coregroup_flags, SD_NAME(MC) },
 #endif
 #ifdef CONFIG_SCHED_BOOK
 	{ cpu_book_mask, SD_NAME(BOOK) },
@@ -5374,6 +5374,11 @@ static const struct cpumask *sd_numa_mask(int cpu)
 	return sched_domains_numa_masks[sched_domains_curr_level][cpu_to_node(cpu)];
 }
 
+static int sd_numa_flags(int cpu)
+{
+	return SD_NUMA;
+}
+
 static void sched_numa_warn(const char *str)
 {
 	static int done = false;
@@ -5598,7 +5603,7 @@ static void sched_init_topology(void)
 	for (j = 0; j < level; i++, j++) {
 		tl[i] = (struct sched_domain_topology_level){
 			.info.mask = sd_numa_mask,
-			.info.flags = SD_NUMA,
+			.info.flags = sd_numa_flags,
 			.flags = SDTL_OVERLAP,
 			.numa_level = j,
 		};
@@ -5613,6 +5618,7 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 {
 	struct sched_domain *sd = *per_cpu_ptr(tl->data.sd, cpu);
 	int sd_weight;
+	int flags;
 
 #ifdef CONFIG_NUMA
 	/*
@@ -5622,10 +5628,11 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 #endif
 
 	sd_weight = cpumask_weight(tl->info.mask(cpu));
+	flags = tl->info.flags ? tl->info.flags(cpu) : 0;
 
-	if (WARN_ONCE(tl->info.flags & ~TOPOLOGY_SD_FLAGS,
+	if (WARN_ONCE(flags & ~TOPOLOGY_SD_FLAGS,
 			"wrong flags in topology info\n"))
-		tl->info.flags &= ~TOPOLOGY_SD_FLAGS;
+		flags &= ~TOPOLOGY_SD_FLAGS;
 
 	*sd = (struct sched_domain){
 				.min_interval  = sd_weight,
@@ -5638,7 +5645,7 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 						| 1*SD_BALANCE_EXEC
 						| 1*SD_BALANCE_FORK
 						| 1*SD_WAKE_AFFINE
-						| tl->info.flags
+						| flags
 						,
 
 				.last_balance     = jiffies,
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 09/11] sched: provide SD_ASYM_PACKING via topology info table
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (7 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 08/11] sched: introduce a func ptr for sd topology flags dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 10/11] sched: delete sd ptr arg in arch_scale_freq_power() dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 11/11] sched: un-export struct sched_domain dietmar.eggemann
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

The provision of the sd topology flag SD_ASYM_PACKING is shifted from the
weak function arch_sd_sibling_asym_packing() towards an arch specific
topology info table.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/powerpc/kernel/smp.c |   34 +++++++++++++++++++++++++---------
 kernel/sched/core.c       |   13 -------------
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c1cf4a1522d9..f8ba79dd9147 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -760,6 +760,30 @@ int setup_profiling_timer(unsigned int multiplier)
 	return 0;
 }
 
+#ifdef CONFIG_SCHED_SMT
+static inline int arch_cpu_smt_flags(int cpu)
+{
+	int flags = SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES;
+
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
+		flags |= SD_ASYM_PACKING;
+	}
+
+	return flags;
+}
+#endif
+
+static struct sched_domain_topology_info topology_info[] = {
+#ifdef CONFIG_SCHED_SMT
+	{ cpu_smt_mask, arch_cpu_smt_flags, SD_NAME(SIBLING) },
+#endif
+#ifdef CONFIG_SCHED_MC
+	{ cpu_coregroup_mask, cpu_coregroup_flags, SD_NAME(MC) },
+#endif
+	{ cpu_cpu_mask, SD_NAME(CPU) },
+};
+
 void __init smp_cpus_done(unsigned int max_cpus)
 {
 	cpumask_var_t old_mask;
@@ -784,15 +808,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 
 	dump_numa_cpu_topology();
 
-}
-
-int arch_sd_sibling_asym_packing(void)
-{
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
-		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
-		return SD_ASYM_PACKING;
-	}
-	return 0;
+	set_sd_topology_info(topology_info, ARRAY_SIZE(topology_info));
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 79f34cc5f547..a3e945021e97 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5233,11 +5233,6 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd)
 	atomic_set(&sg->sgp->nr_busy_cpus, sg->group_weight);
 }
 
-int __weak arch_sd_sibling_asym_packing(void)
-{
-       return 0*SD_ASYM_PACKING;
-}
-
 /*
  * Initializers for schedule domains
  * Non-inlined to reduce accumulated stack pressure in build_sched_domains()
@@ -5662,14 +5657,6 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
 	if (sd->flags & SD_SHARE_CPUPOWER) {
 		sd->imbalance_pct = 110;
 		sd->smt_gain = 1178; /* ~15% */
-
-		/*
-		 * Call SMT specific arch topology function.
-		 * This goes away once the powerpc arch uses
-		 * the new interface for scheduler domain
-		 * setup.
-		 */
-		sd->flags |= arch_sd_sibling_asym_packing();
 	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
 		sd->cache_nice_tries = 1;
 		sd->busy_idx = 2;
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 10/11] sched: delete sd ptr arg in arch_scale_freq_power()
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (8 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 09/11] sched: provide SD_ASYM_PACKING via topology info table dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  2014-01-20 12:39 ` [RFC v2 PATCH 11/11] sched: un-export struct sched_domain dietmar.eggemann
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

The ARM arch_scale_freq_power function is the only one outside the core
scheduler code using struct sched_domain.  This argument is not used in
the call chain.  This patch deletes it to make it possible to un-export
struct sched_domain.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/kernel/topology.c |    4 ++--
 kernel/sched/fair.c        |   10 +++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 85a87370f144..ae209ce7d78f 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -42,7 +42,7 @@
  */
 static DEFINE_PER_CPU(unsigned long, cpu_scale);
 
-unsigned long arch_scale_freq_power(struct sched_domain *sd, int cpu)
+unsigned long arch_scale_freq_power(int cpu)
 {
 	return per_cpu(cpu_scale, cpu);
 }
@@ -166,7 +166,7 @@ void update_cpu_power(unsigned int cpu)
 	set_power_scale(cpu, cpu_capacity(cpu) / middle_capacity);
 
 	printk(KERN_INFO "CPU%u: update cpu_power %lu\n",
-		cpu, arch_scale_freq_power(NULL, cpu));
+		cpu, arch_scale_freq_power(cpu));
 }
 
 #else
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c7395d97e4cb..cb93cb09caf8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5257,14 +5257,14 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 	return load_idx;
 }
 
-static unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu)
+static unsigned long default_scale_freq_power(int cpu)
 {
 	return SCHED_POWER_SCALE;
 }
 
-unsigned long __weak arch_scale_freq_power(struct sched_domain *sd, int cpu)
+unsigned long __weak arch_scale_freq_power(int cpu)
 {
-	return default_scale_freq_power(sd, cpu);
+	return default_scale_freq_power(cpu);
 }
 
 static unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu)
@@ -5329,9 +5329,9 @@ static void update_cpu_power(struct sched_domain *sd, int cpu)
 	sdg->sgp->power_orig = power;
 
 	if (sched_feat(ARCH_POWER))
-		power *= arch_scale_freq_power(sd, cpu);
+		power *= arch_scale_freq_power(cpu);
 	else
-		power *= default_scale_freq_power(sd, cpu);
+		power *= default_scale_freq_power(cpu);
 
 	power >>= SCHED_POWER_SHIFT;
 
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 PATCH 11/11] sched: un-export struct sched_domain
  2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
                   ` (9 preceding siblings ...)
  2014-01-20 12:39 ` [RFC v2 PATCH 10/11] sched: delete sd ptr arg in arch_scale_freq_power() dietmar.eggemann
@ 2014-01-20 12:39 ` dietmar.eggemann
  10 siblings, 0 replies; 12+ messages in thread
From: dietmar.eggemann @ 2014-01-20 12:39 UTC (permalink / raw)
  To: peterz, mingo, vincent.guittot, morten.rasmussen, chris.redpath
  Cc: linux-kernel, dietmar.eggemann

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Since all occurrences of SD_FOO_INIT have been deleted, there is no
need to export struct sched_domain any more.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 include/linux/sched.h |   87 ----------------------------
 kernel/sched/sched.h  |  153 ++++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 119 insertions(+), 121 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 055d79e594ef..cd86c651f476 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -801,93 +801,6 @@ struct sched_domain_attr {
 
 extern int sched_domain_level_max;
 
-struct sched_group;
-
-struct sched_domain {
-	/* These fields must be setup */
-	struct sched_domain *parent;	/* top domain must be null terminated */
-	struct sched_domain *child;	/* bottom domain must be null terminated */
-	struct sched_group *groups;	/* the balancing groups of the domain */
-	unsigned long min_interval;	/* Minimum balance interval ms */
-	unsigned long max_interval;	/* Maximum balance interval ms */
-	unsigned int busy_factor;	/* less balancing by factor if busy */
-	unsigned int imbalance_pct;	/* No balance until over watermark */
-	unsigned int cache_nice_tries;	/* Leave cache hot tasks for # tries */
-	unsigned int busy_idx;
-	unsigned int idle_idx;
-	unsigned int newidle_idx;
-	unsigned int wake_idx;
-	unsigned int forkexec_idx;
-	unsigned int smt_gain;
-
-	int nohz_idle;			/* NOHZ IDLE status */
-	int flags;			/* See SD_* */
-	int level;
-
-	/* Runtime fields. */
-	unsigned long last_balance;	/* init to jiffies. units in jiffies */
-	unsigned int balance_interval;	/* initialise to 1. units in ms. */
-	unsigned int nr_balance_failed; /* initialise to 0 */
-
-	/* idle_balance() stats */
-	u64 max_newidle_lb_cost;
-	unsigned long next_decay_max_lb_cost;
-
-#ifdef CONFIG_SCHEDSTATS
-	/* load_balance() stats */
-	unsigned int lb_count[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_failed[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_balanced[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_imbalance[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_gained[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_hot_gained[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_nobusyg[CPU_MAX_IDLE_TYPES];
-	unsigned int lb_nobusyq[CPU_MAX_IDLE_TYPES];
-
-	/* Active load balancing */
-	unsigned int alb_count;
-	unsigned int alb_failed;
-	unsigned int alb_pushed;
-
-	/* SD_BALANCE_EXEC stats */
-	unsigned int sbe_count;
-	unsigned int sbe_balanced;
-	unsigned int sbe_pushed;
-
-	/* SD_BALANCE_FORK stats */
-	unsigned int sbf_count;
-	unsigned int sbf_balanced;
-	unsigned int sbf_pushed;
-
-	/* try_to_wake_up() stats */
-	unsigned int ttwu_wake_remote;
-	unsigned int ttwu_move_affine;
-	unsigned int ttwu_move_balance;
-#endif
-#ifdef CONFIG_SCHED_DEBUG
-	char *name;
-#endif
-	union {
-		void *private;		/* used during construction */
-		struct rcu_head rcu;	/* used during destruction */
-	};
-
-	unsigned int span_weight;
-	/*
-	 * Span of all CPUs in this domain.
-	 *
-	 * NOTE: this field is variable length. (Allocated dynamically
-	 * by attaching extra space to the end of the structure,
-	 * depending on how many CPUs the kernel has booted up with)
-	 */
-	unsigned long span[0];
-};
-
-static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
-{
-	return to_cpumask(sd->span);
-}
-
 extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
 				    struct sched_domain_attr *dattr_new);
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index fcf2d4317217..796b7f99743d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -585,40 +585,6 @@ extern int migrate_swap(struct task_struct *, struct task_struct *);
 
 #define for_each_lower_domain(sd) for (; sd; sd = sd->child)
 
-/**
- * highest_flag_domain - Return highest sched_domain containing flag.
- * @cpu:	The cpu whose highest level of sched domain is to
- *		be returned.
- * @flag:	The flag to check for the highest sched_domain
- *		for the given cpu.
- *
- * Returns the highest sched_domain of a cpu which contains the given flag.
- */
-static inline struct sched_domain *highest_flag_domain(int cpu, int flag)
-{
-	struct sched_domain *sd, *hsd = NULL;
-
-	for_each_domain(cpu, sd) {
-		if (!(sd->flags & flag))
-			break;
-		hsd = sd;
-	}
-
-	return hsd;
-}
-
-static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
-{
-	struct sched_domain *sd;
-
-	for_each_domain(cpu, sd) {
-		if (sd->flags & flag)
-			break;
-	}
-
-	return sd;
-}
-
 DECLARE_PER_CPU(struct sched_domain *, sd_llc);
 DECLARE_PER_CPU(int, sd_llc_size);
 DECLARE_PER_CPU(int, sd_llc_id);
@@ -660,11 +626,130 @@ struct sched_group {
 	unsigned long cpumask[0];
 };
 
+struct sched_domain {
+	/* These fields must be setup */
+	struct sched_domain *parent;	/* top domain must be null terminated */
+	struct sched_domain *child;	/* bottom domain must be null terminated */
+	struct sched_group *groups;	/* the balancing groups of the domain */
+	unsigned long min_interval;	/* Minimum balance interval ms */
+	unsigned long max_interval;	/* Maximum balance interval ms */
+	unsigned int busy_factor;	/* less balancing by factor if busy */
+	unsigned int imbalance_pct;	/* No balance until over watermark */
+	unsigned int cache_nice_tries;	/* Leave cache hot tasks for # tries */
+	unsigned int busy_idx;
+	unsigned int idle_idx;
+	unsigned int newidle_idx;
+	unsigned int wake_idx;
+	unsigned int forkexec_idx;
+	unsigned int smt_gain;
+
+	int nohz_idle;			/* NOHZ IDLE status */
+	int flags;			/* See SD_* */
+	int level;
+
+	/* Runtime fields. */
+	unsigned long last_balance;	/* init to jiffies. units in jiffies */
+	unsigned int balance_interval;	/* initialise to 1. units in ms. */
+	unsigned int nr_balance_failed; /* initialise to 0 */
+
+	/* idle_balance() stats */
+	u64 max_newidle_lb_cost;
+	unsigned long next_decay_max_lb_cost;
+
+#ifdef CONFIG_SCHEDSTATS
+	/* load_balance() stats */
+	unsigned int lb_count[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_failed[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_balanced[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_imbalance[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_gained[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_hot_gained[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_nobusyg[CPU_MAX_IDLE_TYPES];
+	unsigned int lb_nobusyq[CPU_MAX_IDLE_TYPES];
+
+	/* Active load balancing */
+	unsigned int alb_count;
+	unsigned int alb_failed;
+	unsigned int alb_pushed;
+
+	/* SD_BALANCE_EXEC stats */
+	unsigned int sbe_count;
+	unsigned int sbe_balanced;
+	unsigned int sbe_pushed;
+
+	/* SD_BALANCE_FORK stats */
+	unsigned int sbf_count;
+	unsigned int sbf_balanced;
+	unsigned int sbf_pushed;
+
+	/* try_to_wake_up() stats */
+	unsigned int ttwu_wake_remote;
+	unsigned int ttwu_move_affine;
+	unsigned int ttwu_move_balance;
+#endif
+#ifdef CONFIG_SCHED_DEBUG
+	char *name;
+#endif
+	union {
+		void *private;		/* used during construction */
+		struct rcu_head rcu;	/* used during destruction */
+	};
+
+	unsigned int span_weight;
+	/*
+	 * Span of all CPUs in this domain.
+	 *
+	 * NOTE: this field is variable length. (Allocated dynamically
+	 * by attaching extra space to the end of the structure,
+	 * depending on how many CPUs the kernel has booted up with)
+	 */
+	unsigned long span[0];
+};
+
+static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
+{
+	return to_cpumask(sd->span);
+}
+
 static inline struct cpumask *sched_group_cpus(struct sched_group *sg)
 {
 	return to_cpumask(sg->cpumask);
 }
 
+/**
+ * highest_flag_domain - Return highest sched_domain containing flag.
+ * @cpu:	The cpu whose highest level of sched domain is to
+ *		be returned.
+ * @flag:	The flag to check for the highest sched_domain
+ *		for the given cpu.
+ *
+ * Returns the highest sched_domain of a cpu which contains the given flag.
+ */
+static inline struct sched_domain *highest_flag_domain(int cpu, int flag)
+{
+	struct sched_domain *sd, *hsd = NULL;
+
+	for_each_domain(cpu, sd) {
+		if (!(sd->flags & flag))
+			break;
+		hsd = sd;
+	}
+
+	return hsd;
+}
+
+static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
+{
+	struct sched_domain *sd;
+
+	for_each_domain(cpu, sd) {
+		if (sd->flags & flag)
+			break;
+	}
+
+	return sd;
+}
+
 /*
  * cpumask masking which cpus in the group are allowed to iterate up the domain
  * tree.
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-01-20 12:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-20 12:39 [RFC v2 PATCH 00/11] change scheduler domain hierarchy set-up dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 01/11] sched: define sched_domain_topology_info dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 02/11] sched: export cpu_smt_mask() and cpu_cpu_mask() dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 03/11] sched: define TOPOLOGY_SD_FLAGS dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 04/11] sched: replace SD_INIT_FUNC with sd_init() dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 05/11] sched: add a name to sched_domain_topology_info dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 06/11] sched: delete redundant sd init macros dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 07/11] sched: consolidate sched_init_numa() and sched_init_conv() dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 08/11] sched: introduce a func ptr for sd topology flags dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 09/11] sched: provide SD_ASYM_PACKING via topology info table dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 10/11] sched: delete sd ptr arg in arch_scale_freq_power() dietmar.eggemann
2014-01-20 12:39 ` [RFC v2 PATCH 11/11] sched: un-export struct sched_domain dietmar.eggemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).