linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables
@ 2012-01-16 16:22 Vaidyanathan Srinivasan
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-16 16:22 UTC (permalink / raw)
  To: Vincent Guittot, Peter Zijlstra, Indan Zupancic, Youquan Song,
	Ingo Molnar, Arjan van de Ven, Suresh Siddha
  Cc: Linux Kernel

Hi,

I have created the following RFC patch based on the recent discussions
and consensus on simplifying the power aware scheduler in the kernel.

Ref: LWN: Rethinking power-aware scheduling
     http://lwn.net/Articles/474915/

The goal of the unified tunable are as follows:

* Simple and single tunable for different (all) topology
* Good default powersavings for kernel
* Potential to map this setting to other subsystems like cpufreq and
  cpuidle    

What this patch does: (first step)

* Create a single sched_powersavings tunable in sysfs
* Enable current sched_mc and sched_smt features based on 
  value of this single tunable

What this patch is yet to do:

* Tune the default power savings to pack packages only till
  a threshold of say 50%
* Add notifiers to change setting on battery/AC transitions
* Feed the thresholds from arch specific code so that different archs
  can do 'optimal' packing based on topology  
* May be move this to /sys/device/system/powersavings and add additional
  platform tunables like x86_energy_perf_policy?

This RFC has only x86 changes and has been tested on dual-socket,
quad-core,HT configuration.

Please let me know your comments and feedback.

Thanks,
Vaidy

---

Vaidyanathan Srinivasan (2):
      sched: unified sched_powersavings sysfs tunable
      sched: fix group_capacity for thread level consolidation


 arch/x86/Kconfig          |   20 ++++--------
 arch/x86/kernel/smpboot.c |    2 +
 block/blk.h               |   11 ++++---
 drivers/base/cpu.c        |    2 +
 include/linux/sched.h     |   29 +++++++++--------
 include/linux/topology.h  |    9 +----
 kernel/sched/core.c       |   75 +++++++++++----------------------------------
 kernel/sched/fair.c       |   38 ++++++++++++++++-------
 8 files changed, 77 insertions(+), 109 deletions(-)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-16 16:22 [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
@ 2012-01-16 16:22 ` Vaidyanathan Srinivasan
  2012-01-25 14:53   ` Peter Zijlstra
                     ` (2 more replies)
  2012-01-16 16:22 ` [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation Vaidyanathan Srinivasan
  2012-01-17 18:44 ` [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
  2 siblings, 3 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-16 16:22 UTC (permalink / raw)
  To: Vincent Guittot, Peter Zijlstra, Indan Zupancic, Youquan Song,
	Ingo Molnar, Arjan van de Ven, Suresh Siddha
  Cc: Linux Kernel

Combine the sched_mc_powersavings and sched_smt_powersavings sysfs
tunables into a single sysfs tunable:

/sys/devices/system/cpu/sched_powersavings={0,1,2}

		0 - Power savings disabled (performance mode)
		1 - Default kernel settings.  Automatic powersave
		    vs performance tradeoff by the kernel
		2 - Maximum power savings

The kernel will default to '1' which is equivalent to
sched_mc_powersavings=1 or consolidate at package level.

Max power saving setting '2' would consolidate to sibling threads and
also do aggressive active balancing.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
---
 arch/x86/Kconfig          |   20 ++++--------
 arch/x86/kernel/smpboot.c |    2 +
 block/blk.h               |   11 ++++---
 drivers/base/cpu.c        |    2 +
 include/linux/sched.h     |   29 +++++++++--------
 include/linux/topology.h  |    9 +----
 kernel/sched/core.c       |   75 +++++++++++----------------------------------
 kernel/sched/fair.c       |   23 +++++++-------
 8 files changed, 62 insertions(+), 109 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6c14ecd..ee615af 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -788,23 +788,15 @@ config NR_CPUS
 	  This is purely to save memory - each supported CPU adds
 	  approximately eight kilobytes to the kernel image.
 
-config SCHED_SMT
-	bool "SMT (Hyperthreading) scheduler support"
-	depends on X86_HT
-	---help---
-	  SMT scheduler support improves the CPU scheduler's decision making
-	  when dealing with Intel Pentium 4 chips with HyperThreading at a
-	  cost of slightly increased overhead in some places. If unsure say
-	  N here.
-
-config SCHED_MC
+config SCHED_POWERSAVE
 	def_bool y
-	prompt "Multi-core scheduler support"
+	prompt "Power save support in scheduler"
 	depends on X86_HT
 	---help---
-	  Multi-core scheduler support improves the CPU scheduler's decision
-	  making when dealing with multi-core CPU chips at a cost of slightly
-	  increased overhead in some places. If unsure say N here.
+	  Power saving feature in scheduler optimizes task placement
+	  in a multi-core or mulit-threaded system whenever possible.
+	  Default kernel settings would suit most applications, while
+	  sysfs tunables can be used to control this feature at runtime.
 
 config IRQ_TIME_ACCOUNTING
 	bool "Fine granularity task level IRQ time accounting"
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 66d250c..1d60cdd 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -414,7 +414,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
 	 * For perf, we return last level cache shared map.
 	 * And for power savings, we return cpu_core_map
 	 */
-	if ((sched_mc_power_savings || sched_smt_power_savings) &&
+	if ((sched_power_savings) &&
 	    !(cpu_has(c, X86_FEATURE_AMD_DCM)))
 		return cpu_core_mask(cpu);
 	else
diff --git a/block/blk.h b/block/blk.h
index 7efd772..1457107 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -167,14 +167,15 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)
 static inline int blk_cpu_to_group(int cpu)
 {
 	int group = NR_CPUS;
-#ifdef CONFIG_SCHED_MC
-	const struct cpumask *mask = cpu_coregroup_mask(cpu);
-	group = cpumask_first(mask);
-#elif defined(CONFIG_SCHED_SMT)
-	group = cpumask_first(topology_thread_cpumask(cpu));
+#ifdef CONFIG_SCHED_POWERSAVE
+	if (smt_capable())
+		group = cpumask_first(topology_thread_cpumask(cpu));
+	else	
+		group = cpumask_first(cpu_coregroup_mask(cpu));
 #else
 	return cpu;
 #endif
+	/* Possible dead code?? */
 	if (likely(group < NR_CPUS))
 		return group;
 	return cpu;
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index db87e78..dbaa35f 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -299,7 +299,7 @@ void __init cpu_dev_init(void)
 
 	cpu_dev_register_generic();
 
-#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+#ifdef CONFIG_SCHED_POWERSAVE
 	sched_create_sysfs_power_savings_entries(cpu_subsys.dev_root);
 #endif
 }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4032ec1..5c33bbc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -850,33 +850,34 @@ enum cpu_idle_type {
 #define SD_PREFER_SIBLING	0x1000	/* Prefer to place tasks in a sibling domain */
 #define SD_OVERLAP		0x2000	/* sched_domains of this level overlap */
 
-enum powersavings_balance_level {
-	POWERSAVINGS_BALANCE_NONE = 0,  /* No power saving load balance */
-	POWERSAVINGS_BALANCE_BASIC,	/* Fill one thread/core/package
-					 * first for long running threads
-					 */
-	POWERSAVINGS_BALANCE_WAKEUP,	/* Also bias task wakeups to semi-idle
-					 * cpu package for power savings
-					 */
-	MAX_POWERSAVINGS_BALANCE_LEVELS
+enum powersavings_level {
+	POWERSAVINGS_DISABLED = 0,	/* Max performance */
+	POWERSAVINGS_DEFAULT,		/* Kernel default policy, automatic powersave */
+					/* vs performance tradeoff */
+	POWERSAVINGS_MAX		/* Favour power savings over peformance */
 };
 
-extern int sched_mc_power_savings, sched_smt_power_savings;
+extern int sched_power_savings;
 
 static inline int sd_balance_for_mc_power(void)
 {
-	if (sched_smt_power_savings)
+	switch (sched_power_savings) {
+	case POWERSAVINGS_MAX:
 		return SD_POWERSAVINGS_BALANCE;
 
-	if (!sched_mc_power_savings)
+	case POWERSAVINGS_DISABLED:
 		return SD_PREFER_SIBLING;
 
+	default:
+		break;
+	}
+
 	return 0;
 }
 
 static inline int sd_balance_for_package_power(void)
 {
-	if (sched_mc_power_savings | sched_smt_power_savings)
+	if (sched_power_savings != POWERSAVINGS_DISABLED)
 		return SD_POWERSAVINGS_BALANCE;
 
 	return SD_PREFER_SIBLING;
@@ -892,7 +893,7 @@ extern int __weak arch_sd_sibiling_asym_packing(void);
 
 static inline int sd_power_saving_flags(void)
 {
-	if (sched_mc_power_savings | sched_smt_power_savings)
+	if (sched_power_savings != POWERSAVINGS_DISABLED)
 		return SD_BALANCE_NEWIDLE;
 
 	return 0;
diff --git a/include/linux/topology.h b/include/linux/topology.h
index e26db03..61f3659 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -79,10 +79,7 @@ int arch_update_cpu_topology(void);
  * (Only non-zero and non-null fields need be specified.)
  */
 
-#ifdef CONFIG_SCHED_SMT
-/* MCD - Do we really need this?  It is always on if CONFIG_SCHED_SMT is,
- * so can't we drop this in favor of CONFIG_SCHED_SMT?
- */
+#ifdef CONFIG_SCHED_POWERSAVE
 #define ARCH_HAS_SCHED_WAKE_IDLE
 /* Common values for SMT siblings */
 #ifndef SD_SIBLING_INIT
@@ -110,9 +107,7 @@ int arch_update_cpu_topology(void);
 	.smt_gain		= 1178,	/* 15% */			\
 }
 #endif
-#endif /* CONFIG_SCHED_SMT */
 
-#ifdef CONFIG_SCHED_MC
 /* Common values for MC siblings. for now mostly derived from SD_CPU_INIT */
 #ifndef SD_MC_INIT
 #define SD_MC_INIT (struct sched_domain) {				\
@@ -142,7 +137,7 @@ int arch_update_cpu_topology(void);
 	.balance_interval	= 1,					\
 }
 #endif
-#endif /* CONFIG_SCHED_MC */
+#endif /* CONFIG_SCHED_POWERSAVE */
 
 /* Common values for CPUs */
 #ifndef SD_CPU_INIT
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index df00cb0..f303db8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5924,7 +5924,7 @@ static const struct cpumask *cpu_cpu_mask(int cpu)
 	return cpumask_of_node(cpu_to_node(cpu));
 }
 
-int sched_smt_power_savings = 0, sched_mc_power_savings = 0;
+int sched_power_savings = POWERSAVINGS_DEFAULT;
 
 struct sd_data {
 	struct sched_domain **__percpu sd;
@@ -6150,10 +6150,8 @@ SD_INIT_FUNC(CPU)
  SD_INIT_FUNC(ALLNODES)
  SD_INIT_FUNC(NODE)
 #endif
-#ifdef CONFIG_SCHED_SMT
+#ifdef CONFIG_SCHED_POWERSAVE
  SD_INIT_FUNC(SIBLING)
-#endif
-#ifdef CONFIG_SCHED_MC
  SD_INIT_FUNC(MC)
 #endif
 #ifdef CONFIG_SCHED_BOOK
@@ -6250,7 +6248,7 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
 		*per_cpu_ptr(sdd->sgp, cpu) = NULL;
 }
 
-#ifdef CONFIG_SCHED_SMT
+#ifdef CONFIG_SCHED_POWERSAVE
 static const struct cpumask *cpu_smt_mask(int cpu)
 {
 	return topology_thread_cpumask(cpu);
@@ -6261,10 +6259,8 @@ static const struct cpumask *cpu_smt_mask(int cpu)
  * Topology list, bottom-up.
  */
 static struct sched_domain_topology_level default_topology[] = {
-#ifdef CONFIG_SCHED_SMT
+#ifdef CONFIG_SCHED_POWERSAVE
 	{ sd_init_SIBLING, cpu_smt_mask, },
-#endif
-#ifdef CONFIG_SCHED_MC
 	{ sd_init_MC, cpu_coregroup_mask, },
 #endif
 #ifdef CONFIG_SCHED_BOOK
@@ -6635,7 +6631,7 @@ match2:
 	mutex_unlock(&sched_domains_mutex);
 }
 
-#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+#if defined(CONFIG_SCHED_POWERSAVE)
 static void reinit_sched_domains(void)
 {
 	get_online_cpus();
@@ -6647,7 +6643,9 @@ static void reinit_sched_domains(void)
 	put_online_cpus();
 }
 
-static ssize_t sched_power_savings_store(const char *buf, size_t count, int smt)
+static ssize_t sched_power_savings_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
 {
 	unsigned int level = 0;
 
@@ -6656,75 +6654,40 @@ static ssize_t sched_power_savings_store(const char *buf, size_t count, int smt)
 
 	/*
 	 * level is always be positive so don't check for
-	 * level < POWERSAVINGS_BALANCE_NONE which is 0
+	 * level < POWERSAVINGS_DEFAULT which is 0
 	 * What happens on 0 or 1 byte write,
 	 * need to check for count as well?
 	 */
 
-	if (level >= MAX_POWERSAVINGS_BALANCE_LEVELS)
+	if (level > POWERSAVINGS_MAX)
 		return -EINVAL;
 
-	if (smt)
-		sched_smt_power_savings = level;
-	else
-		sched_mc_power_savings = level;
+		sched_power_savings = level;
 
 	reinit_sched_domains();
 
 	return count;
 }
 
-#ifdef CONFIG_SCHED_MC
-static ssize_t sched_mc_power_savings_show(struct device *dev,
+static ssize_t sched_power_savings_show(struct device *dev,
 					   struct device_attribute *attr,
 					   char *buf)
 {
-	return sprintf(buf, "%u\n", sched_mc_power_savings);
-}
-static ssize_t sched_mc_power_savings_store(struct device *dev,
-					    struct device_attribute *attr,
-					    const char *buf, size_t count)
-{
-	return sched_power_savings_store(buf, count, 0);
-}
-static DEVICE_ATTR(sched_mc_power_savings, 0644,
-		   sched_mc_power_savings_show,
-		   sched_mc_power_savings_store);
-#endif
-
-#ifdef CONFIG_SCHED_SMT
-static ssize_t sched_smt_power_savings_show(struct device *dev,
-					    struct device_attribute *attr,
-					    char *buf)
-{
-	return sprintf(buf, "%u\n", sched_smt_power_savings);
-}
-static ssize_t sched_smt_power_savings_store(struct device *dev,
-					    struct device_attribute *attr,
-					     const char *buf, size_t count)
-{
-	return sched_power_savings_store(buf, count, 1);
+	return sprintf(buf, "%u\n", sched_power_savings);
 }
-static DEVICE_ATTR(sched_smt_power_savings, 0644,
-		   sched_smt_power_savings_show,
-		   sched_smt_power_savings_store);
-#endif
+static DEVICE_ATTR(sched_power_savings, 0644,
+		   sched_power_savings_show,
+		   sched_power_savings_store);
 
 int __init sched_create_sysfs_power_savings_entries(struct device *dev)
 {
 	int err = 0;
 
-#ifdef CONFIG_SCHED_SMT
-	if (smt_capable())
-		err = device_create_file(dev, &dev_attr_sched_smt_power_savings);
-#endif
-#ifdef CONFIG_SCHED_MC
-	if (!err && mc_capable())
-		err = device_create_file(dev, &dev_attr_sched_mc_power_savings);
-#endif
+	if (mc_capable() || smt_capable())
+		err = device_create_file(dev, &dev_attr_sched_power_savings);
 	return err;
 }
-#endif /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
+#endif /* CONFIG_SCHED_POWERSAVE */
 
 /*
  * Update cpusets according to cpu_active mask.  If cpusets are
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 84adb2d..bae6ec8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3497,7 +3497,7 @@ struct sd_lb_stats {
 	unsigned int  busiest_group_weight;
 
 	int group_imb; /* Is there imbalance in this sd */
-#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+#ifdef CONFIG_SCHED_POWERSAVE
 	int power_savings_balance; /* Is powersave balance needed for this sd */
 	struct sched_group *group_min; /* Least loaded group in sd */
 	struct sched_group *group_leader; /* Group which relieves group_min */
@@ -3549,7 +3549,7 @@ static inline int get_sd_load_idx(struct sched_domain *sd,
 }
 
 
-#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+#ifdef CONFIG_SCHED_POWERSAVE
 /**
  * init_sd_power_savings_stats - Initialize power savings statistics for
  * the given sched_domain, during load balancing.
@@ -3669,7 +3669,7 @@ static inline int check_power_save_busiest_group(struct sd_lb_stats *sds,
 	return 1;
 
 }
-#else /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
+#else /* CONFIG_SCHED_POWERSAVE */
 static inline void init_sd_power_savings_stats(struct sched_domain *sd,
 	struct sd_lb_stats *sds, enum cpu_idle_type idle)
 {
@@ -3687,7 +3687,7 @@ static inline int check_power_save_busiest_group(struct sd_lb_stats *sds,
 {
 	return 0;
 }
-#endif /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
+#endif /* CONFIG_SCHED_POWERSAVE */
 
 
 unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu)
@@ -4422,9 +4422,10 @@ static int need_active_balance(struct sched_domain *sd, int idle,
 		 *
 		 * The package power saving logic comes from
 		 * find_busiest_group(). If there are no imbalance, then
-		 * f_b_g() will return NULL. However when sched_mc={1,2} then
-		 * f_b_g() will select a group from which a running task may be
-		 * pulled to this cpu in order to make the other package idle.
+		 * f_b_g() will return NULL. However when
+		 * sched_powersavings={1,2} then f_b_g() will select a group
+		 * from which a running task may be pulled to this cpu
+		 * in order to make the other package idle.
 		 * If there is no opportunity to make a package idle and if
 		 * there are no imbalance, then f_b_g() will return NULL and no
 		 * action will be taken in load_balance_newidle().
@@ -4434,7 +4435,7 @@ static int need_active_balance(struct sched_domain *sd, int idle,
 		 * move_tasks() will succeed.  ld_moved will be true and this
 		 * active balance code will not be triggered.
 		 */
-		if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
+		if (sched_power_savings < POWERSAVINGS_MAX)
 			return 0;
 	}
 
@@ -4739,7 +4740,7 @@ static struct {
 	unsigned long next_balance;     /* in jiffy units */
 } nohz ____cacheline_aligned;
 
-#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+#ifdef CONFIG_SCHED_POWERSAVE
 /**
  * lowest_flag_domain - Return lowest sched_domain containing flag.
  * @cpu:	The cpu whose lowest level of sched domain is to
@@ -4796,7 +4797,7 @@ static int find_new_ilb(int cpu)
 	 * Have idle load balancer selection from semi-idle packages only
 	 * when power-aware load balancing is enabled
 	 */
-	if (!(sched_smt_power_savings || sched_mc_power_savings))
+	if (!(sched_power_savings))
 		goto out_done;
 
 	/*
@@ -4831,7 +4832,7 @@ out_done:
 
 	return nr_cpu_ids;
 }
-#else /*  (CONFIG_SCHED_MC || CONFIG_SCHED_SMT) */
+#else /*  (CONFIG_SCHED_POWERSAVE) */
 static inline int find_new_ilb(int call_cpu)
 {
 	return nr_cpu_ids;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation
  2012-01-16 16:22 [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
@ 2012-01-16 16:22 ` Vaidyanathan Srinivasan
  2012-01-25 15:38   ` Peter Zijlstra
  2012-01-17 18:44 ` [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
  2 siblings, 1 reply; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-16 16:22 UTC (permalink / raw)
  To: Vincent Guittot, Peter Zijlstra, Indan Zupancic, Youquan Song,
	Ingo Molnar, Arjan van de Ven, Suresh Siddha
  Cc: Linux Kernel

sched_powersavings for threaded systems need this fix for
consolidation to sibling threads to work.  Since threads have
fractional capacity, group_capacity will turn out to be one
always and not accommodate another task in the sibling thread.

This fix makes group_capacity a function of cpumask_weight that
will enable the power saving load balancer to pack tasks among
sibling threads and keep more cores idle.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
---
 kernel/sched/fair.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bae6ec8..c94e768 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4012,6 +4012,21 @@ static inline void update_sd_lb_stats(struct sched_domain *sd, int this_cpu,
 		 */
 		if (prefer_sibling && !local_group && sds->this_has_capacity)
 			sgs.group_capacity = min(sgs.group_capacity, 1UL);
+		/*
+		 * If power savings balance is set at this domain, then
+		 * make capacity equal to number of hardware threads to
+		 * accommodate more tasks until capacity is reached.
+		 */
+		else if (sd->flags & SD_POWERSAVINGS_BALANCE)
+			sgs.group_capacity =
+				cpumask_weight(sched_group_cpus(sg));
+
+			/*
+			 * The default group_capacity is rounded from sum of
+			 * fractional cpu_powers of sibling hardware threads
+			 * in order to enable fair use of available hardware
+			 * resources.
+			 */
 
 		if (local_group) {
 			sds->this_load = sgs.avg_load;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables
  2012-01-16 16:22 [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
  2012-01-16 16:22 ` [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation Vaidyanathan Srinivasan
@ 2012-01-17 18:44 ` Vaidyanathan Srinivasan
  2 siblings, 0 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-17 18:44 UTC (permalink / raw)
  To: Vincent Guittot, Peter Zijlstra, Indan Zupancic, Youquan Song,
	Ingo Molnar, Arjan van de Ven, Suresh Siddha
  Cc: Linux Kernel

* Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [2012-01-16 21:52:26]:

> Hi,
> 
> I have created the following RFC patch based on the recent discussions
> and consensus on simplifying the power aware scheduler in the kernel.
> 
> Ref: LWN: Rethinking power-aware scheduling
>      http://lwn.net/Articles/474915/

       https://lkml.org/lkml/2012/1/10/61


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
@ 2012-01-25 14:53   ` Peter Zijlstra
  2012-01-26 10:42     ` Jens Axboe
  2012-01-27  9:35     ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
  2012-01-25 14:57   ` Peter Zijlstra
  2012-01-25 15:10   ` Peter Zijlstra
  2 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-25 14:53 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel, Jens Axboe

On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> +++ b/block/blk.h
> @@ -167,14 +167,15 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)
>  static inline int blk_cpu_to_group(int cpu)
>  {
>         int group = NR_CPUS;
> -#ifdef CONFIG_SCHED_MC
> -       const struct cpumask *mask = cpu_coregroup_mask(cpu);
> -       group = cpumask_first(mask);
> -#elif defined(CONFIG_SCHED_SMT)
> -       group = cpumask_first(topology_thread_cpumask(cpu));
> +#ifdef CONFIG_SCHED_POWERSAVE
> +       if (smt_capable())
> +               group = cpumask_first(topology_thread_cpumask(cpu));
> +       else    
> +               group = cpumask_first(cpu_coregroup_mask(cpu));
>  #else
>         return cpu;
>  #endif
> +       /* Possible dead code?? */
>         if (likely(group < NR_CPUS))
>                 return group;
>         return cpu; 

After going, WTF is block doing! I had a closer look and this doesn't
seem right at all. The old code would use coregroup_mask when SCHED_MC
&& SCHED_SMT, the new code does something else.

Jens, what is this thing trying to do?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
  2012-01-25 14:53   ` Peter Zijlstra
@ 2012-01-25 14:57   ` Peter Zijlstra
  2012-01-27  9:16     ` Vaidyanathan Srinivasan
  2012-01-25 15:10   ` Peter Zijlstra
  2 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-25 14:57 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> +enum powersavings_level {
> +       POWERSAVINGS_DISABLED = 0,      /* Max performance */
> +       POWERSAVINGS_DEFAULT,           /* Kernel default policy, automatic powersave */
> +                                       /* vs performance tradeoff */
> +       POWERSAVINGS_MAX                /* Favour power savings over peformance */
>  }; 

I don't like that, I can get, OFF, AUTO, ON, but to overload that with
different policies for AUTO and ON just reeks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
  2012-01-25 14:53   ` Peter Zijlstra
  2012-01-25 14:57   ` Peter Zijlstra
@ 2012-01-25 15:10   ` Peter Zijlstra
  2012-01-25 15:12     ` Arjan van de Ven
  2012-01-27  9:22     ` Vaidyanathan Srinivasan
  2 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-25 15:10 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> @@ -6150,10 +6150,8 @@ SD_INIT_FUNC(CPU)
>   SD_INIT_FUNC(ALLNODES)
>   SD_INIT_FUNC(NODE)
>  #endif
> -#ifdef CONFIG_SCHED_SMT
> +#ifdef CONFIG_SCHED_POWERSAVE
>   SD_INIT_FUNC(SIBLING)
> -#endif
> -#ifdef CONFIG_SCHED_MC
>   SD_INIT_FUNC(MC)
>  #endif
>  #ifdef CONFIG_SCHED_BOOK
> @@ -6250,7 +6248,7 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
>                 *per_cpu_ptr(sdd->sgp, cpu) = NULL;
>  }
>  
> -#ifdef CONFIG_SCHED_SMT
> +#ifdef CONFIG_SCHED_POWERSAVE
>  static const struct cpumask *cpu_smt_mask(int cpu)
>  {
>         return topology_thread_cpumask(cpu);
> @@ -6261,10 +6259,8 @@ static const struct cpumask *cpu_smt_mask(int cpu)
>   * Topology list, bottom-up.
>   */
>  static struct sched_domain_topology_level default_topology[] = {
> -#ifdef CONFIG_SCHED_SMT
> +#ifdef CONFIG_SCHED_POWERSAVE
>         { sd_init_SIBLING, cpu_smt_mask, },
> -#endif
> -#ifdef CONFIG_SCHED_MC
>         { sd_init_MC, cpu_coregroup_mask, },
>  #endif
>  #ifdef CONFIG_SCHED_BOOK 

I don't like this either, SCHED_{MC,SMT} here have nothing to do with
powersavings, its topology support.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 15:10   ` Peter Zijlstra
@ 2012-01-25 15:12     ` Arjan van de Ven
  2012-01-25 15:36       ` Peter Zijlstra
  2012-01-27  9:22     ` Vaidyanathan Srinivasan
  1 sibling, 1 reply; 23+ messages in thread
From: Arjan van de Ven @ 2012-01-25 15:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Suresh Siddha, Linux Kernel

On 1/25/2012 7:10 AM, Peter Zijlstra wrote:


btw... any reason why this feature is a config option with tons of
ifdefs... why not just have this available all the time?
it shouldn't be all that much code in the first place.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 15:12     ` Arjan van de Ven
@ 2012-01-25 15:36       ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-25 15:36 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Suresh Siddha, Linux Kernel

On Wed, 2012-01-25 at 07:12 -0800, Arjan van de Ven wrote:
> On 1/25/2012 7:10 AM, Peter Zijlstra wrote:
> 
> 
> btw... any reason why this feature is a config option with tons of
> ifdefs... why not just have this available all the time?
> it shouldn't be all that much code in the first place.

I guess its because SCHED_SMT and SCHED_MC also indicate the
availability of the topology functions like cpu_coregroup_mask() etc.

The whole powersave stuff got intermixed with all that.

But yes, I agree, we should kill all that and sort the topology stuff
differently.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation
  2012-01-16 16:22 ` [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation Vaidyanathan Srinivasan
@ 2012-01-25 15:38   ` Peter Zijlstra
  2012-01-27  9:10     ` Vaidyanathan Srinivasan
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-25 15:38 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> +               /*
> +                * If power savings balance is set at this domain, then
> +                * make capacity equal to number of hardware threads to
> +                * accommodate more tasks until capacity is reached.
> +                */
> +               else if (sd->flags & SD_POWERSAVINGS_BALANCE)
> +                       sgs.group_capacity =
> +                               cpumask_weight(sched_group_cpus(sg)); 

sg->group_weight perhaps?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 14:53   ` Peter Zijlstra
@ 2012-01-26 10:42     ` Jens Axboe
  2012-01-26 11:08       ` Peter Zijlstra
  2012-01-27  9:35     ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
  1 sibling, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2012-01-26 10:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On 01/25/2012 03:53 PM, Peter Zijlstra wrote:
> On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
>> +++ b/block/blk.h
>> @@ -167,14 +167,15 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)
>>  static inline int blk_cpu_to_group(int cpu)
>>  {
>>         int group = NR_CPUS;
>> -#ifdef CONFIG_SCHED_MC
>> -       const struct cpumask *mask = cpu_coregroup_mask(cpu);
>> -       group = cpumask_first(mask);
>> -#elif defined(CONFIG_SCHED_SMT)
>> -       group = cpumask_first(topology_thread_cpumask(cpu));
>> +#ifdef CONFIG_SCHED_POWERSAVE
>> +       if (smt_capable())
>> +               group = cpumask_first(topology_thread_cpumask(cpu));
>> +       else    
>> +               group = cpumask_first(cpu_coregroup_mask(cpu));
>>  #else
>>         return cpu;
>>  #endif
>> +       /* Possible dead code?? */
>>         if (likely(group < NR_CPUS))
>>                 return group;
>>         return cpu; 
> 
> After going, WTF is block doing! I had a closer look and this doesn't
> seem right at all. The old code would use coregroup_mask when SCHED_MC
> && SCHED_SMT, the new code does something else.
> 
> Jens, what is this thing trying to do?

Not surprised that it's broken for some configs. The intent of the code
is to return the first CPU in the "group" that the passed in core/thread
belongs to. This is used to decide whether to perform a completion
locally, or to send it off to a different "group".

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 10:42     ` Jens Axboe
@ 2012-01-26 11:08       ` Peter Zijlstra
  2012-01-26 11:26         ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-26 11:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On Thu, 2012-01-26 at 11:42 +0100, Jens Axboe wrote:
> On 01/25/2012 03:53 PM, Peter Zijlstra wrote:

> > Jens, what is this thing trying to do?
> 
>The intent of the code
> is to return the first CPU in the "group" that the passed in core/thread
> belongs to. This is used to decide whether to perform a completion
> locally, or to send it off to a different "group".

Would you perhaps have meant to identify some shared cache domain?

In the scheduler core code we have (for CONFIG_SMP):

  static int ttwu_share_cache(int this_cpu, int that_cpu);

which returns true if this and that share a cache and false otherwise.
Would that suffice or do you need a slightly different form? That is, we
should provide you with some API and avoid you having to poke around
with CONFIG_SCHED* and topology bits methinks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 11:08       ` Peter Zijlstra
@ 2012-01-26 11:26         ` Jens Axboe
  2012-01-26 12:04           ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2012-01-26 11:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On 01/26/2012 12:08 PM, Peter Zijlstra wrote:
> On Thu, 2012-01-26 at 11:42 +0100, Jens Axboe wrote:
>> On 01/25/2012 03:53 PM, Peter Zijlstra wrote:
> 
>>> Jens, what is this thing trying to do?
>>
>> The intent of the code
>> is to return the first CPU in the "group" that the passed in core/thread
>> belongs to. This is used to decide whether to perform a completion
>> locally, or to send it off to a different "group".
> 
> Would you perhaps have meant to identify some shared cache domain?
> 
> In the scheduler core code we have (for CONFIG_SMP):
> 
>   static int ttwu_share_cache(int this_cpu, int that_cpu);
> 
> which returns true if this and that share a cache and false otherwise.
> Would that suffice or do you need a slightly different form? That is, we
> should provide you with some API and avoid you having to poke around
> with CONFIG_SCHED* and topology bits methinks.

Yeah, I think that would suit my purpose nicely, in fact. What level of
cache sharing is being used here? The block code wanted a per-socket
type operation, but since it's a heuristic, perhaps the above is even
better (or equivelant, perhaps).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 11:26         ` Jens Axboe
@ 2012-01-26 12:04           ` Peter Zijlstra
  2012-01-26 12:13             ` Jens Axboe
  2012-01-28 12:06             ` [tip:sched/core] sched, block: Unify cache detection tip-bot for Peter Zijlstra
  0 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-26 12:04 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On Thu, 2012-01-26 at 12:26 +0100, Jens Axboe wrote:
> O
> Yeah, I think that would suit my purpose nicely, in fact. What level of
> cache sharing is being used here? The block code wanted a per-socket
> type operation, but since it's a heuristic, perhaps the above is even
> better (or equivelant, perhaps).

It uses the biggest shared cache exposed in the topology information the
scheduler has (which is currently somewhat funny but is on the todo list
for improvements).

Effectively it ends up being the socket wide LLC for modern Intel chips
though.

Would something like the below work for you (compile tested only).

---
Subject: sched, block: Unify cache detection
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Thu Jan 26 12:44:34 CET 2012

The block layer has some code trying to determine if two CPUs share a
cache, the scheduler has a similar function. Expose the function used
by the scheduler and make the block layer use it, thereby removing the
block layers usage of CONFIG_SCHED* and topology bits.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 block/blk-softirq.c   |   16 ++++++++--------
 block/blk.h           |   16 ----------------
 include/linux/sched.h |    8 ++++++++
 kernel/sched/core.c   |    6 +++---
 4 files changed, 19 insertions(+), 27 deletions(-)

--- a/block/blk-softirq.c
+++ b/block/blk-softirq.c
@@ -8,6 +8,7 @@
 #include <linux/blkdev.h>
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
+#include <linux/sched.h>
 
 #include "blk.h"
 
@@ -103,9 +104,10 @@ static struct notifier_block __cpuinitda
 
 void __blk_complete_request(struct request *req)
 {
-	int ccpu, cpu, group_cpu = NR_CPUS;
+	int ccpu, cpu;
 	struct request_queue *q = req->q;
 	unsigned long flags;
+	bool shared = false;
 
 	BUG_ON(!q->softirq_done_fn);
 
@@ -117,22 +119,20 @@ void __blk_complete_request(struct reque
 	 */
 	if (req->cpu != -1) {
 		ccpu = req->cpu;
-		if (!test_bit(QUEUE_FLAG_SAME_FORCE, &q->queue_flags)) {
-			ccpu = blk_cpu_to_group(ccpu);
-			group_cpu = blk_cpu_to_group(cpu);
-		}
+		if (!test_bit(QUEUE_FLAG_SAME_FORCE, &q->queue_flags))
+			shared = cpus_share_cache(cpu, ccpu);
 	} else
 		ccpu = cpu;
 
 	/*
-	 * If current CPU and requested CPU are in the same group, running
-	 * softirq in current CPU. One might concern this is just like
+	 * If current CPU and requested CPU share a cache, run the softirq on
+	 * the current CPU. One might concern this is just like
 	 * QUEUE_FLAG_SAME_FORCE, but actually not. blk_complete_request() is
 	 * running in interrupt handler, and currently I/O controller doesn't
 	 * support multiple interrupts, so current CPU is unique actually. This
 	 * avoids IPI sending from current CPU to the first CPU of a group.
 	 */
-	if (ccpu == cpu || ccpu == group_cpu) {
+	if (ccpu == cpu || shared) {
 		struct list_head *list;
 do_local:
 		list = &__get_cpu_var(blk_cpu_done);
--- a/block/blk.h
+++ b/block/blk.h
@@ -164,22 +164,6 @@ static inline int queue_congestion_off_t
 	return q->nr_congestion_off;
 }
 
-static inline int blk_cpu_to_group(int cpu)
-{
-	int group = NR_CPUS;
-#ifdef CONFIG_SCHED_MC
-	const struct cpumask *mask = cpu_coregroup_mask(cpu);
-	group = cpumask_first(mask);
-#elif defined(CONFIG_SCHED_SMT)
-	group = cpumask_first(topology_thread_cpumask(cpu));
-#else
-	return cpu;
-#endif
-	if (likely(group < NR_CPUS))
-		return group;
-	return cpu;
-}
-
 /*
  * Contribute to IO statistics IFF:
  *
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1052,6 +1052,8 @@ static inline int test_sd_parent(struct
 unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu);
 unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu);
 
+bool cpus_share_cache(int this_cpu, int that_cpu);
+
 #else /* CONFIG_SMP */
 
 struct sched_domain_attr;
@@ -1061,6 +1063,12 @@ partition_sched_domains(int ndoms_new, c
 			struct sched_domain_attr *dattr_new)
 {
 }
+
+static inline bool cpus_share_cache(int this_cpu, int that_cpu)
+{
+	return true;
+}
+
 #endif	/* !CONFIG_SMP */
 
 
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1512,7 +1512,7 @@ static int ttwu_activate_remote(struct t
 }
 #endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */
 
-static inline int ttwu_share_cache(int this_cpu, int that_cpu)
+bool cpus_share_cache(int this_cpu, int that_cpu)
 {
 	return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
 }
@@ -1523,7 +1523,7 @@ static void ttwu_queue(struct task_struc
 	struct rq *rq = cpu_rq(cpu);
 
 #if defined(CONFIG_SMP)
-	if (sched_feat(TTWU_QUEUE) && !ttwu_share_cache(smp_processor_id(), cpu)) {
+	if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) {
 		sched_clock_cpu(cpu); /* sync clocks x-cpu */
 		ttwu_queue_remote(p, cpu);
 		return;
@@ -5759,7 +5759,7 @@ static void destroy_sched_domains(struct
  *
  * Also keep a unique ID per domain (we use the first cpu number in
  * the cpumask of the domain), this allows us to quickly tell if
- * two cpus are in the same cache domain, see ttwu_share_cache().
+ * two cpus are in the same cache domain, see cpus_share_cache().
  */
 DEFINE_PER_CPU(struct sched_domain *, sd_llc);
 DEFINE_PER_CPU(int, sd_llc_id);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 12:04           ` Peter Zijlstra
@ 2012-01-26 12:13             ` Jens Axboe
  2012-01-26 12:39               ` Peter Zijlstra
  2012-01-28 12:06             ` [tip:sched/core] sched, block: Unify cache detection tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2012-01-26 12:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On 01/26/2012 01:04 PM, Peter Zijlstra wrote:
> On Thu, 2012-01-26 at 12:26 +0100, Jens Axboe wrote:
>> O
>> Yeah, I think that would suit my purpose nicely, in fact. What level of
>> cache sharing is being used here? The block code wanted a per-socket
>> type operation, but since it's a heuristic, perhaps the above is even
>> better (or equivelant, perhaps).
> 
> It uses the biggest shared cache exposed in the topology information the
> scheduler has (which is currently somewhat funny but is on the todo list
> for improvements).
> 
> Effectively it ends up being the socket wide LLC for modern Intel chips
> though.
> 
> Would something like the below work for you (compile tested only).

Yep, looks good to me, and an improvement for me not to carry this code
that doesn't really belong there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 12:13             ` Jens Axboe
@ 2012-01-26 12:39               ` Peter Zijlstra
  2012-01-26 12:46                 ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-26 12:39 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On Thu, 2012-01-26 at 13:13 +0100, Jens Axboe wrote:
> Yep, looks good to me, and an improvement for me not to carry this
> code
> that doesn't really belong there. 

I'll take this as an ACK from you and slam the patch in my 3.4 queue.
Yell if I misconstrued your intent.

Thanks!
 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-26 12:39               ` Peter Zijlstra
@ 2012-01-26 12:46                 ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2012-01-26 12:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vaidyanathan Srinivasan, Vincent Guittot, Indan Zupancic,
	Youquan Song, Ingo Molnar, Arjan van de Ven, Suresh Siddha,
	Linux Kernel

On 01/26/2012 01:39 PM, Peter Zijlstra wrote:
> On Thu, 2012-01-26 at 13:13 +0100, Jens Axboe wrote:
>> Yep, looks good to me, and an improvement for me not to carry this
>> code
>> that doesn't really belong there. 
> 
> I'll take this as an ACK from you and slam the patch in my 3.4 queue.
> Yell if I misconstrued your intent.

Sorry, should have been clearer. Yes, you can add my acked-by. Thanks
Peter!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation
  2012-01-25 15:38   ` Peter Zijlstra
@ 2012-01-27  9:10     ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-27  9:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2012-01-25 16:38:06]:

> On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> > +               /*
> > +                * If power savings balance is set at this domain, then
> > +                * make capacity equal to number of hardware threads to
> > +                * accommodate more tasks until capacity is reached.
> > +                */
> > +               else if (sd->flags & SD_POWERSAVINGS_BALANCE)
> > +                       sgs.group_capacity =
> > +                               cpumask_weight(sched_group_cpus(sg)); 
> 
> sg->group_weight perhaps?

Yes, I will correct this.  sg->group_weight is updated to the correct
value on sched_domain rebuild in init_sched_groups_power().

--Vaidy


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 14:57   ` Peter Zijlstra
@ 2012-01-27  9:16     ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-27  9:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2012-01-25 15:57:26]:

> On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> > +enum powersavings_level {
> > +       POWERSAVINGS_DISABLED = 0,      /* Max performance */
> > +       POWERSAVINGS_DEFAULT,           /* Kernel default policy, automatic powersave */
> > +                                       /* vs performance tradeoff */
> > +       POWERSAVINGS_MAX                /* Favour power savings over peformance */
> >  }; 
> 
> I don't like that, I can get, OFF, AUTO, ON, but to overload that with
> different policies for AUTO and ON just reeks.

How about this:

enum powersavings_level {
       POWERSAVINGS_OFF = 0,           /* Max performance */
       POWERSAVINGS_AUTO,              /* Kernel default policy, automatic powersave */
                                       /* vs performance tradeoff */
       POWERSAVINGS_ON                 /* Max power savings */
}; 
 
Basically AUTO is where we have a 'policy' or heuristics, but simple
straight forward decisions for OFF and ON case. 

--Vaidy


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 15:10   ` Peter Zijlstra
  2012-01-25 15:12     ` Arjan van de Ven
@ 2012-01-27  9:22     ` Vaidyanathan Srinivasan
  2012-01-27  9:40       ` Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-27  9:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2012-01-25 16:10:13]:

> On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> > @@ -6150,10 +6150,8 @@ SD_INIT_FUNC(CPU)
> >   SD_INIT_FUNC(ALLNODES)
> >   SD_INIT_FUNC(NODE)
> >  #endif
> > -#ifdef CONFIG_SCHED_SMT
> > +#ifdef CONFIG_SCHED_POWERSAVE
> >   SD_INIT_FUNC(SIBLING)
> > -#endif
> > -#ifdef CONFIG_SCHED_MC
> >   SD_INIT_FUNC(MC)
> >  #endif
> >  #ifdef CONFIG_SCHED_BOOK
> > @@ -6250,7 +6248,7 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
> >                 *per_cpu_ptr(sdd->sgp, cpu) = NULL;
> >  }
> >  
> > -#ifdef CONFIG_SCHED_SMT
> > +#ifdef CONFIG_SCHED_POWERSAVE
> >  static const struct cpumask *cpu_smt_mask(int cpu)
> >  {
> >         return topology_thread_cpumask(cpu);
> > @@ -6261,10 +6259,8 @@ static const struct cpumask *cpu_smt_mask(int cpu)
> >   * Topology list, bottom-up.
> >   */
> >  static struct sched_domain_topology_level default_topology[] = {
> > -#ifdef CONFIG_SCHED_SMT
> > +#ifdef CONFIG_SCHED_POWERSAVE
> >         { sd_init_SIBLING, cpu_smt_mask, },
> > -#endif
> > -#ifdef CONFIG_SCHED_MC
> >         { sd_init_MC, cpu_coregroup_mask, },
> >  #endif
> >  #ifdef CONFIG_SCHED_BOOK 
> 
> I don't like this either, SCHED_{MC,SMT} here have nothing to do with
> powersavings, its topology support.
 
Yes, but we don't need these domains for any other purpose other than
powersave balance.  The code overheads are not high,  I will remove
the config option and check.

--Vaidy


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-25 14:53   ` Peter Zijlstra
  2012-01-26 10:42     ` Jens Axboe
@ 2012-01-27  9:35     ` Vaidyanathan Srinivasan
  1 sibling, 0 replies; 23+ messages in thread
From: Vaidyanathan Srinivasan @ 2012-01-27  9:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel, Jens Axboe

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2012-01-25 15:53:01]:

> On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> > +++ b/block/blk.h
> > @@ -167,14 +167,15 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)
> >  static inline int blk_cpu_to_group(int cpu)
> >  {
> >         int group = NR_CPUS;
> > -#ifdef CONFIG_SCHED_MC
> > -       const struct cpumask *mask = cpu_coregroup_mask(cpu);
> > -       group = cpumask_first(mask);
> > -#elif defined(CONFIG_SCHED_SMT)
> > -       group = cpumask_first(topology_thread_cpumask(cpu));
> > +#ifdef CONFIG_SCHED_POWERSAVE
> > +       if (smt_capable())
> > +               group = cpumask_first(topology_thread_cpumask(cpu));
> > +       else    
> > +               group = cpumask_first(cpu_coregroup_mask(cpu));
> >  #else
> >         return cpu;
> >  #endif
> > +       /* Possible dead code?? */
> >         if (likely(group < NR_CPUS))
> >                 return group;
> >         return cpu; 
> 
> After going, WTF is block doing! I had a closer look and this doesn't
> seem right at all. The old code would use coregroup_mask when SCHED_MC
> && SCHED_SMT, the new code does something else.
> 
> Jens, what is this thing trying to do?

I understood the requirement as get first cpu in the 'core' on
a hyper-threaded system and first cpu in the 'socket' on
a non-threaded system for best cache affinity.  Based on Jens
explanation and Peter's patch, identifying last-level shared cache
works best for this case.

--Vaidy


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable
  2012-01-27  9:22     ` Vaidyanathan Srinivasan
@ 2012-01-27  9:40       ` Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2012-01-27  9:40 UTC (permalink / raw)
  To: svaidy
  Cc: Vincent Guittot, Indan Zupancic, Youquan Song, Ingo Molnar,
	Arjan van de Ven, Suresh Siddha, Linux Kernel

On Fri, 2012-01-27 at 14:52 +0530, Vaidyanathan Srinivasan wrote:
> * Peter Zijlstra <a.p.zijlstra@chello.nl> [2012-01-25 16:10:13]:
> 
> > On Mon, 2012-01-16 at 21:52 +0530, Vaidyanathan Srinivasan wrote:
> > > @@ -6150,10 +6150,8 @@ SD_INIT_FUNC(CPU)
> > >   SD_INIT_FUNC(ALLNODES)
> > >   SD_INIT_FUNC(NODE)
> > >  #endif
> > > -#ifdef CONFIG_SCHED_SMT
> > > +#ifdef CONFIG_SCHED_POWERSAVE
> > >   SD_INIT_FUNC(SIBLING)
> > > -#endif
> > > -#ifdef CONFIG_SCHED_MC
> > >   SD_INIT_FUNC(MC)
> > >  #endif
> > >  #ifdef CONFIG_SCHED_BOOK
> > > @@ -6250,7 +6248,7 @@ static void claim_allocations(int cpu, struct sched_domain *sd)
> > >                 *per_cpu_ptr(sdd->sgp, cpu) = NULL;
> > >  }
> > >  
> > > -#ifdef CONFIG_SCHED_SMT
> > > +#ifdef CONFIG_SCHED_POWERSAVE
> > >  static const struct cpumask *cpu_smt_mask(int cpu)
> > >  {
> > >         return topology_thread_cpumask(cpu);
> > > @@ -6261,10 +6259,8 @@ static const struct cpumask *cpu_smt_mask(int cpu)
> > >   * Topology list, bottom-up.
> > >   */
> > >  static struct sched_domain_topology_level default_topology[] = {
> > > -#ifdef CONFIG_SCHED_SMT
> > > +#ifdef CONFIG_SCHED_POWERSAVE
> > >         { sd_init_SIBLING, cpu_smt_mask, },
> > > -#endif
> > > -#ifdef CONFIG_SCHED_MC
> > >         { sd_init_MC, cpu_coregroup_mask, },
> > >  #endif
> > >  #ifdef CONFIG_SCHED_BOOK 
> > 
> > I don't like this either, SCHED_{MC,SMT} here have nothing to do with
> > powersavings, its topology support.
>  
> Yes, but we don't need these domains for any other purpose other than
> powersave balance.  The code overheads are not high,  I will remove
> the config option and check.

Yes we do, SMT perf skips siblings and only loads them when there's only
siblings left idle. Similarly for MC, we prefer to run tasks on separate
sockets when possible.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:sched/core] sched, block: Unify cache detection
  2012-01-26 12:04           ` Peter Zijlstra
  2012-01-26 12:13             ` Jens Axboe
@ 2012-01-28 12:06             ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Peter Zijlstra @ 2012-01-28 12:06 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, axboe, a.p.zijlstra, tglx

Commit-ID:  39be350127ec60a078edffe5b4915dafba4ba514
Gitweb:     http://git.kernel.org/tip/39be350127ec60a078edffe5b4915dafba4ba514
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Thu, 26 Jan 2012 12:44:34 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 27 Jan 2012 13:28:48 +0100

sched, block: Unify cache detection

The block layer has some code trying to determine if two CPUs share a
cache, the scheduler has a similar function. Expose the function used
by the scheduler and make the block layer use it, thereby removing the
block layers usage of CONFIG_SCHED* and topology bits.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: http://lkml.kernel.org/r/1327579450.2446.95.camel@twins
---
 block/blk-softirq.c   |   16 ++++++++--------
 block/blk.h           |   16 ----------------
 include/linux/sched.h |    8 ++++++++
 kernel/sched/core.c   |    6 +++---
 4 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/block/blk-softirq.c b/block/blk-softirq.c
index 1366a89..467c8de 100644
--- a/block/blk-softirq.c
+++ b/block/blk-softirq.c
@@ -8,6 +8,7 @@
 #include <linux/blkdev.h>
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
+#include <linux/sched.h>
 
 #include "blk.h"
 
@@ -103,9 +104,10 @@ static struct notifier_block __cpuinitdata blk_cpu_notifier = {
 
 void __blk_complete_request(struct request *req)
 {
-	int ccpu, cpu, group_cpu = NR_CPUS;
+	int ccpu, cpu;
 	struct request_queue *q = req->q;
 	unsigned long flags;
+	bool shared = false;
 
 	BUG_ON(!q->softirq_done_fn);
 
@@ -117,22 +119,20 @@ void __blk_complete_request(struct request *req)
 	 */
 	if (req->cpu != -1) {
 		ccpu = req->cpu;
-		if (!test_bit(QUEUE_FLAG_SAME_FORCE, &q->queue_flags)) {
-			ccpu = blk_cpu_to_group(ccpu);
-			group_cpu = blk_cpu_to_group(cpu);
-		}
+		if (!test_bit(QUEUE_FLAG_SAME_FORCE, &q->queue_flags))
+			shared = cpus_share_cache(cpu, ccpu);
 	} else
 		ccpu = cpu;
 
 	/*
-	 * If current CPU and requested CPU are in the same group, running
-	 * softirq in current CPU. One might concern this is just like
+	 * If current CPU and requested CPU share a cache, run the softirq on
+	 * the current CPU. One might concern this is just like
 	 * QUEUE_FLAG_SAME_FORCE, but actually not. blk_complete_request() is
 	 * running in interrupt handler, and currently I/O controller doesn't
 	 * support multiple interrupts, so current CPU is unique actually. This
 	 * avoids IPI sending from current CPU to the first CPU of a group.
 	 */
-	if (ccpu == cpu || ccpu == group_cpu) {
+	if (ccpu == cpu || shared) {
 		struct list_head *list;
 do_local:
 		list = &__get_cpu_var(blk_cpu_done);
diff --git a/block/blk.h b/block/blk.h
index 7efd772..df5b59a 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -164,22 +164,6 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)
 	return q->nr_congestion_off;
 }
 
-static inline int blk_cpu_to_group(int cpu)
-{
-	int group = NR_CPUS;
-#ifdef CONFIG_SCHED_MC
-	const struct cpumask *mask = cpu_coregroup_mask(cpu);
-	group = cpumask_first(mask);
-#elif defined(CONFIG_SCHED_SMT)
-	group = cpumask_first(topology_thread_cpumask(cpu));
-#else
-	return cpu;
-#endif
-	if (likely(group < NR_CPUS))
-		return group;
-	return cpu;
-}
-
 /*
  * Contribute to IO statistics IFF:
  *
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 513f524..0e19595 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1052,6 +1052,8 @@ static inline int test_sd_parent(struct sched_domain *sd, int flag)
 unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu);
 unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu);
 
+bool cpus_share_cache(int this_cpu, int that_cpu);
+
 #else /* CONFIG_SMP */
 
 struct sched_domain_attr;
@@ -1061,6 +1063,12 @@ partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
 			struct sched_domain_attr *dattr_new)
 {
 }
+
+static inline bool cpus_share_cache(int this_cpu, int that_cpu)
+{
+	return true;
+}
+
 #endif	/* !CONFIG_SMP */
 
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5255c9d..d7c4322 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1507,7 +1507,7 @@ static int ttwu_activate_remote(struct task_struct *p, int wake_flags)
 }
 #endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */
 
-static inline int ttwu_share_cache(int this_cpu, int that_cpu)
+bool cpus_share_cache(int this_cpu, int that_cpu)
 {
 	return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
 }
@@ -1518,7 +1518,7 @@ static void ttwu_queue(struct task_struct *p, int cpu)
 	struct rq *rq = cpu_rq(cpu);
 
 #if defined(CONFIG_SMP)
-	if (sched_feat(TTWU_QUEUE) && !ttwu_share_cache(smp_processor_id(), cpu)) {
+	if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) {
 		sched_clock_cpu(cpu); /* sync clocks x-cpu */
 		ttwu_queue_remote(p, cpu);
 		return;
@@ -5754,7 +5754,7 @@ static void destroy_sched_domains(struct sched_domain *sd, int cpu)
  *
  * Also keep a unique ID per domain (we use the first cpu number in
  * the cpumask of the domain), this allows us to quickly tell if
- * two cpus are in the same cache domain, see ttwu_share_cache().
+ * two cpus are in the same cache domain, see cpus_share_cache().
  */
 DEFINE_PER_CPU(struct sched_domain *, sd_llc);
 DEFINE_PER_CPU(int, sd_llc_id);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-01-28 12:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-16 16:22 [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan
2012-01-16 16:22 ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
2012-01-25 14:53   ` Peter Zijlstra
2012-01-26 10:42     ` Jens Axboe
2012-01-26 11:08       ` Peter Zijlstra
2012-01-26 11:26         ` Jens Axboe
2012-01-26 12:04           ` Peter Zijlstra
2012-01-26 12:13             ` Jens Axboe
2012-01-26 12:39               ` Peter Zijlstra
2012-01-26 12:46                 ` Jens Axboe
2012-01-28 12:06             ` [tip:sched/core] sched, block: Unify cache detection tip-bot for Peter Zijlstra
2012-01-27  9:35     ` [RFC PATCH v1 1/2] sched: unified sched_powersavings sysfs tunable Vaidyanathan Srinivasan
2012-01-25 14:57   ` Peter Zijlstra
2012-01-27  9:16     ` Vaidyanathan Srinivasan
2012-01-25 15:10   ` Peter Zijlstra
2012-01-25 15:12     ` Arjan van de Ven
2012-01-25 15:36       ` Peter Zijlstra
2012-01-27  9:22     ` Vaidyanathan Srinivasan
2012-01-27  9:40       ` Peter Zijlstra
2012-01-16 16:22 ` [RFC PATCH v1 2/2] sched: fix group_capacity for thread level consolidation Vaidyanathan Srinivasan
2012-01-25 15:38   ` Peter Zijlstra
2012-01-27  9:10     ` Vaidyanathan Srinivasan
2012-01-17 18:44 ` [RFC PATCH v1 0/2] sched: unified sched_powersavings tunables Vaidyanathan Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).