[PATCH]cpuset: add new API to change cpuset top group's cpus

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH]cpuset: add new API to change cpuset top group's cpus
@ 2009-05-19  7:39 Shaohua Li
  2009-05-19  8:40 ` Peter Zijlstra
  2009-05-19 19:55 ` Paul Menage
  0 siblings, 2 replies; 24+ messages in thread
From: Shaohua Li @ 2009-05-19  7:39 UTC (permalink / raw)
  To: linux-kernel, linux-acpi; +Cc: lenb, menage

ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
idle.

This patch adds one API to change cpuset top group's cpus. If we want to
make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
then all tasks will be migrate to other cpus, and other tasks will not be
migrated to this cpu again. No functional changes.

We will use this API in new ACPI processor aggregator device driver later.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
---
 include/linux/cpuset.h |    5 +++++
 kernel/cpuset.c        |   27 ++++++++++++++++++++++++---
 2 files changed, 29 insertions(+), 3 deletions(-)

Index: linux/kernel/cpuset.c
===================================================================
--- linux.orig/kernel/cpuset.c	2009-05-12 16:27:16.000000000 +0800
+++ linux/kernel/cpuset.c	2009-05-19 10:05:36.000000000 +0800
@@ -929,14 +929,14 @@ static void update_tasks_cpumask(struct 
  * @buf: buffer of cpu numbers written to this cpuset
  */
 static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
-			  const char *buf)
+			  const char *buf, bool top_ok)
 {
 	struct ptr_heap heap;
 	int retval;
 	int is_load_balanced;
 
 	/* top_cpuset.cpus_allowed tracks cpu_online_map; it's read-only */
-	if (cs == &top_cpuset)
+	if (cs == &top_cpuset && !top_ok)
 		return -EACCES;
 
 	/*
@@ -1496,7 +1496,7 @@ static int cpuset_write_resmask(struct c
 
 	switch (cft->private) {
 	case FILE_CPULIST:
-		retval = update_cpumask(cs, trialcs, buf);
+		retval = update_cpumask(cs, trialcs, buf, false);
 		break;
 	case FILE_MEMLIST:
 		retval = update_nodemask(cs, trialcs, buf);
@@ -1511,6 +1511,27 @@ static int cpuset_write_resmask(struct c
 	return retval;
 }
 
+int cpuset_change_top_cpumask(const char *buf)
+{
+	int retval = 0;
+	struct cpuset *cs = &top_cpuset;
+	struct cpuset *trialcs;
+
+	if (!cgroup_lock_live_group(cs->css.cgroup))
+		return -ENODEV;
+
+	trialcs = alloc_trial_cpuset(cs);
+	if (!trialcs)
+		return -ENOMEM;
+
+	retval = update_cpumask(cs, trialcs, buf, true);
+
+	free_trial_cpuset(trialcs);
+	cgroup_unlock();
+	return retval;
+}
+EXPORT_SYMBOL(cpuset_change_top_cpumask);
+
 /*
  * These ascii lists should be read in a single call, by using a user
  * buffer large enough to hold the entire map.  If read in smaller
Index: linux/include/linux/cpuset.h
===================================================================
--- linux.orig/include/linux/cpuset.h	2009-05-12 16:27:15.000000000 +0800
+++ linux/include/linux/cpuset.h	2009-05-19 10:05:36.000000000 +0800
@@ -92,6 +92,7 @@ extern void rebuild_sched_domains(void);
 
 extern void cpuset_print_task_mems_allowed(struct task_struct *p);
 
+extern int cpuset_change_top_cpumask(const char *buf);
 #else /* !CONFIG_CPUSETS */
 
 static inline int cpuset_init_early(void) { return 0; }
@@ -188,6 +189,10 @@ static inline void cpuset_print_task_mem
 {
 }
 
+static inline int cpuset_change_top_cpumask(const char *buf)
+{
+	return -ENODEV;
+}
 #endif /* !CONFIG_CPUSETS */
 
 #endif /* _LINUX_CPUSET_H */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  7:39 [PATCH]cpuset: add new API to change cpuset top group's cpus Shaohua Li
@ 2009-05-19  8:40 ` Peter Zijlstra
  2009-05-19  8:48   ` Shaohua Li
  2009-05-19 11:27   ` Andi Kleen
  2009-05-19 19:55 ` Paul Menage
  1 sibling, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19  8:40 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> idle.
> 
> This patch adds one API to change cpuset top group's cpus. If we want to
> make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> then all tasks will be migrate to other cpus, and other tasks will not be
> migrated to this cpu again. No functional changes.
> 
> We will use this API in new ACPI processor aggregator device driver later.

I don't think so. There really is a lot more to do than move processes
about.

Furthermore, I object to being able to remove online cpus from the top
cpuset, that just doesn't make sense.

I'd suggest using hotplug for this.

NAK

> Signed-off-by: Shaohua Li<shaohua.li@intel.com>
> ---
>  include/linux/cpuset.h |    5 +++++
>  kernel/cpuset.c        |   27 ++++++++++++++++++++++++---
>  2 files changed, 29 insertions(+), 3 deletions(-)
> 
> Index: linux/kernel/cpuset.c
> ===================================================================
> --- linux.orig/kernel/cpuset.c	2009-05-12 16:27:16.000000000 +0800
> +++ linux/kernel/cpuset.c	2009-05-19 10:05:36.000000000 +0800
> @@ -929,14 +929,14 @@ static void update_tasks_cpumask(struct 
>   * @buf: buffer of cpu numbers written to this cpuset
>   */
>  static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
> -			  const char *buf)
> +			  const char *buf, bool top_ok)
>  {
>  	struct ptr_heap heap;
>  	int retval;
>  	int is_load_balanced;
>  
>  	/* top_cpuset.cpus_allowed tracks cpu_online_map; it's read-only */
> -	if (cs == &top_cpuset)
> +	if (cs == &top_cpuset && !top_ok)
>  		return -EACCES;
>  
>  	/*
> @@ -1496,7 +1496,7 @@ static int cpuset_write_resmask(struct c
>  
>  	switch (cft->private) {
>  	case FILE_CPULIST:
> -		retval = update_cpumask(cs, trialcs, buf);
> +		retval = update_cpumask(cs, trialcs, buf, false);
>  		break;
>  	case FILE_MEMLIST:
>  		retval = update_nodemask(cs, trialcs, buf);
> @@ -1511,6 +1511,27 @@ static int cpuset_write_resmask(struct c
>  	return retval;
>  }
>  
> +int cpuset_change_top_cpumask(const char *buf)
> +{
> +	int retval = 0;
> +	struct cpuset *cs = &top_cpuset;
> +	struct cpuset *trialcs;
> +
> +	if (!cgroup_lock_live_group(cs->css.cgroup))
> +		return -ENODEV;
> +
> +	trialcs = alloc_trial_cpuset(cs);
> +	if (!trialcs)
> +		return -ENOMEM;
> +
> +	retval = update_cpumask(cs, trialcs, buf, true);
> +
> +	free_trial_cpuset(trialcs);
> +	cgroup_unlock();
> +	return retval;
> +}
> +EXPORT_SYMBOL(cpuset_change_top_cpumask);
> +
>  /*
>   * These ascii lists should be read in a single call, by using a user
>   * buffer large enough to hold the entire map.  If read in smaller
> Index: linux/include/linux/cpuset.h
> ===================================================================
> --- linux.orig/include/linux/cpuset.h	2009-05-12 16:27:15.000000000 +0800
> +++ linux/include/linux/cpuset.h	2009-05-19 10:05:36.000000000 +0800
> @@ -92,6 +92,7 @@ extern void rebuild_sched_domains(void);
>  
>  extern void cpuset_print_task_mems_allowed(struct task_struct *p);
>  
> +extern int cpuset_change_top_cpumask(const char *buf);
>  #else /* !CONFIG_CPUSETS */
>  
>  static inline int cpuset_init_early(void) { return 0; }
> @@ -188,6 +189,10 @@ static inline void cpuset_print_task_mem
>  {
>  }
>  
> +static inline int cpuset_change_top_cpumask(const char *buf)
> +{
> +	return -ENODEV;
> +}
>  #endif /* !CONFIG_CPUSETS */
>  
>  #endif /* _LINUX_CPUSET_H */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  8:40 ` Peter Zijlstra
@ 2009-05-19  8:48   ` Shaohua Li
  2009-05-19  8:56     ` Peter Zijlstra
  2009-05-19 11:27   ` Andi Kleen
  1 sibling, 1 reply; 24+ messages in thread
From: Shaohua Li @ 2009-05-19  8:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > idle.
> > 
> > This patch adds one API to change cpuset top group's cpus. If we want to
> > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > then all tasks will be migrate to other cpus, and other tasks will not be
> > migrated to this cpu again. No functional changes.
> > 
> > We will use this API in new ACPI processor aggregator device driver later.
> 
> I don't think so. There really is a lot more to do than move processes
> about.
no processor running is good enough for us, we don't care about interrupts/softirq/
timers so far.

> Furthermore, I object to being able to remove online cpus from the top
> cpuset, that just doesn't make sense.
> 
> I'd suggest using hotplug for this.
cpu hotplug involves too much things, and we are afraid it's not reliable.
Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
efficient. To make hot removed cpu enters deep C-state is in whish list for a
long time, but still not available. The acpi_processor_idle is a module, and
cpuidle governor potentially can't handle offline cpu.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  8:48   ` Shaohua Li
@ 2009-05-19  8:56     ` Peter Zijlstra
  2009-05-19  9:06       ` Shaohua Li
  2009-05-19 10:38       ` Peter Zijlstra
  0 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19  8:56 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, 2009-05-19 at 16:48 +0800, Shaohua Li wrote:
> On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> > On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > > idle.
> > > 
> > > This patch adds one API to change cpuset top group's cpus. If we want to
> > > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > > then all tasks will be migrate to other cpus, and other tasks will not be
> > > migrated to this cpu again. No functional changes.
> > > 
> > > We will use this API in new ACPI processor aggregator device driver later.
> > 
> > I don't think so. There really is a lot more to do than move processes
> > about.
> no processor running is good enough for us, we don't care about interrupts/softirq/
> timers so far.

Well, I don't care for this interface.

> > Furthermore, I object to being able to remove online cpus from the top
> > cpuset, that just doesn't make sense.
> > 
> > I'd suggest using hotplug for this.

> cpu hotplug involves too much things, and we are afraid it's not reliable.

Then make it more reliable instead of providing ugly ass shit like this.

> Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> efficient. To make hot removed cpu enters deep C-state is in whish list for a
> long time, but still not available. The acpi_processor_idle is a module, and
> cpuidle governor potentially can't handle offline cpu.

Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
and I've no idea why its still there, seems like a much better candidate
for your efforts than this.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  8:56     ` Peter Zijlstra
@ 2009-05-19  9:06       ` Shaohua Li
  2009-05-19  9:31         ` Peter Zijlstra
  2009-05-19 10:38       ` Peter Zijlstra
  1 sibling, 1 reply; 24+ messages in thread
From: Shaohua Li @ 2009-05-19  9:06 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, May 19, 2009 at 04:56:04PM +0800, Peter Zijlstra wrote:
> On Tue, 2009-05-19 at 16:48 +0800, Shaohua Li wrote:
> > On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> > > On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > > > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > > > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > > > idle.
> > > > 
> > > > This patch adds one API to change cpuset top group's cpus. If we want to
> > > > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > > > then all tasks will be migrate to other cpus, and other tasks will not be
> > > > migrated to this cpu again. No functional changes.
> > > > 
> > > > We will use this API in new ACPI processor aggregator device driver later.
> > > 
> > > I don't think so. There really is a lot more to do than move processes
> > > about.
> > no processor running is good enough for us, we don't care about interrupts/softirq/
> > timers so far.
> 
> Well, I don't care for this interface.
> 
> > > Furthermore, I object to being able to remove online cpus from the top
> > > cpuset, that just doesn't make sense.
> > > 
> > > I'd suggest using hotplug for this.
> 
> > cpu hotplug involves too much things, and we are afraid it's not reliable.
> 
> Then make it more reliable instead of providing ugly ass shit like this.
I wonder why this is that ugly. We have a cpu_isolated_map, which is just like
this.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  9:06       ` Shaohua Li
@ 2009-05-19  9:31         ` Peter Zijlstra
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19  9:31 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, 2009-05-19 at 17:06 +0800, Shaohua Li wrote:
> On Tue, May 19, 2009 at 04:56:04PM +0800, Peter Zijlstra wrote:
> > On Tue, 2009-05-19 at 16:48 +0800, Shaohua Li wrote:
> > > On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> > > > On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > > > > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > > > > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > > > > idle.
> > > > > 
> > > > > This patch adds one API to change cpuset top group's cpus. If we want to
> > > > > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > > > > then all tasks will be migrate to other cpus, and other tasks will not be
> > > > > migrated to this cpu again. No functional changes.
> > > > > 
> > > > > We will use this API in new ACPI processor aggregator device driver later.
> > > > 
> > > > I don't think so. There really is a lot more to do than move processes
> > > > about.
> > > no processor running is good enough for us, we don't care about interrupts/softirq/
> > > timers so far.
> > 
> > Well, I don't care for this interface.
> > 
> > > > Furthermore, I object to being able to remove online cpus from the top
> > > > cpuset, that just doesn't make sense.
> > > > 
> > > > I'd suggest using hotplug for this.
> > 
> > > cpu hotplug involves too much things, and we are afraid it's not reliable.
> > 
> > Then make it more reliable instead of providing ugly ass shit like this.

> I wonder why this is that ugly. We have a cpu_isolated_map, which is just like
> this.

And just as ugly -- it should die too.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  8:56     ` Peter Zijlstra
  2009-05-19  9:06       ` Shaohua Li
@ 2009-05-19 10:38       ` Peter Zijlstra
  2009-05-19 13:37         ` Vaidyanathan Srinivasan
  2009-05-19 19:01         ` Len Brown
  1 sibling, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19 10:38 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, linux-acpi, lenb, menage

On Tue, 2009-05-19 at 10:56 +0200, Peter Zijlstra wrote:
> On Tue, 2009-05-19 at 16:48 +0800, Shaohua Li wrote:
> > On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> > > On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > > > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > > > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > > > idle.
> > > > 
> > > > This patch adds one API to change cpuset top group's cpus. If we want to
> > > > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > > > then all tasks will be migrate to other cpus, and other tasks will not be
> > > > migrated to this cpu again. No functional changes.
> > > > 
> > > > We will use this API in new ACPI processor aggregator device driver later.
> > > 
> > > I don't think so. There really is a lot more to do than move processes
> > > about.
> > no processor running is good enough for us, we don't care about interrupts/softirq/
> > timers so far.
> 
> Well, I don't care for this interface.
> 
> > > Furthermore, I object to being able to remove online cpus from the top
> > > cpuset, that just doesn't make sense.
> > > 
> > > I'd suggest using hotplug for this.
> 
> > cpu hotplug involves too much things, and we are afraid it's not reliable.
> 
> Then make it more reliable instead of providing ugly ass shit like this.

OK, so perhaps I should have use different words. But the point is, we
don't need a new interface to force a cpu idle. Hotplug does that.

Furthermore, we should not want anything outside of that, either the cpu
is there available for work, or its not -- halfway measures don't make
sense.

Furthermore, we already have power aware scheduling which tries to
aggregate idle time on cpu/core/packages so as to maximize the idle time
power savings. Use it there.

> > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > long time, but still not available. The acpi_processor_idle is a module, and
> > cpuidle governor potentially can't handle offline cpu.
> 
> Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> and I've no idea why its still there, seems like a much better candidate
> for your efforts than this.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  8:40 ` Peter Zijlstra
  2009-05-19  8:48   ` Shaohua Li
@ 2009-05-19 11:27   ` Andi Kleen
  2009-05-19 12:01     ` Peter Zijlstra
  1 sibling, 1 reply; 24+ messages in thread
From: Andi Kleen @ 2009-05-19 11:27 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Shaohua Li, linux-kernel, linux-acpi, lenb, menage

Peter Zijlstra <peterz@infradead.org> writes:
>
> Furthermore, I object to being able to remove online cpus from the top
> cpuset, that just doesn't make sense.

Note you can already do it at boot time with isolated_cpus=...
So your objection seems to be a few years too late.

Shaohua's patch just makes it work at runtime too.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 11:27   ` Andi Kleen
@ 2009-05-19 12:01     ` Peter Zijlstra
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19 12:01 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Shaohua Li, linux-kernel, linux-acpi, lenb, menage

On Tue, 2009-05-19 at 13:27 +0200, Andi Kleen wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> >
> > Furthermore, I object to being able to remove online cpus from the top
> > cpuset, that just doesn't make sense.
> 
> Note you can already do it at boot time with isolated_cpus=...
> So your objection seems to be a few years too late.
> 
> Shaohua's patch just makes it work at runtime too.

No it doesn't. And isolcpus should die.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 10:38       ` Peter Zijlstra
@ 2009-05-19 13:37         ` Vaidyanathan Srinivasan
  2009-05-28  2:34           ` Len Brown
  2009-05-19 19:01         ` Len Brown
  1 sibling, 1 reply; 24+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-05-19 13:37 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Shaohua Li, linux-kernel, linux-acpi, lenb, menage

* Peter Zijlstra <peterz@infradead.org> [2009-05-19 12:38:58]:

> On Tue, 2009-05-19 at 10:56 +0200, Peter Zijlstra wrote:
> > On Tue, 2009-05-19 at 16:48 +0800, Shaohua Li wrote:
> > > On Tue, May 19, 2009 at 04:40:54PM +0800, Peter Zijlstra wrote:
> > > > On Tue, 2009-05-19 at 15:39 +0800, Shaohua Li wrote:
> > > > > ACPI 4.0 defines processor aggregator device. The device can notify OS to idle
> > > > > some CPUs to save power. This isn't to hot remove cpus, but just makes cpus
> > > > > idle.
> > > > > 
> > > > > This patch adds one API to change cpuset top group's cpus. If we want to
> > > > > make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> > > > > then all tasks will be migrate to other cpus, and other tasks will not be
> > > > > migrated to this cpu again. No functional changes.
> > > > > 
> > > > > We will use this API in new ACPI processor aggregator device driver later.
> > > > 
> > > > I don't think so. There really is a lot more to do than move processes
> > > > about.
> > > no processor running is good enough for us, we don't care about interrupts/softirq/
> > > timers so far.
> > 
> > Well, I don't care for this interface.
> > 
> > > > Furthermore, I object to being able to remove online cpus from the top
> > > > cpuset, that just doesn't make sense.
> > > > 
> > > > I'd suggest using hotplug for this.
> > 
> > > cpu hotplug involves too much things, and we are afraid it's not reliable.
> > 
> > Then make it more reliable instead of providing ugly ass shit like this.
> 
> OK, so perhaps I should have use different words. But the point is, we
> don't need a new interface to force a cpu idle. Hotplug does that.

We tried similar approaches to create idle time for power savings, but
cpu hotplug interface seem to be a clean choice.  There could be
issues with the interface, we should fix it.  Is there any other
reason why cpuhotplug is 'ugly' other than its performance (speed)?

I have tried few load balancer hacks to evacuate cores but not a solid
design yet.  It has its advantages but still needs more work.

http://lkml.org/lkml/2009/5/13/173

> Furthermore, we should not want anything outside of that, either the cpu
> is there available for work, or its not -- halfway measures don't make
> sense.
> 
> Furthermore, we already have power aware scheduling which tries to
> aggregate idle time on cpu/core/packages so as to maximize the idle time
> power savings. Use it there.

Power aware scheduling can optimally accumulate idle times.  Framework
to create idle time to force idle cores is good and useful for power
savings.  Other than the speed of online/offline I do not know of any
other major issue for using cpu hotplug for this purpose.
 
> > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > long time, but still not available. The acpi_processor_idle is a module, and
> > > cpuidle governor potentially can't handle offline cpu.
> > 
> > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > and I've no idea why its still there, seems like a much better candidate
> > for your efforts than this.

I agree with Peter.  We need to make cpu hotplug save power first and
later improve upon its performance.

--Vaidy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 10:38       ` Peter Zijlstra
  2009-05-19 13:37         ` Vaidyanathan Srinivasan
@ 2009-05-19 19:01         ` Len Brown
  2009-05-19 22:36           ` Peter Zijlstra
  2009-05-20 17:21           ` Vaidyanathan Srinivasan
  1 sibling, 2 replies; 24+ messages in thread
From: Len Brown @ 2009-05-19 19:01 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Shaohua Li, linux-kernel, linux-acpi, menage

> ... the point is, we
> don't need a new interface to force a cpu idle. Hotplug does that.
>
> Furthermore, we should not want anything outside of that, either the cpu
> is there available for work, or its not -- halfway measures don't make
> sense.
> 
> Furthermore, we already have power aware scheduling which tries to
> aggregate idle time on cpu/core/packages so as to maximize the idle time
> power savings. Use it there.

Some context...

In the past, server room power and thermal issues were handled
either by spending too much money to provision power and
thermals for theoretical worst case, or by abruptly shutting off
servers when hard limits were reached.

Going forward, platforms are getting smarter, measuring how
much power is drawn from the power supply, measuring the room
thermals etc. so that real dollars can be saved by deploying
systems that exceed the theoretical worst case if the power
and thermal limits are enforced.

So if server approaches a budget, the platform
will notify the OS to limit its P-states, and limit its T-states
in order to draw less power.

If that is not sufficient, the platform will ask us to take
processors off-line.  These are not processors that are otherwise idle
-- those are already saving as much power as they can --
these are processors that are fully utilized.

So power-aware scheduling is moot here, this isn't the
partially idle case, this is the fully utilized case.

If power draw continues to be too high, the platform
will simply ask us to take more processors off line.

If this dance doesn't reduce power below that required,
the platform will be shut off.

So it is sufficient to simply not schedule cpu burners
on the 'idled' processor.  Interrupts should generally
not matter -- and if they do, we'll end up simply idling
an additional processor.

> > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > long time, but still not available. The acpi_processor_idle is a module, and
> > > cpuidle governor potentially can't handle offline cpu.
> > 
> > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > and I've no idea why its still there, seems like a much better candidate
> > for your efforts than this.

CONFIG_HOTPLUG_CPU has been problematic in the past.
It does more than what we need here, so we thought
a lighter-weight and lower-latency method that simply
didn't schedule to the idled cpu would suffice.

Personally, I don't think that CONFIG_HOTPLUG_CPU should exist,
taking processors on and off-line should be part of CONFIG_SMP.

A while back when I selected CONFIG_HOTPLUG_CPU from ACPI && SMP,
there was a torrent of outrage that it infringed on user's right's
to save that additional 18KB of memory that CONFIG_HOTPLUG_CPU
includes that SMP does not...

We are fixing the hotplug-unplug idle loop, but there
turns out to be some issues with it related to idle
processors with interrupts disabled that don't actually
get down into the deep C-states we request:-(

So this is why you see a patch for a "halfway measure",
it does what is necessary, and does nothing more.

-Len

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19  7:39 [PATCH]cpuset: add new API to change cpuset top group's cpus Shaohua Li
  2009-05-19  8:40 ` Peter Zijlstra
@ 2009-05-19 19:55 ` Paul Menage
  1 sibling, 0 replies; 24+ messages in thread
From: Paul Menage @ 2009-05-19 19:55 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-kernel, linux-acpi, lenb

On Tue, May 19, 2009 at 12:39 AM, Shaohua Li <shaohua.li@intel.com> wrote:
>
> This patch adds one API to change cpuset top group's cpus. If we want to
> make one cpu idle, simply remove the cpu from cpuset top group's cpu list,
> then all tasks will be migrate to other cpus, and other tasks will not be
> migrated to this cpu again. No functional changes.

>
> +int cpuset_change_top_cpumask(const char *buf)
> +{
> +       int retval = 0;
> +       struct cpuset *cs = &top_cpuset;
> +       struct cpuset *trialcs;
> +
> +       if (!cgroup_lock_live_group(cs->css.cgroup))
> +               return -ENODEV;

top_cpuset can't possibly be dead, so a plain cgroup_lock() would be fine here.

> +
> +       trialcs = alloc_trial_cpuset(cs);
> +       if (!trialcs)
> +               return -ENOMEM;

You returned without doing a cgroup_unlock()

> +
> +       retval = update_cpumask(cs, trialcs, buf, true);

This will fail if any child cpuset is using any cpu not in the new
cpumask, since a child's cpumask must be a subset of its parent's.

So this can't work without co-ordination with userspace regarding
child cpusets. Given that, it seems simpler to do the whole thing in
userspace, or just use the existing hotplug infrastructure.

Paul

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 19:01         ` Len Brown
@ 2009-05-19 22:36           ` Peter Zijlstra
  2009-05-20 11:58             ` Andi Kleen
  2009-05-20 17:21           ` Vaidyanathan Srinivasan
  1 sibling, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-19 22:36 UTC (permalink / raw)
  To: Len Brown
  Cc: Shaohua Li, linux-kernel, linux-acpi, menage, Vaidyanathan Srinivasan

On Tue, 2009-05-19 at 15:01 -0400, Len Brown wrote:
> > ... the point is, we
> > don't need a new interface to force a cpu idle. Hotplug does that.
> >
> > Furthermore, we should not want anything outside of that, either the cpu
> > is there available for work, or its not -- halfway measures don't make
> > sense.
> > 
> > Furthermore, we already have power aware scheduling which tries to
> > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > power savings. Use it there.
> 
> Some context...

<snip default story of thermal overcommit>

> > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > cpuidle governor potentially can't handle offline cpu.
> > > 
> > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > and I've no idea why its still there, seems like a much better candidate
> > > for your efforts than this.
> 
> CONFIG_HOTPLUG_CPU has been problematic in the past.
> It does more than what we need here, so we thought
> a lighter-weight and lower-latency method that simply
> didn't schedule to the idled cpu would suffice.

> We are fixing the hotplug-unplug idle loop, but there
> turns out to be some issues with it related to idle
> processors with interrupts disabled that don't actually
> get down into the deep C-states we request:-(
> 
> So this is why you see a patch for a "halfway measure",
> it does what is necessary, and does nothing more.

Its broken, its ill-defined and its not going to happen.

Ripping cpus out of the top cpuset might upset the cpuset configuration
and has no regards for any realtime processes. And I must take back my
earlier suggestion, hotplug is a bad solution too.

There's just too much user policy (cpuset configuration) to upset.

The IBM folks are working on a scheduler based solution, please talk to
them.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 22:36           ` Peter Zijlstra
@ 2009-05-20 11:58             ` Andi Kleen
  2009-05-20 12:17               ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Andi Kleen @ 2009-05-20 11:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Len Brown, Shaohua Li, linux-kernel, linux-acpi, menage,
	Vaidyanathan Srinivasan

Peter Zijlstra <peterz@infradead.org> writes:

Peter, in general the discussion would be much more fruitful
if you explained your reasoning more verbosely. I can only guess
what your rationales are from your half sentence 
pronouncements.

> and has no regards for any realtime processes.

You're saying this should not be done if any realtime processes are
currently bound to a to be temporarily removed CPU?

That sounds reasonable and I'm sure could be implemented
with the original patch.

> And I must take back my
> earlier suggestion, hotplug is a bad solution too.
>
> There's just too much user policy (cpuset configuration) to upset.

Could you explain that please? How does changing the top level
cpuset affect other cpu sets?

> The IBM folks are working on a scheduler based solution, please talk to
> them.

I don't claim to fully understand the scheduler, but naively since
cpusets can already do this adding another mechanism for it too that
needs to be checked in fast paths would seem somewhat redundant?

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 11:58             ` Andi Kleen
@ 2009-05-20 12:17               ` Peter Zijlstra
  2009-05-20 13:13                 ` Andi Kleen
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-20 12:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Len Brown, Shaohua Li, linux-kernel, linux-acpi, menage,
	Vaidyanathan Srinivasan

On Wed, 2009-05-20 at 13:58 +0200, Andi Kleen wrote:

> Could you explain that please? How does changing the top level
> cpuset affect other cpu sets?

Suppose you have 8 cpus and created 3 cpusets:

 A: cpu0 - system administration stuff
 B: cpu1-5 - generic computational stuff
 C: cpu6-7 - latency critical stuff

Each such set is made a load-balance domain (iow load-balancing on the
top level set is disabled).

Now, suppose someone thinks its a good idea to remove cpu0 because the
machine is running against some thermal limit -- what will all the
administration stuff (including sshd) do?

Same goes for the latency critical stuff.

You really want to start shrinking the generic computational capacity
first.

The thing is, you cannot simply rip cpus out from under a system, people
might rely on them being there and have policy attached to them -- esp.
people touching cpusets should know that a machine isn't configured
homogeneous and any odd cpu will do.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 12:17               ` Peter Zijlstra
@ 2009-05-20 13:13                 ` Andi Kleen
  2009-05-20 13:41                   ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Andi Kleen @ 2009-05-20 13:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Len Brown, Shaohua Li, linux-kernel, linux-acpi,
	menage, Vaidyanathan Srinivasan

Thanks for the explanation.

My naive reaction would be to fail if the socket to be taken out
is the only member of some cpuset. Or maybe break affinities in this case.

> You really want to start shrinking the generic computational capacity
> first.

One general issue to remember that if you don't react to the platform hint 
the platform will likely force a lower p-state on you to not exceed
the thermal limits, making everyone slower. 

(this will likely also not make your real time process happy)

So it's a bit more than a hint; it's more like a command "or else"

So it's a good idea to react or at least make at least a reasonable attempt 
to react.

> The thing is, you cannot simply rip cpus out from under a system, people
> might rely on them being there and have policy attached to them -- esp.
> people touching cpusets should know that a machine isn't configured
> homogeneous and any odd cpu will do.

Ok, so do you think it's possible to figure out based on the cpuset
graph / real time runqueue if a socket can be taken out? 

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 13:13                 ` Andi Kleen
@ 2009-05-20 13:41                   ` Peter Zijlstra
  2009-05-20 14:45                     ` Andi Kleen
  2009-05-20 17:36                     ` Vaidyanathan Srinivasan
  0 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2009-05-20 13:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Len Brown, Shaohua Li, linux-kernel, linux-acpi, menage,
	Vaidyanathan Srinivasan

On Wed, 2009-05-20 at 15:13 +0200, Andi Kleen wrote:
> Thanks for the explanation.
> 
> My naive reaction would be to fail if the socket to be taken out
> is the only member of some cpuset. Or maybe break affinities in this case.

Right, breaking affinities would go against the policy of the admin, I'm
not sure we'd want to go there. We could start generating msgs about how
we're in thermal trouble and the given configuration is obstructing
counter measures etc..

Currently hot-unplug does break affinities, but that's an explicit
action by the admin himself, so he gets what he asks for (and we do
generate complaints in syslog about it).

[ Same scenario for the HPC guys who affinity fix all their threads to
  specific cpus, there's really nothing you can do there. Then again
  such folks generally run their machines at 100% so they'd better
  be able to deal with their thermal peak capacity anyway. ]

> > You really want to start shrinking the generic computational capacity
> > first.
> 
> One general issue to remember that if you don't react to the platform hint 
> the platform will likely force a lower p-state on you to not exceed
> the thermal limits, making everyone slower. 
> 
> (this will likely also not make your real time process happy)

Quite.

> So it's a bit more than a hint; it's more like a command "or else"
> 
> So it's a good idea to react or at least make at least a reasonable attempt 
> to react.

Sure, does the thing give more than a: 'react now, or else' impulse?
That is, can we see it coming, or will we have to deal with it when
we're there?

The latter also has the problem that you have to react very quickly.

> > The thing is, you cannot simply rip cpus out from under a system, people
> > might rely on them being there and have policy attached to them -- esp.
> > people touching cpusets should know that a machine isn't configured
> > homogeneous and any odd cpu will do.
> 
> Ok, so do you think it's possible to figure out based on the cpuset
> graph / real time runqueue if a socket can be taken out? 

Right, so all of this depends on a number of things, how frequent and
how fast would these situations occur?

I would think they'd be rare events, otherwise you really messed up your
infrastructure. I also think reaction times should be in the seconds,
otherwise you're cutting it way to close.

The work IBM has been doing is centered around overloading neighbouring
packages in order to keep some idle. The overload is exposed as a
percentage.

This works within scheduling domains, so if you carve your machine up in
tiny (<= 1 package) domains its impossible to do anything (corner case,
we could send cries for help syslog's way).

I was hoping we could control the situation with that. But for that to
work we need some gradual information in order to make that
thermal<->overload feedback work.

A single: idle a core now (< 'n' sec) or die, isn't really helpful.

[ figuring out how to deal with RT tasks and the like is still open,
  the problem with SCHED_FIFO/RR is that such tasks don't give
  utilization numbers, so we'll have to guesstimate them based on
  historic behaviour.  SCHED_EDF or similar future realtime bits
  would be much easier to deal with in this case ]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 13:41                   ` Peter Zijlstra
@ 2009-05-20 14:45                     ` Andi Kleen
  2009-05-20 17:36                     ` Vaidyanathan Srinivasan
  1 sibling, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2009-05-20 14:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Len Brown, Shaohua Li, linux-kernel, linux-acpi,
	menage, Vaidyanathan Srinivasan

On Wed, May 20, 2009 at 03:41:55PM +0200, Peter Zijlstra wrote:
> On Wed, 2009-05-20 at 15:13 +0200, Andi Kleen wrote:
> > Thanks for the explanation.
> > 
> > My naive reaction would be to fail if the socket to be taken out
> > is the only member of some cpuset. Or maybe break affinities in this case.
> 
> Right, breaking affinities would go against the policy of the admin, I'm
> not sure we'd want to go there. 

> We could start generating msgs about how
> we're in thermal trouble and the given configuration is obstructing
> counter measures etc..

Makes sense.

> 
> Currently hot-unplug does break affinities, but that's an explicit
> action by the admin himself, so he gets what he asks for (and we do

I have some code which can do it implicitely too in mcelog (not yet out).
Basically the CPU can detect when its caches have a problem and the reaction
is then to offline the affected CPUs. But that's a very obscure case
and the alternative is to die.

> generate complaints in syslog about it).

One possible alternative would be also "weak breaking", as in remembering
the old affinities and reinstating them once the CPU becomes online again.

> [ Same scenario for the HPC guys who affinity fix all their threads to
>   specific cpus, there's really nothing you can do there. Then again
>   such folks generally run their machines at 100% so they'd better
>   be able to deal with their thermal peak capacity anyway. ]

Yes. Same for real time. These guys are really not expected to use
these advanced power management features. 

> > So it's a bit more than a hint; it's more like a command "or else"
> > 
> > So it's a good idea to react or at least make at least a reasonable attempt 
> > to react.
> 
> Sure, does the thing give more than a: 'react now, or else' impulse?
> That is, can we see it coming, or will we have to deal with it when
> we're there?
> 
> The latter also has the problem that you have to react very quickly.

My understanding it is a quite strong hint: "do the best you can"
So yes doing it quickly would be good.

> 
> > > The thing is, you cannot simply rip cpus out from under a system, people
> > > might rely on them being there and have policy attached to them -- esp.
> > > people touching cpusets should know that a machine isn't configured
> > > homogeneous and any odd cpu will do.
> > 
> > Ok, so do you think it's possible to figure out based on the cpuset
> > graph / real time runqueue if a socket can be taken out? 
> 
> Right, so all of this depends on a number of things, how frequent and
> how fast would these situations occur?
> 
> I would think they'd be rare events, otherwise you really messed up your

My assumption too.

> infrastructure. I also think reaction times should be in the seconds,
> otherwise you're cutting it way to close.

Yep.

> I was hoping we could control the situation with that. But for that to
> work we need some gradual information in order to make that
> thermal<->overload feedback work.
> 
> 
> A single: idle a core now (< 'n' sec) or die, isn't really helpful.

That's what you get unfortuantely.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 19:01         ` Len Brown
  2009-05-19 22:36           ` Peter Zijlstra
@ 2009-05-20 17:21           ` Vaidyanathan Srinivasan
  1 sibling, 0 replies; 24+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-05-20 17:21 UTC (permalink / raw)
  To: Len Brown; +Cc: Peter Zijlstra, Shaohua Li, linux-kernel, linux-acpi, menage

* Len Brown <lenb@kernel.org> [2009-05-19 15:01:46]:

> > ... the point is, we
> > don't need a new interface to force a cpu idle. Hotplug does that.
> >
> > Furthermore, we should not want anything outside of that, either the cpu
> > is there available for work, or its not -- halfway measures don't make
> > sense.
> > 
> > Furthermore, we already have power aware scheduling which tries to
> > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > power savings. Use it there.
> 
> Some context...
> 
> In the past, server room power and thermal issues were handled
> either by spending too much money to provision power and
> thermals for theoretical worst case, or by abruptly shutting off
> servers when hard limits were reached.
> 
> Going forward, platforms are getting smarter, measuring how
> much power is drawn from the power supply, measuring the room
> thermals etc. so that real dollars can be saved by deploying
> systems that exceed the theoretical worst case if the power
> and thermal limits are enforced.
> 
> So if server approaches a budget, the platform
> will notify the OS to limit its P-states, and limit its T-states
> in order to draw less power.
> 
> If that is not sufficient, the platform will ask us to take
> processors off-line.  These are not processors that are otherwise idle
> -- those are already saving as much power as they can --
> these are processors that are fully utilized.
> 
> So power-aware scheduling is moot here, this isn't the
> partially idle case, this is the fully utilized case.

Hi Len,

Over and above power-aware scheduling we have been exploring
possibility of forcefully idle cpu for power savings.  This is mostly
useful in thermal case that you have mentioned and also to provide
fine grain power vs performance trade-offs.  Creating idle times and
consolidating idle time efficiently in order to evacuate cores and
packages provides a framework to exploit C-States apart from P-States
and T-States that you have mentioned above.  Addition of C-States
control to save power and heat may make the system do more
instructions at a given power/thermal constraint.

Reference: http://lkml.org/lkml/2009/5/13/173
 
> If power draw continues to be too high, the platform
> will simply ask us to take more processors off line.
> 
> If this dance doesn't reduce power below that required,
> the platform will be shut off.
> 
> So it is sufficient to simply not schedule cpu burners
> on the 'idled' processor.  Interrupts should generally
> not matter -- and if they do, we'll end up simply idling
> an additional processor.

The requirements and use cases are clear.

> > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > cpuidle governor potentially can't handle offline cpu.
> > > 
> > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > and I've no idea why its still there, seems like a much better candidate
> > > for your efforts than this.
> 
> CONFIG_HOTPLUG_CPU has been problematic in the past.
> It does more than what we need here, so we thought
> a lighter-weight and lower-latency method that simply
> didn't schedule to the idled cpu would suffice.
> 
> Personally, I don't think that CONFIG_HOTPLUG_CPU should exist,
> taking processors on and off-line should be part of CONFIG_SMP.
> 
> A while back when I selected CONFIG_HOTPLUG_CPU from ACPI && SMP,
> there was a torrent of outrage that it infringed on user's right's
> to save that additional 18KB of memory that CONFIG_HOTPLUG_CPU
> includes that SMP does not...
> 
> We are fixing the hotplug-unplug idle loop, but there
> turns out to be some issues with it related to idle
> processors with interrupts disabled that don't actually
> get down into the deep C-states we request:-(

Fixing the hot-unplug idle loop will help us use the cpu-hotplug
infrastructure for many other purposes like power/thermal management
purposes. Do you think there could be some workaround/solution for
this in short term?

> So this is why you see a patch for a "halfway measure",
> it does what is necessary, and does nothing more.

Peter had detailed comments on this aspect.

--Vaidy


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 13:41                   ` Peter Zijlstra
  2009-05-20 14:45                     ` Andi Kleen
@ 2009-05-20 17:36                     ` Vaidyanathan Srinivasan
  2009-05-21  1:22                       ` Shaohua Li
  1 sibling, 1 reply; 24+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-05-20 17:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Len Brown, Shaohua Li, linux-kernel, linux-acpi, menage

* Peter Zijlstra <peterz@infradead.org> [2009-05-20 15:41:55]:

> On Wed, 2009-05-20 at 15:13 +0200, Andi Kleen wrote:
> > Thanks for the explanation.
> > 
> > My naive reaction would be to fail if the socket to be taken out
> > is the only member of some cpuset. Or maybe break affinities in this case.
> 
> Right, breaking affinities would go against the policy of the admin, I'm
> not sure we'd want to go there. We could start generating msgs about how
> we're in thermal trouble and the given configuration is obstructing
> counter measures etc..
> 
> Currently hot-unplug does break affinities, but that's an explicit
> action by the admin himself, so he gets what he asks for (and we do
> generate complaints in syslog about it).
> 
> [ Same scenario for the HPC guys who affinity fix all their threads to
>   specific cpus, there's really nothing you can do there. Then again
>   such folks generally run their machines at 100% so they'd better
>   be able to deal with their thermal peak capacity anyway. ]
> 
> > > You really want to start shrinking the generic computational capacity
> > > first.
> > 
> > One general issue to remember that if you don't react to the platform hint 
> > the platform will likely force a lower p-state on you to not exceed
> > the thermal limits, making everyone slower. 
> > 
> > (this will likely also not make your real time process happy)
> 
> Quite.
> 
> > So it's a bit more than a hint; it's more like a command "or else"
> > 
> > So it's a good idea to react or at least make at least a reasonable attempt 
> > to react.
> 
> Sure, does the thing give more than a: 'react now, or else' impulse?
> That is, can we see it coming, or will we have to deal with it when
> we're there?
> 
> The latter also has the problem that you have to react very quickly.
> 
> > > The thing is, you cannot simply rip cpus out from under a system, people
> > > might rely on them being there and have policy attached to them -- esp.
> > > people touching cpusets should know that a machine isn't configured
> > > homogeneous and any odd cpu will do.
> > 
> > Ok, so do you think it's possible to figure out based on the cpuset
> > graph / real time runqueue if a socket can be taken out? 
> 
> Right, so all of this depends on a number of things, how frequent and
> how fast would these situations occur?
> 
> I would think they'd be rare events, otherwise you really messed up your
> infrastructure. I also think reaction times should be in the seconds,
> otherwise you're cutting it way to close.
> 
> 
> The work IBM has been doing is centered around overloading neighbouring
> packages in order to keep some idle. The overload is exposed as a
> percentage.
> 
> This works within scheduling domains, so if you carve your machine up in
> tiny (<= 1 package) domains its impossible to do anything (corner case,
> we could send cries for help syslog's way).
> 
> I was hoping we could control the situation with that. But for that to
> work we need some gradual information in order to make that
> thermal<->overload feedback work.

The advantages of this method is to reduce load on one package and not
target a particular CPU.  This is less restrictive and can allow the
load balancer to work out the details.  Keeping a core idle on an
average (over a time interval) is good enough to reduce the power and
heat.  

Here we need not touch the RT jobs or break use space policies.  We
effectively reduce capacity and let the loadbalancer have the
flexibility of figuring out which CPU should not be scheduled now.

That said, this is not useful for a 'cpu cache error' case, in which
case you will have to cpu-hot-unplug anyway.  You don't want any
interrupts/timers to land there in an unreliable CPU.

Overloading the powersave load balancer to assume reduced capacity on
some of the packages while overloading some others packages is the
core idea.  The RFC patches still need a lot of work to meet the
required functionality.

--Vaidy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-20 17:36                     ` Vaidyanathan Srinivasan
@ 2009-05-21  1:22                       ` Shaohua Li
  2009-05-21  3:20                         ` Vaidyanathan Srinivasan
  0 siblings, 1 reply; 24+ messages in thread
From: Shaohua Li @ 2009-05-21  1:22 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Peter Zijlstra, Andi Kleen, Len Brown, linux-kernel, linux-acpi, menage

On Thu, May 21, 2009 at 01:36:35AM +0800, Vaidyanathan Srinivasan wrote:
> * Peter Zijlstra <peterz@infradead.org> [2009-05-20 15:41:55]:
> 
> > On Wed, 2009-05-20 at 15:13 +0200, Andi Kleen wrote:
> > > Thanks for the explanation.
> > > 
> > > My naive reaction would be to fail if the socket to be taken out
> > > is the only member of some cpuset. Or maybe break affinities in this case.
> > 
> > Right, breaking affinities would go against the policy of the admin, I'm
> > not sure we'd want to go there. We could start generating msgs about how
> > we're in thermal trouble and the given configuration is obstructing
> > counter measures etc..
> > 
> > Currently hot-unplug does break affinities, but that's an explicit
> > action by the admin himself, so he gets what he asks for (and we do
> > generate complaints in syslog about it).
> > 
> > [ Same scenario for the HPC guys who affinity fix all their threads to
> >   specific cpus, there's really nothing you can do there. Then again
> >   such folks generally run their machines at 100% so they'd better
> >   be able to deal with their thermal peak capacity anyway. ]
> > 
> > > > You really want to start shrinking the generic computational capacity
> > > > first.
> > > 
> > > One general issue to remember that if you don't react to the platform hint 
> > > the platform will likely force a lower p-state on you to not exceed
> > > the thermal limits, making everyone slower. 
> > > 
> > > (this will likely also not make your real time process happy)
> > 
> > Quite.
> > 
> > > So it's a bit more than a hint; it's more like a command "or else"
> > > 
> > > So it's a good idea to react or at least make at least a reasonable attempt 
> > > to react.
> > 
> > Sure, does the thing give more than a: 'react now, or else' impulse?
> > That is, can we see it coming, or will we have to deal with it when
> > we're there?
> > 
> > The latter also has the problem that you have to react very quickly.
> > 
> > > > The thing is, you cannot simply rip cpus out from under a system, people
> > > > might rely on them being there and have policy attached to them -- esp.
> > > > people touching cpusets should know that a machine isn't configured
> > > > homogeneous and any odd cpu will do.
> > > 
> > > Ok, so do you think it's possible to figure out based on the cpuset
> > > graph / real time runqueue if a socket can be taken out? 
> > 
> > Right, so all of this depends on a number of things, how frequent and
> > how fast would these situations occur?
> > 
> > I would think they'd be rare events, otherwise you really messed up your
> > infrastructure. I also think reaction times should be in the seconds,
> > otherwise you're cutting it way to close.
> > 
> > 
> > The work IBM has been doing is centered around overloading neighbouring
> > packages in order to keep some idle. The overload is exposed as a
> > percentage.
> > 
> > This works within scheduling domains, so if you carve your machine up in
> > tiny (<= 1 package) domains its impossible to do anything (corner case,
> > we could send cries for help syslog's way).
> > 
> > I was hoping we could control the situation with that. But for that to
> > work we need some gradual information in order to make that
> > thermal<->overload feedback work.
> 
> The advantages of this method is to reduce load on one package and not
> target a particular CPU.  This is less restrictive and can allow the
> load balancer to work out the details.  Keeping a core idle on an
> average (over a time interval) is good enough to reduce the power and
> heat.  
> 
> Here we need not touch the RT jobs or break use space policies.  We
> effectively reduce capacity and let the loadbalancer have the
> flexibility of figuring out which CPU should not be scheduled now.
> 
> That said, this is not useful for a 'cpu cache error' case, in which
> case you will have to cpu-hot-unplug anyway.  You don't want any
> interrupts/timers to land there in an unreliable CPU.
> 
> Overloading the powersave load balancer to assume reduced capacity on
> some of the packages while overloading some others packages is the
> core idea.  The RFC patches still need a lot of work to meet the
> required functionality.
So the main concern is breaking user policy, but it appears any approach
(cpu hotplug/cpuset) will break user policy (affinity). I wonder how the
scheduler approach can overcome this to my little scheduler knowledge.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-21  1:22                       ` Shaohua Li
@ 2009-05-21  3:20                         ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 24+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-05-21  3:20 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Peter Zijlstra, Andi Kleen, Len Brown, linux-kernel, linux-acpi, menage

* Shaohua Li <shaohua.li@intel.com> [2009-05-21 09:22:13]:

> On Thu, May 21, 2009 at 01:36:35AM +0800, Vaidyanathan Srinivasan wrote:
> > * Peter Zijlstra <peterz@infradead.org> [2009-05-20 15:41:55]:
> > 
> > > On Wed, 2009-05-20 at 15:13 +0200, Andi Kleen wrote:
> > > > Thanks for the explanation.
> > > > 
> > > > My naive reaction would be to fail if the socket to be taken out
> > > > is the only member of some cpuset. Or maybe break affinities in this case.
> > > 
> > > Right, breaking affinities would go against the policy of the admin, I'm
> > > not sure we'd want to go there. We could start generating msgs about how
> > > we're in thermal trouble and the given configuration is obstructing
> > > counter measures etc..
> > > 
> > > Currently hot-unplug does break affinities, but that's an explicit
> > > action by the admin himself, so he gets what he asks for (and we do
> > > generate complaints in syslog about it).
> > > 
> > > [ Same scenario for the HPC guys who affinity fix all their threads to
> > >   specific cpus, there's really nothing you can do there. Then again
> > >   such folks generally run their machines at 100% so they'd better
> > >   be able to deal with their thermal peak capacity anyway. ]
> > > 
> > > > > You really want to start shrinking the generic computational capacity
> > > > > first.
> > > > 
> > > > One general issue to remember that if you don't react to the platform hint 
> > > > the platform will likely force a lower p-state on you to not exceed
> > > > the thermal limits, making everyone slower. 
> > > > 
> > > > (this will likely also not make your real time process happy)
> > > 
> > > Quite.
> > > 
> > > > So it's a bit more than a hint; it's more like a command "or else"
> > > > 
> > > > So it's a good idea to react or at least make at least a reasonable attempt 
> > > > to react.
> > > 
> > > Sure, does the thing give more than a: 'react now, or else' impulse?
> > > That is, can we see it coming, or will we have to deal with it when
> > > we're there?
> > > 
> > > The latter also has the problem that you have to react very quickly.
> > > 
> > > > > The thing is, you cannot simply rip cpus out from under a system, people
> > > > > might rely on them being there and have policy attached to them -- esp.
> > > > > people touching cpusets should know that a machine isn't configured
> > > > > homogeneous and any odd cpu will do.
> > > > 
> > > > Ok, so do you think it's possible to figure out based on the cpuset
> > > > graph / real time runqueue if a socket can be taken out? 
> > > 
> > > Right, so all of this depends on a number of things, how frequent and
> > > how fast would these situations occur?
> > > 
> > > I would think they'd be rare events, otherwise you really messed up your
> > > infrastructure. I also think reaction times should be in the seconds,
> > > otherwise you're cutting it way to close.
> > > 
> > > 
> > > The work IBM has been doing is centered around overloading neighbouring
> > > packages in order to keep some idle. The overload is exposed as a
> > > percentage.
> > > 
> > > This works within scheduling domains, so if you carve your machine up in
> > > tiny (<= 1 package) domains its impossible to do anything (corner case,
> > > we could send cries for help syslog's way).
> > > 
> > > I was hoping we could control the situation with that. But for that to
> > > work we need some gradual information in order to make that
> > > thermal<->overload feedback work.
> > 
> > The advantages of this method is to reduce load on one package and not
> > target a particular CPU.  This is less restrictive and can allow the
> > load balancer to work out the details.  Keeping a core idle on an
> > average (over a time interval) is good enough to reduce the power and
> > heat.  
> > 
> > Here we need not touch the RT jobs or break use space policies.  We
> > effectively reduce capacity and let the loadbalancer have the
> > flexibility of figuring out which CPU should not be scheduled now.
> > 
> > That said, this is not useful for a 'cpu cache error' case, in which
> > case you will have to cpu-hot-unplug anyway.  You don't want any
> > interrupts/timers to land there in an unreliable CPU.
> > 
> > Overloading the powersave load balancer to assume reduced capacity on
> > some of the packages while overloading some others packages is the
> > core idea.  The RFC patches still need a lot of work to meet the
> > required functionality.
> So the main concern is breaking user policy, but it appears any approach
> (cpu hotplug/cpuset) will break user policy (affinity). I wonder how the
> scheduler approach can overcome this to my little scheduler knowledge.

In the scheduler loadbalancer approach we have a notion like run
3 tasks in a quad core but not specify which cpu to evacuate.  So it
is possible to respect task affinity by throttle tasks so as to not
run all the cores simultaneously.  Even if the system is completely
loaded, we can use all CPUs but avoid one core at a given time.

The input knob is a system-wide capacity percentage than can be
reduced and this reduced capacity in multiples of cores can be
uniformly spread across the system.

This is a possibility with the scheduler approach, but the current set
of RFC patches is not yet there and we do have implementation
challenges.

By artificially creating overload (or under-capacity) situations, the
load balancer can avoid filling up a sched domain completely.  This
works at CPU level and NODE level sched domains and allow the
MC/SIBLING level domains to balance work among the cores/threads.

This is only a possibility and we do have implementation challenges
that needs lots of work.

--Vaidy


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-19 13:37         ` Vaidyanathan Srinivasan
@ 2009-05-28  2:34           ` Len Brown
  2009-05-28  7:44             ` Vaidyanathan Srinivasan
  0 siblings, 1 reply; 24+ messages in thread
From: Len Brown @ 2009-05-28  2:34 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Peter Zijlstra, Shaohua Li, linux-kernel, linux-acpi, menage

On Tue, 19 May 2009, Vaidyanathan Srinivasan wrote:

> We tried similar approaches to create idle time for power savings, but
> cpu hotplug interface seem to be a clean choice.  There could be
> issues with the interface, we should fix it.  Is there any other
> reason why cpuhotplug is 'ugly' other than its performance (speed)?
> 
> I have tried few load balancer hacks to evacuate cores but not a solid
> design yet.  It has its advantages but still needs more work.
> 
> http://lkml.org/lkml/2009/5/13/173

Thanks for the pointer.
I agree with Andi, please avoid the term "throttling", since
it has been used for ages to refer processor clock throttling --
which is actually significantly less effective at saving
energy than what you are trying to do.  (not the word "energy"
here, where the word "power" is incorrectly used in the thread above)

"core evacuation" is a better description, I agree, though I wonder
why you don't simply call it "forced idling", since that is what
you are trying to do.

> > Furthermore, we should not want anything outside of that, either the cpu
> > is there available for work, or its not -- halfway measures don't make
> > sense.
> > 
> > Furthermore, we already have power aware scheduling which tries to
> > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > power savings. Use it there.
> 
> Power aware scheduling can optimally accumulate idle times.  Framework
> to create idle time to force idle cores is good and useful for power
> savings.  Other than the speed of online/offline I do not know of any
> other major issue for using cpu hotplug for this purpose.

It sounds like you want to use this technique more often
that I had in mind.  You are thinking of a warm rack, which
may stay warm all day long.  I am thinking of a rack which
has a theoretical power draw higher than the providioned
electrical supply.  As there is a huge difference between
actual and theoretical power draw, this saves many dollars.

So what you're looking at is more frequent use than we need,
and that is fine -- as long as you exhaust P-states first --
since forcing cores to be idle has a more severe performance
impact than running at a deeper P-state.

I didn't see P-states addressed in your thread.

> > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > cpuidle governor potentially can't handle offline cpu.
> > > 
> > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > and I've no idea why its still there, seems like a much better candidate
> > > for your efforts than this.
> 
> I agree with Peter.  We need to make cpu hotplug save power first and
> later improve upon its performance.

We do have a patch to fix the offline idle loop to save power.
We can use hotplug in the short term until something better comes along.
Yes, it will break cpusets, just like Shaohua's original patch broke them 
-- and that will make using it inappropriate for some customers.

While I think this mechanism is important, I don't think that a large %
of customers will deploy it.  I think the ones that deploy it will do so
to save money on electrical provisioning, not on pushing the limits
of their air conditioner.  So I don't expect its performance requirement
to be extremely severe.  I don't think it will justify tuning the
performance of cpu-hotplug, which I don't think was ever intended
to be in the performance path.

-Len

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH]cpuset: add new API to change cpuset top group's cpus
  2009-05-28  2:34           ` Len Brown
@ 2009-05-28  7:44             ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 24+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-05-28  7:44 UTC (permalink / raw)
  To: Len Brown; +Cc: Peter Zijlstra, Shaohua Li, linux-kernel, linux-acpi, menage

* Len Brown <lenb@kernel.org> [2009-05-27 22:34:38]:

> On Tue, 19 May 2009, Vaidyanathan Srinivasan wrote:
> 
> > We tried similar approaches to create idle time for power savings, but
> > cpu hotplug interface seem to be a clean choice.  There could be
> > issues with the interface, we should fix it.  Is there any other
> > reason why cpuhotplug is 'ugly' other than its performance (speed)?
> > 
> > I have tried few load balancer hacks to evacuate cores but not a solid
> > design yet.  It has its advantages but still needs more work.
> > 
> > http://lkml.org/lkml/2009/5/13/173
> 
> Thanks for the pointer.
> I agree with Andi, please avoid the term "throttling", since
> it has been used for ages to refer processor clock throttling --
> which is actually significantly less effective at saving
> energy than what you are trying to do.  (not the word "energy"
> here, where the word "power" is incorrectly used in the thread above)

Yes, you are right.  This throttling is used to refer to hardware
methods to slow down things and it is less effective in saving energy.
It reduces average power but make the work load run much longer and
consume more energy.

> "core evacuation" is a better description, I agree, though I wonder
> why you don't simply call it "forced idling", since that is what
> you are trying to do.

Yes, core evacuation is what I propose, but actually what we are doing
is starving or throttling tasks in software to create idle time, just
to make the description clear.

> > > Furthermore, we should not want anything outside of that, either the cpu
> > > is there available for work, or its not -- halfway measures don't make
> > > sense.
> > > 
> > > Furthermore, we already have power aware scheduling which tries to
> > > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > > power savings. Use it there.
> > 
> > Power aware scheduling can optimally accumulate idle times.  Framework
> > to create idle time to force idle cores is good and useful for power
> > savings.  Other than the speed of online/offline I do not know of any
> > other major issue for using cpu hotplug for this purpose.
> 
> It sounds like you want to use this technique more often
> that I had in mind.  You are thinking of a warm rack, which
> may stay warm all day long.  I am thinking of a rack which
> has a theoretical power draw higher than the providioned
> electrical supply.  As there is a huge difference between
> actual and theoretical power draw, this saves many dollars.

Yes, this framework can be used more often to balance average power
consumption in systems.  Exploiting the margin between theoretical
limits and practical usage will definitely save money in a data
center.  Present generation power capping techniques and related
infrastructure are available to exploit this margin.

Core evacuation can compliment this safety limit mechanism by
providing more fine grain control.

> So what you're looking at is more frequent use than we need,
> and that is fine -- as long as you exhaust P-states first --
> since forcing cores to be idle has a more severe performance
> impact than running at a deeper P-state.

Yes, that is the idea.  After getting all core to lowest P-State, we
can further cut power by forcing idle.  Even when not at the lowest
P-State, forced idle of complete packages may save more power as
compared to running all cores in a large system at lowest P-State.
This is generally not the case, but the framework can be more flexible
and provide more degrees of control.

> I didn't see P-states addressed in your thread.

P-States can be flexibly managed using the present cpufreq governors.
Ondemand, conservative or userspace can provide us with the required
level of control from userspace.  Idle cores will be at lowest
P-States and C-State in case of ondemand governor.  Independent of the
P-States the idle cores will save power from C-State and hence cpufreq
governors does not make an impact.

In the case of busy cores, end users can decide to pick conservative
or userspace governor before invoking core evacuation.

The main motivation for the core evacuation framework is to provide
another degree of control to exploit C-States based power savings
apart from P-State manipulation (for which good framework already
exist).

> > > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > > cpuidle governor potentially can't handle offline cpu.
> > > > 
> > > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > > and I've no idea why its still there, seems like a much better candidate
> > > > for your efforts than this.
> > 
> > I agree with Peter.  We need to make cpu hotplug save power first and
> > later improve upon its performance.
> 
> We do have a patch to fix the offline idle loop to save power.

This will definitely help the objective.  I have looked at Venki's
patch.  We certainly need that feature even outside of the current
context where we want to hotplug faulty CPUs or setup special system
configurations where all cores in a package is not to be used.

> We can use hotplug in the short term until something better comes along.
> Yes, it will break cpusets, just like Shaohua's original patch broke them 
> -- and that will make using it inappropriate for some customers.

It will good to have a solution that does not affect user policy.
Otherwise that will discourage its adoption and usability.  But the
cpu-hotplug solution will work in short term.

> While I think this mechanism is important, I don't think that a large %
> of customers will deploy it.  I think the ones that deploy it will do so
> to save money on electrical provisioning, not on pushing the limits
> of their air conditioner.  So I don't expect its performance requirement
> to be extremely severe.  I don't think it will justify tuning the
> performance of cpu-hotplug, which I don't think was ever intended
> to be in the performance path.

The motivation to improve cpu-hotplug is that we have begin to find
more uses for the framework and if there are issues, this is a good
time to fix it.  Opportunities to improve performance should be
explored because we will have to hotplug multiple CPUs to have an
impact.  The number of cores in the system will become quite large and
we will always have to hotplug multiple cpus to isolate a package for
hardware faults or power saving purposes.

On a system with 4096 CPUs, perhaps 128 cores my be a package or
entity that needs to go off in bulk.  We will certainly not be dealing
with online/offline of one or two cpus in such a system.  Well this is
an extreme case and weired example.  Hope you get the idea on why we
should try to improve cpu-hotplug path.

Thanks for the detailed comments and suggestions.

--Vaidy

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2009-05-28  7:46 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-19  7:39 [PATCH]cpuset: add new API to change cpuset top group's cpus Shaohua Li
2009-05-19  8:40 ` Peter Zijlstra
2009-05-19  8:48   ` Shaohua Li
2009-05-19  8:56     ` Peter Zijlstra
2009-05-19  9:06       ` Shaohua Li
2009-05-19  9:31         ` Peter Zijlstra
2009-05-19 10:38       ` Peter Zijlstra
2009-05-19 13:37         ` Vaidyanathan Srinivasan
2009-05-28  2:34           ` Len Brown
2009-05-28  7:44             ` Vaidyanathan Srinivasan
2009-05-19 19:01         ` Len Brown
2009-05-19 22:36           ` Peter Zijlstra
2009-05-20 11:58             ` Andi Kleen
2009-05-20 12:17               ` Peter Zijlstra
2009-05-20 13:13                 ` Andi Kleen
2009-05-20 13:41                   ` Peter Zijlstra
2009-05-20 14:45                     ` Andi Kleen
2009-05-20 17:36                     ` Vaidyanathan Srinivasan
2009-05-21  1:22                       ` Shaohua Li
2009-05-21  3:20                         ` Vaidyanathan Srinivasan
2009-05-20 17:21           ` Vaidyanathan Srinivasan
2009-05-19 11:27   ` Andi Kleen
2009-05-19 12:01     ` Peter Zijlstra
2009-05-19 19:55 ` Paul Menage

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).