linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* add lowpower_idle sysctl
@ 2004-03-18  0:31 Kenneth Chen
  2004-03-18  1:04 ` Andrew Morton
  2004-03-23  9:56 ` Pavel Machek
  0 siblings, 2 replies; 18+ messages in thread
From: Kenneth Chen @ 2004-03-18  0:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-ia64

On ia64, we need runtime control to manage CPU power state in the idle
loop.  Logically it means a sysctl entry in /proc/sys/kernel.  Even
though this sysctl entry doesn't exist today, lots of arch already has
some sort of API to dynamically enable/disable low power idle state.
Looking at linux-2.6.4, arm, arm26, cris, i386, parisc, sh, um, x86-64
all has very much the same code in each arch.  So instead of replicate
another set under arch/ia64, we are proposing these API to be abstracted
out in the generic code.  And also add a sysctl entry under /proc/sys/kernel.

Would this be useful to all architecture who wants this features?  It
would be a lot less code duplication.

- Ken


diff -Nur linux-2.6.4/include/linux/sysctl.h linux-2.6.4.halt/include/linux/sysctl.h
--- linux-2.6.4/include/linux/sysctl.h	2004-03-10 18:55:28.000000000 -0800
+++ linux-2.6.4.halt/include/linux/sysctl.h	2004-03-17 15:33:30.000000000 -0800
@@ -131,6 +131,7 @@
 	KERN_PRINTK_RATELIMIT_BURST=61,	/* int: tune printk ratelimiting */
 	KERN_PTY=62,		/* dir: pty driver */
 	KERN_NGROUPS_MAX=63,	/* int: NGROUPS_MAX */
+	KERN_LOWPOWER_IDLE=64,	/* int: low power idle */
 };


diff -Nur linux-2.6.4/kernel/cpu.c linux-2.6.4.halt/kernel/cpu.c
--- linux-2.6.4/kernel/cpu.c	2004-03-10 18:55:44.000000000 -0800
+++ linux-2.6.4.halt/kernel/cpu.c	2004-03-17 15:36:32.000000000 -0800
@@ -64,3 +64,15 @@
 	up(&cpucontrol);
 	return ret;
 }
+
+atomic_t halt_counter;
+void enable_halt(void)
+{
+	atomic_dec(&halt_counter);
+}
+void disable_halt(void)
+{
+	atomic_inc(&halt_counter);
+}
+EXPORT_SYMBOL(enable_halt);
+EXPORT_SYMBOL(disable_halt);
diff -Nur linux-2.6.4/kernel/sysctl.c linux-2.6.4.halt/kernel/sysctl.c
--- linux-2.6.4/kernel/sysctl.c	2004-03-10 18:55:22.000000000 -0800
+++ linux-2.6.4.halt/kernel/sysctl.c	2004-03-17 15:34:52.000000000 -0800
@@ -64,6 +64,7 @@
 extern int min_free_kbytes;
 extern int printk_ratelimit_jiffies;
 extern int printk_ratelimit_burst;
+extern atomic_t halt_counter;

 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;
@@ -615,6 +616,14 @@
 		.mode		= 0444,
 		.proc_handler	= &proc_dointvec,
 	},
+	{
+		.ctl_name	= KERN_LOWPOWER_IDLE,
+		.procname	= "lowpower_idle",
+		.data		= &halt_counter,
+		.maxlen		= sizeof (int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
 	{ .ctl_name = 0 }
 };




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  0:31 add lowpower_idle sysctl Kenneth Chen
@ 2004-03-18  1:04 ` Andrew Morton
  2004-03-18  3:18   ` Kenneth Chen
  2004-03-18  9:42   ` Pasi Savolainen
  2004-03-23  9:56 ` Pavel Machek
  1 sibling, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2004-03-18  1:04 UTC (permalink / raw)
  To: Kenneth Chen; +Cc: linux-kernel, linux-ia64

"Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
>
> On ia64, we need runtime control to manage CPU power state in the idle
> loop. 

Can you expand on this?  Does this mean that the admin can select different
idle-loop algorithms?  If so, what alternative algorithms exist?

On x86 I'd like to be able to turn idle=poll on when performing oprofile
run, so the numbers come out right.  Will this let me do that?


> Logically it means a sysctl entry in /proc/sys/kernel.

Yes, but the *meanings* of the different values of that sysctl need to be
defined, and documented.  If lowpower_idle=42 has a totally different
meaning on different architectures then that's unfortunate but
understandable.  But we should at least enumerate the different values and
try to get different architectures to honour `42' in the same way.

> +atomic_t halt_counter;

Needs to be initialised - atomic_t's may have spinlocks inside them or
anything else.

> +extern atomic_t halt_counter;

Needs to be in a header, not in .c

>  /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
>  static int maxolduid = 65535;
> @@ -615,6 +616,14 @@
>  		.mode		= 0444,
>  		.proc_handler	= &proc_dointvec,
>  	},
> +	{
> +		.ctl_name	= KERN_LOWPOWER_IDLE,
> +		.procname	= "lowpower_idle",
> +		.data		= &halt_counter,
> +		.maxlen		= sizeof (int),
> +		.mode		= 0644,
> +		.proc_handler	= &proc_dointvec,
> +	},
>  	{ .ctl_name = 0 }
>  };

You cannot treat an int* as an atomic_t*!

Why do we want to inc and dec a user-specified tunable anyway?  I think I
don't understand what you're trying to do with this patch. 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: add lowpower_idle sysctl
  2004-03-18  1:04 ` Andrew Morton
@ 2004-03-18  3:18   ` Kenneth Chen
  2004-03-18  3:28     ` Andrew Morton
  2004-03-18  9:42   ` Pasi Savolainen
  1 sibling, 1 reply; 18+ messages in thread
From: Kenneth Chen @ 2004-03-18  3:18 UTC (permalink / raw)
  To: 'Andrew Morton'; +Cc: linux-kernel, linux-ia64

>>>>> Andrew Morton on Wednesday, March 17, 2004 5:05 PM
> >
> > On ia64, we need runtime control to manage CPU power state in the
> > idle loop.
>
> Can you expand on this?

If architecture provides a facility for low power state, we would like
to turn that on in the idle loop to conserve power.  However, in some
specific situation like for performance, it is desired to be off at
least during that period of time.  A runtime control would allow power
state to be managed dynamically to accommodate both.

That's what we are trying to do: to have a sysctl to control whether
CPU goes into low power state or not in the default_idle() loop. In the
generic code, kernel provides a mechanism to set/clear a flag, and in each
arch, we can then test the flag before entering into low power state.


> Does this mean that the admin can select
> different idle-loop algorithms?  If so, what alternative algorithms exist?

This patch isn't that fancy, nice feature but maybe next step :-)


> > Logically it means a sysctl entry in /proc/sys/kernel.
> Yes, but the *meanings* of the different values of that sysctl need
> to be defined, and documented.  If lowpower_idle=42 has a totally
> different meaning on different architectures then that's unfortunate
> but understandable.  But we should at least enumerate the different
> values and try to get different architectures to honour `42' in the
> same way.

Writing to sysctl should be a bool, reading the value can be number of
module currently disabled low power idle.  I think the original intent
is to use ref count for enabling/disabling.  (granted, we copied the
code from other arch).


> Needs to be initialised - atomic_t's may have spinlocks inside them or
> anything else.
>
> Needs to be in a header, not in .c
>
> You cannot treat an int* as an atomic_t*!

My monkey work, must be not having enough coffee today :-P

- Ken



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  3:18   ` Kenneth Chen
@ 2004-03-18  3:28     ` Andrew Morton
  2004-03-18  3:40       ` Zwane Mwaikambo
  2004-03-18 21:59       ` Kenneth Chen
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2004-03-18  3:28 UTC (permalink / raw)
  To: Kenneth Chen; +Cc: linux-kernel, linux-ia64

"Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
>
>  > > Logically it means a sysctl entry in /proc/sys/kernel.
>  > Yes, but the *meanings* of the different values of that sysctl need
>  > to be defined, and documented.  If lowpower_idle=42 has a totally
>  > different meaning on different architectures then that's unfortunate
>  > but understandable.  But we should at least enumerate the different
>  > values and try to get different architectures to honour `42' in the
>  > same way.
> 
>  Writing to sysctl should be a bool, reading the value can be number of
>  module currently disabled low power idle.  I think the original intent
>  is to use ref count for enabling/disabling.  (granted, we copied the
>  code from other arch).

OK, so why not give us:

#define IDLE_HALT			0
#define IDLE_POLL			1
#define IDLE_SUPER_LOW_POWER_HALT	2

and so forth (are there any others?).

Set some system-wide integer via a sysctl and let the particular
architecture decide how best to implement the currently-selected idle mode?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  3:28     ` Andrew Morton
@ 2004-03-18  3:40       ` Zwane Mwaikambo
  2004-03-18  9:05         ` Dominik Brodowski
  2004-03-18 22:59         ` Todd Poynor
  2004-03-18 21:59       ` Kenneth Chen
  1 sibling, 2 replies; 18+ messages in thread
From: Zwane Mwaikambo @ 2004-03-18  3:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Kenneth Chen, Linux Kernel, linux-ia64, CPU Freq ML

On Wed, 17 Mar 2004, Andrew Morton wrote:

> "Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
> >
> >  > > Logically it means a sysctl entry in /proc/sys/kernel.
> >  > Yes, but the *meanings* of the different values of that sysctl need
> >  > to be defined, and documented.  If lowpower_idle=42 has a totally
> >  > different meaning on different architectures then that's unfortunate
> >  > but understandable.  But we should at least enumerate the different
> >  > values and try to get different architectures to honour `42' in the
> >  > same way.
> >
> >  Writing to sysctl should be a bool, reading the value can be number of
> >  module currently disabled low power idle.  I think the original intent
> >  is to use ref count for enabling/disabling.  (granted, we copied the
> >  code from other arch).
>
> OK, so why not give us:
>
> #define IDLE_HALT			0
> #define IDLE_POLL			1
> #define IDLE_SUPER_LOW_POWER_HALT	2
>
> and so forth (are there any others?).
>
> Set some system-wide integer via a sysctl and let the particular
> architecture decide how best to implement the currently-selected idle mode?

I'm wondering whether the setting of these magic numbers can't be done
using cpufreq infrastructure.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  3:40       ` Zwane Mwaikambo
@ 2004-03-18  9:05         ` Dominik Brodowski
  2004-03-18 18:29           ` Kenneth Chen
  2004-03-18 22:59         ` Todd Poynor
  1 sibling, 1 reply; 18+ messages in thread
From: Dominik Brodowski @ 2004-03-18  9:05 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Andrew Morton, Kenneth Chen, linux-ia64, Linux Kernel, CPU Freq ML

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

On Wed, Mar 17, 2004 at 10:40:31PM -0500, Zwane Mwaikambo wrote:
> On Wed, 17 Mar 2004, Andrew Morton wrote:
> 
> > "Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
> > >
> > >  > > Logically it means a sysctl entry in /proc/sys/kernel.
> > >  > Yes, but the *meanings* of the different values of that sysctl need
> > >  > to be defined, and documented.  If lowpower_idle=42 has a totally
> > >  > different meaning on different architectures then that's unfortunate
> > >  > but understandable.  But we should at least enumerate the different
> > >  > values and try to get different architectures to honour `42' in the
> > >  > same way.
> > >
> > >  Writing to sysctl should be a bool, reading the value can be number of
> > >  module currently disabled low power idle.  I think the original intent
> > >  is to use ref count for enabling/disabling.  (granted, we copied the
> > >  code from other arch).

I assume ia64 does idling using the ACPI processor.c driver? If so, couldn't
writing to /proc/acpi/processor/./power be an option?

> > OK, so why not give us:
> >
> > #define IDLE_HALT			0
> > #define IDLE_POLL			1
> > #define IDLE_SUPER_LOW_POWER_HALT	2
> >
> > and so forth (are there any others?).
> >
> > Set some system-wide integer via a sysctl and let the particular
> > architecture decide how best to implement the currently-selected idle mode?
> 
> I'm wondering whether the setting of these magic numbers can't be done
> using cpufreq infrastructure.

I doubt it -- there's no ia64 cpufreq driver anyway, and cpufreq is about
frequency scaling and (sometimes) throttling, but not "idling". And
"idling" is a too different implementation anyways.

	Dominik

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  1:04 ` Andrew Morton
  2004-03-18  3:18   ` Kenneth Chen
@ 2004-03-18  9:42   ` Pasi Savolainen
  1 sibling, 0 replies; 18+ messages in thread
From: Pasi Savolainen @ 2004-03-18  9:42 UTC (permalink / raw)
  To: linux-kernel

* Andrew Morton <akpm@osdl.org>:
> "Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
>>
>> On ia64, we need runtime control to manage CPU power state in the idle
>> loop. 
>
> Can you expand on this?  Does this mean that the admin can select different
> idle-loop algorithms?  If so, what alternative algorithms exist?

At least on 760MPX chipset, when Athlon is pushed into powersaving mode
(disconnected from PCI bus) great deal of graphical distortions are
introduced into bttv -based card's picture.
I've dealt with this by rmmod:ing amd76x_pm -module (this action disables
powersaving mode), but some sane API for disabling these disturbancies
would be much better.


-- 
   Psi -- <http://www.iki.fi/pasi.savolainen>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: add lowpower_idle sysctl
  2004-03-18  9:05         ` Dominik Brodowski
@ 2004-03-18 18:29           ` Kenneth Chen
  0 siblings, 0 replies; 18+ messages in thread
From: Kenneth Chen @ 2004-03-18 18:29 UTC (permalink / raw)
  To: 'Dominik Brodowski'
  Cc: Andrew Morton, linux-ia64, Linux Kernel, CPU Freq ML

>>>>> Dominik Brodowski wrote on Thu, March 18, 2004 1:05 AM
> I assume ia64 does idling using the ACPI processor.c driver?

No, not really.

> If so, couldn't writing to /proc/acpi/processor/./power be
> an option?

Not all platform has ACPI support, so going through ACPI isn't
generic enough.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: add lowpower_idle sysctl
  2004-03-18  3:28     ` Andrew Morton
  2004-03-18  3:40       ` Zwane Mwaikambo
@ 2004-03-18 21:59       ` Kenneth Chen
  2004-03-18 22:35         ` Andrew Morton
  2004-03-24  9:54         ` Pavel Machek
  1 sibling, 2 replies; 18+ messages in thread
From: Kenneth Chen @ 2004-03-18 21:59 UTC (permalink / raw)
  To: 'Andrew Morton'; +Cc: linux-kernel, linux-ia64

>>>>> Andrew Morton wrote on Wed, March 17, 2004 7:28 PM
> > "Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
> >
> >  Writing to sysctl should be a bool, reading the value can be number of
> >  module currently disabled low power idle.  I think the original intent
> >  is to use ref count for enabling/disabling.  (granted, we copied the
> >  code from other arch).
>
> OK, so why not give us:
>
> #define IDLE_HALT			0
> #define IDLE_POLL			1
> #define IDLE_SUPER_LOW_POWER_HALT	2
>
> and so forth (are there any others?).
>
> Set some system-wide integer via a sysctl and let the particular
> architecture decide how best to implement the currently-selected
> idle mode?


Sounds good, Thanks for the suggestion. I just coded it up:


diff -Nur linux-2.6.4/include/linux/cpu.h linux-2.6.4.halt/include/linux/cpu.h
--- linux-2.6.4/include/linux/cpu.h	2004-03-10 18:55:23.000000000 -0800
+++ linux-2.6.4.halt/include/linux/cpu.h	2004-03-18 13:47:43.000000000 -0800
@@ -52,6 +52,12 @@

 #endif /* CONFIG_SMP */
 extern struct sysdev_class cpu_sysdev_class;
+extern int idle_mode;
+
+#define IDLE_NOOP	0
+#define IDLE_HALT	1
+#define IDLE_POLL	2
+#define IDLE_ACPI	3

 #ifdef CONFIG_HOTPLUG_CPU
 /* Stop CPUs going up and down. */
diff -Nur linux-2.6.4/include/linux/sysctl.h linux-2.6.4.halt/include/linux/sysctl.h
--- linux-2.6.4/include/linux/sysctl.h	2004-03-10 18:55:28.000000000 -0800
+++ linux-2.6.4.halt/include/linux/sysctl.h	2004-03-18 12:00:40.000000000 -0800
@@ -131,6 +131,7 @@
 	KERN_PRINTK_RATELIMIT_BURST=61,	/* int: tune printk ratelimiting */
 	KERN_PTY=62,		/* dir: pty driver */
 	KERN_NGROUPS_MAX=63,	/* int: NGROUPS_MAX */
+	KERN_IDLE_MODE=64,	/* int: arch specific cpu idle mode */
 };


diff -Nur linux-2.6.4/kernel/cpu.c linux-2.6.4.halt/kernel/cpu.c
--- linux-2.6.4/kernel/cpu.c	2004-03-10 18:55:44.000000000 -0800
+++ linux-2.6.4.halt/kernel/cpu.c	2004-03-18 13:29:28.000000000 -0800
@@ -64,3 +64,5 @@
 	up(&cpucontrol);
 	return ret;
 }
+
+int idle_mode = IDLE_HALT;
diff -Nur linux-2.6.4/kernel/sysctl.c linux-2.6.4.halt/kernel/sysctl.c
--- linux-2.6.4/kernel/sysctl.c	2004-03-10 18:55:22.000000000 -0800
+++ linux-2.6.4.halt/kernel/sysctl.c	2004-03-18 13:52:27.000000000 -0800
@@ -39,6 +39,7 @@
 #include <linux/initrd.h>
 #include <linux/times.h>
 #include <linux/limits.h>
+#include <linux/cpu.h>
 #include <asm/uaccess.h>

 #ifdef CONFIG_ROOT_NFS
@@ -615,6 +616,14 @@
 		.mode		= 0444,
 		.proc_handler	= &proc_dointvec,
 	},
+	{
+		.ctl_name	= KERN_IDLE_MODE,
+		.procname	= "idle_mode",
+		.data		= &idle_mode,
+		.maxlen		= sizeof (int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
 	{ .ctl_name = 0 }
 };




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18 21:59       ` Kenneth Chen
@ 2004-03-18 22:35         ` Andrew Morton
  2004-03-24  9:54         ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2004-03-18 22:35 UTC (permalink / raw)
  To: Kenneth Chen; +Cc: linux-kernel, linux-ia64

"Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
>
> >>>>> Andrew Morton wrote on Wed, March 17, 2004 7:28 PM
> > > "Kenneth Chen" <kenneth.w.chen@intel.com> wrote:
> > >
> > >  Writing to sysctl should be a bool, reading the value can be number of
> > >  module currently disabled low power idle.  I think the original intent
> > >  is to use ref count for enabling/disabling.  (granted, we copied the
> > >  code from other arch).
> >
> > OK, so why not give us:
> >
> > #define IDLE_HALT			0
> > #define IDLE_POLL			1
> > #define IDLE_SUPER_LOW_POWER_HALT	2
> >
> > and so forth (are there any others?).
> >
> > Set some system-wide integer via a sysctl and let the particular
> > architecture decide how best to implement the currently-selected
> > idle mode?
> 
> 
> Sounds good, Thanks for the suggestion. I just coded it up:
> 

Looks fine, thanks.  I'll queue that up pending some code which actually
uses it.  And the obligatory update to Documentation/kernel-parameters.txt ;)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  3:40       ` Zwane Mwaikambo
  2004-03-18  9:05         ` Dominik Brodowski
@ 2004-03-18 22:59         ` Todd Poynor
  2004-03-19  0:09           ` Andrew Morton
                             ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Todd Poynor @ 2004-03-18 22:59 UTC (permalink / raw)
  To: Zwane Mwaikambo
  Cc: Andrew Morton, Kenneth Chen, linux-ia64, Linux Kernel, CPU Freq ML

Zwane Mwaikambo wrote:

>>Set some system-wide integer via a sysctl and let the particular
>>architecture decide how best to implement the currently-selected idle mode?
> 
> I'm wondering whether the setting of these magic numbers can't be done
> using cpufreq infrastructure.

I'd vote for using Patrick Mochel's PM subsystem and use a standard set 
of identifiers that are mapped to a platform-specific idle behavior, in 
much the same way as platform suspend modes are handled today.  For 
example, strings echoed to /sys/power/idle could be an interface.  If 
folks are amenable to this I'd be happy to supply a (generic) patch for it.

-- 
Todd Poynor
MontaVista Software


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18 22:59         ` Todd Poynor
@ 2004-03-19  0:09           ` Andrew Morton
  2004-03-19  0:43           ` Zwane Mwaikambo
  2004-03-25 19:20           ` Chen, Kenneth W
  2 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2004-03-19  0:09 UTC (permalink / raw)
  To: Todd Poynor; +Cc: zwane, kenneth.w.chen, linux-ia64, linux-kernel, cpufreq

Todd Poynor <tpoynor@mvista.com> wrote:
>
> Zwane Mwaikambo wrote:
> 
> >>Set some system-wide integer via a sysctl and let the particular
> >>architecture decide how best to implement the currently-selected idle mode?
> > 
> > I'm wondering whether the setting of these magic numbers can't be done
> > using cpufreq infrastructure.
> 
> I'd vote for using Patrick Mochel's PM subsystem and use a standard set 
> of identifiers that are mapped to a platform-specific idle behavior, in 
> much the same way as platform suspend modes are handled today.  For 
> example, strings echoed to /sys/power/idle could be an interface.  If 
> folks are amenable to this I'd be happy to supply a (generic) patch for it.

That sounds suitable, thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18 22:59         ` Todd Poynor
  2004-03-19  0:09           ` Andrew Morton
@ 2004-03-19  0:43           ` Zwane Mwaikambo
  2004-03-25 19:20           ` Chen, Kenneth W
  2 siblings, 0 replies; 18+ messages in thread
From: Zwane Mwaikambo @ 2004-03-19  0:43 UTC (permalink / raw)
  To: Todd Poynor
  Cc: Andrew Morton, Kenneth Chen, linux-ia64, Linux Kernel, CPU Freq ML

On Thu, 18 Mar 2004, Todd Poynor wrote:

> Zwane Mwaikambo wrote:
>
> >>Set some system-wide integer via a sysctl and let the particular
> >>architecture decide how best to implement the currently-selected idle mode?
> >
> > I'm wondering whether the setting of these magic numbers can't be done
> > using cpufreq infrastructure.
>
> I'd vote for using Patrick Mochel's PM subsystem and use a standard set
> of identifiers that are mapped to a platform-specific idle behavior, in
> much the same way as platform suspend modes are handled today.  For
> example, strings echoed to /sys/power/idle could be an interface.  If
> folks are amenable to this I'd be happy to supply a (generic) patch for it.

Ta, sounds good, give us a look.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18  0:31 add lowpower_idle sysctl Kenneth Chen
  2004-03-18  1:04 ` Andrew Morton
@ 2004-03-23  9:56 ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Pavel Machek @ 2004-03-23  9:56 UTC (permalink / raw)
  To: Kenneth Chen; +Cc: linux-kernel, linux-ia64

Hi!

> On ia64, we need runtime control to manage CPU power state in the idle
> loop.  Logically it means a sysctl entry in /proc/sys/kernel.  Even
> though this sysctl entry doesn't exist today, lots of arch already has
> some sort of API to dynamically enable/disable low power idle state.

If you make it "max Cx state to allow", it will be usefull for ACPI people, too...
				Pavel
-- 
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms         


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
  2004-03-18 21:59       ` Kenneth Chen
  2004-03-18 22:35         ` Andrew Morton
@ 2004-03-24  9:54         ` Pavel Machek
  2004-03-25 19:04           ` Chen, Kenneth W
  1 sibling, 1 reply; 18+ messages in thread
From: Pavel Machek @ 2004-03-24  9:54 UTC (permalink / raw)
  To: Kenneth Chen; +Cc: 'Andrew Morton', linux-kernel, linux-ia64

Hi!

> Sounds good, Thanks for the suggestion. I just coded it up:
> 
> 
> diff -Nur linux-2.6.4/include/linux/cpu.h linux-2.6.4.halt/include/linux/cpu.h
> --- linux-2.6.4/include/linux/cpu.h	2004-03-10 18:55:23.000000000 -0800
> +++ linux-2.6.4.halt/include/linux/cpu.h	2004-03-18 13:47:43.000000000 -0800
> @@ -52,6 +52,12 @@
> 
>  #endif /* CONFIG_SMP */
>  extern struct sysdev_class cpu_sysdev_class;
> +extern int idle_mode;
> +
> +#define IDLE_NOOP	0
> +#define IDLE_HALT	1
> +#define IDLE_POLL	2
> +#define IDLE_ACPI	3
> 

How is idle_noop different from idle_poll?

idle_halt is equivalent to idle_acpi_C1. But acpi supports also C2
(deeper sleep), and C3 (sleep without coherent caches) and newer
machines support even more. You might want to talk to Len Brown.

[And yes, limiting to C2 (for example) *is* usefull; some machines
(nforce2 iirc) have bugs, and die if you do C3 at wrong time].

								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: add lowpower_idle sysctl
  2004-03-24  9:54         ` Pavel Machek
@ 2004-03-25 19:04           ` Chen, Kenneth W
  0 siblings, 0 replies; 18+ messages in thread
From: Chen, Kenneth W @ 2004-03-25 19:04 UTC (permalink / raw)
  To: 'Pavel Machek'; +Cc: 'Andrew Morton', linux-kernel, linux-ia64

>>>>> Pavel Machek wrote on Wed, March 24, 2004 1:54 AM
> > +extern int idle_mode;
> > +
> > +#define IDLE_NOOP	0
> > +#define IDLE_HALT	1
> > +#define IDLE_POLL	2
> > +#define IDLE_ACPI	3
> >
>
> How is idle_noop different from idle_poll?

I was thinking idle_noop truly does nothing at all, versus idle_poll
which optimize cross cpu wakeup.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: add lowpower_idle sysctl
  2004-03-18 22:59         ` Todd Poynor
  2004-03-19  0:09           ` Andrew Morton
  2004-03-19  0:43           ` Zwane Mwaikambo
@ 2004-03-25 19:20           ` Chen, Kenneth W
  2 siblings, 0 replies; 18+ messages in thread
From: Chen, Kenneth W @ 2004-03-25 19:20 UTC (permalink / raw)
  To: 'Todd Poynor'; +Cc: linux-ia64, Linux Kernel, CPU Freq ML

>>>>> Todd Poynor wrote on Thu, March 18, 2004 3:00 PM
> I'd vote for using Patrick Mochel's PM subsystem and use a standard
> set of identifiers that are mapped to a platform-specific idle behavior,
> in  much the same way as platform suspend modes are handled today.  For
> example, strings echoed to /sys/power/idle could be an interface.  If
> folks are amenable to this I'd be happy to supply a (generic) patch for it.

Just wondering what is the state of development for this new PM scheme?
I don't check LKML that frequently, sorry if I miss any posting on LKML.

- Ken



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: add lowpower_idle sysctl
@ 2004-03-18  7:49 Ross Dickson
  0 siblings, 0 replies; 18+ messages in thread
From: Ross Dickson @ 2004-03-18  7:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, kenneth.w.chen

<snip>
> OK, so why not give us: 
 > 
 > #define IDLE_HALT 0 
 > #define IDLE_POLL 1 
 > #define IDLE_SUPER_LOW_POWER_HALT 2 
 > 
 > and so forth (are there any others?).

#define IDLE_C1HALT ?
 
I created another one for Athlon Nforce2 to prevent lockups in apic mode.
It is proving more useful than I thought.
It has also been used on SIS740 to prevent same problem.
I know such a workaround should not be required but it works well.

It modifies C1 state idle behaviour by being a little more intelligent about
when it is worthwhile to do into disconnect and has a crude but effective
delay in case of back to back disconnect reconnect cycles.
Recent post.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-03/4278.html
Patch is here.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-02/6520.html 

Currently it is kernel arg activated by  "idle=C1halt".
 
 > 
 > Set some system-wide integer via a sysctl and let the particular 
 > architecture decide how best to implement the currently-selected idle mode? 
 
A lockup detector on Athlon systems could conceivably invoke above idle state
after a manual reboot as not all systems of the same chipset have the problem.

Ross.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2004-03-25 19:21 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-18  0:31 add lowpower_idle sysctl Kenneth Chen
2004-03-18  1:04 ` Andrew Morton
2004-03-18  3:18   ` Kenneth Chen
2004-03-18  3:28     ` Andrew Morton
2004-03-18  3:40       ` Zwane Mwaikambo
2004-03-18  9:05         ` Dominik Brodowski
2004-03-18 18:29           ` Kenneth Chen
2004-03-18 22:59         ` Todd Poynor
2004-03-19  0:09           ` Andrew Morton
2004-03-19  0:43           ` Zwane Mwaikambo
2004-03-25 19:20           ` Chen, Kenneth W
2004-03-18 21:59       ` Kenneth Chen
2004-03-18 22:35         ` Andrew Morton
2004-03-24  9:54         ` Pavel Machek
2004-03-25 19:04           ` Chen, Kenneth W
2004-03-18  9:42   ` Pasi Savolainen
2004-03-23  9:56 ` Pavel Machek
2004-03-18  7:49 Ross Dickson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).