All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] CPUFreq PowerOP integration, Intro 0/3
@ 2006-08-24  1:23 Eugeny S. Mints
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
  0 siblings, 1 reply; 84+ messages in thread
From: Eugeny S. Mints @ 2006-08-24  1:23 UTC (permalink / raw)
  To: pm list

Integrating CPUFreq and PowerOP was discussed at the Linux PM summit
and in recent emails exchanges.  Some say keep them separate and some
say they must be integrated.  There is actually a very natural point
where integration makes sense - cpufreq_driver. This patchset presents
that integration point and is submitted for discussion.

The patches do not change the functionality of the cpufreq core.
Instead the idea is to redesign the tightly coupled interfaces of
cpufreq to clearly separate the arch dependent and independent pieces
layers.  This enables cpufreq to become arch independent and can start
to use the named operating points in all its layers.

PowerOP  replaces cpufreq driver as the h/w independent interface for
operating points.  PM core handles the h/w specific details for
defining the power parameters and setting the power parameters in h/w
registers.  Operating point definition/registration is now independent
of cpufreq.

Please note, that all userspace/kernel governor concepts, legacy sysfs cpufreq
interface remain untouched and SMP case is accounted in the resulting code as
well.

Highlights:

cpufreq.c
- get rid of cpufreq driver calls. the calls are replaced be calls to arch
independent freq_helpers (freq_helpers.c)
- available frequences sysfs interface now can be handled in arch independent
way
- cpufreq_sysdev_driver now serves only cpufreq core internad needs upon cpu
add/remove events (since all hw related is handled by PM Core)
- cpufreq_powerop_call() is added to handle operting point registration in the
kernel by an independent module at arbitrary moment

freq_table.c (now freq_helpers.c)
- get rid of cpufreq_frequency_table structures as input parameter and made the
code arch independent by leveraging PowerOP interface
- routine remain the same but are no longer used by arch dependent code but by
cpufreq core arch independent code instead
- all routines are arch independent; the only shared knowledge is platform
power pameter names (string)
- target() method expects power parameter names "freqN", "vN" are supported by
PM Core for cpu N
- setpolisy() method expects power parameter anmes "hfreqN", "lfreqN", "vN" are
supported by PM Core for cpu N

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-08-24  1:23 [RFC] CPUFreq PowerOP integration, Intro 0/3 Eugeny S. Mints
@ 2006-10-07  2:36 ` Dominik Brodowski
  2006-10-07  3:15   ` Dominik Brodowski
                     ` (4 more replies)
  0 siblings, 5 replies; 84+ messages in thread
From: Dominik Brodowski @ 2006-10-07  2:36 UTC (permalink / raw)
  To: Eugeny S. Mints; +Cc: pm list

Hi!

As you know, I never looked too friendly upon PowerOP and the "operating
points" concept. My latest messages may have illustrated this point even
further -- but the reason for that is that I more and more get the feeling
that PowerOP and "operating points" and the so-called new "PM core" is
trying to do too many things at once, and therefore mixes up differnt
levels. Here is a rough sketch of what I'd like to discuss[1] as an
alternative:


A) The lowest level: lots of knobs.

Somewhere in a "computer system"[2] there are very many "knobs" which may
be turned to influence various voltages, clock levels, or operating modes
("turbo", "performance" or "powersave", for example).

Also, there might be many dependencies on how these "knobs" may be
changed.

Let's assume the system is in a well-defined, working state right now.


B) I want to change one such knob!!!

Now, let's say that we want to change one value controlled by such a knob.
What must we do? We need to check that changing it
	a) does not violate any dependency ["verification"]
	b) all dependencies are handled in correct order ["notification"]


C) Notification

Let's look at the "notification" stage first -- that's what current cpufreq
notifiers do in a very basic way. However, this is also what the new clock
and voltage frameworks are trying to do, right? So that's the lesser problem
now.


D) Verification

So, how to do this verification? Basically, there are two approaches:

1) ask every other subsystem whether the new value is OK with it.
	This is what cpufreq currently suggests to do. It is evident
	that this gets overly complicated with lots of dependencies
	and dependencies within the dependencies -- both in terms
	of concept and in terms of time the verification code takes
	to execute.
	Advantages:
	- easy to expand, also in runtime (e.g. USB system is
		modprobed and telling you of a new minimum voltage
		requirement on certain circumstances)
	- does not limit choices for each knob
	Disadvantages:
	- might get very complex

2) look up all valid states in a table
	This is basically what PowerOP and the "operating points"
	concept suggests: if you want to change one value, you check
	what operating points a) contain the new value and b) is
	most suitable to you.
	Advantages:
	- fast
	- pre-defined set of operating points which the system
	  designer is comfortable with
	Disadvantages:
	- needs to be limited to "core" of the system as else
	  the tables may get overly large
	- limits the choices


E) So, why not combine the best of both worlds?


If you want to change a knob, the "PM core" looks both at every other
subsystem adding dependencies, and at a "operating points" table _ifff_ it
exists.



F) So, how would this work for OMAP1?

Let's limit it, to keep it somewhat simple, to the values contained in your
"struct pm_core_point" for OMAP:

	int cpu_vltg; /* voltage in mV */
	int dpll;     /* in KHz */
	int cpu;      /* CPU frequency in KHz */
	int tc;       /* in KHz */
	int per;      /* in KHz */
	int dsp;      /* in KHz */
	int dspmmu;   /* in KHz */
	int lcd;      /* in KHz */

and let's also add a

	int i_am_special;

Let's assume that there is an OMAP1 PM module which implements a ->set and
->get function for all of them. A yet-to-be-defined interface then tells
this PM module

"I want to increase the CPU frequency from C1 MHz to C2 MHz!"

->set(CPU_VLTG, C2);

The ->set function would then ask whether it is allowed to switch to
frequency B. How would it ask for that? It would both call the "operating
points" layer to check whether such a table is registered. Now, let's assume
there are no external subsystems affected by this change, and the system
engineer has defined such a table:

Nr.	CPU_VLTG	CPU	TC	... 	i_am_special
1	A1		C1	D1		1
2	A2		C1	D1		2
3	A1		C2	D2		3
4	A2		C2	D3		4

The core would determine that the latter two states are now allwed, and
using some sensible algorithm (e.g. "where do I not have to switch too many
knobs", or minimize the costs of switching) decide between those two.
Basically, it would recignize now that it is OK to proceed from state Nr. 1
to Nr. 3, but that this means that "tc" also needs to be changed. After
notifing relevant subsystems using the clock and voltage frameworks, it
would then proceed to set the hardware accordingly.

Now, some might argue "I want to tell the interface to enter mp3-mode, and
not enter some CPU_VLTG and hope that it selects the right table entry then
in the verifcation stage!" Well, you can do that. Using the i_am_special
pseudo-knob. You just tell the yet-to-be-defined interface "I want to switch
knob I_AM_SPECIAL to 4". The process is the same.


G) So, what does this get us?

It may look as "Operating Points" turned on its head now. And yes, it is.
But you can do the following now:
- let cpufreq call ->set(CPU_FREQ, <value>), if you want dynamic frequency
  scaling,
- use pre-defined operating points if it's suitable to do so,
- handles all dependencies either way.

Oh, and as the operating point concept is only introduced as an element
between the low-level setting and the "high-level policy decision", it does
not need to be squeezed into current cpufreq drivers or even the current
cpufreq core in any way. cpufreq may call it, but that should be relatively
easy to implement.


I think that this might be much easier to implement than your PowerOP /
operating points / PM core / PowerOP - cpufreq interaction patches. As a
matter of fact, some parts of your operating points table infrastructure
may be usable for the concept outlined above. So, what do you think? What
does everyone else involved think about this alternative approach?


Thanks,
	Dominik


[1] As many here are aware, I will have very limited time to actually
    implement it.
[2] embedded device, notebook, cluster, desktop with lots of USB devices
    connected, and so on

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
@ 2006-10-07  3:15   ` Dominik Brodowski
  2006-10-08  7:16   ` Pavel Machek
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 84+ messages in thread
From: Dominik Brodowski @ 2006-10-07  3:15 UTC (permalink / raw)
  To: Eugeny S. Mints; +Cc: pm list

Hi,

Some very basic incomplete sketch of how some of my ideas might look in
code:

 /*
 * OMAP PM Core implementation by Eugeny S. Mints <eugeny.mints@gmail.com>,
 * modified by Dominik Brodowski.
 *
 * Copyright (C) 2006, Nomad Global Solutions, Inc.
 * Copyright (C) 2006  Dominik Brodowski
 *
 * Based on code by Todd Poynor, Matthew Locke, Dmitry Chigirev, and
 * Bishop Brock.
 *
 * Copyright (C) 2002, 2004 MontaVista Software <source@mvista.com>.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 *
 */

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>

/* Number of colums in operating points table */
#define OP_VALUES  4


/* GENERIC PART BEGINS HERE */

struct op_entry {
	unsigned int value[OP_VALUES];
	struct op_entry *next;
};

int pm_core_verify_value(unsigned int what, unsigned int value,
			 struct op_entry *table,
			 int set_other(unsigned int, unsigned int))
{
	int i, colum = -1;
	struct op_entry *table_entry = table->next;

	for (i=0; i<OP_VALUES; i++) {
		if (table->value[i] == what)
			colum = i;
	}
	if (colum == -1)
		return -EINVAL;

	while (table_entry) {
		/* very easy algorithm at first: use the first
		   operating point which matches what we are looking
		   for
		*/
		if (table_entry->value[colum] == value)
			goto found_it;

		table_entry = table_entry->next;
	};
	return -EINVAL;

 found_it:
	for (i=0; i<OP_VALUES; i++) {
		if (i==colum)
			continue;
		/* FIXME: error handling */
		set_other(table->value[i], table_entry->value[i]);
	}
	return 0;
}

/* GENERIC PART ENDS HERE */



/* dummy */
static int set_cpu_vltg(unsigned int value)
{
	return 0;
}

static int set_dpll(unsigned int value)
{
	return 0;
}

static int set_cpu_freq(unsigned int value)
{
	return 0;
}

enum {
	CPU_VLTG = 0,
	DPLL = 1,
	CPU_FREQ = 2,
	OPPOINT_NR = 3,
};

static int set_pm(unsigned int what, unsigned int value)
{
	int ret = -ENODEV;

	/* FIXME
	pm_core_notify_prechange(what, value);
	*/

	switch (what) {
	case CPU_VLTG:
		/* FIXME: only set it if it is really different
		 * from current level */
		ret = set_cpu_vltg(value);
		break;
	case DPLL:
		ret = set_dpll(value);
		break;
	case CPU_FREQ:
		ret = set_cpu_freq(value);
		break;
	case OPPOINT_NR:
		/* ignore */
		break;
	}

	/* FIXME
	if (!ret)
		pm_core_notify_postchange(what, value);
	else
		pm_core_notify_failed_change(what, value);
	*/

	return ret;
}

/* THE TABLE CONTAING PRE-DEFINED OPERATING POINTS */


/* FIXME: think of a better way to set up a static table */
struct op_entry omap_op_static_entry4 = {
	.value = {2, 3, 2, 4},
};

struct op_entry omap_op_static_entry3 = {
	.value = {1, 2, 2, 3},
	.next = &omap_op_static_entry4,
};

struct op_entry omap_op_static_entry2 = {
	.value = {2, 1, 1, 2},
	.next = &omap_op_static_entry3,
};

struct op_entry omap_op_static_entry1 = {
	.value = {1, 1, 1, 1},
	.next = &omap_op_static_entry2,
};

struct op_entry omap_op_table = {
	.value = {CPU_VLTG, DPLL, CPU_FREQ, OPPOINT_NR},
	.next = &omap_op_static_entry1,
};


/* function being called by external interface. You can add
   whatever userspace API on top of it you like -- configfs,
   sysfs, module parameter, syscall, ... */

int omap_set(unsigned int what, unsigned int value)
{
	int ret;

	ret = pm_core_verify_value(what, value, &omap_op_table, set_pm);
	if (ret)
		return ret;

	return set_pm(what, value);
}


/*
To set a specific operating point, let's say, oppoint 3, the
yet-to-be-defined interface needs to call:

omap_set(OPPOINT_NR, 3);


Setting up a cpufreq interaction is quite easy, as it basically only
needs to call

omap_set(CPU_FREQ, validated_target_freq);

You only need to make sure that you're not using in-kernel CPU
frequency scaling parallel to setting oppoints. But such a validation
should be trivial to add.
*/



IT COMPILES!!! SHIP IT! ;)

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
  2006-10-07  3:15   ` Dominik Brodowski
@ 2006-10-08  7:16   ` Pavel Machek
  2006-10-12 15:38     ` Mark Gross
  2006-10-09 18:21   ` Mark Gross
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2006-10-08  7:16 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: pm list

Hi!

> F) So, how would this work for OMAP1?
> 
> Let's limit it, to keep it somewhat simple, to the values contained in your
> "struct pm_core_point" for OMAP:
> 
> 	int cpu_vltg; /* voltage in mV */
> 	int dpll;     /* in KHz */
> 	int cpu;      /* CPU frequency in KHz */
> 	int tc;       /* in KHz */
> 	int per;      /* in KHz */
> 	int dsp;      /* in KHz */
> 	int dspmmu;   /* in KHz */
> 	int lcd;      /* in KHz */
> 
> and let's also add a
> 
> 	int i_am_special;

Hehe, nice idea.

> I think that this might be much easier to implement than your PowerOP /
> operating points / PM core / PowerOP - cpufreq interaction patches. As a
> matter of fact, some parts of your operating points table infrastructure
> may be usable for the concept outlined above. So, what do you think? What
> does everyone else involved think about this alternative approach?

Looks okay to me. Unlike powerop design, this actually works for
everyone.
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
  2006-10-07  3:15   ` Dominik Brodowski
  2006-10-08  7:16   ` Pavel Machek
@ 2006-10-09 18:21   ` Mark Gross
  2006-10-26  3:06     ` Dominik Brodowski
  2006-10-12 22:43   ` Eugeny S. Mints
  2007-03-13  0:57   ` Alternative Concept Matthew Locke
  4 siblings, 1 reply; 84+ messages in thread
From: Mark Gross @ 2006-10-09 18:21 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: pm list

On Fri, Oct 06, 2006 at 10:36:20PM -0400, Dominik Brodowski wrote:
> Hi!
> 
> As you know, I never looked too friendly upon PowerOP and the "operating
> points" concept. My latest messages may have illustrated this point even
> further -- but the reason for that is that I more and more get the feeling
> that PowerOP and "operating points" and the so-called new "PM core" is
> trying to do too many things at once, and therefore mixes up differnt
> levels. Here is a rough sketch of what I'd like to discuss[1] as an
> alternative:
> 
> 
> A) The lowest level: lots of knobs.
> 
> Somewhere in a "computer system"[2] there are very many "knobs" which may
> be turned to influence various voltages, clock levels, or operating modes
> ("turbo", "performance" or "powersave", for example).
> 
> Also, there might be many dependencies on how these "knobs" may be
> changed.
> 
> Let's assume the system is in a well-defined, working state right now.
> 

Smells like PowerOP to me.

> 
> B) I want to change one such knob!!!
> 
> Now, let's say that we want to change one value controlled by such a knob.
> What must we do? We need to check that changing it
> 	a) does not violate any dependency ["verification"]
> 	b) all dependencies are handled in correct order ["notification"]
> 

Constraints and notifications are the next big problems to address after
we get the interface for the knobs working.

> 
> C) Notification
> 
> Let's look at the "notification" stage first -- that's what current cpufreq
> notifiers do in a very basic way. However, this is also what the new clock
> and voltage frameworks are trying to do, right? So that's the lesser problem
> now.
> 
> 
> D) Verification
i.e. constraint checking and enforcement

> 
> So, how to do this verification? Basically, there are two approaches:
> 
> 1) ask every other subsystem whether the new value is OK with it.
> 	This is what cpufreq currently suggests to do. It is evident
> 	that this gets overly complicated with lots of dependencies
> 	and dependencies within the dependencies -- both in terms
> 	of concept and in terms of time the verification code takes
> 	to execute.
> 	Advantages:
> 	- easy to expand, also in runtime (e.g. USB system is
> 		modprobed and telling you of a new minimum voltage
> 		requirement on certain circumstances)
> 	- does not limit choices for each knob
> 	Disadvantages:
> 	- might get very complex
> 
> 2) look up all valid states in a table
> 	This is basically what PowerOP and the "operating points"
> 	concept suggests: if you want to change one value, you check
> 	what operating points a) contain the new value and b) is
> 	most suitable to you.
> 	Advantages:
> 	- fast
> 	- pre-defined set of operating points which the system
> 	  designer is comfortable with
> 	Disadvantages:
> 	- needs to be limited to "core" of the system as else
> 	  the tables may get overly large
> 	- limits the choices
> 
> 
> E) So, why not combine the best of both worlds?
> 
> 
> If you want to change a knob, the "PM core" looks both at every other
> subsystem adding dependencies, and at a "operating points" table _ifff_ it
> exists.
> 
> 
> 
> F) So, how would this work for OMAP1?
> 
> Let's limit it, to keep it somewhat simple, to the values contained in your
> "struct pm_core_point" for OMAP:
> 
> 	int cpu_vltg; /* voltage in mV */
> 	int dpll;     /* in KHz */
> 	int cpu;      /* CPU frequency in KHz */
> 	int tc;       /* in KHz */
> 	int per;      /* in KHz */
> 	int dsp;      /* in KHz */
> 	int dspmmu;   /* in KHz */
> 	int lcd;      /* in KHz */
> 
> and let's also add a
> 
> 	int i_am_special;
> 
> Let's assume that there is an OMAP1 PM module which implements a ->set and
> ->get function for all of them. A yet-to-be-defined interface then tells
> this PM module
> 
> "I want to increase the CPU frequency from C1 MHz to C2 MHz!"
> 
> ->set(CPU_VLTG, C2);
did you mean ->set(CPU, C2) ?

> 
> The ->set function would then ask whether it is allowed to switch to
> frequency B. How would it ask for that? It would both call the "operating
> points" layer to check whether such a table is registered. Now, let's assume
> there are no external subsystems affected by this change, and the system
> engineer has defined such a table:
> 
> Nr.	CPU_VLTG	CPU	TC	... 	i_am_special
> 1	A1		C1	D1		1
> 2	A2		C1	D1		2
> 3	A1		C2	D2		3
> 4	A2		C2	D3		4
> 
> The core would determine that the latter two states are now allwed, and
> using some sensible algorithm (e.g. "where do I not have to switch too many
> knobs", or minimize the costs of switching) decide between those two.
> Basically, it would recignize now that it is OK to proceed from state Nr. 1
> to Nr. 3, but that this means that "tc" also needs to be changed. After
> notifing relevant subsystems using the clock and voltage frameworks, it
> would then proceed to set the hardware accordingly.

This adds a sort of tree search defining a power state path from a
current state to one of the possible target stats with C2.  In this case
the only way to get to CPU==C2 is to change TC to D2 and deal with all
the ripples that will cause.

One question is how do we know that changing TC is a better way to go
than changing CPU_VLTG?  We'll need to figure out an ordering in the
phase space of power states.  Thinking out loud, I would try to pick the
target state based on latency if there are more than one targets to
satisfy the ->set() request.

> 
> Now, some might argue "I want to tell the interface to enter mp3-mode, and
> not enter some CPU_VLTG and hope that it selects the right table entry then
> in the verifcation stage!" Well, you can do that. Using the i_am_special
> pseudo-knob. You just tell the yet-to-be-defined interface "I want to switch
> knob I_AM_SPECIAL to 4". The process is the same.

MP3 mode effectively becomes a constraint in the system.  yes, I have
looked at cpufreq governors from this perspective and I think it could
work.

The trick is to make it easy to define register, activate / deactivate
constraints.  I try to make them modules that register with a global
constraint / notification thing.

> 
> 
> G) So, what does this get us?
> 
> It may look as "Operating Points" turned on its head now. And yes, it is.
> But you can do the following now:
> - let cpufreq call ->set(CPU_FREQ, <value>), if you want dynamic frequency
>   scaling,
> - use pre-defined operating points if it's suitable to do so,
> - handles all dependencies either way.
> 

I like the concept.

> Oh, and as the operating point concept is only introduced as an element
> between the low-level setting and the "high-level policy decision", it does
> not need to be squeezed into current cpufreq drivers or even the current
> cpufreq core in any way. cpufreq may call it, but that should be relatively
> easy to implement.
> 
> 
> I think that this might be much easier to implement than your PowerOP /
> operating points / PM core / PowerOP - cpufreq interaction patches. As a
> matter of fact, some parts of your operating points table infrastructure
> may be usable for the concept outlined above. So, what do you think? What
> does everyone else involved think about this alternative approach?
> 

I still see a need to take the first step of enabling "lots of knobs".
This is the primary goal of the PowerOp patch set.  The stuff with the
sysfs is just an interface to set/get the operating points while a
more complete solution like what you are talking about evolves.

> 
> Thanks,
> 	Dominik
> 
> 
> [1] As many here are aware, I will have very limited time to actually
>     implement it.
> [2] embedded device, notebook, cluster, desktop with lots of USB devices
>     connected, and so on
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-08  7:16   ` Pavel Machek
@ 2006-10-12 15:38     ` Mark Gross
  2006-10-12 16:02       ` Dominik Brodowski
  2006-10-12 16:48       ` Pavel Machek
  0 siblings, 2 replies; 84+ messages in thread
From: Mark Gross @ 2006-10-12 15:38 UTC (permalink / raw)
  To: Pavel Machek; +Cc: pm list, Dominik Brodowski

On Sun, Oct 08, 2006 at 07:16:49AM +0000, Pavel Machek wrote:
> Hi!
> 
> > F) So, how would this work for OMAP1?
> > 
> > Let's limit it, to keep it somewhat simple, to the values contained in your
> > "struct pm_core_point" for OMAP:
> > 
> > 	int cpu_vltg; /* voltage in mV */
> > 	int dpll;     /* in KHz */
> > 	int cpu;      /* CPU frequency in KHz */
> > 	int tc;       /* in KHz */
> > 	int per;      /* in KHz */
> > 	int dsp;      /* in KHz */
> > 	int dspmmu;   /* in KHz */
> > 	int lcd;      /* in KHz */
> > 
> > and let's also add a
> > 
> > 	int i_am_special;
> 
> Hehe, nice idea.
> 
> > I think that this might be much easier to implement than your PowerOP /
> > operating points / PM core / PowerOP - cpufreq interaction patches. As a
> > matter of fact, some parts of your operating points table infrastructure
> > may be usable for the concept outlined above. So, what do you think? What
> > does everyone else involved think about this alternative approach?
> 
> Looks okay to me. Unlike powerop design, this actually works for
> everyone.

Pavel, if you would pay attention better you would notice that at the
underneath of what Dominic is talking about is a concept of *more knobs*
for controlling platform power states.  This is what PowerOP is trying
to bring to the table.  

PowerOP is not a policy engine like what Dominic is talking about.  And
what Dominic is talking about will need to build on something that will
end up looking so much like power op that it wont be funny.

I don't know how to make it more clear to you, and I hope there isn't
something personal going on between you and the PowerOP guys that is
holding back moving forward on the evolution of a PM framework such as
what Dominic is starting to talk about.

Thanks,

--mgross

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 15:38     ` Mark Gross
@ 2006-10-12 16:02       ` Dominik Brodowski
  2006-10-16 21:56         ` Mark Gross
  2006-10-12 16:48       ` Pavel Machek
  1 sibling, 1 reply; 84+ messages in thread
From: Dominik Brodowski @ 2006-10-12 16:02 UTC (permalink / raw)
  To: Mark Gross; +Cc: pm list

Hi Mark,

On Thu, Oct 12, 2006 at 08:38:21AM -0700, Mark Gross wrote:
> > > I think that this might be much easier to implement than your PowerOP /
> > > operating points / PM core / PowerOP - cpufreq interaction patches. As a
> > > matter of fact, some parts of your operating points table infrastructure
> > > may be usable for the concept outlined above. So, what do you think? What
> > > does everyone else involved think about this alternative approach?
> > 
> > Looks okay to me. Unlike powerop design, this actually works for
> > everyone.
> 
> Pavel, if you would pay attention better you would notice that at the
> underneath of what Dominic is talking about is a concept of *more knobs*
> for controlling platform power states.  This is what PowerOP is trying
> to bring to the table.  

Oh no. PowerOP does it top->bottom; I try to do it bototm->top. That's the
difference, and it is a _fundamental_ difference. Yes, both will lead to a
concept of "operating points" on systems which may need it. But still the
way you get there (which is important if you want to keep it flexible, and
you do want to keep it flexible to allow for cpufreq) is different.

> PowerOP is not a policy engine like what Dominic is talking about.  And
> what Dominic is talking about will need to build on something that will
> end up looking so much like power op that it wont be funny.

This I dare to doubt.

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 15:38     ` Mark Gross
  2006-10-12 16:02       ` Dominik Brodowski
@ 2006-10-12 16:48       ` Pavel Machek
  2006-10-12 17:12         ` Vitaly Wool
  1 sibling, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2006-10-12 16:48 UTC (permalink / raw)
  To: Mark Gross; +Cc: pm list, Dominik Brodowski

On Thu 2006-10-12 08:38:21, Mark Gross wrote:
> On Sun, Oct 08, 2006 at 07:16:49AM +0000, Pavel Machek wrote:
> > Hi!
> > 
> > > F) So, how would this work for OMAP1?
> > > 
> > > Let's limit it, to keep it somewhat simple, to the values contained in your
> > > "struct pm_core_point" for OMAP:
> > > 
> > > 	int cpu_vltg; /* voltage in mV */
> > > 	int dpll;     /* in KHz */
> > > 	int cpu;      /* CPU frequency in KHz */
> > > 	int tc;       /* in KHz */
> > > 	int per;      /* in KHz */
> > > 	int dsp;      /* in KHz */
> > > 	int dspmmu;   /* in KHz */
> > > 	int lcd;      /* in KHz */
> > > 
> > > and let's also add a
> > > 
> > > 	int i_am_special;
> > 
> > Hehe, nice idea.
> > 
> > > I think that this might be much easier to implement than your PowerOP /
> > > operating points / PM core / PowerOP - cpufreq interaction patches. As a
> > > matter of fact, some parts of your operating points table infrastructure
> > > may be usable for the concept outlined above. So, what do you think? What
> > > does everyone else involved think about this alternative approach?
> > 
> > Looks okay to me. Unlike powerop design, this actually works for
> > everyone.
> 
> Pavel, if you would pay attention better you would notice that at the
> underneath of what Dominic is talking about is a concept of *more knobs*
> for controlling platform power states.  This is what PowerOP is trying
> to bring to the table.  

I believe I did pay attention. What PowerOP has is crappy patches,
crappy changelogs, and when you don't like it, you are obviously
biased (or not paying attention).

And no, what Dominik describes is very reasonable, unlike powerop.

> PowerOP is not a policy engine like what Dominic is talking about.  And
> what Dominic is talking about will need to build on something that will
> end up looking so much like power op that it wont be funny.

Great, so you think you do not have that much work to do. Go ahead and
do it.

(Unlike powerop, this has actually chance of working with 256
cpus. Unlike powerop people, Dominik did his homework.)
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 16:48       ` Pavel Machek
@ 2006-10-12 17:12         ` Vitaly Wool
  2006-10-12 17:23           ` Pavel Machek
  0 siblings, 1 reply; 84+ messages in thread
From: Vitaly Wool @ 2006-10-12 17:12 UTC (permalink / raw)
  To: Pavel Machek; +Cc: pm list, Dominik Brodowski


[-- Attachment #1.1: Type: text/plain, Size: 374 bytes --]

On 10/12/06, Pavel Machek <pavel@ucw.cz> wrote:
>
> I believe I did pay attention. What PowerOP has is crappy patches,
> crappy changelogs, and when you don't like it, you are obviously
> biased (or not paying attention).


I would say that by these words and by what you say later you confirm you
just haven't got any valid points you can prove your position with.

Vitaly

[-- Attachment #1.2: Type: text/html, Size: 662 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 17:12         ` Vitaly Wool
@ 2006-10-12 17:23           ` Pavel Machek
  0 siblings, 0 replies; 84+ messages in thread
From: Pavel Machek @ 2006-10-12 17:23 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: pm list, Dominik Brodowski

Hi!

On Thu 2006-10-12 21:12:20, Vitaly Wool wrote:
> 
>    On 10/12/06, Pavel Machek <[1]pavel@ucw.cz> wrote:
> 
>      I believe I did pay attention. What PowerOP has is crappy patches,
>      crappy changelogs, and when you don't like it, you are obviously
>      biased (or not paying attention).
> 
>    I would say that by these words and by what you say later you confirm
>    you just haven't got any valid points you can prove your position
>    with.

I do not need to prove my position. What you need is to send
acceptable patches... and you need to have reasonable interface to the
userland, Dominik proposed one.

(And you should learn not to post html to the lists).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
                     ` (2 preceding siblings ...)
  2006-10-09 18:21   ` Mark Gross
@ 2006-10-12 22:43   ` Eugeny S. Mints
  2006-10-13 10:55     ` Pavel Machek
  2006-10-26  3:05     ` Dominik Brodowski
  2007-03-13  0:57   ` Alternative Concept Matthew Locke
  4 siblings, 2 replies; 84+ messages in thread
From: Eugeny S. Mints @ 2006-10-12 22:43 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: pm list

Dominik Brodowski wrote:
> Hi!
> 
> As you know, I never looked too friendly upon PowerOP and the "operating
> points" concept. My latest messages may have illustrated this point even
> further -- but the reason for that is that I more and more get the feeling
> that PowerOP and "operating points" and the so-called new "PM core" is
> trying to do too many things at once, and therefore mixes up differnt
> levels. Here is a rough sketch of what I'd like to discuss[1] as an
> alternative:
> 
> 
> A) The lowest level: lots of knobs.
> 
> Somewhere in a "computer system"[2] there are very many "knobs" which may
> be turned to influence various voltages, clock levels, or operating modes
> ("turbo", "performance" or "powersave", for example).
> 
> Also, there might be many dependencies on how these "knobs" may be
> changed.
> 
> Let's assume the system is in a well-defined, working state right now.
In terms which we use to describe PowerOP a "kbob" is "power parameter"
and "operating point" is an entity which corresponds to "well-defined, _working_
[system power] state".

So, what PowerOP Core does: it just maintains a collection of operating point,
i.e. collection of known-to-be-working system power states. On many platforms
(especially embedded) not all combinations of power parameters are valid.  Some
[invalid] combinations of the power parameters may crash or damage the system.

PowerOP Core operates with operating points and thus provides capability to
switch ONLY between _known-to-be-working_ system power states and bypass any
invalid.

I feel like you are basically talking about similar things. Lets see.

Each time you call ->set(SOME_PLATFORM_POWER_PARAMETER, value1) you want the
system to switch to the set of power parameter values where value of
SOME_PLATFORM_POWER_PARAMETER is equal to 'value1'. Further you
are saying there are two options here:

1) you have a table which tells you that there are some  combinations of power
parameter values which are
a) _known-to-be-working_
and
b) contains SOME_PLATFORM_POWER_PARAMETER=value1.
Then you chose one of these operating points and switch to it.

The table creation is simply registration of operating points with POwerOP Core.

Now selection and switch. Obviously the functionality of selection between
operating points based on some algo (which btw varies even not across platforms
but even across different profiles of the same platform) has nothing to do with
the code which actually switches operating points. So having such
functionalities coupled within the ->set() method is just invalid design - they
have to be separated. That's exactly what PowerOP approach does: an upper layer
can implement selection logic leveraging PowerOP Core interface and then request
POwerOP Core to switch system to the selected operating point.

2) table does not exist. There are two options here:

Either,
a)an entity which calls ->set() for a particular power parameter IS
RESPONSIBLE for that resulting combination of power parameter values (once the
set has been executed) IS valid one

OR
b) the system executes complex logic you described under D) 1) (in fact,
cpufreq policy notifiers) to get a valid combination of power parameter values
with a predefined value of a certain power parameter.

Let me illustrate why 2)a) is just particular case in contrast to POwerOP which
is general case in this situation.

i) PowerOP Core provides interface to get/set value of a particular power parameter

ii)  Let's assume we limit the set of operating points for a platform to one 
point.  This one operating point is always the current operating point.  All 
operations occur on the the current operating point.

iii)in the assumptions above your  ->set() is nothing else than:
set(param, value)
{
  struct powerop_pwr_param p;

  p.attr = param;
  p.value = value;
  powerop_set_pwr_params(CURRENT_POINT, &p, 1);
  powerop_set_point(CURRENT_POINT);
}

where CURRENT_POINT may be NULL for example (since in current PowerOP Core NULL 
identifier corresponds exactly to "current" operating point).

THe 2)b)(complex logic as the approach to get a valid combination of power
parameter values). This might be point for discussion.
IMO definition of operating points approach as a way to determine a valid
combination of power parameter values is much simple.

That's it. Bottom line is: what you are talking about is NOT an Alternative
Concept but a particular case instead. While PowerOP design is generic case.

I'm not talking about notification (transition notifiers in cpufreq terms)and
constraints because here we basically on the same page.

The last remark about 256 CPU case. Leveraging POwerOP such systems will be
built using just one (current) operating point approach as described above.
> 
> 
> B) I want to change one such knob!!!
> 
> Now, let's say that we want to change one value controlled by such a knob.
> What must we do? We need to check that changing it
> 	a) does not violate any dependency ["verification"]
> 	b) all dependencies are handled in correct order ["notification"]
> 
> 
> C) Notification
> 
> Let's look at the "notification" stage first -- that's what current cpufreq
> notifiers do in a very basic way. However, this is also what the new clock
> and voltage frameworks are trying to do, right? So that's the lesser problem
> now.
> 
> 
> D) Verification
> 
> So, how to do this verification? Basically, there are two approaches:
> 
> 1) ask every other subsystem whether the new value is OK with it.
> 	This is what cpufreq currently suggests to do. It is evident
> 	that this gets overly complicated with lots of dependencies
> 	and dependencies within the dependencies -- both in terms
> 	of concept and in terms of time the verification code takes
> 	to execute.
> 	Advantages:
> 	- easy to expand, also in runtime (e.g. USB system is
> 		modprobed and telling you of a new minimum voltage
> 		requirement on certain circumstances)
> 	- does not limit choices for each knob
> 	Disadvantages:
> 	- might get very complex
> 
> 2) look up all valid states in a table
> 	This is basically what PowerOP and the "operating points"
> 	concept suggests: if you want to change one value, you check
> 	what operating points a) contain the new value and b) is
> 	most suitable to you.
> 	Advantages:
> 	- fast
> 	- pre-defined set of operating points which the system
> 	  designer is comfortable with
> 	Disadvantages:
> 	- needs to be limited to "core" of the system as else
> 	  the tables may get overly large
> 	- limits the choices
> 
> 
> E) So, why not combine the best of both worlds?
> 
> 
> If you want to change a knob, the "PM core" looks both at every other
> subsystem adding dependencies, and at a "operating points" table _ifff_ it
> exists.
> 
> 
> 
> F) So, how would this work for OMAP1?
> 
> Let's limit it, to keep it somewhat simple, to the values contained in your
> "struct pm_core_point" for OMAP:
> 
> 	int cpu_vltg; /* voltage in mV */
> 	int dpll;     /* in KHz */
> 	int cpu;      /* CPU frequency in KHz */
> 	int tc;       /* in KHz */
> 	int per;      /* in KHz */
> 	int dsp;      /* in KHz */
> 	int dspmmu;   /* in KHz */
> 	int lcd;      /* in KHz */
> 
> and let's also add a
> 
> 	int i_am_special;
> 
> Let's assume that there is an OMAP1 PM module which implements a ->set and
> ->get function for all of them. A yet-to-be-defined interface then tells
> this PM module
> 
> "I want to increase the CPU frequency from C1 MHz to C2 MHz!"
> 
> ->set(CPU_VLTG, C2);
> 
> The ->set function would then ask whether it is allowed to switch to
> frequency B. How would it ask for that? It would both call the "operating
> points" layer to check whether such a table is registered. Now, let's assume
> there are no external subsystems affected by this change, and the system
> engineer has defined such a table:
> 
> Nr.	CPU_VLTG	CPU	TC	... 	i_am_special
> 1	A1		C1	D1		1
> 2	A2		C1	D1		2
> 3	A1		C2	D2		3
> 4	A2		C2	D3		4
> 
> The core would determine that the latter two states are now allwed, and
> using some sensible algorithm (e.g. "where do I not have to switch too many
> knobs", or minimize the costs of switching) decide between those two.
> Basically, it would recignize now that it is OK to proceed from state Nr. 1
> to Nr. 3, but that this means that "tc" also needs to be changed. After
> notifing relevant subsystems using the clock and voltage frameworks, it
> would then proceed to set the hardware accordingly.
> 
> Now, some might argue "I want to tell the interface to enter mp3-mode, and
> not enter some CPU_VLTG and hope that it selects the right table entry then
> in the verifcation stage!" Well, you can do that. Using the i_am_special
> pseudo-knob. You just tell the yet-to-be-defined interface "I want to switch
> knob I_AM_SPECIAL to 4". The process is the same.
> 
> 
> G) So, what does this get us?
> 
> It may look as "Operating Points" turned on its head now. And yes, it is.
> But you can do the following now:
> - let cpufreq call ->set(CPU_FREQ, <value>), if you want dynamic frequency
>   scaling,
> - use pre-defined operating points if it's suitable to do so,
> - handles all dependencies either way.
> 
> Oh, and as the operating point concept is only introduced as an element
> between the low-level setting and the "high-level policy decision", it does
> not need to be squeezed into current cpufreq drivers or even the current
> cpufreq core in any way. cpufreq may call it, but that should be relatively
> easy to implement.
> 
> 
> I think that this might be much easier to implement than your PowerOP /
> operating points / PM core / PowerOP - cpufreq interaction patches. As a
> matter of fact, some parts of your operating points table infrastructure
> may be usable for the concept outlined above. So, what do you think? What
> does everyone else involved think about this alternative approach?
> 
> 
> Thanks,
> 	Dominik
> 
> 
> [1] As many here are aware, I will have very limited time to actually
>     implement it.
> [2] embedded device, notebook, cluster, desktop with lots of USB devices
>     connected, and so on
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 22:43   ` Eugeny S. Mints
@ 2006-10-13 10:55     ` Pavel Machek
  2006-10-16 21:44       ` Mark Gross
  2006-10-26  3:05     ` Dominik Brodowski
  1 sibling, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2006-10-13 10:55 UTC (permalink / raw)
  To: Eugeny S. Mints; +Cc: pm list, Dominik Brodowski

Hi!

> That's it. Bottom line is: what you are talking about is NOT an Alternative
> Concept but a particular case instead. While PowerOP design is
> generic case.

Fine then; submit powerOP with interface Dominik suggested. Notice
that his solution exposes all the knobs to userspace directly, so his
interface _is_ different to yours. "i_am_special" is just one of knobs.

> The last remark about 256 CPU case. Leveraging POwerOP such systems will be
> built using just one (current) operating point approach as described
> above.

Notice that Dominik's solution still allows to have more than one
operating point for each of 256 CPUs without explosion of number of
states.

								Pavel

> > F) So, how would this work for OMAP1?
> > 
> > Let's limit it, to keep it somewhat simple, to the values contained in your
> > "struct pm_core_point" for OMAP:
> > 
> > 	int cpu_vltg; /* voltage in mV */
> > 	int dpll;     /* in KHz */
> > 	int cpu;      /* CPU frequency in KHz */
> > 	int tc;       /* in KHz */
> > 	int per;      /* in KHz */
> > 	int dsp;      /* in KHz */
> > 	int dspmmu;   /* in KHz */
> > 	int lcd;      /* in KHz */
> > 
> > and let's also add a
> > 
> > 	int i_am_special;
> > 
> > Let's assume that there is an OMAP1 PM module which implements a ->set and
> > ->get function for all of them. A yet-to-be-defined interface then tells
> > this PM module

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-13 10:55     ` Pavel Machek
@ 2006-10-16 21:44       ` Mark Gross
  2006-10-17  8:26         ` Pavel Machek
  0 siblings, 1 reply; 84+ messages in thread
From: Mark Gross @ 2006-10-16 21:44 UTC (permalink / raw)
  To: Pavel Machek; +Cc: pm list, Dominik Brodowski

On Fri, Oct 13, 2006 at 12:55:04PM +0200, Pavel Machek wrote:
> Hi!
> 
> > That's it. Bottom line is: what you are talking about is NOT an Alternative
> > Concept but a particular case instead. While PowerOP design is
> > generic case.
> 
> Fine then; submit powerOP with interface Dominik suggested. Notice
> that his solution exposes all the knobs to userspace directly, so his
> interface _is_ different to yours. "i_am_special" is just one of knobs.

I am very keen on Dominic's design concept, however; I missed his sysfs
interface that exposes the knobs to userspace.  

I've search the archives and still don't see any interface to user space
users. 

--mgross
> 
> > The last remark about 256 CPU case. Leveraging POwerOP such systems will be
> > built using just one (current) operating point approach as described
> > above.
> 
> Notice that Dominik's solution still allows to have more than one
> operating point for each of 256 CPUs without explosion of number of
> states.
> 
> 								Pavel
> 
> > > F) So, how would this work for OMAP1?
> > > 
> > > Let's limit it, to keep it somewhat simple, to the values contained in your
> > > "struct pm_core_point" for OMAP:
> > > 
> > > 	int cpu_vltg; /* voltage in mV */
> > > 	int dpll;     /* in KHz */
> > > 	int cpu;      /* CPU frequency in KHz */
> > > 	int tc;       /* in KHz */
> > > 	int per;      /* in KHz */
> > > 	int dsp;      /* in KHz */
> > > 	int dspmmu;   /* in KHz */
> > > 	int lcd;      /* in KHz */
> > > 
> > > and let's also add a
> > > 
> > > 	int i_am_special;
> > > 
> > > Let's assume that there is an OMAP1 PM module which implements a ->set and
> > > ->get function for all of them. A yet-to-be-defined interface then tells
> > > this PM module
> 
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 16:02       ` Dominik Brodowski
@ 2006-10-16 21:56         ` Mark Gross
  2006-10-17 21:40           ` Matthew Locke
  0 siblings, 1 reply; 84+ messages in thread
From: Mark Gross @ 2006-10-16 21:56 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: pm list

On Thu, Oct 12, 2006 at 06:02:03PM +0200, Dominik Brodowski wrote:
> Hi Mark,
> 
> On Thu, Oct 12, 2006 at 08:38:21AM -0700, Mark Gross wrote:
> > > > I think that this might be much easier to implement than your PowerOP /
> > > > operating points / PM core / PowerOP - cpufreq interaction patches. As a
> > > > matter of fact, some parts of your operating points table infrastructure
> > > > may be usable for the concept outlined above. So, what do you think? What
> > > > does everyone else involved think about this alternative approach?
> > > 
> > > Looks okay to me. Unlike powerop design, this actually works for
> > > everyone.
> > 
> > Pavel, if you would pay attention better you would notice that at the
> > underneath of what Dominic is talking about is a concept of *more knobs*
> > for controlling platform power states.  This is what PowerOP is trying
> > to bring to the table.  
> 
> Oh no. PowerOP does it top->bottom; I try to do it bototm->top. That's the
> difference, and it is a _fundamental_ difference. Yes, both will lead to a
> concept of "operating points" on systems which may need it. But still the
> way you get there (which is important if you want to keep it flexible, and
> you do want to keep it flexible to allow for cpufreq) is different.

I'll take a closer look at both.  It really looks to me that folks are in
violent agreement more than anything else.  I also prefer a bottom->top
approach.

--mgross

> 
> > PowerOP is not a policy engine like what Dominic is talking about.  And
> > what Dominic is talking about will need to build on something that will
> > end up looking so much like power op that it wont be funny.
> 
> This I dare to doubt.
> 
> Thanks,
> 	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-16 21:44       ` Mark Gross
@ 2006-10-17  8:26         ` Pavel Machek
  0 siblings, 0 replies; 84+ messages in thread
From: Pavel Machek @ 2006-10-17  8:26 UTC (permalink / raw)
  To: Mark Gross; +Cc: pm list, Dominik Brodowski

Hi!

> > > That's it. Bottom line is: what you are talking about is NOT an Alternative
> > > Concept but a particular case instead. While PowerOP design is
> > > generic case.
> > 
> > Fine then; submit powerOP with interface Dominik suggested. Notice
> > that his solution exposes all the knobs to userspace directly, so his
> > interface _is_ different to yours. "i_am_special" is just one of knobs.
> 
> I am very keen on Dominic's design concept, however; I missed his sysfs
> interface that exposes the knobs to userspace.  

He did not specify it in detail. Important point is he has
one-file-per-knob and one-value-per-knob.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-16 21:56         ` Mark Gross
@ 2006-10-17 21:40           ` Matthew Locke
  0 siblings, 0 replies; 84+ messages in thread
From: Matthew Locke @ 2006-10-17 21:40 UTC (permalink / raw)
  To: mgross; +Cc: pm list, Dominik Brodowski


On Oct 16, 2006, at 2:56 PM, Mark Gross wrote:

> On Thu, Oct 12, 2006 at 06:02:03PM +0200, Dominik Brodowski wrote:
>> Hi Mark,
>>
>> On Thu, Oct 12, 2006 at 08:38:21AM -0700, Mark Gross wrote:
>>>>> I think that this might be much easier to implement than your  
>>>>> PowerOP /
>>>>> operating points / PM core / PowerOP - cpufreq interaction  
>>>>> patches. As a
>>>>> matter of fact, some parts of your operating points table  
>>>>> infrastructure
>>>>> may be usable for the concept outlined above. So, what do you  
>>>>> think? What
>>>>> does everyone else involved think about this alternative approach?
>>>>
>>>> Looks okay to me. Unlike powerop design, this actually works for
>>>> everyone.
>>>
>>> Pavel, if you would pay attention better you would notice that at  
>>> the
>>> underneath of what Dominic is talking about is a concept of *more  
>>> knobs*
>>> for controlling platform power states.  This is what PowerOP is  
>>> trying
>>> to bring to the table.
>>
>> Oh no. PowerOP does it top->bottom; I try to do it bototm->top.  
>> That's the
>> difference, and it is a _fundamental_ difference. Yes, both will  
>> lead to a
>> concept of "operating points" on systems which may need it. But  
>> still the
>> way you get there (which is important if you want to keep it  
>> flexible, and
>> you do want to keep it flexible to allow for cpufreq) is different.
>
> I'll take a closer look at both.  It really looks to me that folks  
> are in
> violent agreement more than anything else.  I also prefer a bottom- 
> >top
> approach.

PowerOP has always exposed the power parameters to both kernel and  
userspace.  I think we can make some minor tweaks to the API as  
Eugeny described in his email and we solve for both use cases  
(embedded devices and x86 laptop/desktops).

>
> --mgross
>
>>
>>> PowerOP is not a policy engine like what Dominic is talking  
>>> about.  And
>>> what Dominic is talking about will need to build on something  
>>> that will
>>> end up looking so much like power op that it wont be funny.
>>
>> This I dare to doubt.
>>
>> Thanks,
>> 	Dominik
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-12 22:43   ` Eugeny S. Mints
  2006-10-13 10:55     ` Pavel Machek
@ 2006-10-26  3:05     ` Dominik Brodowski
  1 sibling, 0 replies; 84+ messages in thread
From: Dominik Brodowski @ 2006-10-26  3:05 UTC (permalink / raw)
  To: Eugeny S. Mints; +Cc: pm list

Hi,

On Fri, Oct 13, 2006 at 02:43:48AM +0400, Eugeny S. Mints wrote:
> >A) The lowest level: lots of knobs.
> >
> >Somewhere in a "computer system"[2] there are very many "knobs" which may
> >be turned to influence various voltages, clock levels, or operating modes
> >("turbo", "performance" or "powersave", for example).
> >
> >Also, there might be many dependencies on how these "knobs" may be
> >changed.
> >
> >Let's assume the system is in a well-defined, working state right now.
> In terms which we use to describe PowerOP a "kbob" is "power parameter"
> and "operating point" is an entity which corresponds to "well-defined, 
> _working_ [system power] state".
> 
> So, what PowerOP Core does: it just maintains a collection of operating 
> point,
> i.e. collection of known-to-be-working system power states. On many 
> platforms
> (especially embedded) not all combinations of power parameters are valid.  
> Some
> [invalid] combinations of the power parameters may crash or damage the 
> system.

Exactly. And that is what I suggest you may want to do in the level _above_
the knobs. Not all platform need or even want this limitations; some require
it. So let's not force it upon everyone, but let's just use it where it
makes sense?


<operating points table library>
              |
	<knob layer>

> 1) you have a table which tells you that there are some  combinations of 
> power
> parameter values which are
> a) _known-to-be-working_
> and
> b) contains SOME_PLATFORM_POWER_PARAMETER=value1.
> Then you chose one of these operating points and switch to it.
> 
> The table creation is simply registration of operating points with POwerOP 
> Core.

Sort of.

> Now selection and switch. Obviously the functionality of selection between
> operating points based on some algo (which btw varies even not across 
> platforms
> but even across different profiles of the same platform) has nothing to do 
> with
> the code which actually switches operating points. So having such
> functionalities coupled within the ->set() method is just invalid design - 
> they
> have to be separated. That's exactly what PowerOP approach does:

Looking at the code, not really -- there my approach was able to use the
same "separation", or "coupling", which PowerOP uses.


> an upper 
> layer
> can implement selection logic leveraging PowerOP Core interface and then 
> request
> POwerOP Core to switch system to the selected operating point.

That's also possible using my approach.

> 2) table does not exist. There are two options here:
> 
> Either,
> a)an entity which calls ->set() for a particular power parameter IS
> RESPONSIBLE for that resulting combination of power parameter values (once 
> the
> set has been executed) IS valid one
> 
> OR
> b) the system executes complex logic you described under D) 1) (in fact,
> cpufreq policy notifiers) to get a valid combination of power parameter 
> values
> with a predefined value of a certain power parameter.
> 
> Let me illustrate why 2)a) is just particular case in contrast to POwerOP 
> which
> is general case in this situation.
> 
> i) PowerOP Core provides interface to get/set value of a particular power 
> parameter
> 
> ii)  Let's assume we limit the set of operating points for a platform to 
> one point.  This one operating point is always the current operating point. 
> All operations occur on the the current operating point.
> 
> iii)in the assumptions above your  ->set() is nothing else than:
> set(param, value)
> {
>  struct powerop_pwr_param p;
> 
>  p.attr = param;
>  p.value = value;
>  powerop_set_pwr_params(CURRENT_POINT, &p, 1);
>  powerop_set_point(CURRENT_POINT);
> }

Here you turn PowerOP on its head -- you modify one Operating Point runtime?
Isn't that two Operating Points? That sounds strange. Also, it again looks
at it from a top->bottom view, which I dislike...

> That's it. Bottom line is: what you are talking about is NOT an Alternative
> Concept but a particular case instead. While PowerOP design is generic case.

As I said -- the differences might be subtle. But they're important. And I
don't care how anything is called -- I care that both the concept and the
code is good.

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3]
  2006-10-09 18:21   ` Mark Gross
@ 2006-10-26  3:06     ` Dominik Brodowski
  0 siblings, 0 replies; 84+ messages in thread
From: Dominik Brodowski @ 2006-10-26  3:06 UTC (permalink / raw)
  To: Mark Gross; +Cc: pm list

Hi,

On Mon, Oct 09, 2006 at 11:21:23AM -0700, Mark Gross wrote:
> > D) Verification
> i.e. constraint checking and enforcement

Exactly.

> > Let's assume that there is an OMAP1 PM module which implements a ->set and
> > ->get function for all of them. A yet-to-be-defined interface then tells
> > this PM module
> > 
> > "I want to increase the CPU frequency from C1 MHz to C2 MHz!"
> > 
> > ->set(CPU_VLTG, C2);
> did you mean ->set(CPU, C2) ?

Yes, sorry about the confusion.

> > The core would determine that the latter two states are now allwed, and
> > using some sensible algorithm (e.g. "where do I not have to switch too many
> > knobs", or minimize the costs of switching) decide between those two.
> > Basically, it would recignize now that it is OK to proceed from state Nr. 1
> > to Nr. 3, but that this means that "tc" also needs to be changed. After
> > notifing relevant subsystems using the clock and voltage frameworks, it
> > would then proceed to set the hardware accordingly.
> 
> This adds a sort of tree search defining a power state path from a
> current state to one of the possible target stats with C2.  In this case
> the only way to get to CPU==C2 is to change TC to D2 and deal with all
> the ripples that will cause.
> 
> One question is how do we know that changing TC is a better way to go
> than changing CPU_VLTG?  We'll need to figure out an ordering in the
> phase space of power states.  Thinking out loud, I would try to pick the
> target state based on latency if there are more than one targets to
> satisfy the ->set() request.

That sounds like an interesting approach.

> > I think that this might be much easier to implement than your PowerOP /
> > operating points / PM core / PowerOP - cpufreq interaction patches. As a
> > matter of fact, some parts of your operating points table infrastructure
> > may be usable for the concept outlined above. So, what do you think? What
> > does everyone else involved think about this alternative approach?
> > 
> 
> I still see a need to take the first step of enabling "lots of knobs".
> This is the primary goal of the PowerOp patch set.  The stuff with the
> sysfs is just an interface to set/get the operating points while a
> more complete solution like what you are talking about evolves.

Unfortunately, PowerOp gets the "lots of knobs" on its head, in my
opinion...

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
                     ` (3 preceding siblings ...)
  2006-10-12 22:43   ` Eugeny S. Mints
@ 2007-03-13  0:57   ` Matthew Locke
  2007-03-13 11:08     ` Pavel Machek
  4 siblings, 1 reply; 84+ messages in thread
From: Matthew Locke @ 2007-03-13  0:57 UTC (permalink / raw)
  To: pm list; +Cc: Eugeny Mints, Dominik Brodowski

So, Its time to restart this discussion:)  After all the discussion  
last year, Eugeny and I went back to the drawing board to review the  
requirements and possible solutions.  I thought it would be best to  
respond to this email to remind everyone where we left off.    David  
Brownell's latest email on this topic (subject has something with  
cpufreq in it) is also a good one to read.

Basically, we finally agree that the operating point concept won't  
work for every platform and it is actually too limited to be the base  
abstraction.  Please hold applause until the end:)

We dove into A) in Dominik's email and started looking at what a knob  
layer would require in more detail.  For the moment let's put aside  
operating points.  We believe that a knob type layer makes sense to  
be the lowest level as Dominik proposed.  This layer is responsible  
for controlling hardware resources that affect power management and  
capturing the relationships between resources.  Power management  
resources include components such as clock dividers, pll's, voltage  
regulators, and power domains.  These resources are not always  
independent and often have a dependency relationship between them.    
Knob isn't quite the right word for this layer - pm resources are  
knobs, switches, dials:)   We suggest calling this layer a Power  
Parameter Framework.


The goal of this parameter framework is to expose the resources in  a  
way that allows other s/w (governors, policy mangers, etc) to control  
the resources while keeping the system operational.  One of the main  
requirements in our thinking is that we want this layer to represent  
the h/w and not include policy or decision making.  Meaning the  
software using the parameter framework would be responsible for  
deciding the appropriate value for the parameters.   The framework  
breaks down into 4 parts:

- PM resource representation
Similar to the device abstraction available today.  Platforms need to  
define which resources will be controlled by the parameter  
framework.  We need to take into account that resources will be from  
SoCs and boards.

- PM resource control
Architecture independent API for enable/disable get/set of  
parameters.  Also provide information such as valid ranges or values  
for the parameter based on hardware limitations.
	- The API would work in terms of parameter values such as  
frequencies and voltage not register or divider values.
	- Each parameter is referenced by a id/handle to maintain  
architecture independence.
	- The set function accepts a list of parameter value pairs as well  
as a single parameter value pair.

- Dependency relationships
We believe 3 types of dependencies need to be addressed.
	- Parent/Child.   This relationship would be for parameters of the  
same type such as clocks that depend on each other.  Mostly likely a  
tree structure similar (or exactly the same as) the clock framework  
except generic for any type of parameter.

	- Domain.  This relationship is for parameters of different types.   
For example some platforms provide a  gate for the voltage supply to  
a set of clocks.  The framework would capture the relationship of the  
voltage gate to the clocks so that information can be used when  
setting parameters.

	- Functional -  Often there are platform specific dependency  
relationships that need to be captured and addressed in some way.    
Some examples: A single register may be used to control several  
independent clocks requiring some coordination when setting a new  
value for one or more of the clocks; One parameter may need be  
changed before another due to some platform specific peculiarities.

- Resource reference counting
Its important to keep track of when a resource is being used.  If no  
one is using a resource, then a higher level s/w component (governor/ 
policy manager) can decide to turn off the resource.  The framework  
would provide a claim/release set of APIs for other subsystems/ 
drivers to use.


What is and is not included in the parameter framework?
  - Resources that affect more than one component would go into the  
framework.  For example, a clock that is used by two or more I/O  
devices would need coordination to change.  Therefore it goes into  
the parameter framework.  A resource used by only one device driver  
and doesn't affect other devices/parameters should be controlled  
directly in the device driver and not exposed in the parameter  
framework.

  - The platform designer (or the guy doing the board port) decides  
which resources makes sense to expose on their platform.  Not all  
resources are required to be included.  In fact, it may make sense to  
expose multiple resources as a single parameter.

  - Use case and value based parameter relationships would not be  
included in the parameter framework.  These relationships are not  
required to keep the system operational and not every platform will  
have them. This is where operating points start to make sense.  An  
optional layer on top of the parameter framework would provide the  
ability to group parameters together in a similar manner to operating  
points.  If a platform has a set of optimal parameter values for  
specific use cases, then it would define parameter groups and assign  
a group id for the set of values.


Notifications
The framework needs to provide the ability to subscribe to  
notifications for individual parameter changes.  Device drivers would  
be able to subscribe for pre and post change events and act accordingly.

Verification
The framework API provides the range or the valid set of values for a  
parameter so a potential value can be verified.  Also, the parameter  
dependencies relationships are followed when a parameter or set of  
parameters are set.

If we can agree to and get a basic framework as described above in  
place,  we have a good building block for solving some of the other  
issues such as constraints and policy decisions.  Also, we have a  
framework in the kernel for clocks today.  This framework would  
incorporate the clock framework ideas making them generic for any  
time of power resources and easier to define/use.

I believe this power parameter framework should solve many (if not  
all) of the issues raised by using operating points as the base  
abstraction and provide a common layer across architectures.  Eugeny  
and I have the beginnings of an API proposal for this framework, but  
we wanted to get some high level feedback on the concepts so we can  
adjust the API if necessary.  So, comments?


Matt

On Oct 6, 2006, at 7:36 PM, Dominik Brodowski wrote:

> Hi!
>
> As you know, I never looked too friendly upon PowerOP and the  
> "operating
> points" concept. My latest messages may have illustrated this point  
> even
> further -- but the reason for that is that I more and more get the  
> feeling
> that PowerOP and "operating points" and the so-called new "PM core" is
> trying to do too many things at once, and therefore mixes up differnt
> levels. Here is a rough sketch of what I'd like to discuss[1] as an
> alternative:
>
>
> A) The lowest level: lots of knobs.
>
> Somewhere in a "computer system"[2] there are very many "knobs"  
> which may
> be turned to influence various voltages, clock levels, or operating  
> modes
> ("turbo", "performance" or "powersave", for example).
>
> Also, there might be many dependencies on how these "knobs" may be
> changed.
>
> Let's assume the system is in a well-defined, working state right now.
>
>
> B) I want to change one such knob!!!
>
> Now, let's say that we want to change one value controlled by such  
> a knob.
> What must we do? We need to check that changing it
> 	a) does not violate any dependency ["verification"]
> 	b) all dependencies are handled in correct order ["notification"]
>
>
> C) Notification
>
> Let's look at the "notification" stage first -- that's what current  
> cpufreq
> notifiers do in a very basic way. However, this is also what the  
> new clock
> and voltage frameworks are trying to do, right? So that's the  
> lesser problem
> now.
>
>
> D) Verification
>
> So, how to do this verification? Basically, there are two approaches:
>
> 1) ask every other subsystem whether the new value is OK with it.
> 	This is what cpufreq currently suggests to do. It is evident
> 	that this gets overly complicated with lots of dependencies
> 	and dependencies within the dependencies -- both in terms
> 	of concept and in terms of time the verification code takes
> 	to execute.
> 	Advantages:
> 	- easy to expand, also in runtime (e.g. USB system is
> 		modprobed and telling you of a new minimum voltage
> 		requirement on certain circumstances)
> 	- does not limit choices for each knob
> 	Disadvantages:
> 	- might get very complex
>
> 2) look up all valid states in a table
> 	This is basically what PowerOP and the "operating points"
> 	concept suggests: if you want to change one value, you check
> 	what operating points a) contain the new value and b) is
> 	most suitable to you.
> 	Advantages:
> 	- fast
> 	- pre-defined set of operating points which the system
> 	  designer is comfortable with
> 	Disadvantages:
> 	- needs to be limited to "core" of the system as else
> 	  the tables may get overly large
> 	- limits the choices
>
>
> E) So, why not combine the best of both worlds?
>
>
> If you want to change a knob, the "PM core" looks both at every other
> subsystem adding dependencies, and at a "operating points" table  
> _ifff_ it
> exists.
>
>
>
> F) So, how would this work for OMAP1?
>
> Let's limit it, to keep it somewhat simple, to the values contained  
> in your
> "struct pm_core_point" for OMAP:
>
> 	int cpu_vltg; /* voltage in mV */
> 	int dpll;     /* in KHz */
> 	int cpu;      /* CPU frequency in KHz */
> 	int tc;       /* in KHz */
> 	int per;      /* in KHz */
> 	int dsp;      /* in KHz */
> 	int dspmmu;   /* in KHz */
> 	int lcd;      /* in KHz */
>
> and let's also add a
>
> 	int i_am_special;
>
> Let's assume that there is an OMAP1 PM module which implements a - 
> >set and
> ->get function for all of them. A yet-to-be-defined interface then  
> tells
> this PM module
>
> "I want to increase the CPU frequency from C1 MHz to C2 MHz!"
>
> ->set(CPU_VLTG, C2);
>
> The ->set function would then ask whether it is allowed to switch to
> frequency B. How would it ask for that? It would both call the  
> "operating
> points" layer to check whether such a table is registered. Now,  
> let's assume
> there are no external subsystems affected by this change, and the  
> system
> engineer has defined such a table:
>
> Nr.	CPU_VLTG	CPU	TC	... 	i_am_special
> 1	A1		C1	D1		1
> 2	A2		C1	D1		2
> 3	A1		C2	D2		3
> 4	A2		C2	D3		4
>
> The core would determine that the latter two states are now allwed,  
> and
> using some sensible algorithm (e.g. "where do I not have to switch  
> too many
> knobs", or minimize the costs of switching) decide between those two.
> Basically, it would recignize now that it is OK to proceed from  
> state Nr. 1
> to Nr. 3, but that this means that "tc" also needs to be changed.  
> After
> notifing relevant subsystems using the clock and voltage  
> frameworks, it
> would then proceed to set the hardware accordingly.
>
> Now, some might argue "I want to tell the interface to enter mp3- 
> mode, and
> not enter some CPU_VLTG and hope that it selects the right table  
> entry then
> in the verifcation stage!" Well, you can do that. Using the  
> i_am_special
> pseudo-knob. You just tell the yet-to-be-defined interface "I want  
> to switch
> knob I_AM_SPECIAL to 4". The process is the same.
>
>
> G) So, what does this get us?
>
> It may look as "Operating Points" turned on its head now. And yes,  
> it is.
> But you can do the following now:
> - let cpufreq call ->set(CPU_FREQ, <value>), if you want dynamic  
> frequency
>   scaling,
> - use pre-defined operating points if it's suitable to do so,
> - handles all dependencies either way.
>
> Oh, and as the operating point concept is only introduced as an  
> element
> between the low-level setting and the "high-level policy decision",  
> it does
> not need to be squeezed into current cpufreq drivers or even the  
> current
> cpufreq core in any way. cpufreq may call it, but that should be  
> relatively
> easy to implement.
>
>
> I think that this might be much easier to implement than your  
> PowerOP /
> operating points / PM core / PowerOP - cpufreq interaction patches.  
> As a
> matter of fact, some parts of your operating points table  
> infrastructure
> may be usable for the concept outlined above. So, what do you  
> think? What
> does everyone else involved think about this alternative approach?
>
>
> Thanks,
> 	Dominik
>
>
> [1] As many here are aware, I will have very limited time to actually
>     implement it.
> [2] embedded device, notebook, cluster, desktop with lots of USB  
> devices
>     connected, and so on
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-13  0:57   ` Alternative Concept Matthew Locke
@ 2007-03-13 11:08     ` Pavel Machek
  2007-03-13 20:34       ` Mark Gross
  2007-03-14  3:19       ` Alternative Concept Dominik Brodowski
  0 siblings, 2 replies; 84+ messages in thread
From: Pavel Machek @ 2007-03-13 11:08 UTC (permalink / raw)
  To: Matthew Locke; +Cc: Eugeny Mints, pm list, Dominik Brodowski

Hi!

> I believe this power parameter framework should solve many (if not  
> all) of the issues raised by using operating points as the base  
> abstraction and provide a common layer across architectures.  Eugeny  
> and I have the beginnings of an API proposal for this framework, but  
> we wanted to get some high level feedback on the concepts so we can  
> adjust the API if necessary.  So, comments?

Looks better than powerop certainly.

Perhaps first step would be to convert cpufreq to this new framework?

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-13 11:08     ` Pavel Machek
@ 2007-03-13 20:34       ` Mark Gross
  2007-03-14  2:30         ` Ikhwan Lee
  2007-03-14  3:19       ` Alternative Concept Dominik Brodowski
  1 sibling, 1 reply; 84+ messages in thread
From: Mark Gross @ 2007-03-13 20:34 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Eugeny Mints, pm list, Dominik Brodowski

On Tue, Mar 13, 2007 at 12:08:51PM +0100, Pavel Machek wrote:
> Hi!
> 
> > I believe this power parameter framework should solve many (if not  
> > all) of the issues raised by using operating points as the base  
> > abstraction and provide a common layer across architectures.  Eugeny  
> > and I have the beginnings of an API proposal for this framework, but  
> > we wanted to get some high level feedback on the concepts so we can  
> > adjust the API if necessary.  So, comments?
> 
> Looks better than powerop certainly.
> 
> Perhaps first step would be to convert cpufreq to this new framework?

The first step is to get a parameter framework in upstream.  

It will take some time for the applications of this proposed framework
to materialize and drive the maturing of the implementation.  These
won't get written unless a framework is upstream. 

I don't know if having cpufreq plug into this framework will ever make a
lot of sense.  However; it would be simple to create a cpufreq driver
that access the parameter layer for some selected platforms.  (N800?)

--mgross

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-13 20:34       ` Mark Gross
@ 2007-03-14  2:30         ` Ikhwan Lee
  2007-03-14 10:43           ` Eugeny S. Mints
  0 siblings, 1 reply; 84+ messages in thread
From: Ikhwan Lee @ 2007-03-14  2:30 UTC (permalink / raw)
  To: mgross; +Cc: Eugeny Mints, pm list, Dominik Brodowski, Pavel Machek

Hi,

On 3/14/07, Mark Gross <mgross@linux.intel.com> wrote:
> On Tue, Mar 13, 2007 at 12:08:51PM +0100, Pavel Machek wrote:
> > Hi!
> >
> > > I believe this power parameter framework should solve many (if not
> > > all) of the issues raised by using operating points as the base
> > > abstraction and provide a common layer across architectures.  Eugeny
> > > and I have the beginnings of an API proposal for this framework, but
> > > we wanted to get some high level feedback on the concepts so we can
> > > adjust the API if necessary.  So, comments?
> >
> > Looks better than powerop certainly.

I also think this power parameter framework is a lot easier to adopt.
PowerOp ideas can be built on top of this framework later.

> > Perhaps first step would be to convert cpufreq to this new framework?
>
> The first step is to get a parameter framework in upstream.

Would this involve replacing the clock framework, or are they going to coexist?

> It will take some time for the applications of this proposed framework
> to materialize and drive the maturing of the implementation.  These
> won't get written unless a framework is upstream.
>
> I don't know if having cpufreq plug into this framework will ever make a
> lot of sense.  However; it would be simple to create a cpufreq driver
> that access the parameter layer for some selected platforms.  (N800?)
>
> --mgross
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
>

It seems that this power parameter framework is more toward dynamic
(or runtime) power management of various devices on a platform. We
should make sure it does not break (and is not broken by) system
suspend/resume operations.

ikhwan

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-13 11:08     ` Pavel Machek
  2007-03-13 20:34       ` Mark Gross
@ 2007-03-14  3:19       ` Dominik Brodowski
  1 sibling, 0 replies; 84+ messages in thread
From: Dominik Brodowski @ 2007-03-14  3:19 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Eugeny Mints, pm list

Hi,

On Tue, Mar 13, 2007 at 12:08:51PM +0100, Pavel Machek wrote:
> Perhaps first step would be to convert cpufreq to this new framework?

Actually, cpufreq doesn't need to be modified (at least at first). This
new framework can coexist nicely with cpufreq and this framework can export
its capabilities to cpufreq governors (and, possibly, cpuidle governors) if
it so wishes. Only in the mid- to long-term, changes to cpufreq might make
sense. But that's something I cannot predict accurately ;)

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14  2:30         ` Ikhwan Lee
@ 2007-03-14 10:43           ` Eugeny S. Mints
  2007-03-14 17:19             ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-14 10:43 UTC (permalink / raw)
  To: Ikhwan Lee; +Cc: Pavel Machek, Dominik Brodowski, pm list

Ikhwan Lee wrote:
> Hi,
> 
> On 3/14/07, Mark Gross <mgross@linux.intel.com> wrote:
>> On Tue, Mar 13, 2007 at 12:08:51PM +0100, Pavel Machek wrote:
>>> Hi!
>>>
>>>> I believe this power parameter framework should solve many (if not
>>>> all) of the issues raised by using operating points as the base
>>>> abstraction and provide a common layer across architectures.  Eugeny
>>>> and I have the beginnings of an API proposal for this framework, but
>>>> we wanted to get some high level feedback on the concepts so we can
>>>> adjust the API if necessary.  So, comments?
>>> Looks better than powerop certainly.
> 
> I also think this power parameter framework is a lot easier to adopt.
> PowerOp ideas can be built on top of this framework later.
> 
>>> Perhaps first step would be to convert cpufreq to this new framework?
>> The first step is to get a parameter framework in upstream.
> 
> Would this involve replacing the clock framework, or are they going to coexist?

parameter framework would eventually replace clock framework. Separate clock and 
voltage frameworks lead to code and functionality duplication and do not address 
such things as relationship between clocks and voltages, clock/voltage/power 
domains, etc needed for aggressive power management.

Basically a good way of thinking about parameter framework is that parameter 
framework would start from existed clock framework and gradually evolve by 
addressing voltages, relationship between clocks and voltages, domains. 
Eventually clock framework functionality would be a part of power parameter 
framework.

Thanks,
Eugeny

> 
>> It will take some time for the applications of this proposed framework
>> to materialize and drive the maturing of the implementation.  These
>> won't get written unless a framework is upstream.
>>
>> I don't know if having cpufreq plug into this framework will ever make a
>> lot of sense.  However; it would be simple to create a cpufreq driver
>> that access the parameter layer for some selected platforms.  (N800?)
>>
>> --mgross
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.osdl.org
>> https://lists.osdl.org/mailman/listinfo/linux-pm
>>
> 
> It seems that this power parameter framework is more toward dynamic
> (or runtime) power management of various devices on a platform. We
> should make sure it does not break (and is not broken by) system
> suspend/resume operations.
> 
> ikhwan
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 10:43           ` Eugeny S. Mints
@ 2007-03-14 17:19             ` David Brownell
  2007-03-14 18:12               ` Igor Stoppa
  2007-03-15  9:53               ` Eugeny S. Mints
  0 siblings, 2 replies; 84+ messages in thread
From: David Brownell @ 2007-03-14 17:19 UTC (permalink / raw)
  To: linux-pm; +Cc: Dominik Brodowski, Pavel Machek

This alternative "concept" would seem to be missing a few essential
aspects.  Like proposed interfaces, for starters ...


On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> > 
> > Would this involve replacing the clock framework, or are they going to coexist?
> 
> parameter framework would eventually replace clock framework.

That seems to be the wrong answer.  Especially since nothing has
been shown to be wrong with the clock interface; much less to be
unfixably wrong (hence justifying replacement).


> Separate clock and  
> voltage frameworks lead to code and functionality duplication and do not address 
> such things as relationship between clocks and voltages, clock/voltage/power 
> domains, etc needed for aggressive power management.

Most clocks don't have those issues.  Why penalize all clocks for
issues which only relate to a few?  Better to only do that for the
few cocks which have such additional constraints.

Plus, remember that the clock framework is an interface ... so by
definition, it has no code associated with it.  Hence no duplication
of code is possible... at least at this hand-wavey "concept" level.
Possibly a given implementation has scope for code sharing; but I
doubt it.  Code behind a given implementation of the clock interface
is invariably quite slim.

If a clock being enabled implies a power or voltage domain being active,
there's no reason that constraint shouldn't be enforced by whatever
implementation a given platform uses.

And having a generic -- basically untyped -- notion of "parameter"
seems significantly less good than having a typed notion, with
type-specific operations.  Typed notions are easier to understand,
read, and maintain.


> Basically a good way of thinking about parameter framework is that parameter 
> framework would start from existed clock framework and gradually evolve by 
> addressing voltages, relationship between clocks and voltages, domains. 
> Eventually clock framework functionality would be a part of power parameter 
> framework.

A better way would be to say that implementions of the clock interface
on a given platform can build on whatever they need to build.  That might
include a "parameter" framework, if such a thing were defined in such
a way that it became useful to such implementations.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 17:19             ` David Brownell
@ 2007-03-14 18:12               ` Igor Stoppa
  2007-03-14 18:45                 ` David Brownell
  2007-03-15  9:53               ` Eugeny S. Mints
  1 sibling, 1 reply; 84+ messages in thread
From: Igor Stoppa @ 2007-03-14 18:12 UTC (permalink / raw)
  To: ext David Brownell; +Cc: linux-pm, Pavel Machek, Dominik Brodowski

On Wed, 2007-03-14 at 10:19 -0700, ext David Brownell wrote:
> This alternative "concept" would seem to be missing a few essential
> aspects.  Like proposed interfaces, for starters ...
> 
> 
> On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> > > 
> > > Would this involve replacing the clock framework, or are they going to coexist?
> > 
> > parameter framework would eventually replace clock framework.
> 
> That seems to be the wrong answer.  Especially since nothing has
> been shown to be wrong with the clock interface; much less to be
> unfixably wrong (hence justifying replacement).
I think the rationale for choosing to abstract a clock/voltage should be
clarified more.

> > Separate clock and  
> > voltage frameworks lead to code and functionality duplication and do not address 
> > such things as relationship between clocks and voltages, clock/voltage/power 
> > domains, etc needed for aggressive power management.
> 
> Most clocks don't have those issues.  Why penalize all clocks for
> issues which only relate to a few?  Better to only do that for the
> few cocks which have such additional constraints.
Those that have such constraints tend to be very architecture dependant,
so that not much can be generalised or ported easily without having to add
too many levels of indirection.

> Plus, remember that the clock framework is an interface ... so by
> definition, it has no code associated with it.  Hence no duplication
> of code is possible... at least at this hand-wavey "concept" level.
> Possibly a given implementation has scope for code sharing; but I
> doubt it.  Code behind a given implementation of the clock interface
> is invariably quite slim.
> 
> If a clock being enabled implies a power or voltage domain being active,
> there's no reason that constraint shouldn't be enforced by whatever
> implementation a given platform uses.
And that implementation could be highly optimised since it wouldn't care
too much about being portable.

> And having a generic -- basically untyped -- notion of "parameter"
> seems significantly less good than having a typed notion, with
> type-specific operations.  Typed notions are easier to understand,
> read, and maintain.
That sounds like being on the same lines of C vs C++ comments :) or why
not to use typedef struct foo {...} bar

> 
> > Basically a good way of thinking about parameter framework is that parameter 
> > framework would start from existed clock framework and gradually evolve by 
> > addressing voltages, relationship between clocks and voltages, domains. 
> > Eventually clock framework functionality would be a part of power parameter 
> > framework.
> 
> A better way would be to say that implementions of the clock interface
> on a given platform can build on whatever they need to build.  That might
> include a "parameter" framework, if such a thing were defined in such
> a way that it became useful to such implementations.
> 
But shouldn't it be useful on every platform? As a sort of resource
manager (because that's what it would become if it would start adressing
interdependencies between clocks and voltages).
-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 18:12               ` Igor Stoppa
@ 2007-03-14 18:45                 ` David Brownell
  0 siblings, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-14 18:45 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: linux-pm, Pavel Machek, Dominik Brodowski

On Wednesday 14 March 2007 11:12 am, Igor Stoppa wrote:
> On Wed, 2007-03-14 at 10:19 -0700, ext David Brownell wrote:
> > This alternative "concept" would seem to be missing a few essential
> > aspects.  Like proposed interfaces, for starters ...
> > 
> > 
> > On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> > > > 
> > > > Would this involve replacing the clock framework, or are they going to coexist?
> > > 
> > > parameter framework would eventually replace clock framework.
> > 
> > That seems to be the wrong answer.  Especially since nothing has
> > been shown to be wrong with the clock interface; much less to be
> > unfixably wrong (hence justifying replacement).
>
> I think the rationale for choosing to abstract a clock/voltage should be
> clarified more.

That is, a rationale for abstracting both into "power resource"
so that the difference is not inherent in variable typing?


> > > Separate clock and  
> > > voltage frameworks lead to code and functionality duplication and do not address 
> > > such things as relationship between clocks and voltages, clock/voltage/power 
> > > domains, etc needed for aggressive power management.
> > 
> > Most clocks don't have those issues.  Why penalize all clocks for
> > issues which only relate to a few?  Better to only do that for the
> > few clocks which have such additional constraints.
>
> Those that have such constraints tend to be very architecture dependant,
> so that not much can be generalised or ported easily without having to add
> too many levels of indirection.

Right.

 
> > Plus, remember that the clock framework is an interface ... so by
> > definition, it has no code associated with it.  Hence no duplication
> > of code is possible... at least at this hand-wavey "concept" level.
> > Possibly a given implementation has scope for code sharing; but I
> > doubt it.  Code behind a given implementation of the clock interface
> > is invariably quite slim.
> > 
> > If a clock being enabled implies a power or voltage domain being active,
> > there's no reason that constraint shouldn't be enforced by whatever
> > implementation a given platform uses.
>
> And that implementation could be highly optimised since it wouldn't care
> too much about being portable.

True, but I'm not sure optimization counts as much here as the
basic fact that these things are highly platform-specific even
in terms of basic structure and concepts.  To me, that means
the difference between a relatively small amount code that's
platform-specific ... or a large quantity of very generic code
trying to be all-things-for-all-platforms.  The former sounds much
more practical.


> > And having a generic -- basically untyped -- notion of "parameter"
> > seems significantly less good than having a typed notion, with
> > type-specific operations.  Typed notions are easier to understand,
> > read, and maintain.
>
> That sounds like being on the same lines of C vs C++ comments :) or why
> not to use typedef struct foo {...} bar

Well, why not "typedef struct {...}" is simple:  that's not
the standard for how Linux does things.

As for comment style ... no, not at all comparable.  In one case
the compiler will report typing errors (passing a voltage where
a clock is needed).  In the other, such errors will show up as
runtime errors; with luck, testing will trigger them before they
cause problems in customer/user hands, and they can be fixed
without rewriting code.

 
> > > Basically a good way of thinking about parameter framework is that parameter 
> > > framework would start from existed clock framework and gradually evolve by 
> > > addressing voltages, relationship between clocks and voltages, domains. 
> > > Eventually clock framework functionality would be a part of power parameter 
> > > framework.
> > 
> > A better way would be to say that implementions of the clock interface
> > on a given platform can build on whatever they need to build.  That might
> > include a "parameter" framework, if such a thing were defined in such
> > a way that it became useful to such implementations.
> > 
> But shouldn't it be useful on every platform? As a sort of resource
> manager (because that's what it would become if it would start adressing
> interdependencies between clocks and voltages).

I couldn't know.  This "alternative concept" hasn't gotten very far
into the hand-waving stage, much less beyond it into proposed interface
or (gasp!) implementations.  Platforms that don't *have* those particular
interdependencies should not of course incur costs to implement them...

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 17:19             ` David Brownell
  2007-03-14 18:12               ` Igor Stoppa
@ 2007-03-15  9:53               ` Eugeny S. Mints
  2007-03-15 13:04                 ` Igor Stoppa
  1 sibling, 1 reply; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-15  9:53 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

David Brownell wrote:
> This alternative "concept" would seem to be missing a few essential
> aspects.  Like proposed interfaces, for starters ...

stay tuned - it follows ;)

> 
> 
> On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
>>> Would this involve replacing the clock framework, or are they going to coexist?
>> parameter framework would eventually replace clock framework.
> 
> That seems to be the wrong answer.  Especially since nothing has
> been shown to be wrong with the clock interface; much less to be
> unfixably wrong (hence justifying replacement).

a cherry-picking on clk fw API:

- clk_set_rate() sticks to an individual clock - no way to set rates for number 
of clocks at once instead of having series of clk_set_rate() calls. The former 
for example is required for clocks which have required predefined ratio 
dependency though: two clocks always have to be n:2n and any other ratio leads 
to hardware misbehavior. So, you can't go from 100:200 to 300:600 via series of 
clk_set_rate() for such clocks, i.e. 100:200 -> 300:200 -> 300:600 or 100:200 -> 
100:600 -> 300:600 (see SH7722 hw for reference)

- only a rate can be set up via clk_set_rate() while for a PLL I want to set up 
a desired idle state as well (btw there can be more than one idle state)

- current API does not provide support for clock domains

a cherry-picking on clk fw implementation:

- clock tree structure and traversing clock tree code are duplicated in every 
architecture while can be done in arch independent way and just once (hint: what 
indeed is an arch specific thing it's a clock tree configuration)

> 
> 
>> Separate clock and  
>> voltage frameworks lead to code and functionality duplication and do not address 
>> such things as relationship between clocks and voltages, clock/voltage/power 
>> domains, etc needed for aggressive power management.
> 
> Most clocks don't have those issues.  Why penalize all clocks for
> issues which only relate to a few?  Better to only do that for the
> few cocks which have such additional constraints.

parameter fw would give exactly this behavior: relationship between clocks and 
voltages, clock/voltage/power domains implementation would be just additional 
arcs between nodes for clock and voltages. Nowadays clk fw has only one type of 
arcs - parent-child type. parameter fw would bring additional types. But clock 
nodes would be linked just with required arcs (of required type; both are arch 
specific things) so for an arch without any additional dependencies between 
clocks (and voltages) parameter fw would end up with exactly equivalent clock 
tree as clk fw for the arch has today.

> 
> Plus, remember that the clock framework is an interface ... so by
> definition, it has no code associated with it.  Hence no duplication
> of code is possible... at least at this hand-wavey "concept" level.
> Possibly a given implementation has scope for code sharing; but I
> doubt it.  Code behind a given implementation of the clock interface
> is invariably quite slim.

and invariably looks like a hack and still duplicate clock tree building and 
traversing code. Dependencies which exist in modern hw between clocks, clocks 
and voltages require a more straight and standard technique to be set up for 
implementing clk/vltg/name_it framework. Moving most of the code to be arch 
independent and setting a clear rules on how a clock/vltg tree configuration 
would look like while just looking at a hw manual would help.

> 
> If a clock being enabled implies a power or voltage domain being active,
> there's no reason that constraint shouldn't be enforced by whatever
> implementation a given platform uses.

true. but it's the same functionality required on different arches. and it can 
be done in arch independent way without penalties for arches which do not 
require the functionality. What's the rationale to move it down to arch specific 
code then?

> 
> And having a generic -- basically untyped -- notion of "parameter"
> seems significantly less good than having a typed notion, with
> type-specific operations.  

there is no type-specific operations. clk_set_rate() and vltg_set() basically do 
the same thing - set a value for a pm resource provided by clock or voltage 
device (regulator). The difference is that output of clock regulator is measured 
in Hz and named 'frequesncy' and output of voltage regulator os measured in mV 
and named 'voltage' which actually does not matter from API POV. So, i see more 
sense in having param_set(), parame_set_state() (set_state primitive for idle 
states) rather than in

clk_set_rate(), clk_set_state()
vltg_set(), vltg_set_state()
and, our analysis shows that you would end up with a separate type for domains, 
so it would be at least
domain_set_state() as well.

> Typed notions are easier to understand,
> read, and maintain.

common, after tricks with spinlocks in [merged!] RT patches? ;)

Eugeny
> 
> 
>> Basically a good way of thinking about parameter framework is that parameter 
>> framework would start from existed clock framework and gradually evolve by 
>> addressing voltages, relationship between clocks and voltages, domains. 
>> Eventually clock framework functionality would be a part of power parameter 
>> framework.
> 
> A better way would be to say that implementions of the clock interface
> on a given platform can build on whatever they need to build.  That might
> include a "parameter" framework, if such a thing were defined in such
> a way that it became useful to such implementations.
> 
> - Dave
> 
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15  9:53               ` Eugeny S. Mints
@ 2007-03-15 13:04                 ` Igor Stoppa
  2007-03-16  2:21                   ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Igor Stoppa @ 2007-03-15 13:04 UTC (permalink / raw)
  To: ext Eugeny S. Mints; +Cc: linux-pm, Pavel Machek, Dominik Brodowski

On Thu, 2007-03-15 at 12:53 +0300, ext Eugeny S. Mints wrote:
> David Brownell wrote:
> > This alternative "concept" would seem to be missing a few essential
> > aspects.  Like proposed interfaces, for starters ...
> 
> stay tuned - it follows ;)
> 
> > 
> > 
> > On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> >>> Would this involve replacing the clock framework, or are they going to coexist?
> >> parameter framework would eventually replace clock framework.
> > 
> > That seems to be the wrong answer.  Especially since nothing has
> > been shown to be wrong with the clock interface; much less to be
> > unfixably wrong (hence justifying replacement).
> 
> a cherry-picking on clk fw API:
> 
> - clk_set_rate() sticks to an individual clock - no way to set rates for number 
> of clocks at once instead of having series of clk_set_rate() calls. The former 
> for example is required for clocks which have required predefined ratio 
> dependency though: two clocks always have to be n:2n and any other ratio leads 
> to hardware misbehavior. So, you can't go from 100:200 to 300:600 via series of 
> clk_set_rate() for such clocks, i.e. 100:200 -> 300:200 -> 300:600 or 100:200 -> 
> 100:600 -> 300:600 (see SH7722 hw for reference)
What's wrong with expanding the clk_fw?
All is needed is:
clk_set_rate_buffered(clk1, 300);
clk_set_rate_buffered(clk2, 600);
clk_rate_flush(); /* can include validation of the set */

Which is, incidentally, what OMAP2 does in hw for all the relevant clk
dividers and it also provides validation for the new set of values.

Furthermore if the original assumption that complex transitions are
allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is
mandatory, otherwise the transition is impossible, no matter what fancy
sw fw is performing it.

> - only a rate can be set up via clk_set_rate() while for a PLL I want to set up 
> a desired idle state as well (btw there can be more than one idle state)
Is there any PLL which provide automatic switching between running/idle state?
Do you have any HW architecture on your mind?

> - current API does not provide support for clock domains
So?

> a cherry-picking on clk fw implementation:
> 
> - clock tree structure and traversing clock tree code are duplicated in every 
> architecture while can be done in arch independent way and just once (hint: what 
> indeed is an arch specific thing it's a clock tree configuration)
that looks more like an optimisation to the clk fw, not a call for a brand new one

> > 
> > 
> >> Separate clock and  
> >> voltage frameworks lead to code and functionality duplication and do not address 
> >> such things as relationship between clocks and voltages, clock/voltage/power 
> >> domains, etc needed for aggressive power management.
relationships are very arch specific so having or not the fw there
wouldn't probably make any real difference.
About power domains ... well, what's the deal?
Aren't they controlled like voltages, in the end?

 
> > Most clocks don't have those issues.  Why penalize all clocks for
> > issues which only relate to a few?  Better to only do that for the
> > few cocks which have such additional constraints.
> 
> parameter fw would give exactly this behavior: relationship between clocks and 
> voltages, clock/voltage/power domains implementation would be just additional 
> arcs between nodes for clock and voltages. Nowadays clk fw has only one type of 
> arcs - parent-child type. parameter fw would bring additional types. But clock 
> nodes would be linked just with required arcs (of required type; both are arch 
> specific things) so for an arch without any additional dependencies between 
> clocks (and voltages) parameter fw would end up with exactly equivalent clock 
> tree as clk fw for the arch has today.

I think what has to be clarified, is wether there is a real need for new
features or not. 

> > 
> > Plus, remember that the clock framework is an interface ... so by
> > definition, it has no code associated with it.  Hence no duplication
> > of code is possible... at least at this hand-wavey "concept" level.
> > Possibly a given implementation has scope for code sharing; but I
> > doubt it.  Code behind a given implementation of the clock interface
> > is invariably quite slim.
> 
> and invariably looks like a hack and still duplicate clock tree building and 
> traversing code. Dependencies which exist in modern hw between clocks, clocks 
> and voltages require a more straight and standard technique to be set up for 
> implementing clk/vltg/name_it framework. Moving most of the code to be arch 
> independent and setting a clear rules on how a clock/vltg tree configuration 
> would look like while just looking at a hw manual would help.
I think clocks and voltages providers have very different behaviours: for example
a voltage src can sometimes be put in quiescent state, where the voltage value is
 preserved, but the max current is significantly reduced, therefore minimizing the
leakage. I wouldn't welcome such functionality to be merged with clock handling.


> > 
> > If a clock being enabled implies a power or voltage domain being active,
> > there's no reason that constraint shouldn't be enforced by whatever
> > implementation a given platform uses.
> 
> true. but it's the same functionality required on different arches. and it can 
> be done in arch independent way without penalties for arches which do not 
> require the functionality. What's the rationale to move it down to arch specific 
> code then?

I'd rather ask: what's the benefit of merging apples and oranges?

> > 
> > And having a generic -- basically untyped -- notion of "parameter"
> > seems significantly less good than having a typed notion, with
> > type-specific operations.  
> 
> there is no type-specific operations. clk_set_rate() and vltg_set() basically do 
> the same thing - set a value for a pm resource provided by clock or voltage 
> device (regulator). The difference is that output of clock regulator is measured 
> in Hz and named 'frequesncy' and output of voltage regulator os measured in mV 
> and named 'voltage' which actually does not matter from API POV. So, i see more 
> sense in having param_set(), parame_set_state() (set_state primitive for idle 
> states) rather than in
See my comment above about different, peculiar, behaviour of voltages.
And that's just an example.

> clk_set_rate(), clk_set_state()
> vltg_set(), vltg_set_state()
> and, our analysis shows that you would end up with a separate type for domains, 
> so it would be at least
> domain_set_state() as well.
> 
> > Typed notions are easier to understand,
> > read, and maintain.
> 
> common, after tricks with spinlocks in [merged!] RT patches? ;)
> 
> Eugeny
> > 
> > 
> >> Basically a good way of thinking about parameter framework is that parameter 
> >> framework would start from existed clock framework and gradually evolve by 
> >> addressing voltages, relationship between clocks and voltages, domains. 
> >> Eventually clock framework functionality would be a part of power parameter 
> >> framework.
> > 
> > A better way would be to say that implementions of the clock interface
> > on a given platform can build on whatever they need to build.  That might
> > include a "parameter" framework, if such a thing were defined in such
> > a way that it became useful to such implementations.
> > 
> > - Dave
> > 
> > 
> 
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15 13:04                 ` Igor Stoppa
@ 2007-03-16  2:21                   ` David Brownell
  2007-03-16  3:56                     ` Ikhwan Lee
  2007-03-16 13:06                     ` Dmitry Krivoschekov
  0 siblings, 2 replies; 84+ messages in thread
From: David Brownell @ 2007-03-16  2:21 UTC (permalink / raw)
  To: Eugeny S. Mints, Igor Stoppa; +Cc: linux-pm, Pavel Machek, Dominik Brodowski

On Thursday 15 March 2007 6:04 am, Igor Stoppa wrote:
> On Thu, 2007-03-15 at 12:53 +0300, ext Eugeny S. Mints wrote:
> > > On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> > >>> Would this involve replacing the clock framework, or are they going to coexist?
> > >> parameter framework would eventually replace clock framework.
> > > 
> > > That seems to be the wrong answer.  Especially since nothing has
> > > been shown to be wrong with the clock interface; much less to be
> > > unfixably wrong (hence justifying replacement).
> > 
> > a cherry-picking on clk fw API:
> > 
> > - clk_set_rate() sticks to an individual clock - no way to set rates for number 
> >   of clocks at once instead of having series of clk_set_rate() calls.

This isn't "wrong", it's a "lack of feature".  The normal process for
addressing such things is to improve working/deployed software interfaces,
not throw them out and try to create and deploy something new.  :)

Plus, that's not actually true.  When you change the rate for a parent
clock, all its derived clocks necessarily change ... however many they
may be.  And there's nothing to prevent updating non-derived clocks.

Of course, such a change *does* touch on something I'd agree is a notable
omission:  requests to change such a clock rate (call it a "base" rate
for purposes of discussion here) are currently safe only if the related
clocks have no drivers using them, since there's no way for those driver
to either veto the change ("I can't support 115200 baud at that base rate"),
or reclock to handle the changed environment ("update clock divisor to
maintain current serial port baud rate").

Which is why currently most clock rates aren't currently changeable; and
those which can be changed are mostly leaves on the clock tree.

On some platforms that significantly impacts what cpufreq (or whatever
DVFS scheme is used) could achieve.


> > The former  
> > for example is required for clocks which have required predefined ratio 
> > dependency though: two clocks always have to be n:2n and any other ratio leads 
> > to hardware misbehavior.

The programming interface doesn't preclude enforcement of that rule
though.  In fact, I'd call any *implementation* buggy if the hardware
has such a rule about base rates, but the software doesn't enforce it.
(That is:  it's not an interface issue.)


> > So, you can't go from 100:200 to 300:600 via series of  
> > clk_set_rate() for such clocks, i.e. 100:200 -> 300:200 -> 300:600 or 100:200 -> 
> > 100:600 -> 300:600 (see SH7722 hw for reference)

AT91 (e.g. at91sam9263) processors have MCKR with similar restrictions,
but the only time this could ever matter seems to be for cpufreq.

Does this matter to you for any other reason than trying to use the
clock framework to implement cpufreq?  If not, there's no need for
the clock framework to define how to do this ... since cpufreq is
so highly platform-specific.

If you want to say that this makes it hard to implement cpufreq in terms
of the clock API, I'd agree ... but then, the *complete* lack of driver
clock change notifications is a far bigger issue.  Drivers like MMC/SD,
UART, SPI, and I2C commonly need to update internal clock settings when
rates change, and may need to plug their lowlevel I/O queues while those
base clock rates change. 


> What's wrong with expanding the clk_fw?
> All is needed is:
> clk_set_rate_buffered(clk1, 300);
> clk_set_rate_buffered(clk2, 600);
> clk_rate_flush(); /* can include validation of the set */
> 
> Which is, incidentally, what OMAP2 does in hw for all the relevant clk
> dividers and it also provides validation for the new set of values.
> 
> Furthermore if the original assumption that complex transitions are
> allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is
> mandatory, otherwise the transition is impossible, no matter what fancy
> sw fw is performing it.

I think Scott's "power transaction" notion is a bit better, since
it should among other things provide a way to cope with cases like
two concurrent tasks trying issue conflicting changes.  An SMP
system ought to be able to reclock two or more CPU cores at once;
and maybe even do it with one "power transaction".

 
> > - only a rate can be set up via clk_set_rate() while for a PLL I want to set up 
> > a desired idle state as well (btw there can be more than one idle state)
>
> Is there any PLL which provide automatic switching between running/idle state?
> Do you have any HW architecture on your mind?

And why should such a hardware-specific mechanism show up in an
all-platforms API?  There's quite a lot of hardware-specific stuff
going on, and that seems like just one more instance of it.


> > - current API does not provide support for clock domains
>
> So?

Exactly.  :)

And again, that's not strictly true either; the clock tree itself
defines one kind of domain, also known as "subtrees".

A point I'll make below with respect to power domains:  unless
those domains need to be exposed for some reason to drivers,
there is no need for these to show up the programming interface
except as implementation artifacts.  As in:  activate this clock,
and platform-specific stuff happens.  It always does, of course,
but in some cases it will be more extensive than in others.

(Activating a PLL normally takes more than setting a bit to
open a clock gate; activating some clocks might invove kicking
some clock or power domain logic.)

 
> > a cherry-picking on clk fw implementation:

Which platform's implementation?  I've seen several...


> > - clock tree structure and traversing clock tree code are duplicated in every 
> > architecture while can be done in arch independent way and just once (hint: what 
> > indeed is an arch specific thing it's a clock tree configuration)
>
> that looks more like an optimisation to the clk fw, not a call for a brand new one

And I'm not even sure calling it an "optimization" is correct.

Given how arch-specific the tree structure is, it's not clear
that traversing it shouldn't be arch-specific too.  And in any
case ... the traversal code I've seen is so trivial I'd have a
hard time worring about sharing it.  Implementation flexibility
seems to be widely useful in handling arch-specific differences.


> > >> Separate clock and voltage frameworks

Not that we _have_ a voltage framework yet, e.g. "turn on that
that power rail, configured at 3.3V".

Plus rememember that a voltage framework is not synonymous with
a DVFS kit ... and that a DVFS kit may need to interact with a
lot more than just voltage and frequency controls.  Example,
reclocking CPU and memory frequency may need to run out of SRAM
and without accessing page tables stored in DRAM.


> > >> lead to code and functionality duplication

But since a clock is not a voltage, the primitives must differ.
There's no analogue of "allocate 80 mA" for a clock, as there
would be with a voltage framework's power budget primitives.


> > >> and do not address   
> > >> such things as relationship between clocks and voltages, clock/voltage/power 
> > >> domains, etc needed for aggressive power management.
>
> relationships are very arch specific so having or not the fw there
> wouldn't probably make any real difference.
> About power domains ... well, what's the deal?
> Aren't they controlled like voltages, in the end?

The last time I looked at a power domain, it was more like a
prerequisite to using various device clocks.  It wasn't something
that needed to be exposed in an API ... it would suffice to
have it be activated/deactivated as a side effect of accessing
those clocks.

That is, the existing hooks suffice.  The platform-specific code
to enable a given clock subtree would just have a few lines of
code teaching it how to enable the relevant power domain.  They
already rely on platform-specific logic to sort out which register
bank to touch, which bits there, and various other things.


Although that recent work for memory partitioning is relevant
here too:  being able to power down banks of DRAM (ensuring they
hold no data or code that's currently needed) can certainly cut
down one category of power drain.

In the same vein, some utilities to manage on-chip SRAM would
likely be helpful.  It's common that certain low power modes
disable all DRAM, leaving only (say) 16 KB of SRAM available
for running code ... or that DRAM be unavailable while CPU
and/or memory are being reclocked (maybe by a DVFS scheme,
using cpufreq or whatever).


> > > Most clocks don't have those issues.  Why penalize all clocks for
> > > issues which only relate to a few?  Better to only do that for the
> > > few cocks which have such additional constraints.
> > 
> > parameter fw would give exactly this behavior: relationship between clocks and 
> > voltages, clock/voltage/power domains implementation would be just additional 
> > arcs between nodes for clock and voltages. Nowadays clk fw has only one type of 
> > arcs - parent-child type. parameter fw would bring additional types. But clock 
> > nodes would be linked just with required arcs (of required type; both are arch 
> > specific things) so for an arch without any additional dependencies between 
> > clocks (and voltages) parameter fw would end up with exactly equivalent clock 
> > tree as clk fw for the arch has today.
> 
> I think what has to be clarified, is wether there is a real need for new
> features or not. 

Not just "new features" but "new cross-platform features".  This PM stuff
is really down'n'dirty code ... who will be working with such code?  The less
we demand of those people, the more likely it is that such features would
actually be delivered in working form.  A lot of platform vendors work
with older kernel releases (2.6.10 etc), which means that new APIs will
be obstacles to development and deployment.

Going back to clock and power domain examples above:  it's *easy* to see how
those could be delivered using the current clock framework.  There are a LOT of
folk who are familiar with those interfaces -- today.  So unless there really
is no way to address the issue with the current APIs, it's best to avoid
creating new interfaces for those issues.
 

> > > Plus, remember that the clock framework is an interface ... so by
> > > definition, it has no code associated with it.  Hence no duplication
> > > of code is possible... at least at this hand-wavey "concept" level.
> > > Possibly a given implementation has scope for code sharing; but I
> > > doubt it.  Code behind a given implementation of the clock interface
> > > is invariably quite slim.
> > 
> > and invariably looks like a hack and still duplicate clock tree building and 
> > traversing code. 

Most cases I've seen build the clock tree with a bunch of static struct
declarations, so the building is all at compile time ... and could hardly
duplicate something on another platform, since it's all platform-specific.
And as noted above, traversal code is trivial ... 


> > Dependencies which exist in modern hw between clocks, clocks  
> > and voltages require a more straight and standard technique to be set up for 
> > implementing clk/vltg/name_it framework.

I think Linus has pointed out that cut'n'paste is a very straightforward
and standard technique.  :)

The clock-to-clock dependencies are handled by the clock framework
already.  As are at least some cases of power domains.

DVFS -- cpufreq or otherwise -- is the primary example of a case where
coupling is needed between clocks and voltages.  These PM discussions
lately seem to dance around DVFS without saying so explicitly ... seemingly,
to avoid the "so why not just fix cpufreq" discussions that would (should!)
naturally follow.

My own two cents on the cpufreq thing:  (a) it should support the driver
model better; (b) a generic "run from SRAM" solution would be an aid to
that and other PM code; (c) without clock framework support for drivers
being able to reject changes or else re-clock, changing base clocks is
often impractical; (d) I don't recall seeing anyone really answer the
"so why not ust fix cpufreq" question.


> > Moving most of the code to be arch  
> > independent and setting a clear rules on how a clock/vltg tree configuration 
> > would look like while just looking at a hw manual would help.
>
> I think clocks and voltages providers have very different behaviours: for example
> a voltage src can sometimes be put in quiescent state, where the voltage value is
> preserved, but the max current is significantly reduced, therefore minimizing the
> leakage. I wouldn't welcome such functionality to be merged with clock handling.

Another good example of these APIs being domain-specific ... though
in a way it's a special case of the one I gave above.  In a similar
vein, lower current (but not quiescent) modes may be able to use
alternate schemes for voltage regulation.  The regulation may need
software inputs to switch schemes.

 
> > > If a clock being enabled implies a power or voltage domain being active,
> > > there's no reason that constraint shouldn't be enforced by whatever
> > > implementation a given platform uses.
> > 
> > true. but it's the same functionality required on different arches. and it can 
> > be done in arch independent way without penalties for arches which do not 
> > require the functionality. What's the rationale to move it down to arch specific 
> > code then?
> 
> I'd rather ask: what's the benefit of merging apples and oranges?

Someone may be tiring of a steady diet, and wants to become a chef?  :)

I'll repeat that power and clock domain examples above:  the current
clock framework is **ALREADY** arch-independant in exactly the way
Eugeny sketched, for several platforms.


> > > And having a generic -- basically untyped -- notion of "parameter"
> > > seems significantly less good than having a typed notion, with
> > > type-specific operations.  
> > 
> > there is no type-specific operations. clk_set_rate() and vltg_set() basically do 
> > the same thing - set a value for a pm resource provided by clock or voltage 
> > device (regulator). The difference is that output of clock regulator is measured 
> > in Hz and named 'frequesncy' and output of voltage regulator os measured in mV 
> > and named 'voltage' which actually does not matter from API POV. So, i see more 
> > sense in having param_set(), parame_set_state() (set_state primitive for idle 
> > states) rather than in
>
> See my comment above about different, peculiar, behaviour of voltages.
> And that's just an example.

One of several, I added more ...


> > clk_set_rate(), clk_set_state()
> > vltg_set(), vltg_set_state()

vtg_allocate_budget() would have no clock domain analogue,
nor would vtg_free_budget() ...


> > and, our analysis shows that you would end up with a separate type for domains, 
> > so it would be at least
> > domain_set_state() as well.

See my comments above re clock and voltage domains, applicable to
at least half a dozen platforms.

Until you share your analysis, so we can see what we agree and disagree
with, it's not helpful for you to claim that it supports your argument.


> > > Typed notions are easier to understand,
> > > read, and maintain.
> > 
> > common, after tricks with spinlocks in [merged!] RT patches? ;)

I don't follow this comment.  I'm still waiting to see the NO_HZ stuff
become as stable and fast as the previous NO_IDLE_HZ stuff ... but
none of that seems to relate to type-unsafe programming interfaces.

- Dave


> > 
> > Eugeny
> > 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16  2:21                   ` David Brownell
@ 2007-03-16  3:56                     ` Ikhwan Lee
  2007-03-16  6:17                       ` David Brownell
  2007-03-16 13:06                     ` Dmitry Krivoschekov
  1 sibling, 1 reply; 84+ messages in thread
From: Ikhwan Lee @ 2007-03-16  3:56 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

Hi,

Although I agree that the current clock framework can handle power or
voltage domains in many platforms, having something like (struct clk
powerdomain1, powerdomain2;) does not seem like a good implementation,
a struct for clocks representing a power domain.

If a new framework is more straighforward and introduces a negligible
overhead to the current kernel, I think it is worthwhile to have a
look at it. Plus this new framework might be able to take care of
those platforms that are not nicely supported by the current clock
framework.

Ikhwan

On 3/16/07, David Brownell <david-b@pacbell.net> wrote:
> On Thursday 15 March 2007 6:04 am, Igor Stoppa wrote:
> > On Thu, 2007-03-15 at 12:53 +0300, ext Eugeny S. Mints wrote:
> > > > On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
> > > >>> Would this involve replacing the clock framework, or are they going to coexist?
> > > >> parameter framework would eventually replace clock framework.
> > > >
> > > > That seems to be the wrong answer.  Especially since nothing has
> > > > been shown to be wrong with the clock interface; much less to be
> > > > unfixably wrong (hence justifying replacement).
> > >
> > > a cherry-picking on clk fw API:
> > >
> > > - clk_set_rate() sticks to an individual clock - no way to set rates for number
> > >   of clocks at once instead of having series of clk_set_rate() calls.
>
> This isn't "wrong", it's a "lack of feature".  The normal process for
> addressing such things is to improve working/deployed software interfaces,
> not throw them out and try to create and deploy something new.  :)
>
> Plus, that's not actually true.  When you change the rate for a parent
> clock, all its derived clocks necessarily change ... however many they
> may be.  And there's nothing to prevent updating non-derived clocks.
>
> Of course, such a change *does* touch on something I'd agree is a notable
> omission:  requests to change such a clock rate (call it a "base" rate
> for purposes of discussion here) are currently safe only if the related
> clocks have no drivers using them, since there's no way for those driver
> to either veto the change ("I can't support 115200 baud at that base rate"),
> or reclock to handle the changed environment ("update clock divisor to
> maintain current serial port baud rate").
>
> Which is why currently most clock rates aren't currently changeable; and
> those which can be changed are mostly leaves on the clock tree.
>
> On some platforms that significantly impacts what cpufreq (or whatever
> DVFS scheme is used) could achieve.
>
>
> > > The former
> > > for example is required for clocks which have required predefined ratio
> > > dependency though: two clocks always have to be n:2n and any other ratio leads
> > > to hardware misbehavior.
>
> The programming interface doesn't preclude enforcement of that rule
> though.  In fact, I'd call any *implementation* buggy if the hardware
> has such a rule about base rates, but the software doesn't enforce it.
> (That is:  it's not an interface issue.)
>
>
> > > So, you can't go from 100:200 to 300:600 via series of
> > > clk_set_rate() for such clocks, i.e. 100:200 -> 300:200 -> 300:600 or 100:200 ->
> > > 100:600 -> 300:600 (see SH7722 hw for reference)
>
> AT91 (e.g. at91sam9263) processors have MCKR with similar restrictions,
> but the only time this could ever matter seems to be for cpufreq.
>
> Does this matter to you for any other reason than trying to use the
> clock framework to implement cpufreq?  If not, there's no need for
> the clock framework to define how to do this ... since cpufreq is
> so highly platform-specific.
>
> If you want to say that this makes it hard to implement cpufreq in terms
> of the clock API, I'd agree ... but then, the *complete* lack of driver
> clock change notifications is a far bigger issue.  Drivers like MMC/SD,
> UART, SPI, and I2C commonly need to update internal clock settings when
> rates change, and may need to plug their lowlevel I/O queues while those
> base clock rates change.
>
>
> > What's wrong with expanding the clk_fw?
> > All is needed is:
> > clk_set_rate_buffered(clk1, 300);
> > clk_set_rate_buffered(clk2, 600);
> > clk_rate_flush(); /* can include validation of the set */
> >
> > Which is, incidentally, what OMAP2 does in hw for all the relevant clk
> > dividers and it also provides validation for the new set of values.
> >
> > Furthermore if the original assumption that complex transitions are
> > allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is
> > mandatory, otherwise the transition is impossible, no matter what fancy
> > sw fw is performing it.
>
> I think Scott's "power transaction" notion is a bit better, since
> it should among other things provide a way to cope with cases like
> two concurrent tasks trying issue conflicting changes.  An SMP
> system ought to be able to reclock two or more CPU cores at once;
> and maybe even do it with one "power transaction".
>
>
> > > - only a rate can be set up via clk_set_rate() while for a PLL I want to set up
> > > a desired idle state as well (btw there can be more than one idle state)
> >
> > Is there any PLL which provide automatic switching between running/idle state?
> > Do you have any HW architecture on your mind?
>
> And why should such a hardware-specific mechanism show up in an
> all-platforms API?  There's quite a lot of hardware-specific stuff
> going on, and that seems like just one more instance of it.
>
>
> > > - current API does not provide support for clock domains
> >
> > So?
>
> Exactly.  :)
>
> And again, that's not strictly true either; the clock tree itself
> defines one kind of domain, also known as "subtrees".
>
> A point I'll make below with respect to power domains:  unless
> those domains need to be exposed for some reason to drivers,
> there is no need for these to show up the programming interface
> except as implementation artifacts.  As in:  activate this clock,
> and platform-specific stuff happens.  It always does, of course,
> but in some cases it will be more extensive than in others.
>
> (Activating a PLL normally takes more than setting a bit to
> open a clock gate; activating some clocks might invove kicking
> some clock or power domain logic.)
>
>
> > > a cherry-picking on clk fw implementation:
>
> Which platform's implementation?  I've seen several...
>
>
> > > - clock tree structure and traversing clock tree code are duplicated in every
> > > architecture while can be done in arch independent way and just once (hint: what
> > > indeed is an arch specific thing it's a clock tree configuration)
> >
> > that looks more like an optimisation to the clk fw, not a call for a brand new one
>
> And I'm not even sure calling it an "optimization" is correct.
>
> Given how arch-specific the tree structure is, it's not clear
> that traversing it shouldn't be arch-specific too.  And in any
> case ... the traversal code I've seen is so trivial I'd have a
> hard time worring about sharing it.  Implementation flexibility
> seems to be widely useful in handling arch-specific differences.
>
>
> > > >> Separate clock and voltage frameworks
>
> Not that we _have_ a voltage framework yet, e.g. "turn on that
> that power rail, configured at 3.3V".
>
> Plus rememember that a voltage framework is not synonymous with
> a DVFS kit ... and that a DVFS kit may need to interact with a
> lot more than just voltage and frequency controls.  Example,
> reclocking CPU and memory frequency may need to run out of SRAM
> and without accessing page tables stored in DRAM.
>
>
> > > >> lead to code and functionality duplication
>
> But since a clock is not a voltage, the primitives must differ.
> There's no analogue of "allocate 80 mA" for a clock, as there
> would be with a voltage framework's power budget primitives.
>
>
> > > >> and do not address
> > > >> such things as relationship between clocks and voltages, clock/voltage/power
> > > >> domains, etc needed for aggressive power management.
> >
> > relationships are very arch specific so having or not the fw there
> > wouldn't probably make any real difference.
> > About power domains ... well, what's the deal?
> > Aren't they controlled like voltages, in the end?
>
> The last time I looked at a power domain, it was more like a
> prerequisite to using various device clocks.  It wasn't something
> that needed to be exposed in an API ... it would suffice to
> have it be activated/deactivated as a side effect of accessing
> those clocks.
>
> That is, the existing hooks suffice.  The platform-specific code
> to enable a given clock subtree would just have a few lines of
> code teaching it how to enable the relevant power domain.  They
> already rely on platform-specific logic to sort out which register
> bank to touch, which bits there, and various other things.
>
>
> Although that recent work for memory partitioning is relevant
> here too:  being able to power down banks of DRAM (ensuring they
> hold no data or code that's currently needed) can certainly cut
> down one category of power drain.
>
> In the same vein, some utilities to manage on-chip SRAM would
> likely be helpful.  It's common that certain low power modes
> disable all DRAM, leaving only (say) 16 KB of SRAM available
> for running code ... or that DRAM be unavailable while CPU
> and/or memory are being reclocked (maybe by a DVFS scheme,
> using cpufreq or whatever).
>
>
> > > > Most clocks don't have those issues.  Why penalize all clocks for
> > > > issues which only relate to a few?  Better to only do that for the
> > > > few cocks which have such additional constraints.
> > >
> > > parameter fw would give exactly this behavior: relationship between clocks and
> > > voltages, clock/voltage/power domains implementation would be just additional
> > > arcs between nodes for clock and voltages. Nowadays clk fw has only one type of
> > > arcs - parent-child type. parameter fw would bring additional types. But clock
> > > nodes would be linked just with required arcs (of required type; both are arch
> > > specific things) so for an arch without any additional dependencies between
> > > clocks (and voltages) parameter fw would end up with exactly equivalent clock
> > > tree as clk fw for the arch has today.
> >
> > I think what has to be clarified, is wether there is a real need for new
> > features or not.
>
> Not just "new features" but "new cross-platform features".  This PM stuff
> is really down'n'dirty code ... who will be working with such code?  The less
> we demand of those people, the more likely it is that such features would
> actually be delivered in working form.  A lot of platform vendors work
> with older kernel releases (2.6.10 etc), which means that new APIs will
> be obstacles to development and deployment.
>
> Going back to clock and power domain examples above:  it's *easy* to see how
> those could be delivered using the current clock framework.  There are a LOT of
> folk who are familiar with those interfaces -- today.  So unless there really
> is no way to address the issue with the current APIs, it's best to avoid
> creating new interfaces for those issues.
>
>
> > > > Plus, remember that the clock framework is an interface ... so by
> > > > definition, it has no code associated with it.  Hence no duplication
> > > > of code is possible... at least at this hand-wavey "concept" level.
> > > > Possibly a given implementation has scope for code sharing; but I
> > > > doubt it.  Code behind a given implementation of the clock interface
> > > > is invariably quite slim.
> > >
> > > and invariably looks like a hack and still duplicate clock tree building and
> > > traversing code.
>
> Most cases I've seen build the clock tree with a bunch of static struct
> declarations, so the building is all at compile time ... and could hardly
> duplicate something on another platform, since it's all platform-specific.
> And as noted above, traversal code is trivial ...
>
>
> > > Dependencies which exist in modern hw between clocks, clocks
> > > and voltages require a more straight and standard technique to be set up for
> > > implementing clk/vltg/name_it framework.
>
> I think Linus has pointed out that cut'n'paste is a very straightforward
> and standard technique.  :)
>
> The clock-to-clock dependencies are handled by the clock framework
> already.  As are at least some cases of power domains.
>
> DVFS -- cpufreq or otherwise -- is the primary example of a case where
> coupling is needed between clocks and voltages.  These PM discussions
> lately seem to dance around DVFS without saying so explicitly ... seemingly,
> to avoid the "so why not just fix cpufreq" discussions that would (should!)
> naturally follow.
>
> My own two cents on the cpufreq thing:  (a) it should support the driver
> model better; (b) a generic "run from SRAM" solution would be an aid to
> that and other PM code; (c) without clock framework support for drivers
> being able to reject changes or else re-clock, changing base clocks is
> often impractical; (d) I don't recall seeing anyone really answer the
> "so why not ust fix cpufreq" question.
>
>
> > > Moving most of the code to be arch
> > > independent and setting a clear rules on how a clock/vltg tree configuration
> > > would look like while just looking at a hw manual would help.
> >
> > I think clocks and voltages providers have very different behaviours: for example
> > a voltage src can sometimes be put in quiescent state, where the voltage value is
> > preserved, but the max current is significantly reduced, therefore minimizing the
> > leakage. I wouldn't welcome such functionality to be merged with clock handling.
>
> Another good example of these APIs being domain-specific ... though
> in a way it's a special case of the one I gave above.  In a similar
> vein, lower current (but not quiescent) modes may be able to use
> alternate schemes for voltage regulation.  The regulation may need
> software inputs to switch schemes.
>
>
> > > > If a clock being enabled implies a power or voltage domain being active,
> > > > there's no reason that constraint shouldn't be enforced by whatever
> > > > implementation a given platform uses.
> > >
> > > true. but it's the same functionality required on different arches. and it can
> > > be done in arch independent way without penalties for arches which do not
> > > require the functionality. What's the rationale to move it down to arch specific
> > > code then?
> >
> > I'd rather ask: what's the benefit of merging apples and oranges?
>
> Someone may be tiring of a steady diet, and wants to become a chef?  :)
>
> I'll repeat that power and clock domain examples above:  the current
> clock framework is **ALREADY** arch-independant in exactly the way
> Eugeny sketched, for several platforms.
>
>
> > > > And having a generic -- basically untyped -- notion of "parameter"
> > > > seems significantly less good than having a typed notion, with
> > > > type-specific operations.
> > >
> > > there is no type-specific operations. clk_set_rate() and vltg_set() basically do
> > > the same thing - set a value for a pm resource provided by clock or voltage
> > > device (regulator). The difference is that output of clock regulator is measured
> > > in Hz and named 'frequesncy' and output of voltage regulator os measured in mV
> > > and named 'voltage' which actually does not matter from API POV. So, i see more
> > > sense in having param_set(), parame_set_state() (set_state primitive for idle
> > > states) rather than in
> >
> > See my comment above about different, peculiar, behaviour of voltages.
> > And that's just an example.
>
> One of several, I added more ...
>
>
> > > clk_set_rate(), clk_set_state()
> > > vltg_set(), vltg_set_state()
>
> vtg_allocate_budget() would have no clock domain analogue,
> nor would vtg_free_budget() ...
>
>
> > > and, our analysis shows that you would end up with a separate type for domains,
> > > so it would be at least
> > > domain_set_state() as well.
>
> See my comments above re clock and voltage domains, applicable to
> at least half a dozen platforms.
>
> Until you share your analysis, so we can see what we agree and disagree
> with, it's not helpful for you to claim that it supports your argument.
>
>
> > > > Typed notions are easier to understand,
> > > > read, and maintain.
> > >
> > > common, after tricks with spinlocks in [merged!] RT patches? ;)
>
> I don't follow this comment.  I'm still waiting to see the NO_HZ stuff
> become as stable and fast as the previous NO_IDLE_HZ stuff ... but
> none of that seems to relate to type-unsafe programming interfaces.
>
> - Dave
>
>
> > >
> > > Eugeny
> > >
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16  3:56                     ` Ikhwan Lee
@ 2007-03-16  6:17                       ` David Brownell
  2007-03-19  2:27                         ` Ikhwan Lee
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-16  6:17 UTC (permalink / raw)
  To: Ikhwan Lee; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Thursday 15 March 2007 8:56 pm, Ikhwan Lee wrote:
> Hi,
> 
> Although I agree that the current clock framework can handle power or
> voltage domains in many platforms, having something like (struct clk
> powerdomain1, powerdomain2;) does not seem like a good implementation,
> a struct for clocks representing a power domain.

Good thing that's not what I suggested then, right?  :)

The point was that in the examples I've seen, the power domains
are associated with clock domains, so that each clock is tied
to one power domain.  And since you can't use the power domain
without having a clock ... the implementation can tell if it's
got to activate a power domain by looking at the clock.

There may be other models of power domain, but that's the one
I've had reason to look at (which isn't synonymous with a straight
voltage/current supply).


> If a new framework is more straighforward and introduces a negligible
> overhead to the current kernel, I think it is worthwhile to have a
> look at it. Plus this new framework might be able to take care of
> those platforms that are not nicely supported by the current clock
> framework.

Perhaps when we see one, we could discuss that as somethong other
than pure handwaving.  But that still won't address the basic point
that it's wrong to assume the clock framework should be written out
of the picture.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16  2:21                   ` David Brownell
  2007-03-16  3:56                     ` Ikhwan Lee
@ 2007-03-16 13:06                     ` Dmitry Krivoschekov
  2007-03-16 18:03                       ` David Brownell
  1 sibling, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-16 13:06 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
> On Thursday 15 March 2007 6:04 am, Igor Stoppa wrote:
>> On Thu, 2007-03-15 at 12:53 +0300, ext Eugeny S. Mints wrote:
>>>> On Wednesday 14 March 2007 3:43 am, Eugeny S. Mints wrote:
>>>>>> Would this involve replacing the clock framework, or are they going to coexist?
>>>>> parameter framework would eventually replace clock framework.
>>>> That seems to be the wrong answer.  Especially since nothing has
>>>> been shown to be wrong with the clock interface; much less to be
>>>> unfixably wrong (hence justifying replacement).
>>> a cherry-picking on clk fw API:
>>>
>>> - clk_set_rate() sticks to an individual clock - no way to set rates for number 
>>>   of clocks at once instead of having series of clk_set_rate() calls.
>
> This isn't "wrong", it's a "lack of feature".  The normal process for
> addressing such things is to improve working/deployed software interfaces,
> not throw them out and try to create and deploy something new.  :)
David,

Eugeny didn't say he suggest deploying something new just to address
only this issue. Also, the key word in his sentence is "eventually" that
assumes
"evolutionary", if you read all Eugeny's responses you've seen he suggest
starting with existing clock framework.

The actual reason to deploy something new is to organize a mess around
Power Management in Linux kernel... and not only in kernel. I bet you
are also
thinking on how to make order in this area, and I surprised you didn't
comment
on original Matthew's post since it contains so many points to discuss.

For example,

> - PM resource representation
>   Similar to the device abstraction available today. 

IIUC, the framework will use some platform-independent entities to
represent
clock and voltage subsystems of a platform, entities similar for 'struct
device'
which is used to represent normal devices in the system. I.e. one entity
(node) corresponds
to one real clock(voltage) device. What's the worth of
platform-independent representing?
Perhaps it will help to organize clock and voltage subsystems of various
platforms.
The elementary entity (a node),
may store information of clock/voltage device state (active, idle,
off...), a value of
provided frequency/voltage(MHz/mV), a latency needed for transitioning
from one state
for another(ms), also the node may store lists of dependent devices,
info of dependencies
the node is involved to, also the node may contain platform-specific info.
Considering clock framework, it means 'struct clk' becomes
arch-independent but having
a pointer to platform-specific data.
Also, the node may have a dedicated driver which performs
hardware-specific operations
(programms PLL or just sets dividers), some of
this operations may be the same among some platforms or even among
different archs.


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16 13:06                     ` Dmitry Krivoschekov
@ 2007-03-16 18:03                       ` David Brownell
  2007-03-18 20:25                         ` Dmitry Krivoschekov
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-16 18:03 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Friday 16 March 2007 6:06 am, Dmitry Krivoschekov wrote:

> 	 I surprised you didn't comment
> on original Matthew's post since it contains so many points to discuss.

As a direction, it sounded better than many notions I've seen here.

But there was a certain lack of ... detail ... making discussion
impractical, except as speculation.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16 18:03                       ` David Brownell
@ 2007-03-18 20:25                         ` Dmitry Krivoschekov
  2007-03-19  4:04                           ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-18 20:25 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
> On Friday 16 March 2007 6:06 am, Dmitry Krivoschekov wrote:
>
>> 	 I surprised you didn't comment
>> on original Matthew's post since it contains so many points to discuss.
>
> As a direction, it sounded better than many notions I've seen here.
>
> But there was a certain lack of ... detail ... making discussion
> impractical, except as speculation.
I believe  details  was intentionally omitted to receive some feedback
regarding
concept basics only, this may help to adjust (or change) implementation
details
from the beginning.

But ok, as you aren't arguing against those basics and intentions then
seems you
agree to them.

For me, there is a point that seems debatable already at the starting stage:

> The goal of this parameter framework is to expose the resources in  a  
> way that allows other s/w (governors, policy mangers, etc) to control  
> the resources while keeping the system operational.  One of the main  
> requirements in our thinking is that we want this layer to represent  
> the h/w and not include policy or decision making.  Meaning the 
> software using the parameter framework would be responsible for  
> deciding the appropriate value for the parameters.   


Sometimes it's quite reasonable to make decisions (or policy)
at the low level, without exposing events to higher layers,
e.g. turning a clock off when reference counter gets zero, this is
what OMAP's clock framework currently does.


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-16  6:17                       ` David Brownell
@ 2007-03-19  2:27                         ` Ikhwan Lee
  2007-03-19  6:07                           ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Ikhwan Lee @ 2007-03-19  2:27 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

Hi,

On 3/16/07, David Brownell <david-b@pacbell.net> wrote:
> On Thursday 15 March 2007 8:56 pm, Ikhwan Lee wrote:
> > Hi,
> >
> > Although I agree that the current clock framework can handle power or
> > voltage domains in many platforms, having something like (struct clk
> > powerdomain1, powerdomain2;) does not seem like a good implementation,
> > a struct for clocks representing a power domain.
>
> Good thing that's not what I suggested then, right?  :)
>
> The point was that in the examples I've seen, the power domains
> are associated with clock domains, so that each clock is tied
> to one power domain.  And since you can't use the power domain
> without having a clock ... the implementation can tell if it's
> got to activate a power domain by looking at the clock.

True, although sometimes it gets dirty because multiple clock sources
are associated with one power domain at the same time multiple power
domains are associated with one clock source. Simple parent and child
relationship provided by the clock framework is not always enough.

> There may be other models of power domain, but that's the one
> I've had reason to look at (which isn't synonymous with a straight
> voltage/current supply).
>
>
> > If a new framework is more straighforward and introduces a negligible
> > overhead to the current kernel, I think it is worthwhile to have a
> > look at it. Plus this new framework might be able to take care of
> > those platforms that are not nicely supported by the current clock
> > framework.
>
> Perhaps when we see one, we could discuss that as somethong other
> than pure handwaving.  But that still won't address the basic point
> that it's wrong to assume the clock framework should be written out
> of the picture.

I think we can reach an agreement. The clock framework does not need
to be replaced with a new one since it is serving its purpose well
enough. If extra functionalities are needed for clocks, we can extend
the existing clock framework. Such extensions will include functions
like clk_set_rate_pending() and power_transaction_commit(). However,
since clocks and voltages (or power domains) have different
characteristics, it is desirable to have a separate framework for
power domains and associate that framework with the existing clock
framework.

I am not sure if this is the direction that the original PowerOp
people suggested. If we can agree on this, however, I think we can
proceed to look at the code.

Ikhwan.

>
> - Dave
>
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-18 20:25                         ` Dmitry Krivoschekov
@ 2007-03-19  4:04                           ` David Brownell
  2007-03-20  0:03                             ` Dmitry Krivoschekov
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-19  4:04 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> 
> For me, there is a point that seems debatable already at the starting stage:
> 
> > The goal of this parameter framework is to expose the resources in  a  
> > way that allows other s/w (governors, policy mangers, etc) to control  
> > the resources while keeping the system operational.  One of the main  
> > requirements in our thinking is that we want this layer to represent  
> > the h/w and not include policy or decision making.  Meaning the 
> > software using the parameter framework would be responsible for  
> > deciding the appropriate value for the parameters.   
> 
> 
> Sometimes it's quite reasonable to make decisions (or policy)
> at the low level, without exposing events to higher layers,

Of course.  Any layer can incorporate a degree of policy.

It's only when that's badly done -- or the problem is so complex
that multiple policies need to be supported -- that you need to
pull out that old "mechanism not policy" chestnut, and support
some kind of policy switching mechanism (governors, userspace
agents, etc) for different application domains.


> e.g. turning a clock off when reference counter gets zero, this is
> what OMAP's clock framework currently does.

There are no choices to be made in that layer; it's no more "policy"
than following the laws of arithmetic is "policy".  Software clock
gating is what the clock framework is defined as doing; there's
nothing OMAP-specific about that.

The interesting bit for OMAP is that clock gating will often be done
in hardware, not just in software.  There are other low-power SOC
designs that do such things, but the ones I'm most aware of are for
microcontrollers (MSP430, picoAVR, etc) that can't run Linux.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-19  2:27                         ` Ikhwan Lee
@ 2007-03-19  6:07                           ` David Brownell
  0 siblings, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-19  6:07 UTC (permalink / raw)
  To: Ikhwan Lee; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Sunday 18 March 2007 7:27 pm, Ikhwan Lee wrote:
> Hi,
> 
> On 3/16/07, David Brownell <david-b@pacbell.net> wrote:
> > On Thursday 15 March 2007 8:56 pm, Ikhwan Lee wrote:
> > > Hi,
> > >
> > > Although I agree that the current clock framework can handle power or
> > > voltage domains in many platforms, having something like (struct clk
> > > powerdomain1, powerdomain2;) does not seem like a good implementation,
> > > a struct for clocks representing a power domain.
> >
> > Good thing that's not what I suggested then, right?  :)
> >
> > The point was that in the examples I've seen, the power domains
> > are associated with clock domains, so that each clock is tied
> > to one power domain.  And since you can't use the power domain
> > without having a clock ... the implementation can tell if it's
> > got to activate a power domain by looking at the clock.
> 
> True, although sometimes it gets dirty because multiple clock sources
> are associated with one power domain

As clearly allowed for in what I wrote.  clock->power_domain.

> at the same time multiple power 
> domains are associated with one clock source.

As also allowed for in what I wrote originally.  clock->power_domains[].

> Simple parent and child 
> relationship provided by the clock framework is not always enough.

Not implied in what I wrote.


> > There may be other models of power domain, but that's the one
> > I've had reason to look at (which isn't synonymous with a straight
> > voltage/current supply).
> >
> >
> > > If a new framework is more straighforward and introduces a negligible
> > > overhead to the current kernel, I think it is worthwhile to have a
> > > look at it. Plus this new framework might be able to take care of
> > > those platforms that are not nicely supported by the current clock
> > > framework.
> >
> > Perhaps when we see one, we could discuss that as somethong other
> > than pure handwaving.  But that still won't address the basic point
> > that it's wrong to assume the clock framework should be written out
> > of the picture.
> 
> I think we can reach an agreement. The clock framework does not need
> to be replaced with a new one since it is serving its purpose well
> enough. If extra functionalities are needed for clocks, we can extend
> the existing clock framework. Such extensions will include functions
> like clk_set_rate_pending() and power_transaction_commit(). However,
> since clocks and voltages (or power domains) have different
> characteristics, it is desirable to have a separate framework for
> power domains and associate that framework with the existing clock
> framework.

If the platform needs power domains to be exposed, yes.  But I gave
examples where it does NOT need to be exposed, since each clock was
in a single power domain.


> I am not sure if this is the direction that the original PowerOp
> people suggested. If we can agree on this, however, I think we can
> proceed to look at the code.

I'm not sure why such agreement should be necessary before showing
interface definitions.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-19  4:04                           ` David Brownell
@ 2007-03-20  0:03                             ` Dmitry Krivoschekov
  2007-03-20  8:07                               ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20  0:03 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>
>> Sometimes it's quite reasonable to make decisions (or policy)
>> at the low level, without exposing events to higher layers,
>
> Of course.  Any layer can incorporate a degree of policy.
But users should be able to choose to use or do not use the incorporated
policy, shouldn't they?
>
> It's only when that's badly done -- or the problem is so complex
> that multiple policies need to be supported -- that you need to
> pull out that old "mechanism not policy" chestnut, and support
> some kind of policy switching mechanism (governors, userspace
> agents, etc) for different application domains.
>
>
>> e.g. turning a clock off when reference counter gets zero, this is
>> what OMAP's clock framework currently does.
>
> There are no choices to be made in that layer; it's no more "policy"
> than following the laws of arithmetic is "policy".  Software clock
there is some principle: "turn the clock off when use counter reaches
zero",
so it is a policy, and a choice is to disable or not to disable an
output clock,
it is the simplest case but it's certainly a policy.  And there is no API
to enable/disable the policy that could be useful in some cases.
> gating is what the clock framework is defined as doing; there's
> nothing OMAP-specific about that.
Yes, OMAP's code just a pioneer with that.
>
> The interesting bit for OMAP is that clock gating will often be done
> in hardware, not just in software.  
Yes, but you are free to disable the feature.
> There are other low-power SOC
> designs that do such things, 
Recent PXAs, i.MX31 are capable for  that. There are also more exotic
examples of hardware PM awareness,  e.g.  i.MX31 chip supports
DVFS that can totally be performed  in hardware i.e.
processor adjusts its clock and voltage basing on current processor
workload.


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20  0:03                             ` Dmitry Krivoschekov
@ 2007-03-20  8:07                               ` David Brownell
  2007-03-20  9:45                                 ` Dmitry Krivoschekov
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-20  8:07 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
> David Brownell wrote:
> > On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> >>
> >> Sometimes it's quite reasonable to make decisions (or policy)
> >> at the low level, without exposing events to higher layers,
> >
> > Of course.  Any layer can incorporate a degree of policy.
>
> But users should be able to choose to use or do not use the incorporated
> policy, shouldn't they?

Sometimes.  What's a user?  Do you really expect every single
algorithm choice to be packaged as a pluggable policy?  Any
time I've seen systems designed that way, those pluggabilty
hooks have been a major drag on performance and maintainability.

Most components don't actually _need_ a choice of policies.


> > It's only when that's badly done -- or the problem is so complex
> > that multiple policies need to be supported -- that you need to
> > pull out that old "mechanism not policy" chestnut, and support
> > some kind of policy switching mechanism (governors, userspace
> > agents, etc) for different application domains.
> >
> >
> >> e.g. turning a clock off when reference counter gets zero, this is
> >> what OMAP's clock framework currently does.
> >
> > There are no choices to be made in that layer; it's no more "policy"
> > than following the laws of arithmetic is "policy".  Software clock
>
> there is some principle: "turn the clock off when use counter reaches
> zero", so it is a policy, and a choice is to disable or not to disable
> an output clock, it is the simplest case but it's certainly a policy. 

That's not a choice; it's how the API is defined.  It's not "policy".

Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
allowing it to == 5 instead?  Or should we just accept that as how
things are defined, and move on?

At some point, decisions are just ground rules, and not policy.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20  8:07                               ` David Brownell
@ 2007-03-20  9:45                                 ` Dmitry Krivoschekov
  2007-03-20 10:30                                   ` Igor Stoppa
  2007-03-20 19:58                                   ` David Brownell
  0 siblings, 2 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20  9:45 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
>> David Brownell wrote:
>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>>> Sometimes it's quite reasonable to make decisions (or policy)
>>>> at the low level, without exposing events to higher layers,
>>> Of course.  Any layer can incorporate a degree of policy.
>> But users should be able to choose to use or do not use the incorporated
>> policy, shouldn't they?
>
> Sometimes.  What's a user? 
It's a user of a kernel subsystem that (subsystem ) keeps a policy,
i.e. a user it's another kernel subsystem or userspace application,
it depends on implementation of a system.
>  Do you really expect every single
> algorithm choice to be packaged as a pluggable policy?  
I didn't say pluggable policy, I just said there are must be
an alternative - "use" or "do not use" a predefined policy.
For example, in USB you are able to enable/disable autosuspend rule,
don't know if it's possible to disable it at runtime though.
> Any
> time I've seen systems designed that way, those pluggabilty
> hooks have been a major drag on performance and maintainability.
>
> Most components don't actually _need_ a choice of policies.
>
>
Yes, but they at least need a mechanism to disable an associated policy,
upper layers should be able to decide where the policy well be kept,
they may delegate the keeping to lower layers but also may want to
keep the policy themselves for some reason.

Also, in some cases it is reasonable to adjust rules of a policy
(without changing the policy). For example if you define a policy
"keep an output frequency always for 33 MHz (an input frequency may vary)",
you may want to change the base frequency to 66 MHz sometimes.


>>> It's only when that's badly done -- or the problem is so complex
>>> that multiple policies need to be supported -- that you need to
>>> pull out that old "mechanism not policy" chestnut, and support
>>> some kind of policy switching mechanism (governors, userspace
>>> agents, etc) for different application domains.
>>>
>>>
>>>> e.g. turning a clock off when reference counter gets zero, this is
>>>> what OMAP's clock framework currently does.
>>> There are no choices to be made in that layer; it's no more "policy"
>>> than following the laws of arithmetic is "policy".  Software clock
>> there is some principle: "turn the clock off when use counter reaches
>> zero", so it is a policy, and a choice is to disable or not to disable
>> an output clock, it is the simplest case but it's certainly a policy. 
>
> That's not a choice; it's how the API is defined.  It's not "policy".
>
> Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
> allowing it to == 5 instead?  Or should we just accept that as how
> things are defined, and move on?
We should accept this if we agree that benefit of using the rule always
exist,
but if the rule constrain some functionality we may want to disable the
rule.
Considering the case with clocks,  we may want  to leave the clock running
even if there is no users of the clock, but there is a timing constraint
for readiness  of  a clock device  (PLLs can't be started immediately).


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20  9:45                                 ` Dmitry Krivoschekov
@ 2007-03-20 10:30                                   ` Igor Stoppa
  2007-03-20 12:13                                     ` Eugeny S. Mints
  2007-03-20 13:07                                     ` Dmitry Krivoschekov
  2007-03-20 19:58                                   ` David Brownell
  1 sibling, 2 replies; 84+ messages in thread
From: Igor Stoppa @ 2007-03-20 10:30 UTC (permalink / raw)
  To: ext Dmitry Krivoschekov; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
> David Brownell wrote:
> > On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
> >> David Brownell wrote:
> >>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> >>>> Sometimes it's quite reasonable to make decisions (or policy)
> >>>> at the low level, without exposing events to higher layers,
> >>> Of course.  Any layer can incorporate a degree of policy.
> >> But users should be able to choose to use or do not use the incorporated
> >> policy, shouldn't they?
> >
> > Sometimes.  What's a user? 
> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
> i.e. a user it's another kernel subsystem or userspace application,
> it depends on implementation of a system.
> >  Do you really expect every single
> > algorithm choice to be packaged as a pluggable policy?  
> I didn't say pluggable policy, I just said there are must be
> an alternative - "use" or "do not use" a predefined policy.
> For example, in USB you are able to enable/disable autosuspend rule,
> don't know if it's possible to disable it at runtime though.
> > Any
> > time I've seen systems designed that way, those pluggabilty
> > hooks have been a major drag on performance and maintainability.
> >
> > Most components don't actually _need_ a choice of policies.
> >
> >
> Yes, but they at least need a mechanism to disable an associated policy,
> upper layers should be able to decide where the policy well be kept,
> they may delegate the keeping to lower layers but also may want to
> keep the policy themselves for some reason.
That sounds more like it's driven by IP than by technical reasons.
Anyway in the USB example mentioned the rule is very high level, while
here the proposal is to meddle with the internals of a driver.
It seems more logical to implement policies/rules at driver level,
rather than going straight for the resources of the driver.

Why can't the driver itself be able to translate whatever high-level
command/hint it receives into the platform/arch/board specific actions?


> Also, in some cases it is reasonable to adjust rules of a policy
> (without changing the policy). For example if you define a policy
> "keep an output frequency always for 33 MHz (an input frequency may vary)",
> you may want to change the base frequency to 66 MHz sometimes.
> 
> 
> >>> It's only when that's badly done -- or the problem is so complex
> >>> that multiple policies need to be supported -- that you need to
> >>> pull out that old "mechanism not policy" chestnut, and support
> >>> some kind of policy switching mechanism (governors, userspace
> >>> agents, etc) for different application domains.
> >>>
> >>>
> >>>> e.g. turning a clock off when reference counter gets zero, this is
> >>>> what OMAP's clock framework currently does.
> >>> There are no choices to be made in that layer; it's no more "policy"
> >>> than following the laws of arithmetic is "policy".  Software clock
> >> there is some principle: "turn the clock off when use counter reaches
> >> zero", so it is a policy, and a choice is to disable or not to disable
> >> an output clock, it is the simplest case but it's certainly a policy. 
> >
> > That's not a choice; it's how the API is defined.  It's not "policy".
> >
> > Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
> > allowing it to == 5 instead?  Or should we just accept that as how
> > things are defined, and move on?
> We should accept this if we agree that benefit of using the rule always
> exist,
> but if the rule constrain some functionality we may want to disable the
> rule.
> Considering the case with clocks,  we may want  to leave the clock running
> even if there is no users of the clock, but there is a timing constraint
> for readiness  of  a clock device  (PLLs can't be started immediately).

I find that a bogus example.
It seems like you are generalising clock handling based on PLLs.
The PLL is actually the exception, having a penalty in commuting between
states, while all the children can be toggled on/off without any delay.
And that's easy to deal with: if a driver is going to do something that
could be affected by the PLL automatically going off, the driver can
avoid releasing its clock. That will effectively keep the PLL on.

The PLL per se is not really significant, apart from the fact that it
tatkes power and it's desirable to keep it off for as long as possible,
but the important bit is that the drivers must have the clock ready and
available when needed.

A similar approach can be used for frequencies: if a driver periodically
needs a certain high frequency, it might be impacted by the system
automatically scaling voltage/frequence.

Possible solutions?
-keep the request for high frequency
  if a pll relock is involved in the scaling

-keep the request for high voltage
  if there is no significant delay from a possible pll relock or no
relock at all, but significant ramp-up time for the voltage regulator

These actions could be performed by the driver either autonomously or
based on hints/commands receivied from upper layers, in the form of
driver specific commands.
-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 10:30                                   ` Igor Stoppa
@ 2007-03-20 12:13                                     ` Eugeny S. Mints
  2007-03-20 12:39                                       ` Igor Stoppa
  2007-03-20 13:07                                     ` Dmitry Krivoschekov
  1 sibling, 1 reply; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-20 12:13 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: Dominik Brodowski, linux-pm, Pavel Machek

Igor Stoppa wrote:
> On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
>> David Brownell wrote:
>>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
>>>> David Brownell wrote:
>>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>>>>> Sometimes it's quite reasonable to make decisions (or policy)
>>>>>> at the low level, without exposing events to higher layers,
>>>>> Of course.  Any layer can incorporate a degree of policy.
>>>> But users should be able to choose to use or do not use the incorporated
>>>> policy, shouldn't they?
>>> Sometimes.  What's a user? 
>> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
>> i.e. a user it's another kernel subsystem or userspace application,
>> it depends on implementation of a system.
>>>  Do you really expect every single
>>> algorithm choice to be packaged as a pluggable policy?  
>> I didn't say pluggable policy, I just said there are must be
>> an alternative - "use" or "do not use" a predefined policy.
>> For example, in USB you are able to enable/disable autosuspend rule,
>> don't know if it's possible to disable it at runtime though.
>>> Any
>>> time I've seen systems designed that way, those pluggabilty
>>> hooks have been a major drag on performance and maintainability.
>>>
>>> Most components don't actually _need_ a choice of policies.
>>>
>>>
>> Yes, but they at least need a mechanism to disable an associated policy,
>> upper layers should be able to decide where the policy well be kept,
>> they may delegate the keeping to lower layers but also may want to
>> keep the policy themselves for some reason.
> That sounds more like it's driven by IP than by technical reasons.
> Anyway in the USB example mentioned the rule is very high level, while
> here the proposal is to meddle with the internals of a driver.
> It seems more logical to implement policies/rules at driver level,
> rather than going straight for the resources of the driver.
> 
> Why can't the driver itself be able to translate whatever high-level
> command/hint it receives into the platform/arch/board specific actions?

I guess you are confusing two things here.

There are pm resources which are solely used by one linux device driver only. 
Such pm resources are not supposed to be under control of power parameter 
framework we are talking about. Indeed, a linux device driver controls such pm 
resources
a) by itself
and
b) based on knowledge internal to the driver or translating whatever high-level 
command/hint it receives.

However there is another type of pm resources - pm resources shared by several 
drivers. Such drivers of course do not know about each other and can't know 
which state of the pm resource in question is the best at the moment from the 
_whole_ _system_ point of view. In this case implementation of control of such a 
pm resource via hinting in some way both drivers seems to me neither 
reasonable/practical nor feasible.

IMO autoidle (a state(s) hw puts a resource to without any sw interaction) as a 
particular control case for such pm resources is out of scope of a particular 
driver as well.

So, in my thinking a devoted entity should exist to deal with such questions in 
regard to such shared pm resources. This entity is policy manager. Thus I see a 
need in an API for policy managers to be presented.
> 
> 
>> Also, in some cases it is reasonable to adjust rules of a policy
>> (without changing the policy). For example if you define a policy
>> "keep an output frequency always for 33 MHz (an input frequency may vary)",
>> you may want to change the base frequency to 66 MHz sometimes.
>>
>>
>>>>> It's only when that's badly done -- or the problem is so complex
>>>>> that multiple policies need to be supported -- that you need to
>>>>> pull out that old "mechanism not policy" chestnut, and support
>>>>> some kind of policy switching mechanism (governors, userspace
>>>>> agents, etc) for different application domains.
>>>>>
>>>>>
>>>>>> e.g. turning a clock off when reference counter gets zero, this is
>>>>>> what OMAP's clock framework currently does.
>>>>> There are no choices to be made in that layer; it's no more "policy"
>>>>> than following the laws of arithmetic is "policy".  Software clock
>>>> there is some principle: "turn the clock off when use counter reaches
>>>> zero", so it is a policy, and a choice is to disable or not to disable
>>>> an output clock, it is the simplest case but it's certainly a policy. 
>>> That's not a choice; it's how the API is defined.  It's not "policy".
>>>
>>> Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
>>> allowing it to == 5 instead?  Or should we just accept that as how
>>> things are defined, and move on?
>> We should accept this if we agree that benefit of using the rule always
>> exist,
>> but if the rule constrain some functionality we may want to disable the
>> rule.
>> Considering the case with clocks,  we may want  to leave the clock running
>> even if there is no users of the clock, but there is a timing constraint
>> for readiness  of  a clock device  (PLLs can't be started immediately).
> 
> I find that a bogus example.
> It seems like you are generalising clock handling based on PLLs.

No. We are designing generic framework for various types of pm resources 
provided by currently available hw. Already today we know about such cases as 
PLLs. Please remember that we are approaching voltages as well and I hope you 
agree that voltage regulators also fit in this example as switching a voltage 
incurs latencies.

> The PLL is actually the exception, having a penalty in commuting between
> states, while all the children can be toggled on/off without any delay.
> And that's easy to deal with: if a driver is going to do something that
> could be affected by the PLL automatically going off, the driver can
> avoid releasing its clock. That will effectively keep the PLL on.

It's just about preventing effective power management. A particular driver does 
not have all necessary knowledge/information on what is the most efficient at 
the moment in regard to _system_ power management. Not releasing a clock/voltage 
which is not in use by the driver at the moment (and what more important in 
generic case the driver has no clue on when it will start to use it again) just 
prevents an upper layer (policy manager) from capability to perform effective 
power management.

> 
> The PLL per se is not really significant, apart from the fact that it
> tatkes power and it's desirable to keep it off for as long as possible,
> but the important bit is that the drivers must have the clock ready and
> available when needed.

Indeed we may want to think about that a driver may want to export requirements 
on what expected latencies for pm_resource_get() are. But again, a particular 
driver can;t perform such a decision internally for shared pm resources.

Eugeny

> 
> A similar approach can be used for frequencies: if a driver periodically
> needs a certain high frequency, it might be impacted by the system
> automatically scaling voltage/frequence.
> 
> Possible solutions?
> -keep the request for high frequency
>   if a pll relock is involved in the scaling
> 
> -keep the request for high voltage
>   if there is no significant delay from a possible pll relock or no
> relock at all, but significant ramp-up time for the voltage regulator
> 
> These actions could be performed by the driver either autonomously or
> based on hints/commands receivied from upper layers, in the form of
> driver specific commands.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 12:13                                     ` Eugeny S. Mints
@ 2007-03-20 12:39                                       ` Igor Stoppa
  2007-03-20 13:44                                         ` Dmitry Krivoschekov
  2007-03-20 21:03                                         ` David Brownell
  0 siblings, 2 replies; 84+ messages in thread
From: Igor Stoppa @ 2007-03-20 12:39 UTC (permalink / raw)
  To: ext Eugeny S. Mints; +Cc: Dominik Brodowski, linux-pm, Pavel Machek

On Tue, 2007-03-20 at 15:13 +0300, ext Eugeny S. Mints wrote:
> Igor Stoppa wrote:
> > On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
> >> David Brownell wrote:
> >>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
> >>>> David Brownell wrote:
> >>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> >>>>>> Sometimes it's quite reasonable to make decisions (or policy)
> >>>>>> at the low level, without exposing events to higher layers,
> >>>>> Of course.  Any layer can incorporate a degree of policy.
> >>>> But users should be able to choose to use or do not use the incorporated
> >>>> policy, shouldn't they?
> >>> Sometimes.  What's a user? 
> >> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
> >> i.e. a user it's another kernel subsystem or userspace application,
> >> it depends on implementation of a system.
> >>>  Do you really expect every single
> >>> algorithm choice to be packaged as a pluggable policy?  
> >> I didn't say pluggable policy, I just said there are must be
> >> an alternative - "use" or "do not use" a predefined policy.
> >> For example, in USB you are able to enable/disable autosuspend rule,
> >> don't know if it's possible to disable it at runtime though.
> >>> Any
> >>> time I've seen systems designed that way, those pluggabilty
> >>> hooks have been a major drag on performance and maintainability.
> >>>
> >>> Most components don't actually _need_ a choice of policies.
> >>>
> >>>
> >> Yes, but they at least need a mechanism to disable an associated policy,
> >> upper layers should be able to decide where the policy well be kept,
> >> they may delegate the keeping to lower layers but also may want to
> >> keep the policy themselves for some reason.
> > That sounds more like it's driven by IP than by technical reasons.
> > Anyway in the USB example mentioned the rule is very high level, while
> > here the proposal is to meddle with the internals of a driver.
> > It seems more logical to implement policies/rules at driver level,
> > rather than going straight for the resources of the driver.
> > 
> > Why can't the driver itself be able to translate whatever high-level
> > command/hint it receives into the platform/arch/board specific actions?
> 
> I guess you are confusing two things here.
Not really.

> There are pm resources which are solely used by one linux device driver only. 
> Such pm resources are not supposed to be under control of power parameter 
> framework we are talking about. Indeed, a linux device driver controls such pm 
> resources
> a) by itself
> and
> b) based on knowledge internal to the driver or translating whatever high-level 
> command/hint it receives.
> 
> However there is another type of pm resources - pm resources shared by several 
> drivers. Such drivers of course do not know about each other and can't know 
> which state of the pm resource in question is the best at the moment from the 
> _whole_ _system_ point of view.
That's why we have, for example, such thing as the clk fw to manage
shared clocks.

>  In this case implementation of control of such a 
> pm resource via hinting in some way both drivers seems to me neither 
> reasonable/practical nor feasible.
Really? It's just abstracting what you are proposing to do by directly touching
the resource underneath the driver.

Takiing for example the MMC driver, it would be as simple as having a
High/low responsiveness/troughput toggle which would prevent deep power
saving states when there is a high volume of data to be transferred.
> IMO autoidle (a state(s) hw puts a resource to without any sw interaction) as a 
> particular control case for such pm resources is out of scope of a particular 
> driver as well.

> So, in my thinking a devoted entity should exist to deal with such questions in 
> regard to such shared pm resources. This entity is policy manager. Thus I see a 
> need in an API for policy managers to be presented.

Your policy manager becomes too invasive and even _mandatory_ because it
becomes the only way to perform resource management.

With a decentralised approach, instead, only people who _want_ to use
the policy manager can include it, while otherwise they rely on the
automatic handling.

> > 
> > 
> >> Also, in some cases it is reasonable to adjust rules of a policy
> >> (without changing the policy). For example if you define a policy
> >> "keep an output frequency always for 33 MHz (an input frequency may vary)",
> >> you may want to change the base frequency to 66 MHz sometimes.
> >>
> >>
> >>>>> It's only when that's badly done -- or the problem is so complex
> >>>>> that multiple policies need to be supported -- that you need to
> >>>>> pull out that old "mechanism not policy" chestnut, and support
> >>>>> some kind of policy switching mechanism (governors, userspace
> >>>>> agents, etc) for different application domains.
> >>>>>
> >>>>>
> >>>>>> e.g. turning a clock off when reference counter gets zero, this is
> >>>>>> what OMAP's clock framework currently does.
> >>>>> There are no choices to be made in that layer; it's no more "policy"
> >>>>> than following the laws of arithmetic is "policy".  Software clock
> >>>> there is some principle: "turn the clock off when use counter reaches
> >>>> zero", so it is a policy, and a choice is to disable or not to disable
> >>>> an output clock, it is the simplest case but it's certainly a policy. 
> >>> That's not a choice; it's how the API is defined.  It's not "policy".
> >>>
> >>> Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
> >>> allowing it to == 5 instead?  Or should we just accept that as how
> >>> things are defined, and move on?
> >> We should accept this if we agree that benefit of using the rule always
> >> exist,
> >> but if the rule constrain some functionality we may want to disable the
> >> rule.
> >> Considering the case with clocks,  we may want  to leave the clock running
> >> even if there is no users of the clock, but there is a timing constraint
> >> for readiness  of  a clock device  (PLLs can't be started immediately).
> > 
> > I find that a bogus example.
> > It seems like you are generalising clock handling based on PLLs.
> 
> No. We are designing generic framework for various types of pm resources 
> provided by currently available hw. Already today we know about such cases as 
> PLLs. Please remember that we are approaching voltages as well and I hope you 
> agree that voltage regulators also fit in this example as switching a voltage 
> incurs latencies.
Not only that: switching a voltage may require also context
saving/restoring; of course there are latencies involved in voltage
scaling but that doesn't makes it automatic to merge voltage and
frequencies handling. Because as pointed out in other branches of this
discussion, there are aspects of voltages and clocks handling that set
them apart from each other.
> 
> > The PLL is actually the exception, having a penalty in commuting between
> > states, while all the children can be toggled on/off without any delay.
> > And that's easy to deal with: if a driver is going to do something that
> > could be affected by the PLL automatically going off, the driver can
> > avoid releasing its clock. That will effectively keep the PLL on.
> 
> It's just about preventing effective power management. A particular driver does 
> not have all necessary knowledge/information on what is the most efficient at 
> the moment in regard to _system_ power management. Not releasing a clock/voltage 
> which is not in use by the driver at the moment (and what more important in 
> generic case the driver has no clue on when it will start to use it again) just 
> prevents an upper layer (policy manager) from capability to perform effective 
> power management.
> 
No, the point is that by default every driver would try to go to the
lowest power state autonomously. The policy layer (or whatever would
take care of the system point of view, in case the decentralised
approach is not enough) would _prevent_ temporarily certain deeper power
saving states.
> > 
> > The PLL per se is not really significant, apart from the fact that it
> > tatkes power and it's desirable to keep it off for as long as possible,
> > but the important bit is that the drivers must have the clock ready and
> > available when needed.
> 
> Indeed we may want to think about that a driver may want to export requirements 
> on what expected latencies for pm_resource_get() are. But again, a particular 
> driver can;t perform such a decision internally for shared pm resources.

As i wrote in he last 3 lines below, that's why upper layers can
interact with drivers, _if needed_

> 
> > 
> > A similar approach can be used for frequencies: if a driver periodically
> > needs a certain high frequency, it might be impacted by the system
> > automatically scaling voltage/frequence.
> > 
> > Possible solutions?
> > -keep the request for high frequency
> >   if a pll relock is involved in the scaling
> > 
> > -keep the request for high voltage
> >   if there is no significant delay from a possible pll relock or no
> > relock at all, but significant ramp-up time for the voltage regulator
> > 
> > These actions could be performed by the driver either autonomously or
> > based on hints/commands receivied from upper layers, in the form of
> > driver specific commands.
> 
-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 10:30                                   ` Igor Stoppa
  2007-03-20 12:13                                     ` Eugeny S. Mints
@ 2007-03-20 13:07                                     ` Dmitry Krivoschekov
  2007-03-20 13:52                                       ` Igor Stoppa
  2007-03-20 20:21                                       ` David Brownell
  1 sibling, 2 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 13:07 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

Igor Stoppa wrote:
> On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
>> David Brownell wrote:
>>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
>>>> David Brownell wrote:
>>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>>>>> Sometimes it's quite reasonable to make decisions (or policy)
>>>>>> at the low level, without exposing events to higher layers,
>>>>> Of course.  Any layer can incorporate a degree of policy.
>>>> But users should be able to choose to use or do not use the incorporated
>>>> policy, shouldn't they?
>>> Sometimes.  What's a user? 
>> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
>> i.e. a user it's another kernel subsystem or userspace application,
>> it depends on implementation of a system.
>>>  Do you really expect every single
>>> algorithm choice to be packaged as a pluggable policy?  
>> I didn't say pluggable policy, I just said there are must be
>> an alternative - "use" or "do not use" a predefined policy.
>> For example, in USB you are able to enable/disable autosuspend rule,
>> don't know if it's possible to disable it at runtime though.
>>> Any
>>> time I've seen systems designed that way, those pluggabilty
>>> hooks have been a major drag on performance and maintainability.
>>>
>>> Most components don't actually _need_ a choice of policies.
>>>
>>>
>> Yes, but they at least need a mechanism to disable an associated policy,
>> upper layers should be able to decide where the policy well be kept,
>> they may delegate the keeping to lower layers but also may want to
>> keep the policy themselves for some reason.
> That sounds more like it's driven by IP than by technical reasons.
Linux - Operating system, the key word is "system", doing things on
one layer you have to think how it is refleceted to other layers, so
here are not bare technical questions.
> Anyway in the USB example mentioned the rule is very high level, 
What level can be lower than driver level?

I agree, USB stack consist of a number of drivers but all of them
belongs to one, USB, subsystem, more precisely USB host subsystem
(considering the case with AUTOSUSPEND)

Driver level is serving the lowest level - interacting with particular
system device, while layer serving system clocks is higher by default
since clocks are usually distributed for several devices  in the system,
and you need a common knowledge then. Putting the knowledge to driver
layer means duplicating of the knowledge for each driver.
> while
> here the proposal is to meddle with the internals of a driver.
> It seems more logical to implement policies/rules at driver level,
> rather than going straight for the resources of the driver.
>
> Why can't the driver itself be able to translate whatever high-level
> command/hint it receives into the platform/arch/board specific actions?
Assuming we consider normal device driver, because it needs to coordinate
actions with a number of other drivers in this case, and also for the reason
I mentioned  in previous comment.

And you should definitely read the first chapter of the "Linux Device
drivers" book,
see "The Role of the Device Driver" section.
>
>
>> Also, in some cases it is reasonable to adjust rules of a policy
>> (without changing the policy). For example if you define a policy
>> "keep an output frequency always for 33 MHz (an input frequency may vary)",
>> you may want to change the base frequency to 66 MHz sometimes.
>>
>>
>>>>> It's only when that's badly done -- or the problem is so complex
>>>>> that multiple policies need to be supported -- that you need to
>>>>> pull out that old "mechanism not policy" chestnut, and support
>>>>> some kind of policy switching mechanism (governors, userspace
>>>>> agents, etc) for different application domains.
>>>>>
>>>>>
>>>>>> e.g. turning a clock off when reference counter gets zero, this is
>>>>>> what OMAP's clock framework currently does.
>>>>> There are no choices to be made in that layer; it's no more "policy"
>>>>> than following the laws of arithmetic is "policy".  Software clock
>>>> there is some principle: "turn the clock off when use counter reaches
>>>> zero", so it is a policy, and a choice is to disable or not to disable
>>>> an output clock, it is the simplest case but it's certainly a policy. 
>>> That's not a choice; it's how the API is defined.  It's not "policy".
>>>
>>> Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
>>> allowing it to == 5 instead?  Or should we just accept that as how
>>> things are defined, and move on?
>> We should accept this if we agree that benefit of using the rule always
>> exist,
>> but if the rule constrain some functionality we may want to disable the
>> rule.
>> Considering the case with clocks,  we may want  to leave the clock running
>> even if there is no users of the clock, but there is a timing constraint
>> for readiness  of  a clock device  (PLLs can't be started immediately).
>
> I find that a bogus example.
It's just a technical example.
> It seems like you are generalising clock handling based on PLLs.
Nop. I just found it does not provide a mechanism to control clock devices
themselves. Gates, multipliers, dividers don't need for this, but PLL's
do need.

> The PLL is actually the exception, having a penalty in commuting between
> states, while all the children can be toggled on/off without any delay.
> And that's easy to deal with: if a driver is going to do something that
> could be affected by the PLL automatically going off, the driver can
> avoid releasing its clock. That will effectively keep the PLL on.
> The PLL per se is not really significant, apart from the fact that it
> tatkes power and it's desirable to keep it off for as long as possible,
> but the important bit is that the drivers must have the clock ready and
> available when needed.
>
> A similar approach can be used for frequencies: if a driver periodically
> needs a certain high frequency, it might be impacted by the system
> automatically scaling voltage/frequence.
>
> Possible solutions?
> -keep the request for high frequency
>   if a pll relock is involved in the scaling
Request to whom? Requesting for something assumes a subsystem that
serves the requests.
>
> -keep the request for high voltage
>   if there is no significant delay from a possible pll relock or no
> relock at all, but significant ramp-up time for the voltage regulator
>
> These actions could be performed by the driver either autonomously or
> based on hints/commands receivied from upper layers, in the form of
> driver specific commands.


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 12:39                                       ` Igor Stoppa
@ 2007-03-20 13:44                                         ` Dmitry Krivoschekov
  2007-03-20 21:03                                         ` David Brownell
  1 sibling, 0 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 13:44 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

Igor Stoppa wrote:
>
>> So, in my thinking a devoted entity should exist to deal with such questions in 
>> regard to such shared pm resources. This entity is policy manager. Thus I see a 
>> need in an API for policy managers to be presented.
>
> Your policy manager becomes too invasive and even _mandatory_ because it
> becomes the only way to perform resource management.
>
> With a decentralised approach, instead, only people who _want_ to use
> the policy manager can include it, while otherwise they rely on the
> automatic handling.
>
Seems you both are considering  two opposite extremities. Why don't
consider
things as its nature? Every computer system  has  a subsystem 
delivering resources
to its elements, i.e subsystem delivering clocks - clock subsystem,
subsystem delivering
power - power subsystem. So you need to corresponding kernel  subsystems
to control that couple.
That is, there is a driver layer, there is clock/voltage subsytems and
eventually there is a policy
manager, i.e clock/volatge subsystem becomes a dealer for different
devices and also it becomes
a dealer for user space applications (i.e. policy managers).

As a thought, we already use an irq subsystem that provides resource -
interrupt, some ideas might be taken
there.
Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 13:07                                     ` Dmitry Krivoschekov
@ 2007-03-20 13:52                                       ` Igor Stoppa
  2007-03-20 14:58                                         ` Dmitry Krivoschekov
  2007-03-20 20:21                                       ` David Brownell
  1 sibling, 1 reply; 84+ messages in thread
From: Igor Stoppa @ 2007-03-20 13:52 UTC (permalink / raw)
  To: ext Dmitry Krivoschekov; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

On Tue, 2007-03-20 at 16:07 +0300, ext Dmitry Krivoschekov wrote:
> Igor Stoppa wrote:
> > On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
> >> David Brownell wrote:
> >>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
> >>>> David Brownell wrote:
> >>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> >>>>>> Sometimes it's quite reasonable to make decisions (or policy)
> >>>>>> at the low level, without exposing events to higher layers,
> >>>>> Of course.  Any layer can incorporate a degree of policy.
> >>>> But users should be able to choose to use or do not use the incorporated
> >>>> policy, shouldn't they?
> >>> Sometimes.  What's a user? 
> >> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
> >> i.e. a user it's another kernel subsystem or userspace application,
> >> it depends on implementation of a system.
> >>>  Do you really expect every single
> >>> algorithm choice to be packaged as a pluggable policy?  
> >> I didn't say pluggable policy, I just said there are must be
> >> an alternative - "use" or "do not use" a predefined policy.
> >> For example, in USB you are able to enable/disable autosuspend rule,
> >> don't know if it's possible to disable it at runtime though.
> >>> Any
> >>> time I've seen systems designed that way, those pluggabilty
> >>> hooks have been a major drag on performance and maintainability.
> >>>
> >>> Most components don't actually _need_ a choice of policies.
> >>>
> >>>
> >> Yes, but they at least need a mechanism to disable an associated policy,
> >> upper layers should be able to decide where the policy well be kept,
> >> they may delegate the keeping to lower layers but also may want to
> >> keep the policy themselves for some reason.
> > That sounds more like it's driven by IP than by technical reasons.
> Linux - Operating system, the key word is "system", doing things on
> one layer you have to think how it is refleceted to other layers, so
> here are not bare technical questions.
> > Anyway in the USB example mentioned the rule is very high level, 
> What level can be lower than driver level?
Enable/disable autosuspend is certainly higher level command than
forcing a clock that a driver is using.
> 
> I agree, USB stack consist of a number of drivers but all of them
> belongs to one, USB, subsystem, more precisely USB host subsystem
> (considering the case with AUTOSUSPEND)
> 
> Driver level is serving the lowest level - interacting with particular
> system device, while layer serving system clocks is higher by default
> since clocks are usually distributed for several devices  in the system,
> and you need a common knowledge then. Putting the knowledge to driver
> layer means duplicating of the knowledge for each driver.
I really can't immagine how you have read this into what i wrote.

> > while
> > here the proposal is to meddle with the internals of a driver.
> > It seems more logical to implement policies/rules at driver level,
> > rather than going straight for the resources of the driver.
> >
> > Why can't the driver itself be able to translate whatever high-level
> > command/hint it receives into the platform/arch/board specific actions?
> Assuming we consider normal device driver, because it needs to coordinate
> actions with a number of other drivers in this case, and also for the reason
> I mentioned  in previous comment.
> 
> And you should definitely read the first chapter of the "Linux Device
> drivers" book,
> see "The Role of the Device Driver" section.
I'll try to reformulate, so maybe this time you won't jump to such funny
conclusions:
case 1-you are speaking in favour of a framework where policies can
orchestrate system level decisions by having a strict control on every
shared resource
case 2-i'm saying that it is simpler to rely on drivers doing their job,
as they are doing right now, on OMAP, at least, of releasing resources
when not needed. Based on this default behavior, there might be certain
cases when a system-level manager can interact with them by making the
_localised_ power saving less aggressive, thus optimiising the global
saves/performance. So if others than you are happy with just the
distributed approach, they can avoid using the framework, while you can
just put it on top and achive the same power saving that you would get
in case 1

> >
> >
> >> Also, in some cases it is reasonable to adjust rules of a policy
> >> (without changing the policy). For example if you define a policy
> >> "keep an output frequency always for 33 MHz (an input frequency may vary)",
> >> you may want to change the base frequency to 66 MHz sometimes.
> >>
> >>
> >>>>> It's only when that's badly done -- or the problem is so complex
> >>>>> that multiple policies need to be supported -- that you need to
> >>>>> pull out that old "mechanism not policy" chestnut, and support
> >>>>> some kind of policy switching mechanism (governors, userspace
> >>>>> agents, etc) for different application domains.
> >>>>>
> >>>>>
> >>>>>> e.g. turning a clock off when reference counter gets zero, this is
> >>>>>> what OMAP's clock framework currently does.
> >>>>> There are no choices to be made in that layer; it's no more "policy"
> >>>>> than following the laws of arithmetic is "policy".  Software clock
> >>>> there is some principle: "turn the clock off when use counter reaches
> >>>> zero", so it is a policy, and a choice is to disable or not to disable
> >>>> an output clock, it is the simplest case but it's certainly a policy. 
> >>> That's not a choice; it's how the API is defined.  It's not "policy".
> >>>
> >>> Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
> >>> allowing it to == 5 instead?  Or should we just accept that as how
> >>> things are defined, and move on?
> >> We should accept this if we agree that benefit of using the rule always
> >> exist,
> >> but if the rule constrain some functionality we may want to disable the
> >> rule.
> >> Considering the case with clocks,  we may want  to leave the clock running
> >> even if there is no users of the clock, but there is a timing constraint
> >> for readiness  of  a clock device  (PLLs can't be started immediately).
> >
> > I find that a bogus example.
> It's just a technical example.
> > It seems like you are generalising clock handling based on PLLs.
> Nop. I just found it does not provide a mechanism to control clock devices
> themselves. Gates, multipliers, dividers don't need for this, but PLL's
> do need.
nothing prevents implementing that in the clk fw
> > The PLL is actually the exception, having a penalty in commuting between
> > states, while all the children can be toggled on/off without any delay.
> > And that's easy to deal with: if a driver is going to do something that
> > could be affected by the PLL automatically going off, the driver can
> > avoid releasing its clock. That will effectively keep the PLL on.
> > The PLL per se is not really significant, apart from the fact that it
> > tatkes power and it's desirable to keep it off for as long as possible,
> > but the important bit is that the drivers must have the clock ready and
> > available when needed.
> >
> > A similar approach can be used for frequencies: if a driver periodically
> > needs a certain high frequency, it might be impacted by the system
> > automatically scaling voltage/frequence.
> >
> > Possible solutions?
> > -keep the request for high frequency
> >   if a pll relock is involved in the scaling
> Request to whom? Requesting for something assumes a subsystem that
> serves the requests.
> >
> > -keep the request for high voltage
> >   if there is no significant delay from a possible pll relock or no
> > relock at all, but significant ramp-up time for the voltage regulator
> >
> > These actions could be performed by the driver either autonomously or
> > based on hints/commands receivied from upper layers, in the form of
> > driver specific commands.

-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 13:52                                       ` Igor Stoppa
@ 2007-03-20 14:58                                         ` Dmitry Krivoschekov
  2007-03-20 15:36                                           ` Pavel Machek
  2007-03-20 15:36                                           ` Igor Stoppa
  0 siblings, 2 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 14:58 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

Igor Stoppa wrote:
> On Tue, 2007-03-20 at 16:07 +0300, ext Dmitry Krivoschekov wrote:
>> Igor Stoppa wrote:
>>> On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
>>>> David Brownell wrote:
>>>>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
>>>>>> David Brownell wrote:
>>>>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>>>>>>> Sometimes it's quite reasonable to make decisions (or policy)
>>>>>>>> at the low level, without exposing events to higher layers,
>>>>>>> Of course.  Any layer can incorporate a degree of policy.
>>>>>> But users should be able to choose to use or do not use the incorporated
>>>>>> policy, shouldn't they?
>>>>> Sometimes.  What's a user? 
>>>> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
>>>> i.e. a user it's another kernel subsystem or userspace application,
>>>> it depends on implementation of a system.
>>>>>  Do you really expect every single
>>>>> algorithm choice to be packaged as a pluggable policy?  
>>>> I didn't say pluggable policy, I just said there are must be
>>>> an alternative - "use" or "do not use" a predefined policy.
>>>> For example, in USB you are able to enable/disable autosuspend rule,
>>>> don't know if it's possible to disable it at runtime though.
>>>>> Any
>>>>> time I've seen systems designed that way, those pluggabilty
>>>>> hooks have been a major drag on performance and maintainability.
>>>>>
>>>>> Most components don't actually _need_ a choice of policies.
>>>>>
>>>>>
>>>> Yes, but they at least need a mechanism to disable an associated policy,
>>>> upper layers should be able to decide where the policy well be kept,
>>>> they may delegate the keeping to lower layers but also may want to
>>>> keep the policy themselves for some reason.
>>> That sounds more like it's driven by IP than by technical reasons.
>> Linux - Operating system, the key word is "system", doing things on
>> one layer you have to think how it is refleceted to other layers, so
>> here are not bare technical questions.
>>> Anyway in the USB example mentioned the rule is very high level, 
>> What level can be lower than driver level?
> Enable/disable autosuspend is certainly higher level command than
> forcing a clock that a driver is using.
Please re-read what you wrote, "Anyway in the USB example mentioned the
rule is very high level",
by the rule I assume "suspend a device if it doesn't need for USB
interface for now",  I didn't consider
enabling/disabling the rule itself here.
>> I agree, USB stack consist of a number of drivers but all of them
>> belongs to one, USB, subsystem, more precisely USB host subsystem
>> (considering the case with AUTOSUSPEND)
>>
>> Driver level is serving the lowest level - interacting with particular
>> system device, while layer serving system clocks is higher by default
>> since clocks are usually distributed for several devices  in the system,
>> and you need a common knowledge then. Putting the knowledge to driver
>> layer means duplicating of the knowledge for each driver.
> I really can't immagine how you have read this into what i wrote.
Just think of that then, I don't insist though :)
>
>>> while
>>> here the proposal is to meddle with the internals of a driver.
>>> It seems more logical to implement policies/rules at driver level,
>>> rather than going straight for the resources of the driver.
>>>
>>> Why can't the driver itself be able to translate whatever high-level
>>> command/hint it receives into the platform/arch/board specific actions?
>> Assuming we consider normal device driver, because it needs to coordinate
>> actions with a number of other drivers in this case, and also for the reason
>> I mentioned  in previous comment.
>>
>> And you should definitely read the first chapter of the "Linux Device
>> drivers" book,
>> see "The Role of the Device Driver" section.
> I'll try to reformulate, so maybe this time you won't jump to such funny
> conclusions:
It's not a conclusion it's a suggestion, there you may find explanations
what is wrong
with delegating policies to device drivers.
> case 1-you are speaking in favour of a framework where policies can
> orchestrate system level decisions by having a strict control on every
> shared resource
I personally don't favor any approach I just want to sort out advantages
and disadvantages
of every approach.
> case 2-i'm saying that it is simpler to rely on drivers doing their job,
What criteria do you use to accept one approach is simpler than another?
> as they are doing right now, on OMAP, at least, of releasing resources
Did you think of other archs?
> when not needed. Based on this default behavior, there might be certain
> cases when a system-level manager can interact with them by making the
the system-level manager has to be arch-dependent in this case
> _localised_ power saving less aggressive, thus optimiising the global
> saves/performance. So if others than you are happy with just the
> distributed approach, they can avoid using the framework, while you can
> just put it on top and achive the same power saving that you would get
> in case 1
Ok, seems you are happy with current clock framework and advocating it
to be as is.
Are you against addition some features to it, such as enable/disable 
"turn the unused clock off"
rule, set/get state of  associated  clock device  (to control low power
modes of the clock device itself)?
Besides that it is meaningless for you, do you have any technical
objections for that?


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 14:58                                         ` Dmitry Krivoschekov
@ 2007-03-20 15:36                                           ` Pavel Machek
  2007-03-20 19:16                                             ` Dmitry Krivoschekov
  2007-03-20 15:36                                           ` Igor Stoppa
  1 sibling, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2007-03-20 15:36 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, linux-pm

Hi!

> > when not needed. Based on this default behavior, there might be certain
> > cases when a system-level manager can interact with them by making the
> the system-level manager has to be arch-dependent in this case
> > _localised_ power saving less aggressive, thus optimiising the global
> > saves/performance. So if others than you are happy with just the
> > distributed approach, they can avoid using the framework, while you can
> > just put it on top and achive the same power saving that you would get
> > in case 1
> Ok, seems you are happy with current clock framework and advocating it
> to be as is.
> Are you against addition some features to it, such as enable/disable 
> "turn the unused clock off"

What kind of debate is this?!

Of course I and everyone else is against adding features without
_really good_ explanation why this is needed. And no, you should not
even ask unless you have patch ready, because it all depends on how
the patch looks like.

And now, can we let this silly thread die?

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 14:58                                         ` Dmitry Krivoschekov
  2007-03-20 15:36                                           ` Pavel Machek
@ 2007-03-20 15:36                                           ` Igor Stoppa
  2007-03-20 19:17                                             ` Dmitry Krivoschekov
  2007-03-20 20:17                                             ` David Brownell
  1 sibling, 2 replies; 84+ messages in thread
From: Igor Stoppa @ 2007-03-20 15:36 UTC (permalink / raw)
  To: ext Dmitry Krivoschekov; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

On Tue, 2007-03-20 at 17:58 +0300, ext Dmitry Krivoschekov wrote:
> Igor Stoppa wrote:
> > On Tue, 2007-03-20 at 16:07 +0300, ext Dmitry Krivoschekov wrote:
> >> Igor Stoppa wrote:
> >>> On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
> >>>> David Brownell wrote:
> >>>>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
> >>>>>> David Brownell wrote:
> >>>>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
> >>>>>>>> Sometimes it's quite reasonable to make decisions (or policy)
> >>>>>>>> at the low level, without exposing events to higher layers,
> >>>>>>> Of course.  Any layer can incorporate a degree of policy.
> >>>>>> But users should be able to choose to use or do not use the incorporated
> >>>>>> policy, shouldn't they?
> >>>>> Sometimes.  What's a user? 
> >>>> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
> >>>> i.e. a user it's another kernel subsystem or userspace application,
> >>>> it depends on implementation of a system.
> >>>>>  Do you really expect every single
> >>>>> algorithm choice to be packaged as a pluggable policy?  
> >>>> I didn't say pluggable policy, I just said there are must be
> >>>> an alternative - "use" or "do not use" a predefined policy.
> >>>> For example, in USB you are able to enable/disable autosuspend rule,
> >>>> don't know if it's possible to disable it at runtime though.
> >>>>> Any
> >>>>> time I've seen systems designed that way, those pluggabilty
> >>>>> hooks have been a major drag on performance and maintainability.
> >>>>>
> >>>>> Most components don't actually _need_ a choice of policies.
> >>>>>
> >>>>>
> >>>> Yes, but they at least need a mechanism to disable an associated policy,
> >>>> upper layers should be able to decide where the policy well be kept,
> >>>> they may delegate the keeping to lower layers but also may want to
> >>>> keep the policy themselves for some reason.
> >>> That sounds more like it's driven by IP than by technical reasons.
> >> Linux - Operating system, the key word is "system", doing things on
> >> one layer you have to think how it is refleceted to other layers, so
> >> here are not bare technical questions.
> >>> Anyway in the USB example mentioned the rule is very high level, 
> >> What level can be lower than driver level?
> > Enable/disable autosuspend is certainly higher level command than
> > forcing a clock that a driver is using.
> Please re-read what you wrote, "Anyway in the USB example mentioned the
> rule is very high level",
> by the rule I assume "suspend a device if it doesn't need for USB
> interface for now",  I didn't consider
> enabling/disabling the rule itself here.
And you were correct. It _is_ high level if it refers to the whole
device rather than directly to a resource used by the device, like a
clock.
> >> I agree, USB stack consist of a number of drivers but all of them
> >> belongs to one, USB, subsystem, more precisely USB host subsystem
> >> (considering the case with AUTOSUSPEND)
> >>
> >> Driver level is serving the lowest level - interacting with particular
> >> system device, while layer serving system clocks is higher by default
> >> since clocks are usually distributed for several devices  in the system,
> >> and you need a common knowledge then. Putting the knowledge to driver
> >> layer means duplicating of the knowledge for each driver.
> > I really can't immagine how you have read this into what i wrote.
> Just think of that then, I don't insist though :)
> >
> >>> while
> >>> here the proposal is to meddle with the internals of a driver.
> >>> It seems more logical to implement policies/rules at driver level,
> >>> rather than going straight for the resources of the driver.
> >>>
> >>> Why can't the driver itself be able to translate whatever high-level
> >>> command/hint it receives into the platform/arch/board specific actions?
> >> Assuming we consider normal device driver, because it needs to coordinate
> >> actions with a number of other drivers in this case, and also for the reason
> >> I mentioned  in previous comment.
> >>
> >> And you should definitely read the first chapter of the "Linux Device
> >> drivers" book,
> >> see "The Role of the Device Driver" section.
> > I'll try to reformulate, so maybe this time you won't jump to such funny
> > conclusions:
> It's not a conclusion it's a suggestion, there you may find explanations
> what is wrong
> with delegating policies to device drivers.
Your approach is to just label policy what you want to kick out of the
driver =) Nice one!

acquire_resource_before_proceeding/release_resource_when_done

that's a mechanism; the _policy_ is to tell the driver what "when done"
means. At it can be done dynamically.
> > case 1-you are speaking in favour of a framework where policies can
> > orchestrate system level decisions by having a strict control on every
> > shared resource
> I personally don't favor any approach I just want to sort out advantages
> and disadvantages
> of every approach.
> > case 2-i'm saying that it is simpler to rely on drivers doing their job,
> What criteria do you use to accept one approach is simpler than another?
case 2 is already in place for OMAP2, to prevent retention in certain
cases, you can check the code. It works and has very little impact in
terms of code. Furthermore only the drivers that need it are affected.
> > as they are doing right now, on OMAP, at least, of releasing resources
> Did you think of other archs?
well, this is linux-pm ml, so i felt that it was correct to delimit the
current playground; it's to 
> > when not needed. Based on this default behavior, there might be certain
> > cases when a system-level manager can interact with them by making the
> the system-level manager has to be arch-dependent in this case
but very trivial; there's no point in having something
cross-architecture if it initroduces unnecessary complexity just for the
sake of it. Better stick to a cross platform interface and let the
details to arch-specific code.
> > _localised_ power saving less aggressive, thus optimiising the global
> > saves/performance. So if others than you are happy with just the
> > distributed approach, they can avoid using the framework, while you can
> > just put it on top and achive the same power saving that you would get
> > in case 1
> Ok, seems you are happy with current clock framework and advocating it
> to be as is.
As I wrote several times i haven't seen yet a reason to replace it;
certainly there is space for improvement but so far this proposal has
not been on the lines of: how to improve the clk fw
> Are you against addition some features to it, such as enable/disable 
> "turn the unused clock off"
> rule, set/get state of  associated  clock device  (to control low power
> modes of the clock device itself)?
> Besides that it is meaningless for you, do you have any technical
> objections for that?
Do you mean apart from the fact that it means hijacking the fw?
Noooo, not at all.
But why is it meaningful to you? Why can't you interact with the entity
that is actually controlling a certain clock, rather than with the clock
itself?
> 
> 
> Thanks,
> Dmitry
> 
-- 
Cheers, Igor

Igor Stoppa <igor.stoppa@nokia.com>
(Nokia Multimedia - CP - OSSO / Helsinki, Finland)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 15:36                                           ` Pavel Machek
@ 2007-03-20 19:16                                             ` Dmitry Krivoschekov
  2007-03-20 20:45                                               ` Pavel Machek
  0 siblings, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 19:16 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Dominik Brodowski, linux-pm

Hi Pavel,

Pavel Machek wrote:
> Hi!
>
>>> when not needed. Based on this default behavior, there might be certain
>>> cases when a system-level manager can interact with them by making the
>> the system-level manager has to be arch-dependent in this case
>>> _localised_ power saving less aggressive, thus optimiising the global
>>> saves/performance. So if others than you are happy with just the
>>> distributed approach, they can avoid using the framework, while you can
>>> just put it on top and achive the same power saving that you would get
>>> in case 1
>> Ok, seems you are happy with current clock framework and advocating it
>> to be as is.
>> Are you against addition some features to it, such as enable/disable 
>> "turn the unused clock off"
>
> What kind of debate is this?!
>
> Of course I and everyone else is against adding features without
> _really good_ explanation why this is needed. And no, you should not
> even ask unless you have patch ready, 

If I understand you correctly, you are against big debates unless a
patch is ready,
I agree to that if someone tries to speculate on adding some feature
which is easily
can be shown as a patch, but before doing something more complex, say
new subsystem,
it is worth to make sure whether your intention makes sense at all, and
if it does
make sense, to discuss possible design details. I think it's better then
reviewing
of a patch set, then get an agreement the patches is ok and eventually
decide
the thing the patches are adding is unneeded.

If you feel people ask and discuss the same things again and again
perhaps it is time
to create some FAQ list.


the first question would be:

Q: Does Linux Power management take care about personal computers only?
A: ????
 


Thanks,
Dmitry


P.S. Are there special Etiquette notes for the list?

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 15:36                                           ` Igor Stoppa
@ 2007-03-20 19:17                                             ` Dmitry Krivoschekov
  2007-03-20 20:17                                             ` David Brownell
  1 sibling, 0 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 19:17 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: linux-pm, Dominik Brodowski, Pavel Machek

Igor Stoppa wrote:
> On Tue, 2007-03-20 at 17:58 +0300, ext Dmitry Krivoschekov wrote:
>> Igor Stoppa wrote:
>>> On Tue, 2007-03-20 at 16:07 +0300, ext Dmitry Krivoschekov wrote:
>>>> Igor Stoppa wrote:
>>>>> On Tue, 2007-03-20 at 12:45 +0300, ext Dmitry Krivoschekov wrote:
>>>>>> David Brownell wrote:
>>>>>>> On Monday 19 March 2007 5:03 pm, Dmitry Krivoschekov wrote:
>>>>>>>> David Brownell wrote:
>>>>>>>>> On Sunday 18 March 2007 1:25 pm, Dmitry Krivoschekov wrote:
>>>>>>>>>> Sometimes it's quite reasonable to make decisions (or policy)
>>>>>>>>>> at the low level, without exposing events to higher layers,
>>>>>>>>> Of course.  Any layer can incorporate a degree of policy.
>>>>>>>> But users should be able to choose to use or do not use the incorporated
>>>>>>>> policy, shouldn't they?
>>>>>>> Sometimes.  What's a user? 
>>>>>> It's a user of a kernel subsystem that (subsystem ) keeps a policy,
>>>>>> i.e. a user it's another kernel subsystem or userspace application,
>>>>>> it depends on implementation of a system.
>>>>>>>  Do you really expect every single
>>>>>>> algorithm choice to be packaged as a pluggable policy?  
>>>>>> I didn't say pluggable policy, I just said there are must be
>>>>>> an alternative - "use" or "do not use" a predefined policy.
>>>>>> For example, in USB you are able to enable/disable autosuspend rule,
>>>>>> don't know if it's possible to disable it at runtime though.
>>>>>>> Any
>>>>>>> time I've seen systems designed that way, those pluggabilty
>>>>>>> hooks have been a major drag on performance and maintainability.
>>>>>>>
>>>>>>> Most components don't actually _need_ a choice of policies.
>>>>>>>
>>>>>>>
>>>>>> Yes, but they at least need a mechanism to disable an associated policy,
>>>>>> upper layers should be able to decide where the policy well be kept,
>>>>>> they may delegate the keeping to lower layers but also may want to
>>>>>> keep the policy themselves for some reason.
>>>>> That sounds more like it's driven by IP than by technical reasons.
>>>> Linux - Operating system, the key word is "system", doing things on
>>>> one layer you have to think how it is refleceted to other layers, so
>>>> here are not bare technical questions.
>>>>> Anyway in the USB example mentioned the rule is very high level, 
>>>> What level can be lower than driver level?
>>> Enable/disable autosuspend is certainly higher level command than
>>> forcing a clock that a driver is using.
>> Please re-read what you wrote, "Anyway in the USB example mentioned the
>> rule is very high level",
>> by the rule I assume "suspend a device if it doesn't need for USB
>> interface for now",  I didn't consider
>> enabling/disabling the rule itself here.
> And you were correct. It _is_ high level if it refers to the whole
> device rather than directly to a resource used by the device, like a
> clock.

If you consider a bit that controls a clock for particular device (and
only for
this device) then I agree with you. But I initially referred to system
wide  clocks
that are propagated to different devices.


>>>> I agree, USB stack consist of a number of drivers but all of them
>>>> belongs to one, USB, subsystem, more precisely USB host subsystem
>>>> (considering the case with AUTOSUSPEND)
>>>>
>>>> Driver level is serving the lowest level - interacting with particular
>>>> system device, while layer serving system clocks is higher by default
>>>> since clocks are usually distributed for several devices  in the system,
>>>> and you need a common knowledge then. Putting the knowledge to driver
>>>> layer means duplicating of the knowledge for each driver.
>>> I really can't immagine how you have read this into what i wrote.
>> Just think of that then, I don't insist though :)
>>>>> while
>>>>> here the proposal is to meddle with the internals of a driver.
>>>>> It seems more logical to implement policies/rules at driver level,
>>>>> rather than going straight for the resources of the driver.
>>>>>
>>>>> Why can't the driver itself be able to translate whatever high-level
>>>>> command/hint it receives into the platform/arch/board specific actions?
>>>> Assuming we consider normal device driver, because it needs to coordinate
>>>> actions with a number of other drivers in this case, and also for the reason
>>>> I mentioned  in previous comment.
>>>>
>>>> And you should definitely read the first chapter of the "Linux Device
>>>> drivers" book,
>>>> see "The Role of the Device Driver" section.
>>> I'll try to reformulate, so maybe this time you won't jump to such funny
>>> conclusions:
>> It's not a conclusion it's a suggestion, there you may find explanations
>> what is wrong
>> with delegating policies to device drivers.
> Your approach is to just label policy what you want to kick out of the
> driver =) Nice one!

I may want to kick out policy which may prevent higher layer PM
managers to work properly but leave only those rules that device's
specification defines to be kept.


>
> acquire_resource_before_proceeding/release_resource_when_done
>
> that's a mechanism; the _policy_ is to tell the driver what "when done"
> means. At it can be done dynamically.
>>> case 1-you are speaking in favour of a framework where policies can
>>> orchestrate system level decisions by having a strict control on every
>>> shared resource
>> I personally don't favor any approach I just want to sort out advantages
>> and disadvantages
>> of every approach.
>>> case 2-i'm saying that it is simpler to rely on drivers doing their job,
>> What criteria do you use to accept one approach is simpler than another?
> case 2 is already in place for OMAP2, to prevent retention in certain
> cases, you can check the code. It works and has very little impact in
> terms of code. Furthermore only the drivers that need it are affected.
>>> as they are doing right now, on OMAP, at least, of releasing resources
>> Did you think of other archs?
> well, this is linux-pm ml, so i felt that it was correct to delimit the
> current playground; it's to 
>>> when not needed. Based on this default behavior, there might be certain
>>> cases when a system-level manager can interact with them by making the
>> the system-level manager has to be arch-dependent in this case
> but very trivial; there's no point in having something
> cross-architecture if it initroduces unnecessary complexity just for the
> sake of it. Better stick to a cross platform interface and let the
> details to arch-specific code.
>>> _localised_ power saving less aggressive, thus optimiising the global
>>> saves/performance. So if others than you are happy with just the
>>> distributed approach, they can avoid using the framework, while you can
>>> just put it on top and achive the same power saving that you would get
>>> in case 1
>> Ok, seems you are happy with current clock framework and advocating it
>> to be as is.
> As I wrote several times i haven't seen yet a reason to replace it;
> certainly there is space for improvement but so far this proposal has
> not been on the lines of: how to improve the clk fw
>> Are you against addition some features to it, such as enable/disable 
>> "turn the unused clock off"
>> rule, set/get state of  associated  clock device  (to control low power
>> modes of the clock device itself)?
>> Besides that it is meaningless for you, do you have any technical
>> objections for that?
> Do you mean apart from the fact that it means hijacking the fw?
> Noooo, not at all.
> But why is it meaningful to you? Why can't you interact with the entity
> that is actually controlling a certain clock, rather than with the clock
> itself?

You interact with hardware entity when interacting with a clock,
so interacting with a corresponding software entity by appropriate
driver of software entity is the only meaning for now.
Making software entity to be cross-platform also makes sense.

Igor, Thanks for all of your valuable comments!


Regards,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20  9:45                                 ` Dmitry Krivoschekov
  2007-03-20 10:30                                   ` Igor Stoppa
@ 2007-03-20 19:58                                   ` David Brownell
  2007-03-24  0:47                                     ` charging batteries from USB [was: Re: Alternative Concept] Dmitry Krivoschekov
  1 sibling, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-20 19:58 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

> >  Do you really expect every single
> > algorithm choice to be packaged as a pluggable policy?
>
> I didn't say pluggable policy, I just said there are must be
> an alternative - "use" or "do not use" a predefined policy.

I'd call that a "choose between two policies" switch.


> For example, in USB you are able to enable/disable autosuspend rule,
> don't know if it's possible to disable it at runtime though.

There are patches to allow disabling it at runtime through sysfs
attributes of any given device.

The primary reason to have one is bugs in those external devices,
where they don't behave according to the USB spec.  That's the
same reason that the "wakeup" mechanism needs to be disabled
for various devices:  they don't work correctly.

With billions of USB devices in the world, it's impractical for
Linux to rely on lists of device quirks.  No other subsystem has
that particular burden.


> > Most components don't actually _need_ a choice of policies.
> 
> Yes, but they at least need a mechanism to disable an associated policy,

No, most of them don't "need" any such thing.


> upper layers should be able to decide where the policy well be kept,
> they may delegate the keeping to lower layers but also may want to
> keep the policy themselves for some reason.

This keeps sounding worse and worse.  Most of the time, any one
layer ("upper" or otherwise) can't have a clue about the issues
that affect another layer ("lower" or otherwise).

This is a basic fallout of good systems design.  The principle
is called "information hiding", and it minimizes needless
coupling.  It's what lets systems evolve incrementally, and
without needing to constantly change interfaces.

 
> > That's not a choice; it's how the API is defined.  It's not "policy".
> >
> > Arithmetic is defined so that 2 + 2 == 4.  Should we have a "policy"
> > allowing it to == 5 instead?  Or should we just accept that as how
> > things are defined, and move on?
>
> We should accept this if we agree that benefit of using the rule always
> exist, but if the rule constrain some functionality we may want to
> disable the rule.

Another basic "how to design systems" guideline is stick to simple rules,
and let the complex behaviors come (when needed) by applying those rules
correctly and consistently.

Or to put it differently:  if you keep having to define exceptions to
the rules (as you propose), you're doing something *very* wrong.  You're
building a house of cards; one that will not stand up to much stress.
It's the same reason that "spaghetti code" is bad.  Once the system gets
too chaotic, it stops being comprehensible ... and you lose the ability
to maintain or enhance it.


> Considering the case with clocks,  we may want  to leave the clock running
> even if there is no users of the clock, but there is a timing constraint
> for readiness  of  a clock device  (PLLs can't be started immediately).

As Igor pointed out:  that may be a reason to keep an extra reference
to PLL output clock.  In the cases I've happened across, those latencies
are not troublesome ... they're not incurred in time-critical transitions.

Whatever keeps that extra reference would need extra information to
decide whether to keep it running.  Presumably you have some real-world
scenario in mind..

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 15:36                                           ` Igor Stoppa
  2007-03-20 19:17                                             ` Dmitry Krivoschekov
@ 2007-03-20 20:17                                             ` David Brownell
  1 sibling, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-20 20:17 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: Dominik Brodowski, linux-pm, Pavel Machek

On Tuesday 20 March 2007 8:36 am, Igor Stoppa wrote:

> Your approach is to just label policy what you want to kick out of the
> driver =) Nice one!

I had a similar reaction ... although, I was also wondering exactly
what benefit would come from kicking that stuff out of drivers, and
thus needing to rewrite/retest a lot of them.


> > Ok, seems you are happy with current clock framework and advocating it
> > to be as is.
>
> As I wrote several times i haven't seen yet a reason to replace it;
> certainly there is space for improvement but so far this proposal has
> not been on the lines of: how to improve the clk fw

Yes.  This thread is unfortunately very much a "where's the beef".
I was afraid that was what would happen, given the complete lack
of interface proposal.

I suspect that until some non-troll content is provided, I'll tune out.


> > Are you against addition some features to it, such as enable/disable 
> > "turn the unused clock off" rule,

That'd be what we call a "misfeature", or "bug".  Are you suggesting
this be done to the IRQ subsystem too?  It has a rule that unused
IRQs must be turned off.  Likewise that IRQs not used as wake events
shouldn't be enabled as wake events.  I hope you're not restricting
your addition of misfeatures to just the clock framework!


> > Besides that it is meaningless for you, do you have any technical
> > objections for that?
>
> Do you mean apart from the fact that it means hijacking the fw?
> Noooo, not at all.

I take it that was meant to be sarcasm...

Me, yes I have already presented technical objections, all of which
appear to have been ignored.


> But why is it meaningful to you? Why can't you interact with the entity
> that is actually controlling a certain clock, rather than with the clock
> itself?

Good question.  I hope that one doesn't get ignored, too.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 13:07                                     ` Dmitry Krivoschekov
  2007-03-20 13:52                                       ` Igor Stoppa
@ 2007-03-20 20:21                                       ` David Brownell
  1 sibling, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-20 20:21 UTC (permalink / raw)
  To: Dmitry Krivoschekov
  Cc: Igor Stoppa, Dominik Brodowski, Pavel Machek, linux-pm

On Tuesday 20 March 2007 6:07 am, Dmitry Krivoschekov wrote:

> And you should definitely read the first chapter of the "Linux Device
> drivers" book, see "The Role of the Device Driver" section.

While it's a fine book for beginners, I wouldn't treat anything
it says as being "the final word" even for the kernel version
that it was written against.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 19:16                                             ` Dmitry Krivoschekov
@ 2007-03-20 20:45                                               ` Pavel Machek
  2007-03-20 22:04                                                 ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2007-03-20 20:45 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, linux-pm

Hi!

> >> Ok, seems you are happy with current clock framework and advocating it
> >> to be as is.
> >> Are you against addition some features to it, such as enable/disable 
> >> "turn the unused clock off"
> >
> > What kind of debate is this?!
> >
> > Of course I and everyone else is against adding features without
> > _really good_ explanation why this is needed. And no, you should not
> > even ask unless you have patch ready, 
> 
> If I understand you correctly, you are against big debates unless a
> patch is ready,

You understood it well.

> I agree to that if someone tries to speculate on adding some feature
> which is easily

But we do not want new subsystem. We want power management to
work. Take a look how Alan added pm to usb... and just do it like
him. If some code makes sense to be shared, share it. But start with
support for platform you care about and don't overdesign it.

> can be shown as a patch, but before doing something more complex, say
> new subsystem,
> it is worth to make sure whether your intention makes sense at all, and
> if it does

No, your intention of adding subsystems does not make any
sense. Satisfied?

> the first question would be:
> 
> Q: Does Linux Power management take care about personal computers only?
> A: ????

Linux Power management is not person, it is piece of code. Me, I care
about arm, too.

> P.S. Are there special Etiquette notes for the list?

I guess linux-kernel has nice set of rules, and they should apply
here, too. Expect to defend your design on l-k sooner or later. 
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 12:39                                       ` Igor Stoppa
  2007-03-20 13:44                                         ` Dmitry Krivoschekov
@ 2007-03-20 21:03                                         ` David Brownell
  1 sibling, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-20 21:03 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: Dominik Brodowski, linux-pm, Pavel Machek

On Tuesday 20 March 2007 5:39 am, Igor Stoppa wrote:

> No, the point is that by default every driver would try to go to the
> lowest power state autonomously.

Yes.  Anyone assuming otherwise should go back to the drawing board,
since so much of the most relevant information will BY DESIGN never
be exposed outside that driver.  And must not be, since each version
of silicon will generally have its own localized quirks.


Plus there may be groups of drivers that work together.  SOC audio
is probably a decent example:  the high level driver presents just
playback and record interfaces, but internally there will be several
busses involved, a SOC-specific serial data stream controller (AC97,
McBSP, I2S, etc), an external codec, and control interfaces which
manage each of those.  That means a couple drivers collaborating...
(I have no idea if this new ALSA SOC support stuff is sanely design
from a PM perspective...)

The same thing comes up with some USB OTG configurations too.
For example, on OMAP1 or PXA27x chips, there are three on-chip
controllers (OHCI, UDC, OTG) plus an external (I2C) transceiver.
That's three or four drivers which must work together to make
sure power isn't needlessly wasted.

But in all those cases, all the PM-relevant information is local
to those drivers, and all they need is private interfaces to sort
things out.  No new "system-wide" component (policy or otherewise)
is needed.


> 	The policy layer (or whatever would 
> take care of the system point of view, in case the decentralised
> approach is not enough) would _prevent_ temporarily certain deeper power
> saving states.

Yes, but ...

There are two "power management" models.  I think you are talking
about the one most relevant to low power operation:  "runtime PM".
Something like an N770 or N800 tablet needs that(*), and of course
a Linux laptop would benefit from it quite nicely too.

The "system suspend" model works the other way around:  when the
system is forced into a specific low power state, that prevents
certain drivers from working.  For example, it's common that USB
must in effect be powered off, unable to serve as a wakeup event
source (or maintain power sessions) because a key clock must be
turned off in order to enter that state.

- Dave

(*) Plus a smart idle loop ... using NO_IDLE_HZ and entering into
    SOC-wide deep sleep states rather than just CPU idle.  A partial
    PC analogue would be using the new NO_HZ and C3 or deeper.

    To me, it's suggestive that all this discussion about power
    frameworks has mentioned neither that, nor several other
    essential mechanisms, like the "run from SRAM" case that I
    mentioned earlier.  Without addressing a few such real
    problems, the email discussions can't produce good results.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 20:45                                               ` Pavel Machek
@ 2007-03-20 22:04                                                 ` David Brownell
  2007-03-20 22:06                                                   ` Pavel Machek
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-20 22:04 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Dominik Brodowski, linux-pm

On Tuesday 20 March 2007 1:45 pm, Pavel Machek wrote:

> > If I understand you correctly, you are against big debates unless a
> > patch is ready,
> 
> You understood it well.

Thing is, this hasn't even gotten to the level of "debate".
The "alternative" is at this point pure handwaving.

The reason to insist on a patch is primarily to ensure that
peoples' time doesn't get wasted.  There are ways to make a
concrete proposal that don't involve patches.  But in this
case, we haven't seen any of them.  So it's fair to ask for
something more concrete, such as a few patches...


> But we do not want new subsystem. We want power management to
> work. Take a look how Alan added pm to usb... and just do it like
> him. If some code makes sense to be shared, share it. But start with
> support for platform you care about and don't overdesign it.

Note, Alan didn't do it by himself.  The autosuspend stuff built on
previous work in USB PM, and this DID get discussed -- as concrete
proposals -- before he did all that good stuff.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 22:04                                                 ` David Brownell
@ 2007-03-20 22:06                                                   ` Pavel Machek
  2007-03-20 23:29                                                     ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Pavel Machek @ 2007-03-20 22:06 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, linux-pm

Hi!

> > But we do not want new subsystem. We want power management to
> > work. Take a look how Alan added pm to usb... and just do it like
> > him. If some code makes sense to be shared, share it. But start with
> > support for platform you care about and don't overdesign it.
> 
> Note, Alan didn't do it by himself.  The autosuspend stuff built on
> previous work in USB PM, and this DID get discussed -- as concrete
> proposals -- before he did all that good stuff.

Okay, you are right. But I do not remember any "handwaving"
phase... (And sorry for not giving credit where it was due, obviously
Alan did not do it all by himself).

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 22:06                                                   ` Pavel Machek
@ 2007-03-20 23:29                                                     ` David Brownell
  0 siblings, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-20 23:29 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Dominik Brodowski, linux-pm

On Tuesday 20 March 2007 3:06 pm, Pavel Machek wrote:
> Hi!
> 
> > > But we do not want new subsystem. We want power management to
> > > work. Take a look how Alan added pm to usb... and just do it like
> > > him. If some code makes sense to be shared, share it. But start with
> > > support for platform you care about and don't overdesign it.
> > 
> > Note, Alan didn't do it by himself.  The autosuspend stuff built on
> > previous work in USB PM, and this DID get discussed -- as concrete
> > proposals -- before he did all that good stuff.
> 
> Okay, you are right. But I do not remember any "handwaving"
> phase... (And sorry for not giving credit where it was due, obviously
> Alan did not do it all by himself).

No, I don't think there was a handwaving phase there either.

There were specific problems to be addressed, and specific solutions
came out of discussions of how to address them.  (Which is not at all
the way these "concept" discusions have gone.)

I think the canonical example was one I first heard from Len Brown a
few years back:  laptop with USB mouse.  If that mouse can (auto)suspend,
with remote wakeup to kick it out of that state, then the USB host
controller doesn't need be active.  If that controller isn't doing DMA
every millisecond, the CPU can enter C3.  So the savings cascade in a
very clean way, once all the parts are present.

To get there, first we needed USB suspend, and remote wakeup, to work.
That was a bunch of work.  Then we needed proper "runtime suspend" to
behave for the host controller drivers too.  More work ... they all
had to act the same.  (I'd say that work first started around the time
of the 2.6.9 kernels...)  Once all that was stable, the autosuspend
work could begin.


All this "alternative concept" stuff seems to be starting from the
"big piece of blank paper, and dreams" school of design.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* charging batteries from USB [was: Re: Alternative Concept]
  2007-03-20 19:58                                   ` David Brownell
@ 2007-03-24  0:47                                     ` Dmitry Krivoschekov
  2007-03-24  1:17                                       ` David Brownell
  2007-03-24  8:36                                       ` Oliver Neukum
  0 siblings, 2 replies; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-24  0:47 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
>> For example, in USB you are able to enable/disable autosuspend rule,
>> don't know if it's possible to disable it at runtime though.
>
> There are patches to allow disabling it at runtime through sysfs
> attributes of any given device.
>
> The primary reason to have one is bugs in those external devices,
> where they don't behave according to the USB spec.  That's the

If I didn't miss something, the primary reason was to allow
devices to charge batteries from the bus, so, does USB
specs restrict this somehow? Or, what's wrong with that?


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: charging batteries from USB [was: Re: Alternative Concept]
  2007-03-24  0:47                                     ` charging batteries from USB [was: Re: Alternative Concept] Dmitry Krivoschekov
@ 2007-03-24  1:17                                       ` David Brownell
  2007-03-24  1:48                                         ` Dmitry Krivoschekov
  2007-03-24  8:36                                       ` Oliver Neukum
  1 sibling, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-24  1:17 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Friday 23 March 2007 5:47 pm, Dmitry Krivoschekov wrote:
> David Brownell wrote:
> >> For example, in USB you are able to enable/disable autosuspend rule,
> >> don't know if it's possible to disable it at runtime though.
> >
> > There are patches to allow disabling it at runtime through sysfs
> > attributes of any given device.
> >
> > The primary reason to have one is bugs in those external devices,
> > where they don't behave according to the USB spec.  That's the
> 
> If I didn't miss something, the primary reason was to allow
> devices to charge batteries from the bus, so, does USB
> specs restrict this somehow? Or, what's wrong with that?

You're mixing up two distinct issues:

 - Minimizing power use by the USB host ... to stretch its battery
   life, or otherwise shrink its power usage, starting with current
   delivered between on USB (between VBUS and GND).

 - How the peripheral uses whatever VBUS current it draws ... which
   can power arbitrary electronics, including a battery charger, but
   might be nothing more than powering the USB link.  (FWIW that's all
   the Nokia 800 does with VBUS; that helps stretch battery life.)

Clearly there's some competition there.  The default policy allows
autosuspend.  The primary reason to disable autosuspend is, as I
said, that some devices don't work well with it.  (Flakey circuits
or firmware, etc.)

However, enabling autosuspend *does* have a side effect that would
matter for those few devices that use VBUS to recharge a battery:
the VBUS current going to a suspended device is almost certainly
not enough to recharge anything substantial.  So -- for those few
devices that do recharge batteries over USB -- yes, another reason
to disable autosuspend might be to help recharge a battery.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: charging batteries from USB [was: Re: Alternative Concept]
  2007-03-24  1:17                                       ` David Brownell
@ 2007-03-24  1:48                                         ` Dmitry Krivoschekov
  2007-03-24  2:35                                           ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-24  1:48 UTC (permalink / raw)
  To: David Brownell; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

David Brownell wrote:
> On Friday 23 March 2007 5:47 pm, Dmitry Krivoschekov wrote:
>> David Brownell wrote:
>>>> For example, in USB you are able to enable/disable autosuspend rule,
>>>> don't know if it's possible to disable it at runtime though.
>>> There are patches to allow disabling it at runtime through sysfs
>>> attributes of any given device.
>>>
>>> The primary reason to have one is bugs in those external devices,
>>> where they don't behave according to the USB spec.  That's the
>> If I didn't miss something, the primary reason was to allow
>> devices to charge batteries from the bus, so, does USB
>> specs restrict this somehow? Or, what's wrong with that?
>
> You're mixing up two distinct issues:
>
>  - Minimizing power use by the USB host ... to stretch its battery
>    life, or otherwise shrink its power usage, starting with current
>    delivered between on USB (between VBUS and GND).
>
>  - How the peripheral uses whatever VBUS current it draws ... which
>    can power arbitrary electronics, including a battery charger, but
>    might be nothing more than powering the USB link.  (FWIW that's all
>    the Nokia 800 does with VBUS; that helps stretch battery life.)
>
> Clearly there's some competition there.  The default policy allows
> autosuspend.  The primary reason to disable autosuspend is, as I
> said, that some devices don't work well with it.  (Flakey circuits
> or firmware, etc.)
>
> However, enabling autosuspend *does* have a side effect that would
> matter for those few devices that use VBUS to recharge a battery:
> the VBUS current going to a suspended device is almost certainly
> not enough to recharge anything substantial.  So -- for those few
> devices that do recharge batteries over USB -- yes, another reason
> to disable autosuspend might be to help recharge a battery.
>
I clearly understand these distinct issues, I read Oliver's
comments for "switching off autosuspend through sysfs" patch,
where he said "This is needed for devices which recharge
their batteries of the bus" (but nothing regarding improper
behavior of some devices) so I decided you was referring
to that issue, that made me think that I overlooked something
in USB spec :)

Anyway, thanks for your explanation.



Regards,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: charging batteries from USB [was: Re: Alternative Concept]
  2007-03-24  1:48                                         ` Dmitry Krivoschekov
@ 2007-03-24  2:35                                           ` David Brownell
  2007-03-24 10:20                                             ` Oliver Neukum
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-24  2:35 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: Dominik Brodowski, Pavel Machek, linux-pm

On Friday 23 March 2007 6:48 pm, Dmitry Krivoschekov wrote:

> I clearly understand these distinct issues, I read Oliver's
> comments for "switching off autosuspend through sysfs" patch,
> where he said "This is needed for devices which recharge
> their batteries of the bus" (but nothing regarding improper
> behavior of some devices)

Sometimes people have been known to focus too much on
secondary issues.  ;)

> so I decided you was referring 
> to that issue, that made me think that I overlooked something
> in USB spec :)
> 
> Anyway, thanks for your explanation.
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: charging batteries from USB [was: Re: Alternative Concept]
  2007-03-24  0:47                                     ` charging batteries from USB [was: Re: Alternative Concept] Dmitry Krivoschekov
  2007-03-24  1:17                                       ` David Brownell
@ 2007-03-24  8:36                                       ` Oliver Neukum
  1 sibling, 0 replies; 84+ messages in thread
From: Oliver Neukum @ 2007-03-24  8:36 UTC (permalink / raw)
  To: linux-pm; +Cc: Pavel Machek, linux-pm, Dominik Brodowski

Am Samstag, 24. März 2007 01:47 schrieb Dmitry Krivoschekov:
> If I didn't miss something, the primary reason was to allow
> devices to charge batteries from the bus, so, does USB
> specs restrict this somehow? Or, what's wrong with that?

Suspended devices have a very low limit on power consumption.
It's not enough to charge a battery.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: charging batteries from USB [was: Re: Alternative Concept]
  2007-03-24  2:35                                           ` David Brownell
@ 2007-03-24 10:20                                             ` Oliver Neukum
  0 siblings, 0 replies; 84+ messages in thread
From: Oliver Neukum @ 2007-03-24 10:20 UTC (permalink / raw)
  To: linux-pm; +Cc: Pavel Machek, linux-pm, Dominik Brodowski

Am Samstag, 24. März 2007 03:35 schrieb David Brownell:
> On Friday 23 March 2007 6:48 pm, Dmitry Krivoschekov wrote:
> 
> > I clearly understand these distinct issues, I read Oliver's
> > comments for "switching off autosuspend through sysfs" patch,
> > where he said "This is needed for devices which recharge
> > their batteries of the bus" (but nothing regarding improper
> > behavior of some devices)
> 
> Sometimes people have been known to focus too much on
> secondary issues.  ;)

I did write a patch to introduce a black list for such devices ;-)

Seriously. The kernel can learn to deal with broken devices on its own.
It's a question of black listing or introducing workarounds. Cumbersome
but doable. The question of whether to charge a battery in a device or
conserve the laptop's batteries is not solvable in the kernel. How would
you decide?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 15:08     ` Dmitry Krivoschekov
@ 2007-03-20 17:04       ` David Brownell
  0 siblings, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-20 17:04 UTC (permalink / raw)
  To: Dmitry Krivoschekov; +Cc: linux-pm

On Tuesday 20 March 2007 8:08 am, Dmitry Krivoschekov wrote:
> Amit Kucheria wrote:
> > On 3/20/07, David Brownell <david-b@pacbell.net> wrote:
> >> On Monday 19 March 2007 7:12 am, Scott E. Preece wrote:
> >>> Could you guys present a clear definition of exactly what you mean by
> >>> "clock domain" and "power domain"? I can think of several different ways
> >>> to interpret the phrases, and I'd like to end up with the same meaning
> >>> that you are arguing from...
> >> A set of devices that use the same power supply or clock are
> >> in the same "power domain" or "clock domain" (respectively).

Dmitry, would you please stop removing all the blank lines
separating different peoples' replies?  That removal makes
it extremely difficult to follow what's going on in any
thread to which you contribute ... because quoting what you
wrote significantly reduces the readability of the text.


> > For the sake of completeness, there is also the Voltage domain. A
> > groups of modules supplied by the same regulator would belong to a
> > single Voltage domain. Multiple voltage domains allow independent
> > scaling of voltages to different domains.

That's a kind of power domain.  They can be hierarchical in the
same way as clock domains.  A regulator would be analagous to a
PLL, a switch would be analagous to a clock gate, etc.

Example:  a 5V supply derived by charge pump from a 3.3V supply;
a 1.8V supply derived from that same 3.3V supply.  Without 3.3V
neither of the other two voltages exist.  But maybe the 3.3V and
1.8V are separately gated.  Depending on the issue at hand, it can
be important to talk about a one or all of those as the "domain"
of interest.


> If you forget ability of scaling voltage for the domain, does the domain
> become a power domain, IOW is a power domain  the  simplest case of
> voltage domain

Voltage domains are **types of** power domains...

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20 14:26   ` Amit Kucheria
@ 2007-03-20 15:08     ` Dmitry Krivoschekov
  2007-03-20 17:04       ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Dmitry Krivoschekov @ 2007-03-20 15:08 UTC (permalink / raw)
  To: Amit Kucheria; +Cc: linux-pm

Amit Kucheria wrote:
> On 3/20/07, David Brownell <david-b@pacbell.net> wrote:
>> On Monday 19 March 2007 7:12 am, Scott E. Preece wrote:
>>> Could you guys present a clear definition of exactly what you mean by
>>> "clock domain" and "power domain"? I can think of several different ways
>>> to interpret the phrases, and I'd like to end up with the same meaning
>>> that you are arguing from...
>> A set of devices that use the same power supply or clock are
>> in the same "power domain" or "clock domain" (respectively).
>>
>> The domains will often be hierarchical, e.g. a base clock
>> rooting other clocks, derived from it by dividers, PLL,
>> or clock gates.
>>
>> Sometimes domains overlap ... e.g. a controller that needs
>> to use one logic level for on-chip logic and another for the
>> external interface; or similarly, different clock rates.
>>
>> Simple chips may not have many domains.  Nowadays I think
>> most SOCs have at least a decent selection of clock domains,
>> to eliminate the power drain involved in driving transistors
>> through clock ticks.  I understand it's more complicated to
>> have multiple power domains, but the incentive to shrink the
>> leakage current is strong.  (So adding on-chip power domains
>> involves tricks to constrain leakage, and not just an ability
>> to operate without a given power rail.)
>
> For the sake of completeness, there is also the Voltage domain. A
> groups of modules supplied by the same regulator would belong to a
> single Voltage domain. Multiple voltage domains allow independent
> scaling of voltages to different domains.
If you forget ability of scaling voltage for the domain, does the domain
become a power domain, IOW is a power domain  the  simplest case of
voltage domain which  doesn't  not support  voltage scalinig in any of
its states (no scaling in active mode, no scaling in idle mode, no
scaling in
retention mode)? 


Thanks,
Dmitry

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-20  7:56 ` David Brownell
@ 2007-03-20 14:26   ` Amit Kucheria
  2007-03-20 15:08     ` Dmitry Krivoschekov
  0 siblings, 1 reply; 84+ messages in thread
From: Amit Kucheria @ 2007-03-20 14:26 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm

On 3/20/07, David Brownell <david-b@pacbell.net> wrote:
> On Monday 19 March 2007 7:12 am, Scott E. Preece wrote:
> >
> > Could you guys present a clear definition of exactly what you mean by
> > "clock domain" and "power domain"? I can think of several different ways
> > to interpret the phrases, and I'd like to end up with the same meaning
> > that you are arguing from...
>
> A set of devices that use the same power supply or clock are
> in the same "power domain" or "clock domain" (respectively).
>
> The domains will often be hierarchical, e.g. a base clock
> rooting other clocks, derived from it by dividers, PLL,
> or clock gates.
>
> Sometimes domains overlap ... e.g. a controller that needs
> to use one logic level for on-chip logic and another for the
> external interface; or similarly, different clock rates.
>
> Simple chips may not have many domains.  Nowadays I think
> most SOCs have at least a decent selection of clock domains,
> to eliminate the power drain involved in driving transistors
> through clock ticks.  I understand it's more complicated to
> have multiple power domains, but the incentive to shrink the
> leakage current is strong.  (So adding on-chip power domains
> involves tricks to constrain leakage, and not just an ability
> to operate without a given power rail.)

For the sake of completeness, there is also the Voltage domain. A
groups of modules supplied by the same regulator would belong to a
single Voltage domain. Multiple voltage domains allow independent
scaling of voltages to different domains.
--
Amit Kucheria, Nokia

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-19 14:12 Scott E. Preece
@ 2007-03-20  7:56 ` David Brownell
  2007-03-20 14:26   ` Amit Kucheria
  0 siblings, 1 reply; 84+ messages in thread
From: David Brownell @ 2007-03-20  7:56 UTC (permalink / raw)
  To: Scott E. Preece; +Cc: linux-pm, pavel, linux

On Monday 19 March 2007 7:12 am, Scott E. Preece wrote:
> 
> Could you guys present a clear definition of exactly what you mean by
> "clock domain" and "power domain"? I can think of several different ways
> to interpret the phrases, and I'd like to end up with the same meaning
> that you are arguing from...

A set of devices that use the same power supply or clock are
in the same "power domain" or "clock domain" (respectively).

The domains will often be hierarchical, e.g. a base clock
rooting other clocks, derived from it by dividers, PLL,
or clock gates.

Sometimes domains overlap ... e.g. a controller that needs
to use one logic level for on-chip logic and another for the
external interface; or similarly, different clock rates.

Simple chips may not have many domains.  Nowadays I think
most SOCs have at least a decent selection of clock domains,
to eliminate the power drain involved in driving transistors
through clock ticks.  I understand it's more complicated to
have multiple power domains, but the incentive to shrink the
leakage current is strong.  (So adding on-chip power domains
involves tricks to constrain leakage, and not just an ability
to operate without a given power rail.)

I think that captures the basics... from a software perspective.
I'm sure a hardware guy could provide a more advanced course.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
@ 2007-03-19 14:12 Scott E. Preece
  2007-03-20  7:56 ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Scott E. Preece @ 2007-03-19 14:12 UTC (permalink / raw)
  To: david-b; +Cc: linux-pm, pavel, linux


Could you guys present a clear definition of exactly what you mean by
"clock domain" and "power domain"? I can think of several different ways
to interpret the phrases, and I'd like to end up with the same meaning
that you are arguing from...

thanks,
scott

| From: David Brownell<david-b@pacbell.net>
| 
| On Sunday 18 March 2007 7:27 pm, Ikhwan Lee wrote:
| > Hi,
| > 
| > On 3/16/07, David Brownell <david-b@pacbell.net> wrote:
| > > On Thursday 15 March 2007 8:56 pm, Ikhwan Lee wrote:
| > > > Hi,
| > > >
| > > > Although I agree that the current clock framework can handle power or
| > > > voltage domains in many platforms, having something like (struct clk
| > > > powerdomain1, powerdomain2;) does not seem like a good implementation,
| > > > a struct for clocks representing a power domain.
| > >
| > > Good thing that's not what I suggested then, right?  :)
| > >
| > > The point was that in the examples I've seen, the power domains
| > > are associated with clock domains, so that each clock is tied
| > > to one power domain.  And since you can't use the power domain
| > > without having a clock ... the implementation can tell if it's
| > > got to activate a power domain by looking at the clock.
| > 
| > True, although sometimes it gets dirty because multiple clock sources
| > are associated with one power domain
| 
| As clearly allowed for in what I wrote.  clock->power_domain.
| 
| > at the same time multiple power 
| > domains are associated with one clock source.
| 
| As also allowed for in what I wrote originally.  clock->power_domains[].
| 
| > Simple parent and child 
| > relationship provided by the clock framework is not always enough.
| 
| Not implied in what I wrote.
| 
| 
| > > There may be other models of power domain, but that's the one
| > > I've had reason to look at (which isn't synonymous with a straight
| > > voltage/current supply).
| > >
| > >
| > > > If a new framework is more straighforward and introduces a negligible
| > > > overhead to the current kernel, I think it is worthwhile to have a
| > > > look at it. Plus this new framework might be able to take care of
| > > > those platforms that are not nicely supported by the current clock
| > > > framework.
| > >
| > > Perhaps when we see one, we could discuss that as somethong other
| > > than pure handwaving.  But that still won't address the basic point
| > > that it's wrong to assume the clock framework should be written out
| > > of the picture.
| > 
| > I think we can reach an agreement. The clock framework does not need
| > to be replaced with a new one since it is serving its purpose well
| > enough. If extra functionalities are needed for clocks, we can extend
| > the existing clock framework. Such extensions will include functions
| > like clk_set_rate_pending() and power_transaction_commit(). However,
| > since clocks and voltages (or power domains) have different
| > characteristics, it is desirable to have a separate framework for
| > power domains and associate that framework with the existing clock
| > framework.
| 
| If the platform needs power domains to be exposed, yes.  But I gave
| examples where it does NOT need to be exposed, since each clock was
| in a single power domain.
| 
| 
| > I am not sure if this is the direction that the original PowerOp
| > people suggested. If we can agree on this, however, I think we can
| > proceed to look at the code.
| 
| I'm not sure why such agreement should be necessary before showing
| interface definitions.
| 
| - Dave
| _______________________________________________
| linux-pm mailing list
| linux-pm@lists.linux-foundation.org
| https://lists.linux-foundation.org/mailman/listinfo/linux-pm
| 

-- 
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
e-mail:	preece@motorola.com	fax:	+1-217-384-8550
phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15 13:29 Scott E. Preece
@ 2007-03-15 23:07 ` David Brownell
  0 siblings, 0 replies; 84+ messages in thread
From: David Brownell @ 2007-03-15 23:07 UTC (permalink / raw)
  To: Scott E. Preece; +Cc: linux-pm, linux, pavel

On Thursday 15 March 2007 6:29 am, Scott E. Preece wrote:

> | > For the rest of us, though, all the stuff you're currently 
> | > doing for power management is wasted effort and why should we incur
> | > costs to work around them? 
> | 
> | Me personally?  What specifically are you referring to, and
> | in what respects would that be "wasted" effort?
> ---
> 
> As noted in previous apology, I was speaking over-broadly. However, as I
> said, we currently configure out cpufreq and ACPI support,

ACPI -- goes without saying, unless you're on x86 or ia64.

cpufreq -- similar, although some non-x86 versions do exist, and seem
to provide limited power savings in a few cases (in conjunction with
voltage scaling, since the cost of N cpu cycles is otherwise constant).


> among other 
> things, so they represent wasted effort from the particular perspective
> of our products. I was speaking rhetorically - just saying that the work
> done on cpufreq and ACPI was "wasted effort" in exactly the same sense
> that work spent on supporting the PM needs of embedded devices would be.

I still don't follow.  I think I'll just count your original response
as one of those "should not have written that" posts most folk suffer
from on occasion.

- Dave


> ---
> | 
> | > Today, we just configure it all out and put 
> | > in our own stuff. We would prefer to have a mainstream framework that
> | > could be used to meet both Intel laptop needs and embedded device needs...
> | 
> | I don't think I ever said anything against that notion of having PM
> | infrastructure capable of handling both PC and embedded configs.  Not
> | that I've seen a framework that handles either one well -- yet! -- so
> | such notions haven't yet progressed to being testable theories.
> | 
> | Against the notion of infrastructure (PM or otherwise) that's not
> | well designed or defined -- certainly I've argued.  That includes
> | much current PM infrastructure, and most recent proposals.
> ---
> 
> Thanks - I can agree with that!
> 
> scott 
> 
> -- 
> scott preece
> motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
> e-mail:	preece@motorola.com	fax:	+1-217-384-8550
> phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com
> 
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15 14:00 Scott E. Preece
  2007-03-15 14:38 ` Eugeny S. Mints
@ 2007-03-15 17:33 ` Woodruff, Richard
  1 sibling, 0 replies; 84+ messages in thread
From: Woodruff, Richard @ 2007-03-15 17:33 UTC (permalink / raw)
  To: Scott E. Preece, igor.stoppa; +Cc: linux-pm, pavel, linux

> If you did it this way:
> 
> tid1 = power_transaction_start();
> clk_set_rate_pending(clk1, 300, tid1);
> clk_set_rate_pending(clk1, 600, tid1);
> power_transaction_commit(tid1);

You could do it in the way above or you can do it with virtual clocks
which allow grouped changes.

In the OMAP2 implementation this is currently hidden.  You can define a
virtual clock which encompasses several key root dividers.  Generally
for this type of change a pre-scale notifier goes out to registered
drivers.  That sequencing code does a clk_set_rate() against that
virtual clock.  The code internally sets up then jumps to SRAM and does
any associated voltage shift along with the multi-divider change.  On
the way back from SRAM the rate change propagates to update all child
nodes.  Finally a post scale happens.  The hardware does give some
assistance with a buffer for key clock groups.

Most simple clocks don't go through this process.

Doing a series of clk_round_rates() before doing your clk_set_rate() to
find available speeds given a certain parent allows you to adjust to
local needs like what a LCD panels end pixel clock divider needs to be
as bounded by panel specs.

The clk api is pretty useful.  Nothing says it can't be expanded or
absorbed.  Right now keeping api's like round/set available is good.

Regards,
Richard W.  

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15 14:00 Scott E. Preece
@ 2007-03-15 14:38 ` Eugeny S. Mints
  2007-03-15 17:33 ` Woodruff, Richard
  1 sibling, 0 replies; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-15 14:38 UTC (permalink / raw)
  To: Scott E. Preece; +Cc: linux-pm, linux, pavel

Scott E. Preece wrote:
> | From: Igor Stoppa<igor.stoppa@nokia.com>
> | 
> | What's wrong with expanding the clk_fw?
> | All is needed is:
> | clk_set_rate_buffered(clk1, 300);
> | clk_set_rate_buffered(clk2, 600);
> | clk_rate_flush(); /* can include validation of the set */
> | 
> | Which is, incidentally, what OMAP2 does in hw for all the relevant clk
> | dividers and it also provides validation for the new set of values.
> | 
> | Furthermore if the original assumption that complex transitions are
> | allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is
> | mandatory, otherwise the transition is impossible, no matter what fancy
> | sw fw is performing it.
> ---
> 
> If you did it this way:
> 
> tid1 = power_transaction_start();
> clk_set_rate_pending(clk1, 300, tid1);
> clk_set_rate_pending(clk1, 600, tid1);
> power_transaction_commit(tid1);
> 
> Then you can conveniently be constructing more than one transaction at a
> time and would have a little more information for debugging and for
> canceling partial transactions.
> 
> I agree that there needs to be some use of hardware magic underneath to
> make the changes truly atomic; the flush/commit operation would have to
> use that magic. 

i was talking about absolutely concrete example of SH7722 where you have 3 
clocks managed via one register FRQCR. With this hw you:

1) have to keep predefined ratio between clocks like 1:n:n
2) can write the whole register at once (is it that hw magic you are talking about?)

So I want this hw to be supported by the framework. The transaction approach 
pointed by Scott could be one of possible solution while I'd hardly accept 
"anonymous" (not coupled with switching clocks) clk_rate_flush() approach.

Meanwhile we are proposing an alternative concept which completely hides such 
clocks dependency inside the framework, behind the API: the dependency is coded 
via clock tree structure (such clocks are linked with an arc), Further, assuming 
clk_set() accepting variable number of clk_id/rate pairs as input parameters 
such approach eliminates a need in adding transaction concept to the framework API.

The latter approach significantly simplify building group layer (operating 
points) on top of parameter framework. Indeed, while with transaction approach 
the knowledge of clocks dependencies in order to properly set up transactions 
has to be exported to the group layer with our concept the group layer would be 
as simple as mapping between a string name ("mp3") to single call to 
clk_set_rate() (actually param_set()) where the whole list of parameters an 
operating point is constructed from is passed to such a call.


> That has the advantage of putting all the complexity of
> undedstanding that magic in one place, at the cost of making that one
> place possibly arbitrarily complex.

yes, we are looking towards hiding this completely inside parameter framework.

Eugeny
> 
> scott
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
@ 2007-03-15 14:00 Scott E. Preece
  2007-03-15 14:38 ` Eugeny S. Mints
  2007-03-15 17:33 ` Woodruff, Richard
  0 siblings, 2 replies; 84+ messages in thread
From: Scott E. Preece @ 2007-03-15 14:00 UTC (permalink / raw)
  To: igor.stoppa; +Cc: linux-pm, linux, pavel


| From: Igor Stoppa<igor.stoppa@nokia.com>
| 
| What's wrong with expanding the clk_fw?
| All is needed is:
| clk_set_rate_buffered(clk1, 300);
| clk_set_rate_buffered(clk2, 600);
| clk_rate_flush(); /* can include validation of the set */
| 
| Which is, incidentally, what OMAP2 does in hw for all the relevant clk
| dividers and it also provides validation for the new set of values.
| 
| Furthermore if the original assumption that complex transitions are
| allowed only atomically (p1A, p1B) => (p2A, p2B), hw support is
| mandatory, otherwise the transition is impossible, no matter what fancy
| sw fw is performing it.
---

If you did it this way:

tid1 = power_transaction_start();
clk_set_rate_pending(clk1, 300, tid1);
clk_set_rate_pending(clk1, 600, tid1);
power_transaction_commit(tid1);

Then you can conveniently be constructing more than one transaction at a
time and would have a little more information for debugging and for
canceling partial transactions.

I agree that there needs to be some use of hardware magic underneath to
make the changes truly atomic; the flush/commit operation would have to
use that magic. That has the advantage of putting all the complexity of
undedstanding that magic in one place, at the cost of making that one
place possibly arbitrarily complex.

scott

-- 
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
e-mail:	preece@motorola.com	fax:	+1-217-384-8550
phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
@ 2007-03-15 13:29 Scott E. Preece
  2007-03-15 23:07 ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Scott E. Preece @ 2007-03-15 13:29 UTC (permalink / raw)
  To: david-b; +Cc: linux-pm, linux, pavel


| From: David Brownell<david-b@pacbell.net>
| 
| On Wednesday 14 March 2007 3:08 pm, Scott E. Preece wrote:
| > | > 
| > | > But shouldn't it be useful on every platform? ..
| > | 
| > | I couldn't know.  This "alternative concept" hasn't gotten very far
| > | into the hand-waving stage, much less beyond it into proposed interface
| > | or (gasp!) implementations.  Platforms that don't *have* those particular
| > | interdependencies should not of course incur costs to implement them...
| > ---
| > 
| > Well, that's fine if the platform you use is the current design
| > center.
| 
| So you think that platforms which don't have such interdependencies
| should incur costs and complexity to address problems they don't have.
| Why?
---

Well, yes. That's part of having a solution that addresses the whole
community and not a subset. Linux is already full of things that trade
off benefits for one platform against costs for another platform.

---
| 
| > For the rest of us, though, all the stuff you're currently 
| > doing for power management is wasted effort and why should we incur
| > costs to work around them? 
| 
| Me personally?  What specifically are you referring to, and
| in what respects would that be "wasted" effort?
---

As noted in previous apology, I was speaking over-broadly. However, as I
said, we currently configure out cpufreq and ACPI support, among other
things, so they represent wasted effort from the particular perspective
of our products. I was speaking rhetorically - just saying that the work
done on cpufreq and ACPI was "wasted effort" in exactly the same sense
that work spent on supporting the PM needs of embedded devices would be.

---
| 
| > Today, we just configure it all out and put 
| > in our own stuff. We would prefer to have a mainstream framework that
| > could be used to meet both Intel laptop needs and embedded device needs...
| 
| I don't think I ever said anything against that notion of having PM
| infrastructure capable of handling both PC and embedded configs.  Not
| that I've seen a framework that handles either one well -- yet! -- so
| such notions haven't yet progressed to being testable theories.
| 
| Against the notion of infrastructure (PM or otherwise) that's not
| well designed or defined -- certainly I've argued.  That includes
| much current PM infrastructure, and most recent proposals.
---

Thanks - I can agree with that!

scott 

-- 
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
e-mail:	preece@motorola.com	fax:	+1-217-384-8550
phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
@ 2007-03-15 13:21 Scott E. Preece
  0 siblings, 0 replies; 84+ messages in thread
From: Scott E. Preece @ 2007-03-15 13:21 UTC (permalink / raw)
  To: preece; +Cc: linux-pm, linux, pavel


| From: "Scott E. Preece" <preece@motorola.com>
| 
| | From: David Brownell<david-b@pacbell.net>
| | 
| | I couldn't know.  This "alternative concept" hasn't gotten very far
| | into the hand-waving stage, much less beyond it into proposed interface
| | or (gasp!) implementations.  Platforms that don't *have* those particular
| | interdependencies should not of course incur costs to implement them...
| ---
| 
| Well, that's fine if the platform you use is the current design
| center. For the rest of us, though, all the stuff you're currently
| doing for power management is wasted effort and why should we incur
| costs to work around them?  Today, we just configure it all out and put
| in our own stuff. We would prefer to have a mainstream framework that
| could be used to meet both Intel laptop needs and embedded device needs...
---

I have to apologize for this comment. I wrote it in a hurry as I left
for a meeting and tried to condense too many thoughts and not enough
thinking into the number of words I had time to type.

The clock framework is reasonably inoffensive, and I think it might be
reasonable to retain the current interfaces for clock-like devices while
adding on support for dependency modeling.  Today the dependencies
behind the clocks aren't modeled anywhere visible (in our implementation
they are managed by low-level assembler interfaces). We would like to be
able to have a power management system that WAS aware of the
interdependencies and able to make decisions based on concerns deeper
than the current clock frequency. Part of the reson for this is to
enable portability of at least some parts of our PM code (we build
products on a number of platforms, each with different parts in similar
roles).

I agree with David that typed interfaces are highly preferable.  Moving
the typing to data (having a type field associated with nodes in a
network) is more dangerous and less readable, at least in the absence of
a language with first-class classes and inheritance. So, I would prefer
to see the model have different kinds of nodes for different kinds of
devices; that should be possible while using common mechanisms for
implementing the dependencies between them.

However, I think it would be more appropriate to debate that when there
is a proposal for interfaces and at least a little bit of code. It is
possible that the current clock framework would appear broken in
comparison to something that was clearly superior. We don't know, yet,
whether what is proposed will look clearly superior, so it's probably
too early to argue about that aspect. The goal at this point was just to
surface the underlying concepts and see if people thought they
adequately model the problem space.

scott

-- 
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
e-mail:	preece@motorola.com	fax:	+1-217-384-8550
phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15  8:14     ` Amit Kucheria
@ 2007-03-15 10:55       ` Eugeny S. Mints
  0 siblings, 0 replies; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-15 10:55 UTC (permalink / raw)
  To: Amit Kucheria; +Cc: linux-pm, pavel, linux

Amit Kucheria wrote:
> On 3/15/07, Ikhwan Lee <dlrghks@gmail.com> wrote:
>> Hi,
>>
>> On 3/15/07, David Brownell <david-b@pacbell.net> wrote:
>>> So you think that platforms which don't have such interdependencies
>>> should incur costs and complexity to address problems they don't have.
>>> Why?
>> Not every platform implements the clock interface. I think same can be
>> done with the proposed power parameter framework. The basic codes
>> defining the power parameter interface need not be costly and complex.
> 
> Exactly! Maybe once we get to the stage of interface discussion, Matt
> and Eugeny could provide a roadmap on the evolution of the PM
> framework. Personally, I don't see clock framework disappearing
> overnight for platforms that do use it.

dissolving might be a better wording in the sense that for a platform which 
would not gain from adding any node and arc to the clock nodes and arcs already 
exited in the current clock tree for the platform would stick with exactly the 
same tree(graph) without incurring any additional costs and complexity.

Eugeny

> 
> /Amit
> --
> Amit Kucheria, Nokia
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15  7:25   ` Ikhwan Lee
  2007-03-15  8:14     ` Amit Kucheria
@ 2007-03-15 10:46     ` Eugeny S. Mints
  1 sibling, 0 replies; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-15 10:46 UTC (permalink / raw)
  To: Ikhwan Lee; +Cc: linux-pm, linux, pavel

Ikhwan Lee wrote:
> Hi,
> 
> On 3/15/07, David Brownell <david-b@pacbell.net> wrote:
>> On Wednesday 14 March 2007 3:08 pm, Scott E. Preece wrote:
>>> | >
>>> | > But shouldn't it be useful on every platform? ..
>>> |
>>> | I couldn't know.  This "alternative concept" hasn't gotten very far
>>> | into the hand-waving stage, much less beyond it into proposed interface
>>> | or (gasp!) implementations.  Platforms that don't *have* those particular
>>> | interdependencies should not of course incur costs to implement them...
>>> ---
>>>
>>> Well, that's fine if the platform you use is the current design
>>> center.
>> So you think that platforms which don't have such interdependencies
>> should incur costs and complexity to address problems they don't have.
>> Why?
> 
> Not every platform implements the clock interface. I think same can be
> done with the proposed power parameter framework. The basic codes
> defining the power parameter interface need not be costly and complex.
> 
> Since interdependencies significantly vary among platforms, we can
> leave that to platform specific code and have something as simple as
> the current clock interface for voltage and power domains.

exactly. I touched this already in reply to David. The interdependencies for a 
particular platform affect configuration of nodes and arcs graph only while do 
not affect the API. The graph configuration is the only arch dependent thing.

Eugeny

> 
>>> For the rest of us, though, all the stuff you're currently
>>> doing for power management is wasted effort and why should we incur
>>> costs to work around them?
>> Me personally?  What specifically are you referring to, and
>> in what respects would that be "wasted" effort?
>>
>>
>>> Today, we just configure it all out and put
>>> in our own stuff. We would prefer to have a mainstream framework that
>>> could be used to meet both Intel laptop needs and embedded device needs...
>> I don't think I ever said anything against that notion of having PM
>> infrastructure capable of handling both PC and embedded configs.  Not
>> that I've seen a framework that handles either one well -- yet! -- so
>> such notions haven't yet progressed to being testable theories.
>>
>> Against the notion of infrastructure (PM or otherwise) that's not
>> well designed or defined -- certainly I've argued.  That includes
>> much current PM infrastructure, and most recent proposals.
>>
>> - Dave
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.osdl.org
>> https://lists.osdl.org/mailman/listinfo/linux-pm
>>
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 23:23 ` David Brownell
  2007-03-15  7:25   ` Ikhwan Lee
@ 2007-03-15 10:33   ` Eugeny S. Mints
  1 sibling, 0 replies; 84+ messages in thread
From: Eugeny S. Mints @ 2007-03-15 10:33 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm, pavel, linux

David Brownell wrote:
> On Wednesday 14 March 2007 3:08 pm, Scott E. Preece wrote:
>> | > 
>> | > But shouldn't it be useful on every platform? ..
>> | 
>> | I couldn't know.  This "alternative concept" hasn't gotten very far
>> | into the hand-waving stage, much less beyond it into proposed interface
>> | or (gasp!) implementations.  Platforms that don't *have* those particular
>> | interdependencies should not of course incur costs to implement them...
>> ---
>>
>> Well, that's fine if the platform you use is the current design
>> center.
> 
> So you think that platforms which don't have such interdependencies
> should incur costs and complexity to address problems they don't have.
> Why?

they don't. and what is more important they wouldn't.

Our idea is to provide building bricks: a clock provider node, a voltage 
provider node, a domain node, an arc of parent-child type, an arc of domain 
type, etc as well as tools/means (API) to construct an arbitrary graph from 
these nodes and arcs. These are pieces which parameter framework would bring to 
the plate.

Further, an arch/platform system designer reading hw manual defines which nodes 
are required for a particular arch/platform and how a graph of nodes and arcs
should look like for the arch/platform. Parameter framework API allows to build 
an arbitrary graph of nodes and arcs either statically or at runtime.

This way a particular arch/platform graph configuration is the only arch 
dependent thing while the rest is handled in arch independent way.

Am arch/platform designer uses only set of nodes and set of arcs required for 
his particular platform eliminating costs and complexity non-related to the 
specific platform.

Of course, any generalization has size and performance penalties but with the 
proposed approach it's basically narrowed down to size of the structure 
representing a node in the tree and performance of tree traversing. Both leave 
opportunity for optimization without impact on generic ideas, structures and API 
though.

Eugeny
> 
> 
>> For the rest of us, though, all the stuff you're currently 
>> doing for power management is wasted effort and why should we incur
>> costs to work around them? 
> 
> Me personally?  What specifically are you referring to, and
> in what respects would that be "wasted" effort?
> 
> 
>> Today, we just configure it all out and put 
>> in our own stuff. We would prefer to have a mainstream framework that
>> could be used to meet both Intel laptop needs and embedded device needs...
> 
> I don't think I ever said anything against that notion of having PM
> infrastructure capable of handling both PC and embedded configs.  Not
> that I've seen a framework that handles either one well -- yet! -- so
> such notions haven't yet progressed to being testable theories.
> 
> Against the notion of infrastructure (PM or otherwise) that's not
> well designed or defined -- certainly I've argued.  That includes
> much current PM infrastructure, and most recent proposals.
> 
> - Dave
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-15  7:25   ` Ikhwan Lee
@ 2007-03-15  8:14     ` Amit Kucheria
  2007-03-15 10:55       ` Eugeny S. Mints
  2007-03-15 10:46     ` Eugeny S. Mints
  1 sibling, 1 reply; 84+ messages in thread
From: Amit Kucheria @ 2007-03-15  8:14 UTC (permalink / raw)
  To: Ikhwan Lee; +Cc: linux-pm, linux, pavel

On 3/15/07, Ikhwan Lee <dlrghks@gmail.com> wrote:
> Hi,
>
> On 3/15/07, David Brownell <david-b@pacbell.net> wrote:
> >
> > So you think that platforms which don't have such interdependencies
> > should incur costs and complexity to address problems they don't have.
> > Why?
>
> Not every platform implements the clock interface. I think same can be
> done with the proposed power parameter framework. The basic codes
> defining the power parameter interface need not be costly and complex.

Exactly! Maybe once we get to the stage of interface discussion, Matt
and Eugeny could provide a roadmap on the evolution of the PM
framework. Personally, I don't see clock framework disappearing
overnight for platforms that do use it.

/Amit
--
Amit Kucheria, Nokia

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 23:23 ` David Brownell
@ 2007-03-15  7:25   ` Ikhwan Lee
  2007-03-15  8:14     ` Amit Kucheria
  2007-03-15 10:46     ` Eugeny S. Mints
  2007-03-15 10:33   ` Eugeny S. Mints
  1 sibling, 2 replies; 84+ messages in thread
From: Ikhwan Lee @ 2007-03-15  7:25 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm, pavel, linux

Hi,

On 3/15/07, David Brownell <david-b@pacbell.net> wrote:
> On Wednesday 14 March 2007 3:08 pm, Scott E. Preece wrote:
> > | >
> > | > But shouldn't it be useful on every platform? ..
> > |
> > | I couldn't know.  This "alternative concept" hasn't gotten very far
> > | into the hand-waving stage, much less beyond it into proposed interface
> > | or (gasp!) implementations.  Platforms that don't *have* those particular
> > | interdependencies should not of course incur costs to implement them...
> > ---
> >
> > Well, that's fine if the platform you use is the current design
> > center.
>
> So you think that platforms which don't have such interdependencies
> should incur costs and complexity to address problems they don't have.
> Why?

Not every platform implements the clock interface. I think same can be
done with the proposed power parameter framework. The basic codes
defining the power parameter interface need not be costly and complex.

Since interdependencies significantly vary among platforms, we can
leave that to platform specific code and have something as simple as
the current clock interface for voltage and power domains.

>
> > For the rest of us, though, all the stuff you're currently
> > doing for power management is wasted effort and why should we incur
> > costs to work around them?
>
> Me personally?  What specifically are you referring to, and
> in what respects would that be "wasted" effort?
>
>
> > Today, we just configure it all out and put
> > in our own stuff. We would prefer to have a mainstream framework that
> > could be used to meet both Intel laptop needs and embedded device needs...
>
> I don't think I ever said anything against that notion of having PM
> infrastructure capable of handling both PC and embedded configs.  Not
> that I've seen a framework that handles either one well -- yet! -- so
> such notions haven't yet progressed to being testable theories.
>
> Against the notion of infrastructure (PM or otherwise) that's not
> well designed or defined -- certainly I've argued.  That includes
> much current PM infrastructure, and most recent proposals.
>
> - Dave
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
  2007-03-14 22:08 Scott E. Preece
@ 2007-03-14 23:23 ` David Brownell
  2007-03-15  7:25   ` Ikhwan Lee
  2007-03-15 10:33   ` Eugeny S. Mints
  0 siblings, 2 replies; 84+ messages in thread
From: David Brownell @ 2007-03-14 23:23 UTC (permalink / raw)
  To: Scott E. Preece; +Cc: linux-pm, linux, pavel

On Wednesday 14 March 2007 3:08 pm, Scott E. Preece wrote:
> | > 
> | > But shouldn't it be useful on every platform? ..
> | 
> | I couldn't know.  This "alternative concept" hasn't gotten very far
> | into the hand-waving stage, much less beyond it into proposed interface
> | or (gasp!) implementations.  Platforms that don't *have* those particular
> | interdependencies should not of course incur costs to implement them...
> ---
> 
> Well, that's fine if the platform you use is the current design
> center.

So you think that platforms which don't have such interdependencies
should incur costs and complexity to address problems they don't have.
Why?


> For the rest of us, though, all the stuff you're currently 
> doing for power management is wasted effort and why should we incur
> costs to work around them? 

Me personally?  What specifically are you referring to, and
in what respects would that be "wasted" effort?


> Today, we just configure it all out and put 
> in our own stuff. We would prefer to have a mainstream framework that
> could be used to meet both Intel laptop needs and embedded device needs...

I don't think I ever said anything against that notion of having PM
infrastructure capable of handling both PC and embedded configs.  Not
that I've seen a framework that handles either one well -- yet! -- so
such notions haven't yet progressed to being testable theories.

Against the notion of infrastructure (PM or otherwise) that's not
well designed or defined -- certainly I've argued.  That includes
much current PM infrastructure, and most recent proposals.

- Dave

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Alternative Concept
@ 2007-03-14 22:08 Scott E. Preece
  2007-03-14 23:23 ` David Brownell
  0 siblings, 1 reply; 84+ messages in thread
From: Scott E. Preece @ 2007-03-14 22:08 UTC (permalink / raw)
  To: david-b; +Cc: linux-pm, linux, pavel



| From: David Brownell<david-b@pacbell.net>
| ...
| > > > Basically a good way of thinking about parameter framework is that parameter 
| > > > framework would start from existed clock framework and gradually evolve by 
| > > > addressing voltages, relationship between clocks and voltages, domains. 
| > > > Eventually clock framework functionality would be a part of power parameter 
| > > > framework.
| > > 
| > > A better way would be to say that implementions of the clock interface
| > > on a given platform can build on whatever they need to build.  That might
| > > include a "parameter" framework, if such a thing were defined in such
| > > a way that it became useful to such implementations.
| > > 
| > But shouldn't it be useful on every platform? As a sort of resource
| > manager (because that's what it would become if it would start adressing
| > interdependencies between clocks and voltages).
| 
| I couldn't know.  This "alternative concept" hasn't gotten very far
| into the hand-waving stage, much less beyond it into proposed interface
| or (gasp!) implementations.  Platforms that don't *have* those particular
| interdependencies should not of course incur costs to implement them...
---

Well, that's fine if the platform you use is the current design
center. For the rest of us, though, all the stuff you're currently
doing for power management is wasted effort and why should we incur
costs to work around them?  Today, we just configure it all out and put
in our own stuff. We would prefer to have a mainstream framework that
could be used to meet both Intel laptop needs and embedded device needs...

scott
-- 
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il  61820  
e-mail:	preece@motorola.com	fax:	+1-217-384-8550
phone:	+1-217-384-8589	cell: +1-217-433-6114	pager: 2174336114@vtext.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2007-03-24 10:20 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-24  1:23 [RFC] CPUFreq PowerOP integration, Intro 0/3 Eugeny S. Mints
2006-10-07  2:36 ` Alternative Concept [Was: Re: [RFC] CPUFreq PowerOP integration, Intro 0/3] Dominik Brodowski
2006-10-07  3:15   ` Dominik Brodowski
2006-10-08  7:16   ` Pavel Machek
2006-10-12 15:38     ` Mark Gross
2006-10-12 16:02       ` Dominik Brodowski
2006-10-16 21:56         ` Mark Gross
2006-10-17 21:40           ` Matthew Locke
2006-10-12 16:48       ` Pavel Machek
2006-10-12 17:12         ` Vitaly Wool
2006-10-12 17:23           ` Pavel Machek
2006-10-09 18:21   ` Mark Gross
2006-10-26  3:06     ` Dominik Brodowski
2006-10-12 22:43   ` Eugeny S. Mints
2006-10-13 10:55     ` Pavel Machek
2006-10-16 21:44       ` Mark Gross
2006-10-17  8:26         ` Pavel Machek
2006-10-26  3:05     ` Dominik Brodowski
2007-03-13  0:57   ` Alternative Concept Matthew Locke
2007-03-13 11:08     ` Pavel Machek
2007-03-13 20:34       ` Mark Gross
2007-03-14  2:30         ` Ikhwan Lee
2007-03-14 10:43           ` Eugeny S. Mints
2007-03-14 17:19             ` David Brownell
2007-03-14 18:12               ` Igor Stoppa
2007-03-14 18:45                 ` David Brownell
2007-03-15  9:53               ` Eugeny S. Mints
2007-03-15 13:04                 ` Igor Stoppa
2007-03-16  2:21                   ` David Brownell
2007-03-16  3:56                     ` Ikhwan Lee
2007-03-16  6:17                       ` David Brownell
2007-03-19  2:27                         ` Ikhwan Lee
2007-03-19  6:07                           ` David Brownell
2007-03-16 13:06                     ` Dmitry Krivoschekov
2007-03-16 18:03                       ` David Brownell
2007-03-18 20:25                         ` Dmitry Krivoschekov
2007-03-19  4:04                           ` David Brownell
2007-03-20  0:03                             ` Dmitry Krivoschekov
2007-03-20  8:07                               ` David Brownell
2007-03-20  9:45                                 ` Dmitry Krivoschekov
2007-03-20 10:30                                   ` Igor Stoppa
2007-03-20 12:13                                     ` Eugeny S. Mints
2007-03-20 12:39                                       ` Igor Stoppa
2007-03-20 13:44                                         ` Dmitry Krivoschekov
2007-03-20 21:03                                         ` David Brownell
2007-03-20 13:07                                     ` Dmitry Krivoschekov
2007-03-20 13:52                                       ` Igor Stoppa
2007-03-20 14:58                                         ` Dmitry Krivoschekov
2007-03-20 15:36                                           ` Pavel Machek
2007-03-20 19:16                                             ` Dmitry Krivoschekov
2007-03-20 20:45                                               ` Pavel Machek
2007-03-20 22:04                                                 ` David Brownell
2007-03-20 22:06                                                   ` Pavel Machek
2007-03-20 23:29                                                     ` David Brownell
2007-03-20 15:36                                           ` Igor Stoppa
2007-03-20 19:17                                             ` Dmitry Krivoschekov
2007-03-20 20:17                                             ` David Brownell
2007-03-20 20:21                                       ` David Brownell
2007-03-20 19:58                                   ` David Brownell
2007-03-24  0:47                                     ` charging batteries from USB [was: Re: Alternative Concept] Dmitry Krivoschekov
2007-03-24  1:17                                       ` David Brownell
2007-03-24  1:48                                         ` Dmitry Krivoschekov
2007-03-24  2:35                                           ` David Brownell
2007-03-24 10:20                                             ` Oliver Neukum
2007-03-24  8:36                                       ` Oliver Neukum
2007-03-14  3:19       ` Alternative Concept Dominik Brodowski
2007-03-14 22:08 Scott E. Preece
2007-03-14 23:23 ` David Brownell
2007-03-15  7:25   ` Ikhwan Lee
2007-03-15  8:14     ` Amit Kucheria
2007-03-15 10:55       ` Eugeny S. Mints
2007-03-15 10:46     ` Eugeny S. Mints
2007-03-15 10:33   ` Eugeny S. Mints
2007-03-15 13:21 Scott E. Preece
2007-03-15 13:29 Scott E. Preece
2007-03-15 23:07 ` David Brownell
2007-03-15 14:00 Scott E. Preece
2007-03-15 14:38 ` Eugeny S. Mints
2007-03-15 17:33 ` Woodruff, Richard
2007-03-19 14:12 Scott E. Preece
2007-03-20  7:56 ` David Brownell
2007-03-20 14:26   ` Amit Kucheria
2007-03-20 15:08     ` Dmitry Krivoschekov
2007-03-20 17:04       ` David Brownell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.