linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/2] PM / Domains / cpuidle: Preliminary cpuidle support for generic PM domains
@ 2012-05-09 21:40 Rafael J. Wysocki
  2012-05-09 21:41 ` [RFC][PATCH 1/2] PM / cpuidle: Add driver reference counter Rafael J. Wysocki
  2012-05-09 21:43 ` [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support Rafael J. Wysocki
  0 siblings, 2 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-09 21:40 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Len Brown, Colin Cross, Kevin Hilman, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

Hi all,

Since people have been spending quite some time on cpuidle recently,
I thought it might be a good time to show some ideas about how to make cpuidle
work along with the generic PM domains framework (and the other way around).

The first patch is just an addition that's used by the second one.  The
second patch is not functional code and untested, so it is for comments only
right now (no signoff), although I did my best to make is as close to
functional as possible.

The details are in the changelogs, please tell me what you think.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC][PATCH 1/2] PM / cpuidle: Add driver reference counter
  2012-05-09 21:40 [RFC][PATCH 0/2] PM / Domains / cpuidle: Preliminary cpuidle support for generic PM domains Rafael J. Wysocki
@ 2012-05-09 21:41 ` Rafael J. Wysocki
  2012-05-09 21:43 ` [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support Rafael J. Wysocki
  1 sibling, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-09 21:41 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Len Brown, Colin Cross, Kevin Hilman, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

From: Rafael J. Wysocki <rjw@sisk.pl>

Add a reference counter for the cpuidle driver, so that it can't
be unregistered when it is in use.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/cpuidle/driver.c |   29 ++++++++++++++++++++++++++++-
 include/linux/cpuidle.h  |    6 +++++-
 2 files changed, 33 insertions(+), 2 deletions(-)

Index: linux/drivers/cpuidle/driver.c
===================================================================
--- linux.orig/drivers/cpuidle/driver.c
+++ linux/drivers/cpuidle/driver.c
@@ -16,6 +16,7 @@
 
 static struct cpuidle_driver *cpuidle_curr_driver;
 DEFINE_SPINLOCK(cpuidle_driver_lock);
+int cpuidle_driver_refcount;
 
 static void __cpuidle_register_driver(struct cpuidle_driver *drv)
 {
@@ -89,8 +90,34 @@ void cpuidle_unregister_driver(struct cp
 	}
 
 	spin_lock(&cpuidle_driver_lock);
-	cpuidle_curr_driver = NULL;
+
+	if (!WARN_ON(cpuidle_driver_refcount > 0))
+		cpuidle_curr_driver = NULL;
+
 	spin_unlock(&cpuidle_driver_lock);
 }
 
 EXPORT_SYMBOL_GPL(cpuidle_unregister_driver);
+
+struct cpuidle_driver *cpuidle_driver_ref(void)
+{
+	struct cpuidle_driver *drv;
+
+	spin_lock(&cpuidle_driver_lock);
+
+	drv = cpuidle_curr_driver;
+	cpuidle_driver_refcount++;
+
+	spin_unlock(&cpuidle_driver_lock);
+	return drv;
+}
+
+void cpuidle_driver_unref(void)
+{
+	spin_lock(&cpuidle_driver_lock);
+
+	if (!WARN_ON(cpuidle_driver_refcount <= 0))
+		cpuidle_driver_refcount--;
+
+	spin_unlock(&cpuidle_driver_lock);
+}
Index: linux/include/linux/cpuidle.h
===================================================================
--- linux.orig/include/linux/cpuidle.h
+++ linux/include/linux/cpuidle.h
@@ -136,7 +136,9 @@ struct cpuidle_driver {
 extern void disable_cpuidle(void);
 extern int cpuidle_idle_call(void);
 extern int cpuidle_register_driver(struct cpuidle_driver *drv);
-struct cpuidle_driver *cpuidle_get_driver(void);
+extern struct cpuidle_driver *cpuidle_get_driver(void);
+extern struct cpuidle_driver *cpuidle_driver_ref(void);
+extern void cpuidle_driver_unref(void);
 extern void cpuidle_unregister_driver(struct cpuidle_driver *drv);
 extern int cpuidle_register_device(struct cpuidle_device *dev);
 extern void cpuidle_unregister_device(struct cpuidle_device *dev);
@@ -157,6 +159,8 @@ static inline int cpuidle_idle_call(void
 static inline int cpuidle_register_driver(struct cpuidle_driver *drv)
 {return -ENODEV; }
 static inline struct cpuidle_driver *cpuidle_get_driver(void) {return NULL; }
+static inline struct cpuidle_driver *cpuidle_driver_ref(void) {return NULL; }
+static inline void cpuidle_driver_unref(void) {}
 static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { }
 static inline int cpuidle_register_device(struct cpuidle_device *dev)
 {return -ENODEV; }


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-09 21:40 [RFC][PATCH 0/2] PM / Domains / cpuidle: Preliminary cpuidle support for generic PM domains Rafael J. Wysocki
  2012-05-09 21:41 ` [RFC][PATCH 1/2] PM / cpuidle: Add driver reference counter Rafael J. Wysocki
@ 2012-05-09 21:43 ` Rafael J. Wysocki
  2012-05-10 10:10   ` Santosh Shilimkar
  2012-05-12 19:40   ` [Update][RFC][PATCH " Rafael J. Wysocki
  1 sibling, 2 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-09 21:43 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Len Brown, Colin Cross, Kevin Hilman, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

From: Rafael J. Wysocki <rjw@sisk.pl>

On some systems there are CPU cores located in the same power
domains as I/O devices.  Then, power can only be removed from the
domain if all I/O devices in it are not in use and the CPU core
is idle.  Add preliminary support for that to the generic PM domains
framework.

This assumes that there is only one CPU core in the system and it is
supposed to be set up in the following way.

First, the platform is expected to provide a cpuidle driver with one
extra state designated for the generic PM domains code to handle.
This state should be initially disabled and its exit_latency value
should be set to whatever time is needed to bring up the CPU core
itself after restoring power to it, not including the domain's
power on latency.  Its .enter() callback should point to a procedure
that will save the CPU core's state as appropriate before power
removal.  On success, it should return the same value as it has
been passed as its third argument, but it shouldn't put the CPU
core into a C-state.  If it is about to return the index of
a different cpuidle state, however, it should make sure that the CPU
be put into that state before it returns.

The remaining characteristics of the extra cpuidle state, referred to
as the "domain" cpuidle state below, (e.g. power usage, target
residency) should be populated in accordance with the properties of
the hardware.

Next, the platform should execute genpd_attach_cpuidle() on the PM
domain containing the CPU core.  That will cause the generic PM
domains framework to treat that domain in a special way such that:

 * When all devices in the domain have been suspended and it is about
   to be turned off, the states of the devices will be saved, but
   power will not be removed from the domain.  Instead, the "domain"
   cpuidle state will be enabled so that power can be removed from
   the domain when the CPU core is idle and the state has been chosen
   as the target by the cpuidle governor.  In that case, before
   removing power from the domain, the framework will execute the
   .enter() callback initially defined for the "domain" state.

 * When the first I/O device in the domain is resumed and
   __pm_genpd_poweron(() is called for the first time after
   power has been removed from the domain, the "domain" cpuidle
   state will be disabled to avoid subsequent surprise power removals
   via cpuidle.

---
 drivers/base/power/domain.c |  152 +++++++++++++++++++++++++++++++++++++++++++-
 include/linux/cpuidle.h     |    2 
 include/linux/pm_domain.h   |   20 +++++
 3 files changed, 173 insertions(+), 1 deletion(-)

Index: linux/include/linux/pm_domain.h
===================================================================
--- linux.orig/include/linux/pm_domain.h
+++ linux/include/linux/pm_domain.h
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/of.h>
 #include <linux/notifier.h>
+#include <linux/cpuidle.h>
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -45,6 +46,14 @@ struct gpd_dev_ops {
 	bool (*active_wakeup)(struct device *dev);
 };
 
+struct gpd_cpu_data {
+	unsigned int saved_exit_latency;
+	struct cpuidle_state *idle_state;
+	int (*idle_enter)(struct cpuidle_device *dev,
+			  struct cpuidle_driver *drv,
+			  int index);
+};
+
 struct generic_pm_domain {
 	struct dev_pm_domain domain;	/* PM domain operations */
 	struct list_head gpd_list_node;	/* Node in the global PM domains list */
@@ -75,6 +84,7 @@ struct generic_pm_domain {
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
 	struct device_node *of_node; /* Node in device tree */
+	struct gpd_cpu_data *cpu_data;
 };
 
 static inline struct generic_pm_domain *pd_to_genpd(struct dev_pm_domain *pd)
@@ -154,6 +164,8 @@ extern int pm_genpd_add_callbacks(struct
 				  struct gpd_dev_ops *ops,
 				  struct gpd_timing_data *td);
 extern int __pm_genpd_remove_callbacks(struct device *dev, bool clear_td);
+extern int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state);
+extern int genpd_detach_cpuidle(struct generic_pm_domain *genpd);
 extern void pm_genpd_init(struct generic_pm_domain *genpd,
 			  struct dev_power_governor *gov, bool is_off);
 
@@ -209,6 +221,14 @@ static inline int __pm_genpd_remove_call
 {
 	return -ENOSYS;
 }
+static inline int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int st)
+{
+	return -ENOSYS;
+}
+static inline int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	return -ENOSYS;
+}
 static inline void pm_genpd_init(struct generic_pm_domain *genpd,
 				 struct dev_power_governor *gov, bool is_off)
 {
Index: linux/drivers/base/power/domain.c
===================================================================
--- linux.orig/drivers/base/power/domain.c
+++ linux/drivers/base/power/domain.c
@@ -139,6 +139,19 @@ static void genpd_set_active(struct gene
 		genpd->status = GPD_STATE_ACTIVE;
 }
 
+static void genpd_recalc_cpu_exit_latency(struct generic_pm_domain *genpd)
+{
+	s64 usecs64;
+
+	if (!genpd->cpu_data)
+		return;
+
+	usecs64 = genpd->power_on_latency_ns;
+	do_div(usecs64, NSEC_PER_USEC);
+	usecs64 += genpd->cpu_data->saved_exit_latency;
+	genpd->cpu_data->idle_state->exit_latency = usecs64;
+}
+
 /**
  * __pm_genpd_poweron - Restore power to a given PM domain and its masters.
  * @genpd: PM domain to power up.
@@ -203,7 +216,11 @@ int __pm_genpd_poweron(struct generic_pm
 		}
 	}
 
-	if (genpd->power_on) {
+	if (genpd->cpu_data) {
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = true;
+		cpuidle_resume_and_unlock();
+	} else if (genpd->power_on) {
 		ktime_t time_start = ktime_get();
 		s64 elapsed_ns;
 
@@ -215,6 +232,7 @@ int __pm_genpd_poweron(struct generic_pm
 		if (elapsed_ns > genpd->power_on_latency_ns) {
 			genpd->power_on_latency_ns = elapsed_ns;
 			genpd->max_off_time_changed = true;
+			genpd_recalc_cpu_exit_latency(genpd);
 			if (genpd->name)
 				pr_warning("%s: Power-on latency exceeded, "
 					"new value %lld ns\n", genpd->name,
@@ -464,6 +482,21 @@ static int pm_genpd_poweroff(struct gene
 		}
 	}
 
+	if (genpd->cpu_data) {
+		/*
+		 * If cpu_data is set, cpuidle should turn the domain off when
+		 * the CPU in it is idle.  In that case we don't decrement the
+		 * subdomain counts of the master domains, so that power is not
+		 * removed from the current domain prematurely as a result of
+		 * cutting off the masters' power.
+		 */
+		genpd->status = GPD_STATE_POWER_OFF;
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = false;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	if (genpd->power_off) {
 		ktime_t time_start;
 		s64 elapsed_ns;
@@ -1597,6 +1630,123 @@ int __pm_genpd_remove_callbacks(struct d
 }
 EXPORT_SYMBOL_GPL(__pm_genpd_remove_callbacks);
 
+int genpd_cpuidle_enter(struct cpuidle_device *dev, struct cpuidle_driver *drv,
+			int index)
+{
+	struct cpuidle_state *target = &drv->states[index];
+	struct generic_pm_domain *genpd = target->platform_data;
+	struct gpd_cpu_data *cpu_data;
+	int state;
+
+	if (!genpd || !genpd->cpu_data)
+		goto fall_back;
+
+	cpu_data = genpd->cpu_data;
+	state = cpu_data->idle_enter(dev, drv, index);
+	if (state != index)
+		return state;
+
+	if (!genpd->power_off)
+		goto fall_back;
+
+	/*
+	 * We can safely power off here, because __pm_genpd_poweron(() has to
+	 * run in process context, so the CPU has to exit idle before that
+	 * function runs.
+	 */
+	if (!genpd->power_off(genpd))
+		return index;
+
+ fall_back:
+	index = drv->safe_state_index;
+	return drv->states[index].enter(dev, drv, index);
+}
+
+int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state)
+{
+	struct cpuidle_driver *cpuidle_drv;
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd) || state < 0)
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	if (genpd->cpu_data) {
+		ret = -EEXIST;
+		goto out;
+	}
+	cpu_data = kzalloc(sizeof(*cpu_data), GFP_KERNEL);
+	if (!cpu_data) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpuidle_drv = cpuidle_driver_ref();
+	if (!cpuidle_drv) {
+		ret = -ENODEV;
+		goto out;
+	}
+	if (cpuidle_drv->state_count <= state) {
+		ret = -EINVAL;
+		goto err;
+	}
+	idle_state = &cpuidle_drv->states[state];
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto err;
+	}
+	cpu_data->idle_state = idle_state;
+	cpu_data->saved_exit_latency = idle_state->exit_latency;
+	cpu_data->idle_enter = idle_state->enter;
+	genpd->cpu_data = cpu_data;
+	idle_state->platform_data = genpd;
+	idle_state->enter = genpd_cpuidle_enter;
+	genpd_recalc_cpu_exit_latency(genpd);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+
+ err:
+	cpuidle_driver_unref();
+	goto out;
+}
+
+int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd))
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	cpu_data = genpd->cpu_data;
+	if (!cpu_data) {
+		ret = -ENODEV;
+		goto out;
+	}
+	idle_state = cpu_data->idle_state;
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto out;
+	}
+	idle_state->enter = cpu_data->idle_enter;
+	idle_state->exit_latency = cpu_data->saved_exit_latency;
+	idle_state->platform_data = NULL;
+	cpuidle_driver_unref();
+	genpd->cpu_data = NULL;
+	kfree(cpu_data);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+}
+
 /* Default device callbacks for generic PM domains. */
 
 /**
Index: linux/include/linux/cpuidle.h
===================================================================
--- linux.orig/include/linux/cpuidle.h
+++ linux/include/linux/cpuidle.h
@@ -53,6 +53,8 @@ struct cpuidle_state {
 			int index);
 
 	int (*enter_dead) (struct cpuidle_device *dev, int index);
+
+	void		*platform_data;
 };
 
 /* Idle State Flags */


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-09 21:43 ` [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support Rafael J. Wysocki
@ 2012-05-10 10:10   ` Santosh Shilimkar
  2012-05-10 18:41     ` Rafael J. Wysocki
  2012-05-12 19:40   ` [Update][RFC][PATCH " Rafael J. Wysocki
  1 sibling, 1 reply; 19+ messages in thread
From: Santosh Shilimkar @ 2012-05-10 10:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

Rafael,

On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> On some systems there are CPU cores located in the same power
> domains as I/O devices.  Then, power can only be removed from the
> domain if all I/O devices in it are not in use and the CPU core
> is idle.  Add preliminary support for that to the generic PM domains
> framework.
> 
I am just curious to know, what kind of IO devices, you are
talking here? And also how those devices linked with CPU low power
states apart from being part of same power domain. And is it
the power domain or more of voltage domain, we are talking here.

> This assumes that there is only one CPU core in the system and it is
> supposed to be set up in the following way.
> 
> First, the platform is expected to provide a cpuidle driver with one
> extra state designated for the generic PM domains code to handle.
> This state should be initially disabled and its exit_latency value
> should be set to whatever time is needed to bring up the CPU core
> itself after restoring power to it, not including the domain's
> power on latency.  Its .enter() callback should point to a procedure
> that will save the CPU core's state as appropriate before power
> removal.  On success, it should return the same value as it has
> been passed as its third argument, but it shouldn't put the CPU
> core into a C-state.  If it is about to return the index of
> a different cpuidle state, however, it should make sure that the CPU
> be put into that state before it returns.
> 
> The remaining characteristics of the extra cpuidle state, referred to
> as the "domain" cpuidle state below, (e.g. power usage, target
> residency) should be populated in accordance with the properties of
> the hardware.
> 
> Next, the platform should execute genpd_attach_cpuidle() on the PM
> domain containing the CPU core.  That will cause the generic PM
> domains framework to treat that domain in a special way such that:
> 
>  * When all devices in the domain have been suspended and it is about
>    to be turned off, the states of the devices will be saved, but
>    power will not be removed from the domain.  Instead, the "domain"
>    cpuidle state will be enabled so that power can be removed from
>    the domain when the CPU core is idle and the state has been chosen
>    as the target by the cpuidle governor.  In that case, before
>    removing power from the domain, the framework will execute the
>    .enter() callback initially defined for the "domain" state.
> 
>  * When the first I/O device in the domain is resumed and
>    __pm_genpd_poweron(() is called for the first time after
>    power has been removed from the domain, the "domain" cpuidle
>    state will be disabled to avoid subsequent surprise power removals
>    via cpuidle.
> 

If these are CPU cluster/package specific IO's like interrupt
controller, cache controller, Coherency interconnect etc and
if the intention is to ensure that these devices context
is saved/restored in cpuidle entry/exit, it can be handled with
CPU PM notifiers. We already do that for ARM SOCs.

>From the patch description it seems, they are general purpose
peripherals. We had one thermal sensor on OMAP which
wrongly clocked from the CPU clock source and needed
some idle notifier infrastructure to prepare/resume
this device for idle entry/exit.

Regards
Santosh





^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-10 10:10   ` Santosh Shilimkar
@ 2012-05-10 18:41     ` Rafael J. Wysocki
  2012-05-11  8:23       ` Santosh Shilimkar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-10 18:41 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Thursday, May 10, 2012, Santosh Shilimkar wrote:
> Rafael,
> 
> On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > On some systems there are CPU cores located in the same power
> > domains as I/O devices.  Then, power can only be removed from the
> > domain if all I/O devices in it are not in use and the CPU core
> > is idle.  Add preliminary support for that to the generic PM domains
> > framework.
> > 
> I am just curious to know, what kind of IO devices, you are
> talking here?

Nothing specific, really.  It can be any kind of I/O devices that happen
to be in the same power domain.  This includes USB, SDHI, MMCIF controllers
on the SoC I have in mind in particular.

> And also how those devices linked with CPU low power
> states apart from being part of same power domain. And is it
> the power domain or more of voltage domain, we are talking here.

Depending on the definitions I guess.  How do you define a power domain and
a voltage domain?

> > This assumes that there is only one CPU core in the system and it is
> > supposed to be set up in the following way.
> > 
> > First, the platform is expected to provide a cpuidle driver with one
> > extra state designated for the generic PM domains code to handle.
> > This state should be initially disabled and its exit_latency value
> > should be set to whatever time is needed to bring up the CPU core
> > itself after restoring power to it, not including the domain's
> > power on latency.  Its .enter() callback should point to a procedure
> > that will save the CPU core's state as appropriate before power
> > removal.  On success, it should return the same value as it has
> > been passed as its third argument, but it shouldn't put the CPU
> > core into a C-state.  If it is about to return the index of
> > a different cpuidle state, however, it should make sure that the CPU
> > be put into that state before it returns.
> > 
> > The remaining characteristics of the extra cpuidle state, referred to
> > as the "domain" cpuidle state below, (e.g. power usage, target
> > residency) should be populated in accordance with the properties of
> > the hardware.
> > 
> > Next, the platform should execute genpd_attach_cpuidle() on the PM
> > domain containing the CPU core.  That will cause the generic PM
> > domains framework to treat that domain in a special way such that:
> > 
> >  * When all devices in the domain have been suspended and it is about
> >    to be turned off, the states of the devices will be saved, but
> >    power will not be removed from the domain.  Instead, the "domain"
> >    cpuidle state will be enabled so that power can be removed from
> >    the domain when the CPU core is idle and the state has been chosen
> >    as the target by the cpuidle governor.  In that case, before
> >    removing power from the domain, the framework will execute the
> >    .enter() callback initially defined for the "domain" state.
> > 
> >  * When the first I/O device in the domain is resumed and
> >    __pm_genpd_poweron(() is called for the first time after
> >    power has been removed from the domain, the "domain" cpuidle
> >    state will be disabled to avoid subsequent surprise power removals
> >    via cpuidle.
> > 
> 
> If these are CPU cluster/package specific IO's like interrupt
> controller, cache controller, Coherency interconnect etc and
> if the intention is to ensure that these devices context
> is saved/restored in cpuidle entry/exit, it can be handled with
> CPU PM notifiers.

Maybe it can, but I'm not so sure of that in general.

> We already do that for ARM SOCs.

Surely not all of them?  I know of a few at least where this isn't done.

> From the patch description it seems, they are general purpose
> peripherals.

Yes, they are.

> We had one thermal sensor on OMAP which
> wrongly clocked from the CPU clock source and needed
> some idle notifier infrastructure to prepare/resume
> this device for idle entry/exit.

The system I have in mind is designed in such a way that there is a power
domain with three subdomains, one of which contains the CPU core and the
remaining two contain I/O devices of various kinds.  General purpose as well
as "core".

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-10 18:41     ` Rafael J. Wysocki
@ 2012-05-11  8:23       ` Santosh Shilimkar
  2012-05-11  8:35         ` Magnus Damm
  2012-05-11 18:59         ` Rafael J. Wysocki
  0 siblings, 2 replies; 19+ messages in thread
From: Santosh Shilimkar @ 2012-05-11  8:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
> On Thursday, May 10, 2012, Santosh Shilimkar wrote:
>> Rafael,
>>
>> On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rjw@sisk.pl>
>>>
>>> On some systems there are CPU cores located in the same power
>>> domains as I/O devices.  Then, power can only be removed from the
>>> domain if all I/O devices in it are not in use and the CPU core
>>> is idle.  Add preliminary support for that to the generic PM domains
>>> framework.
>>>
>> I am just curious to know, what kind of IO devices, you are
>> talking here?
> 
> Nothing specific, really.  It can be any kind of I/O devices that happen
> to be in the same power domain.  This includes USB, SDHI, MMCIF controllers
> on the SoC I have in mind in particular.
> 
OK.
These are more of generic devices and actually not related to CPU/CPU
clusters as such.

>> And also how those devices linked with CPU low power
>> states apart from being part of same power domain. And is it
>> the power domain or more of voltage domain, we are talking here.
> 
> Depending on the definitions I guess.  How do you define a power domain and
> a voltage domain?
>

A voltage domain can be a section of the device supplied by a dedicated
voltage rail. A voltage domain can have many power-domains like
CPU cluster domain, Interconnect domain, peripheral domains.
And each power domain then can have many sub-modules like UART, SPI,
USB etc

>>> This assumes that there is only one CPU core in the system and it is
>>> supposed to be set up in the following way.
>>>
>>> First, the platform is expected to provide a cpuidle driver with one
>>> extra state designated for the generic PM domains code to handle.
>>> This state should be initially disabled and its exit_latency value
>>> should be set to whatever time is needed to bring up the CPU core
>>> itself after restoring power to it, not including the domain's
>>> power on latency.  Its .enter() callback should point to a procedure
>>> that will save the CPU core's state as appropriate before power
>>> removal.  On success, it should return the same value as it has
>>> been passed as its third argument, but it shouldn't put the CPU
>>> core into a C-state.  If it is about to return the index of
>>> a different cpuidle state, however, it should make sure that the CPU
>>> be put into that state before it returns.
>>>
>>> The remaining characteristics of the extra cpuidle state, referred to
>>> as the "domain" cpuidle state below, (e.g. power usage, target
>>> residency) should be populated in accordance with the properties of
>>> the hardware.
>>>
>>> Next, the platform should execute genpd_attach_cpuidle() on the PM
>>> domain containing the CPU core.  That will cause the generic PM
>>> domains framework to treat that domain in a special way such that:
>>>
>>>  * When all devices in the domain have been suspended and it is about
>>>    to be turned off, the states of the devices will be saved, but
>>>    power will not be removed from the domain.  Instead, the "domain"
>>>    cpuidle state will be enabled so that power can be removed from
>>>    the domain when the CPU core is idle and the state has been chosen
>>>    as the target by the cpuidle governor.  In that case, before
>>>    removing power from the domain, the framework will execute the
>>>    .enter() callback initially defined for the "domain" state.
>>>
>>>  * When the first I/O device in the domain is resumed and
>>>    __pm_genpd_poweron(() is called for the first time after
>>>    power has been removed from the domain, the "domain" cpuidle
>>>    state will be disabled to avoid subsequent surprise power removals
>>>    via cpuidle.
>>>
>>
>> If these are CPU cluster/package specific IO's like interrupt
>> controller, cache controller, Coherency interconnect etc and
>> if the intention is to ensure that these devices context
>> is saved/restored in cpuidle entry/exit, it can be handled with
>> CPU PM notifiers.
> 
> Maybe it can, but I'm not so sure of that in general.
> 
>> We already do that for ARM SOCs.
> 
> Surely not all of them?  I know of a few at least where this isn't done.
>
You are right these are not for general purpose IO's

>> From the patch description it seems, they are general purpose
>> peripherals.
> 
> Yes, they are.
> 
>> We had one thermal sensor on OMAP which
>> wrongly clocked from the CPU clock source and needed
>> some idle notifier infrastructure to prepare/resume
>> this device for idle entry/exit.
> 
> The system I have in mind is designed in such a way that there is a power
> domain with three subdomains, one of which contains the CPU core and the
> remaining two contain I/O devices of various kinds.  General purpose as well
> as "core".
> 
I am not sure CPUIDLE is suppose to take care of these kind of general
purpose IO's. CPUIDLE should take care of CPU and CPU cluster power
management. Any other peripherals as you mentioned should be already
have some sort of device drivers and they should be using runtime PM for
it, no? And for the constraints, PM-Qos can be used. So far CPUIDLE
core code has maintained that distinction and all the C-state latencies
are of the CPU clusters rather than the SOC.

If you have a voltage rail dependency then that should be handled
in the voltage layer/regulator layer. If there is a power domain
dependency then the power domain framework should do the use
counting yo handle such scenarios.

Please correct me but, IIUC, your proposal wants to use CPUIDLE
for the SOC level power management.
Will you be able to expand your requirements and explain why can't
you manage PM for the general purpose devices like MMC, USB etc
in their own device drivers ?

Regards
Santosh





^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-11  8:23       ` Santosh Shilimkar
@ 2012-05-11  8:35         ` Magnus Damm
  2012-05-11 18:59         ` Rafael J. Wysocki
  1 sibling, 0 replies; 19+ messages in thread
From: Magnus Damm @ 2012-05-11  8:35 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Len Brown, Colin Cross,
	Kevin Hilman, Arjan van de Ven

Hi Santosh,

On Fri, May 11, 2012 at 5:23 PM, Santosh Shilimkar
<santosh.shilimkar@ti.com> wrote:
> On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
>> On Thursday, May 10, 2012, Santosh Shilimkar wrote:
>>> Rafael,
>>>
>>> On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rjw@sisk.pl>
>>>>
>>>> On some systems there are CPU cores located in the same power
>>>> domains as I/O devices.  Then, power can only be removed from the
>>>> domain if all I/O devices in it are not in use and the CPU core
>>>> is idle.  Add preliminary support for that to the generic PM domains
>>>> framework.
>>>>
>>> I am just curious to know, what kind of IO devices, you are
>>> talking here?
>>
>> Nothing specific, really.  It can be any kind of I/O devices that happen
>> to be in the same power domain.  This includes USB, SDHI, MMCIF controllers
>> on the SoC I have in mind in particular.
>>
> OK.
> These are more of generic devices and actually not related to CPU/CPU
> clusters as such.
>
>>> And also how those devices linked with CPU low power
>>> states apart from being part of same power domain. And is it
>>> the power domain or more of voltage domain, we are talking here.
>>
>> Depending on the definitions I guess.  How do you define a power domain and
>> a voltage domain?
>>
>
> A voltage domain can be a section of the device supplied by a dedicated
> voltage rail. A voltage domain can have many power-domains like
> CPU cluster domain, Interconnect domain, peripheral domains.
> And each power domain then can have many sub-modules like UART, SPI,
> USB etc

There are no software controllable voltage domains on the
sh7372/Mackerel SoC/board that Rafael is using. The SoC does however
have multiple hierarchical power domains, and in the case when we want
to have transparent runtime power management of the root power domain
then we will have to take both I/O devices and CPU cores into
consideration. I suppose exactly how to do that is up for discussion,
but CPUIdle is certainly a natural step in some direction - this since
the CPU-only power domains are already managed by CPUIdle.

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-11  8:23       ` Santosh Shilimkar
  2012-05-11  8:35         ` Magnus Damm
@ 2012-05-11 18:59         ` Rafael J. Wysocki
  2012-05-12  6:32           ` Shilimkar, Santosh
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-11 18:59 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Friday, May 11, 2012, Santosh Shilimkar wrote:
> On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
> > On Thursday, May 10, 2012, Santosh Shilimkar wrote:
> >> Rafael,
> >>
> >> On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <rjw@sisk.pl>
> >>>
> >>> On some systems there are CPU cores located in the same power
> >>> domains as I/O devices.  Then, power can only be removed from the
> >>> domain if all I/O devices in it are not in use and the CPU core
> >>> is idle.  Add preliminary support for that to the generic PM domains
> >>> framework.
> >>>
> >> I am just curious to know, what kind of IO devices, you are
> >> talking here?
> > 
> > Nothing specific, really.  It can be any kind of I/O devices that happen
> > to be in the same power domain.  This includes USB, SDHI, MMCIF controllers
> > on the SoC I have in mind in particular.
> > 
> OK.
> These are more of generic devices and actually not related to CPU/CPU
> clusters as such.
> 
> >> And also how those devices linked with CPU low power
> >> states apart from being part of same power domain. And is it
> >> the power domain or more of voltage domain, we are talking here.
> > 
> > Depending on the definitions I guess.  How do you define a power domain and
> > a voltage domain?
> >
> 
> A voltage domain can be a section of the device supplied by a dedicated
> voltage rail. A voltage domain can have many power-domains like
> CPU cluster domain, Interconnect domain, peripheral domains.
> And each power domain then can have many sub-modules like UART, SPI,
> USB etc

OK, so this is not the level of detail the code in question is about.

In my terminology a power domain is a set of devices such that it only is
possible to remove power from (and restore power to) all of them together.

[...]
> > 
> > The system I have in mind is designed in such a way that there is a power
> > domain with three subdomains, one of which contains the CPU core and the
> > remaining two contain I/O devices of various kinds.  General purpose as well
> > as "core".
> > 
> I am not sure CPUIDLE is suppose to take care of these kind of general
> purpose IO's. CPUIDLE should take care of CPU and CPU cluster power
> management. Any other peripherals as you mentioned should be already
> have some sort of device drivers and they should be using runtime PM for
> it, no?

Yes.

> And for the constraints, PM-Qos can be used. So far CPUIDLE
> core code has maintained that distinction and all the C-state latencies
> are of the CPU clusters rather than the SOC.

Well, that need not be the case and my patch is about that.

> If you have a voltage rail dependency then that should be handled
> in the voltage layer/regulator layer. If there is a power domain
> dependency then the power domain framework should do the use
> counting yo handle such scenarios.

The power domain framework, though, only covers I/O devices at the moment
and this is an attempt to extend it to cover domains containing CPU cores
as well as I/O devices.

> Please correct me but, IIUC, your proposal wants to use CPUIDLE
> for the SOC level power management.

That depends on what you mean by SoC-level.

> Will you be able to expand your requirements and explain why can't
> you manage PM for the general purpose devices like MMC, USB etc
> in their own device drivers ?

Because it is not possible to remove power from those devices individually.

Say you have a bunch of I/O devices such that you can only remove power from
all of them simultaneously (a power domain, that is).  Suppose that there
is a register such that if a specific value is written to it, power is cut
from all of those devices at the same time (and there's an analogous value
for restoring power).  Then, you can use the generic PM domains framework
for power management of those devices.

Suppose, however, that if you write the "cut power" value to the register,
your CPU core will lose power too.  This case is beyond the scope of the
existing generic PM domains framework, because it has to take the CPU power
management into account, which is cpuidle in this particular case.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-11 18:59         ` Rafael J. Wysocki
@ 2012-05-12  6:32           ` Shilimkar, Santosh
  2012-05-12 19:33             ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Shilimkar, Santosh @ 2012-05-12  6:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Sat, May 12, 2012 at 12:29 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, May 11, 2012, Santosh Shilimkar wrote:
>> On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
>> > On Thursday, May 10, 2012, Santosh Shilimkar wrote:
>> >> Rafael,
>> >>
>> >> On Thursday 10 May 2012 03:13 AM, Rafael J. Wysocki wrote:
>> >>> From: Rafael J. Wysocki <rjw@sisk.pl>
>> >>>
>> >>> On some systems there are CPU cores located in the same power
>> >>> domains as I/O devices.  Then, power can only be removed from the
>> >>> domain if all I/O devices in it are not in use and the CPU core
>> >>> is idle.  Add preliminary support for that to the generic PM domains
>> >>> framework.
>> >>>
>> >> I am just curious to know, what kind of IO devices, you are
>> >> talking here?
>> >
>> > Nothing specific, really.  It can be any kind of I/O devices that happen
>> > to be in the same power domain.  This includes USB, SDHI, MMCIF controllers
>> > on the SoC I have in mind in particular.
>> >
>> OK.
>> These are more of generic devices and actually not related to CPU/CPU
>> clusters as such.
>>
>> >> And also how those devices linked with CPU low power
>> >> states apart from being part of same power domain. And is it
>> >> the power domain or more of voltage domain, we are talking here.
>> >
>> > Depending on the definitions I guess.  How do you define a power domain and
>> > a voltage domain?
>> >
>>
>> A voltage domain can be a section of the device supplied by a dedicated
>> voltage rail. A voltage domain can have many power-domains like
>> CPU cluster domain, Interconnect domain, peripheral domains.
>> And each power domain then can have many sub-modules like UART, SPI,
>> USB etc
>
> OK, so this is not the level of detail the code in question is about.
>
> In my terminology a power domain is a set of devices such that it only is
> possible to remove power from (and restore power to) all of them together.
>
> [...]
>> >
>> > The system I have in mind is designed in such a way that there is a power
>> > domain with three subdomains, one of which contains the CPU core and the
>> > remaining two contain I/O devices of various kinds.  General purpose as well
>> > as "core".
>> >
>> I am not sure CPUIDLE is suppose to take care of these kind of general
>> purpose IO's. CPUIDLE should take care of CPU and CPU cluster power
>> management. Any other peripherals as you mentioned should be already
>> have some sort of device drivers and they should be using runtime PM for
>> it, no?
>
> Yes.
>
>> And for the constraints, PM-Qos can be used. So far CPUIDLE
>> core code has maintained that distinction and all the C-state latencies
>> are of the CPU clusters rather than the SOC.
>
> Well, that need not be the case and my patch is about that.
>
>> If you have a voltage rail dependency then that should be handled
>> in the voltage layer/regulator layer. If there is a power domain
>> dependency then the power domain framework should do the use
>> counting yo handle such scenarios.
>
> The power domain framework, though, only covers I/O devices at the moment
> and this is an attempt to extend it to cover domains containing CPU cores
> as well as I/O devices.
>
>> Please correct me but, IIUC, your proposal wants to use CPUIDLE
>> for the SOC level power management.
>
> That depends on what you mean by SoC-level.
>
>> Will you be able to expand your requirements and explain why can't
>> you manage PM for the general purpose devices like MMC, USB etc
>> in their own device drivers ?
>
> Because it is not possible to remove power from those devices individually.
>
> Say you have a bunch of I/O devices such that you can only remove power from
> all of them simultaneously (a power domain, that is).  Suppose that there
> is a register such that if a specific value is written to it, power is cut
> from all of those devices at the same time (and there's an analogous value
> for restoring power).  Then, you can use the generic PM domains framework
> for power management of those devices.
>
> Suppose, however, that if you write the "cut power" value to the register,
> your CPU core will lose power too.  This case is beyond the scope of the
> existing generic PM domains framework, because it has to take the CPU power
> management into account, which is cpuidle in this particular case.
>
Thanks for the explanation and now I understand your hardware and the
patch-set intention bit better. Multiple devices in a power domain is very
common design and OMAP also has similar concepts. If there are
more than one module, a powerdomain can maintain an usecount.
And whenever individual modules idle, they will decrements it.
And when all module in a power-domain idle, the power-domain
can do low power transition.

Your case is bit special because CPU is also part of the same power
domain as IO's. Since CPU is part of the root power domain along
with peripherals, you want CPU to be the device to cut the power in
the end(idle) and all the notifiers prepares the IO's for the 'cut
power' which CPU can do in the cpuidle. And then on the reverse
chain, notifier will help devices to restore the context so that can
resume without any issues.

Well if CPU is added as one more device along with other
peripherals, you can still make it work at power domain framework
level. This can be debated further but I think, we need an agreement
on CPUIDLE exclusivity.

Based on your hardware, say the USB is doing a huge DMA transfer
and CPU is not doing anything so it will hit idle thread and can idle.
At least your CPU sub-domain can idle because it is quite independent
even if you can't cut the power for whole domain. Even with
your patchset its not possible to cut the power because one of the
device in the domain is busy. There can be scenario, where say
SPI is not used for thousands of CPU idle entry/exit so there
is no need to worry about its context save/restore for every
idle entry/exit.

The way I am looking at your issue is, every device in a power domain
including CPU will decrement the powerdomain usecount. This can be
handled through runtime PM. When they idle, they can save the context
and the context can be restored only when next time they needs to be
used. CPU can also decrements the usecount in the idle entry and then check
whether the usecound of the common PD is zero. If it is, it can cut the
power else just do CPU sub-domain idle.

I might be off track if above can't be managed for your hardware.

Regards
Santosh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-12  6:32           ` Shilimkar, Santosh
@ 2012-05-12 19:33             ` Rafael J. Wysocki
  2012-05-14  6:14               ` Shilimkar, Santosh
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-12 19:33 UTC (permalink / raw)
  To: Shilimkar, Santosh
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Saturday, May 12, 2012, Shilimkar, Santosh wrote:
> On Sat, May 12, 2012 at 12:29 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, May 11, 2012, Santosh Shilimkar wrote:
> >> On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
> >> > On Thursday, May 10, 2012, Santosh Shilimkar wrote:
[...]
> Your case is bit special because CPU is also part of the same power
> domain as IO's. Since CPU is part of the root power domain along
> with peripherals, you want CPU to be the device to cut the power in
> the end(idle) and all the notifiers prepares the IO's for the 'cut
> power' which CPU can do in the cpuidle.

No, no notifiers, please, they are clearly suboptimal.

> And then on the reverse chain, notifier will help devices to restore
> the context so that can resume without any issues.

The idea here is to avoid using notifiers.
 
> Well if CPU is added as one more device along with other
> peripherals, you can still make it work at power domain framework
> level.

What exactly do you mean by "power domain framework level"?  My patch
is at the power domain framework level as far as I can say.

> This can be debated further but I think, we need an agreement
> on CPUIDLE exclusivity.

What do you mean by cpuidle exclusivity?  And what agreement?

> Based on your hardware, say the USB is doing a huge DMA transfer
> and CPU is not doing anything so it will hit idle thread and can idle.

Sure, it will.

> At least your CPU sub-domain can idle because it is quite independent
> even if you can't cut the power for whole domain.

That's correct.  The CPU sub-domain is beyond the scope of my patch.

> Even with your patchset its not possible to cut the power because one of the
> device in the domain is busy. There can be scenario, where say
> SPI is not used for thousands of CPU idle entry/exit so there
> is no need to worry about its context save/restore for every
> idle entry/exit.

The CPU idle state I'm referring to is not the only CPU idle state.
It is an extra state allowing the CPU to go to even deeper idle that usually
if all of the devices in its domain happen to be idle.

Quite obviously there are other idle states along this one that will be used
if this extra state is not enabled.

> The way I am looking at your issue is, every device in a power domain
> including CPU will decrement the powerdomain usecount.

The CPU will not do that, only devices.

> This can be handled through runtime PM.

But this patchset is a part of runtime PM!  What exactly do you mean?

> When they idle, they can save the context
> and the context can be restored only when next time they needs to be
> used. CPU can also decrements the usecount in the idle entry and then check
> whether the usecound of the common PD is zero.

This is not how it is supposed to work.

It doesn't make sense for cpuidle to even try to use the "domain" state if
it is known that some I/O devices in the domain aren't idle.  That's why
that state is disabled in my code whenever one (the first) of the I/O devices
is resumed.  It is then enabled when the last I/O device in the domain goes
idle (the states of all devices are saved at this point, to allow cpuidle to
use the "domain" state without worrying of the devices).

I'll post a new version of the $subject patch shortly, because I found that
the previous one wasn't correct.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Update][RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-09 21:43 ` [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support Rafael J. Wysocki
  2012-05-10 10:10   ` Santosh Shilimkar
@ 2012-05-12 19:40   ` Rafael J. Wysocki
  2012-05-16 20:29     ` [Update 2x][RFC][PATCH " Rafael J. Wysocki
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-12 19:40 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Len Brown, Colin Cross, Kevin Hilman, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

From: Rafael J. Wysocki <rjw@sisk.pl>

On some systems there are CPU cores located in the same power
domains as I/O devices.  Then, power can only be removed from the
domain if all I/O devices in it are not in use and the CPU core
is idle.  Add preliminary support for that to the generic PM domains
framework.

This assumes that all CPU cores in the system will be located in the
same power domain is supposed to be set up in the following way.

First, the platform is expected to provide a cpuidle driver with one
extra state designated for use with the generic PM domains code.
This state should be initially disabled and its exit_latency value
should be set to whatever time is needed to bring up the CPU core
itself after restoring power to it, not including the domain's
power on latency.  Its .enter() callback should point to a procedure
that will remove power from the domain containing the CPU core at
the end of the CPU power transition.

The remaining characteristics of the extra cpuidle state, referred to
as the "domain" cpuidle state below, (e.g. power usage, target
residency) should be populated in accordance with the properties of
the hardware.

Next, the platform should execute genpd_attach_cpuidle() on the PM
domain containing the CPU core.  That will cause the generic PM
domains framework to treat that domain in a special way such that:

 * When all devices in the domain have been suspended and it is about
   to be turned off, the states of the devices will be saved, but
   power will not be removed from the domain.  Instead, the "domain"
   cpuidle state will be enabled so that power can be removed from
   the domain when the CPU core is idle and the "domain" state has
   been chosen as the target by the cpuidle governor.  The "domain"
   state's .enter() callback is then reposible for removing power
   from the domain.

 * When the first I/O device in the domain is resumed and
   __pm_genpd_poweron(() is called for the first time after
   power has been removed from the domain, the "domain" cpuidle
   state will be disabled to avoid subsequent surprise power removals
   via cpuidle.

---
 drivers/base/power/domain.c |  115 +++++++++++++++++++++++++++++++++++++++++++-
 include/linux/pm_domain.h   |   17 ++++++
 2 files changed, 131 insertions(+), 1 deletion(-)

Index: linux/include/linux/pm_domain.h
===================================================================
--- linux.orig/include/linux/pm_domain.h
+++ linux/include/linux/pm_domain.h
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/of.h>
 #include <linux/notifier.h>
+#include <linux/cpuidle.h>
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -45,6 +46,11 @@ struct gpd_dev_ops {
 	bool (*active_wakeup)(struct device *dev);
 };
 
+struct gpd_cpu_data {
+	unsigned int saved_exit_latency;
+	struct cpuidle_state *idle_state;
+};
+
 struct generic_pm_domain {
 	struct dev_pm_domain domain;	/* PM domain operations */
 	struct list_head gpd_list_node;	/* Node in the global PM domains list */
@@ -75,6 +81,7 @@ struct generic_pm_domain {
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
 	struct device_node *of_node; /* Node in device tree */
+	struct gpd_cpu_data *cpu_data;
 };
 
 static inline struct generic_pm_domain *pd_to_genpd(struct dev_pm_domain *pd)
@@ -155,6 +162,8 @@ extern int pm_genpd_add_callbacks(struct
 				  struct gpd_dev_ops *ops,
 				  struct gpd_timing_data *td);
 extern int __pm_genpd_remove_callbacks(struct device *dev, bool clear_td);
+extern int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state);
+extern int genpd_detach_cpuidle(struct generic_pm_domain *genpd);
 extern void pm_genpd_init(struct generic_pm_domain *genpd,
 			  struct dev_power_governor *gov, bool is_off);
 
@@ -211,6 +220,14 @@ static inline int __pm_genpd_remove_call
 {
 	return -ENOSYS;
 }
+static inline int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int st)
+{
+	return -ENOSYS;
+}
+static inline int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	return -ENOSYS;
+}
 static inline void pm_genpd_init(struct generic_pm_domain *genpd,
 				 struct dev_power_governor *gov, bool is_off)
 {
Index: linux/drivers/base/power/domain.c
===================================================================
--- linux.orig/drivers/base/power/domain.c
+++ linux/drivers/base/power/domain.c
@@ -139,6 +139,19 @@ static void genpd_set_active(struct gene
 		genpd->status = GPD_STATE_ACTIVE;
 }
 
+static void genpd_recalc_cpu_exit_latency(struct generic_pm_domain *genpd)
+{
+	s64 usecs64;
+
+	if (!genpd->cpu_data)
+		return;
+
+	usecs64 = genpd->power_on_latency_ns;
+	do_div(usecs64, NSEC_PER_USEC);
+	usecs64 += genpd->cpu_data->saved_exit_latency;
+	genpd->cpu_data->idle_state->exit_latency = usecs64;
+}
+
 /**
  * __pm_genpd_poweron - Restore power to a given PM domain and its masters.
  * @genpd: PM domain to power up.
@@ -203,7 +216,11 @@ int __pm_genpd_poweron(struct generic_pm
 		}
 	}
 
-	if (genpd->power_on) {
+	if (genpd->cpu_data) {
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = true;
+		cpuidle_resume_and_unlock();
+	} else if (genpd->power_on) {
 		ktime_t time_start = ktime_get();
 		s64 elapsed_ns;
 
@@ -215,6 +232,7 @@ int __pm_genpd_poweron(struct generic_pm
 		if (elapsed_ns > genpd->power_on_latency_ns) {
 			genpd->power_on_latency_ns = elapsed_ns;
 			genpd->max_off_time_changed = true;
+			genpd_recalc_cpu_exit_latency(genpd);
 			if (genpd->name)
 				pr_warning("%s: Power-on latency exceeded, "
 					"new value %lld ns\n", genpd->name,
@@ -464,6 +482,21 @@ static int pm_genpd_poweroff(struct gene
 		}
 	}
 
+	if (genpd->cpu_data) {
+		/*
+		 * If cpu_data is set, cpuidle should turn the domain off when
+		 * the CPU in it is idle.  In that case we don't decrement the
+		 * subdomain counts of the master domains, so that power is not
+		 * removed from the current domain prematurely as a result of
+		 * cutting off the masters' power.
+		 */
+		genpd->status = GPD_STATE_POWER_OFF;
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = false;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	if (genpd->power_off) {
 		ktime_t time_start;
 		s64 elapsed_ns;
@@ -1612,6 +1645,86 @@ int __pm_genpd_remove_callbacks(struct d
 }
 EXPORT_SYMBOL_GPL(__pm_genpd_remove_callbacks);
 
+int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state)
+{
+	struct cpuidle_driver *cpuidle_drv;
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd) || state < 0)
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	if (genpd->cpu_data) {
+		ret = -EEXIST;
+		goto out;
+	}
+	cpu_data = kzalloc(sizeof(*cpu_data), GFP_KERNEL);
+	if (!cpu_data) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpuidle_drv = cpuidle_driver_ref();
+	if (!cpuidle_drv) {
+		ret = -ENODEV;
+		goto out;
+	}
+	if (cpuidle_drv->state_count <= state) {
+		ret = -EINVAL;
+		goto err;
+	}
+	idle_state = &cpuidle_drv->states[state];
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto err;
+	}
+	cpu_data->idle_state = idle_state;
+	cpu_data->saved_exit_latency = idle_state->exit_latency;
+	genpd->cpu_data = cpu_data;
+	genpd_recalc_cpu_exit_latency(genpd);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+
+ err:
+	cpuidle_driver_unref();
+	goto out;
+}
+
+int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd))
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	cpu_data = genpd->cpu_data;
+	if (!cpu_data) {
+		ret = -ENODEV;
+		goto out;
+	}
+	idle_state = cpu_data->idle_state;
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto out;
+	}
+	idle_state->exit_latency = cpu_data->saved_exit_latency;
+	cpuidle_driver_unref();
+	genpd->cpu_data = NULL;
+	kfree(cpu_data);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+}
+
 /* Default device callbacks for generic PM domains. */
 
 /**

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-12 19:33             ` Rafael J. Wysocki
@ 2012-05-14  6:14               ` Shilimkar, Santosh
  0 siblings, 0 replies; 19+ messages in thread
From: Shilimkar, Santosh @ 2012-05-14  6:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Kevin Hilman,
	Magnus Damm, Arjan van de Ven

On Sun, May 13, 2012 at 1:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Saturday, May 12, 2012, Shilimkar, Santosh wrote:
>> On Sat, May 12, 2012 at 12:29 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > On Friday, May 11, 2012, Santosh Shilimkar wrote:
>> >> On Friday 11 May 2012 12:11 AM, Rafael J. Wysocki wrote:
>> >> > On Thursday, May 10, 2012, Santosh Shilimkar wrote:
> [...]
>> Your case is bit special because CPU is also part of the same power
>> domain as IO's. Since CPU is part of the root power domain along
>> with peripherals, you want CPU to be the device to cut the power in
>> the end(idle) and all the notifiers prepares the IO's for the 'cut
>> power' which CPU can do in the cpuidle.
>
> No, no notifiers, please, they are clearly suboptimal.
>
>> And then on the reverse chain, notifier will help devices to restore
>> the context so that can resume without any issues.
>
> The idea here is to avoid using notifiers.
>
Thanks for clarifying that. Somehow I miss-understood this
part in first place.

>> Well if CPU is added as one more device along with other
>> peripherals, you can still make it work at power domain framework
>> level.
>
> What exactly do you mean by "power domain framework level"?  My patch
> is at the power domain framework level as far as I can say.
>
>> This can be debated further but I think, we need an agreement
>> on CPUIDLE exclusivity.
>
> What do you mean by cpuidle exclusivity?  And what agreement?
>
>> Based on your hardware, say the USB is doing a huge DMA transfer
>> and CPU is not doing anything so it will hit idle thread and can idle.
>
> Sure, it will.
>
>> At least your CPU sub-domain can idle because it is quite independent
>> even if you can't cut the power for whole domain.
>
> That's correct.  The CPU sub-domain is beyond the scope of my patch.
>
>> Even with your patchset its not possible to cut the power because one of the
>> device in the domain is busy. There can be scenario, where say
>> SPI is not used for thousands of CPU idle entry/exit so there
>> is no need to worry about its context save/restore for every
>> idle entry/exit.
>
> The CPU idle state I'm referring to is not the only CPU idle state.
> It is an extra state allowing the CPU to go to even deeper idle that usually
> if all of the devices in its domain happen to be idle.
>
> Quite obviously there are other idle states along this one that will be used
> if this extra state is not enabled.
>
That make sense.

>> The way I am looking at your issue is, every device in a power domain
>> including CPU will decrement the powerdomain usecount.
>
> The CPU will not do that, only devices.
>
>> This can be handled through runtime PM.
>
> But this patchset is a part of runtime PM!  What exactly do you mean?
>
As I said for some reason I thought you were proposing the idle notifiers
which is not the case.

>> When they idle, they can save the context
>> and the context can be restored only when next time they needs to be
>> used. CPU can also decrements the usecount in the idle entry and then check
>> whether the usecound of the common PD is zero.
>
> This is not how it is supposed to work.
>
> It doesn't make sense for cpuidle to even try to use the "domain" state if
> it is known that some I/O devices in the domain aren't idle.  That's why
> that state is disabled in my code whenever one (the first) of the I/O devices
> is resumed.  It is then enabled when the last I/O device in the domain goes
> idle (the states of all devices are saved at this point, to allow cpuidle to
> use the "domain" state without worrying of the devices).
>
Now since the idle notifier confusion ( at least for me) is out, what you
proposed seems to be reasonable considering the power domain is
shared between multiple devices including CPU.

Regards
Santosh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Update 2x][RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-12 19:40   ` [Update][RFC][PATCH " Rafael J. Wysocki
@ 2012-05-16 20:29     ` Rafael J. Wysocki
  2012-05-23 22:23       ` Kevin Hilman
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-16 20:29 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Len Brown, Colin Cross, Kevin Hilman, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

From: Rafael J. Wysocki <rjw@sisk.pl>

On some systems there are CPU cores located in the same power
domains as I/O devices.  Then, power can only be removed from the
domain if all I/O devices in it are not in use and the CPU core
is idle.  Add preliminary support for that to the generic PM domains
framework.

First, the platform is expected to provide a cpuidle driver with one
extra state designated for use with the generic PM domains code.
This state should be initially disabled and its exit_latency value
should be set to whatever time is needed to bring up the CPU core
itself after restoring power to it, not including the domain's
power on latency.  Its .enter() callback should point to a procedure
that will remove power from the domain containing the CPU core at
the end of the CPU power transition.

The remaining characteristics of the extra cpuidle state, referred to
as the "domain" cpuidle state below, (e.g. power usage, target
residency) should be populated in accordance with the properties of
the hardware.

Next, the platform should execute genpd_attach_cpuidle() on the PM
domain containing the CPU core.  That will cause the generic PM
domains framework to treat that domain in a special way such that:

 * When all devices in the domain have been suspended and it is about
   to be turned off, the states of the devices will be saved, but
   power will not be removed from the domain.  Instead, the "domain"
   cpuidle state will be enabled so that power can be removed from
   the domain when the CPU core is idle and the state has been chosen
   as the target by the cpuidle governor.

 * When the first I/O device in the domain is resumed and
   __pm_genpd_poweron(() is called for the first time after
   power has been removed from the domain, the "domain" cpuidle
   state will be disabled to avoid subsequent surprise power removals
   via cpuidle.

---

Since pm_genpd_poweroff() doesn't modify the masters' subdomain reference
counters if genpd->cpu_data is set, __pm_genpd_poweron() should avoid modifying
them too in that case.

Thanks,
Rafael

---
 drivers/base/power/domain.c |  117 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |   17 ++++++
 2 files changed, 134 insertions(+)

Index: linux/include/linux/pm_domain.h
===================================================================
--- linux.orig/include/linux/pm_domain.h
+++ linux/include/linux/pm_domain.h
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/of.h>
 #include <linux/notifier.h>
+#include <linux/cpuidle.h>
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -45,6 +46,11 @@ struct gpd_dev_ops {
 	bool (*active_wakeup)(struct device *dev);
 };
 
+struct gpd_cpu_data {
+	unsigned int saved_exit_latency;
+	struct cpuidle_state *idle_state;
+};
+
 struct generic_pm_domain {
 	struct dev_pm_domain domain;	/* PM domain operations */
 	struct list_head gpd_list_node;	/* Node in the global PM domains list */
@@ -75,6 +81,7 @@ struct generic_pm_domain {
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
 	struct device_node *of_node; /* Node in device tree */
+	struct gpd_cpu_data *cpu_data;
 };
 
 static inline struct generic_pm_domain *pd_to_genpd(struct dev_pm_domain *pd)
@@ -155,6 +162,8 @@ extern int pm_genpd_add_callbacks(struct
 				  struct gpd_dev_ops *ops,
 				  struct gpd_timing_data *td);
 extern int __pm_genpd_remove_callbacks(struct device *dev, bool clear_td);
+extern int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state);
+extern int genpd_detach_cpuidle(struct generic_pm_domain *genpd);
 extern void pm_genpd_init(struct generic_pm_domain *genpd,
 			  struct dev_power_governor *gov, bool is_off);
 
@@ -211,6 +220,14 @@ static inline int __pm_genpd_remove_call
 {
 	return -ENOSYS;
 }
+static inline int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int st)
+{
+	return -ENOSYS;
+}
+static inline int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	return -ENOSYS;
+}
 static inline void pm_genpd_init(struct generic_pm_domain *genpd,
 				 struct dev_power_governor *gov, bool is_off)
 {
Index: linux/drivers/base/power/domain.c
===================================================================
--- linux.orig/drivers/base/power/domain.c
+++ linux/drivers/base/power/domain.c
@@ -139,6 +139,19 @@ static void genpd_set_active(struct gene
 		genpd->status = GPD_STATE_ACTIVE;
 }
 
+static void genpd_recalc_cpu_exit_latency(struct generic_pm_domain *genpd)
+{
+	s64 usecs64;
+
+	if (!genpd->cpu_data)
+		return;
+
+	usecs64 = genpd->power_on_latency_ns;
+	do_div(usecs64, NSEC_PER_USEC);
+	usecs64 += genpd->cpu_data->saved_exit_latency;
+	genpd->cpu_data->idle_state->exit_latency = usecs64;
+}
+
 /**
  * __pm_genpd_poweron - Restore power to a given PM domain and its masters.
  * @genpd: PM domain to power up.
@@ -176,6 +189,13 @@ int __pm_genpd_poweron(struct generic_pm
 		return 0;
 	}
 
+	if (genpd->cpu_data) {
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = true;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	/*
 	 * The list is guaranteed not to change while the loop below is being
 	 * executed, unless one of the masters' .power_on() callbacks fiddles
@@ -215,6 +235,7 @@ int __pm_genpd_poweron(struct generic_pm
 		if (elapsed_ns > genpd->power_on_latency_ns) {
 			genpd->power_on_latency_ns = elapsed_ns;
 			genpd->max_off_time_changed = true;
+			genpd_recalc_cpu_exit_latency(genpd);
 			if (genpd->name)
 				pr_warning("%s: Power-on latency exceeded, "
 					"new value %lld ns\n", genpd->name,
@@ -222,6 +243,7 @@ int __pm_genpd_poweron(struct generic_pm
 		}
 	}
 
+ out:
 	genpd_set_active(genpd);
 
 	return 0;
@@ -464,6 +486,21 @@ static int pm_genpd_poweroff(struct gene
 		}
 	}
 
+	if (genpd->cpu_data) {
+		/*
+		 * If cpu_data is set, cpuidle should turn the domain off when
+		 * the CPU in it is idle.  In that case we don't decrement the
+		 * subdomain counts of the master domains, so that power is not
+		 * removed from the current domain prematurely as a result of
+		 * cutting off the masters' power.
+		 */
+		genpd->status = GPD_STATE_POWER_OFF;
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = false;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	if (genpd->power_off) {
 		ktime_t time_start;
 		s64 elapsed_ns;
@@ -1612,6 +1649,86 @@ int __pm_genpd_remove_callbacks(struct d
 }
 EXPORT_SYMBOL_GPL(__pm_genpd_remove_callbacks);
 
+int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state)
+{
+	struct cpuidle_driver *cpuidle_drv;
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd) || state < 0)
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	if (genpd->cpu_data) {
+		ret = -EEXIST;
+		goto out;
+	}
+	cpu_data = kzalloc(sizeof(*cpu_data), GFP_KERNEL);
+	if (!cpu_data) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpuidle_drv = cpuidle_driver_ref();
+	if (!cpuidle_drv) {
+		ret = -ENODEV;
+		goto out;
+	}
+	if (cpuidle_drv->state_count <= state) {
+		ret = -EINVAL;
+		goto err;
+	}
+	idle_state = &cpuidle_drv->states[state];
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto err;
+	}
+	cpu_data->idle_state = idle_state;
+	cpu_data->saved_exit_latency = idle_state->exit_latency;
+	genpd->cpu_data = cpu_data;
+	genpd_recalc_cpu_exit_latency(genpd);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+
+ err:
+	cpuidle_driver_unref();
+	goto out;
+}
+
+int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd))
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	cpu_data = genpd->cpu_data;
+	if (!cpu_data) {
+		ret = -ENODEV;
+		goto out;
+	}
+	idle_state = cpu_data->idle_state;
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto out;
+	}
+	idle_state->exit_latency = cpu_data->saved_exit_latency;
+	cpuidle_driver_unref();
+	genpd->cpu_data = NULL;
+	kfree(cpu_data);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+}
+
 /* Default device callbacks for generic PM domains. */
 
 /**


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Update 2x][RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-16 20:29     ` [Update 2x][RFC][PATCH " Rafael J. Wysocki
@ 2012-05-23 22:23       ` Kevin Hilman
  2012-05-24 16:17         ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Hilman @ 2012-05-23 22:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

Hi Rafael,

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> From: Rafael J. Wysocki <rjw@sisk.pl>
>
> On some systems there are CPU cores located in the same power
> domains as I/O devices.  Then, power can only be removed from the
> domain if all I/O devices in it are not in use and the CPU core
> is idle.  Add preliminary support for that to the generic PM domains
> framework.
>
> First, the platform is expected to provide a cpuidle driver with one
> extra state designated for use with the generic PM domains code.
> This state should be initially disabled and its exit_latency value
> should be set to whatever time is needed to bring up the CPU core
> itself after restoring power to it, not including the domain's
> power on latency.  Its .enter() callback should point to a procedure
> that will remove power from the domain containing the CPU core at
> the end of the CPU power transition.
>
> The remaining characteristics of the extra cpuidle state, referred to
> as the "domain" cpuidle state below, (e.g. power usage, target
> residency) should be populated in accordance with the properties of
> the hardware.
>
> Next, the platform should execute genpd_attach_cpuidle() on the PM
> domain containing the CPU core.  That will cause the generic PM
> domains framework to treat that domain in a special way such that:
>
>  * When all devices in the domain have been suspended and it is about
>    to be turned off, the states of the devices will be saved, but
>    power will not be removed from the domain.  Instead, the "domain"
>    cpuidle state will be enabled so that power can be removed from
>    the domain when the CPU core is idle and the state has been chosen
>    as the target by the cpuidle governor.
>
>  * When the first I/O device in the domain is resumed and
>    __pm_genpd_poweron(() is called for the first time after
>    power has been removed from the domain, the "domain" cpuidle
>    state will be disabled to avoid subsequent surprise power removals
>    via cpuidle.

This looks like a good approach.  I like that it keeps a pretty clean
separation between CPUidle and PM domains.

My only comment would be that the recalc of the exit_latency should be
described a bit more.  Specifically, I'm not sure why it's adjused at
every genpd poweron.  At first I thought it was just supposed to be
adjusted upon attach, then adjusted back on detatch, but with the recalc
also in every poweron, I'm a little confused.  Care to clarify?

Thanks,

Kevin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Update 2x][RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-23 22:23       ` Kevin Hilman
@ 2012-05-24 16:17         ` Rafael J. Wysocki
  2012-05-24 23:59           ` Kevin Hilman
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-24 16:17 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

On Thursday, May 24, 2012, Kevin Hilman wrote:
> Hi Rafael,
> 
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >
> > On some systems there are CPU cores located in the same power
> > domains as I/O devices.  Then, power can only be removed from the
> > domain if all I/O devices in it are not in use and the CPU core
> > is idle.  Add preliminary support for that to the generic PM domains
> > framework.
> >
> > First, the platform is expected to provide a cpuidle driver with one
> > extra state designated for use with the generic PM domains code.
> > This state should be initially disabled and its exit_latency value
> > should be set to whatever time is needed to bring up the CPU core
> > itself after restoring power to it, not including the domain's
> > power on latency.  Its .enter() callback should point to a procedure
> > that will remove power from the domain containing the CPU core at
> > the end of the CPU power transition.
> >
> > The remaining characteristics of the extra cpuidle state, referred to
> > as the "domain" cpuidle state below, (e.g. power usage, target
> > residency) should be populated in accordance with the properties of
> > the hardware.
> >
> > Next, the platform should execute genpd_attach_cpuidle() on the PM
> > domain containing the CPU core.  That will cause the generic PM
> > domains framework to treat that domain in a special way such that:
> >
> >  * When all devices in the domain have been suspended and it is about
> >    to be turned off, the states of the devices will be saved, but
> >    power will not be removed from the domain.  Instead, the "domain"
> >    cpuidle state will be enabled so that power can be removed from
> >    the domain when the CPU core is idle and the state has been chosen
> >    as the target by the cpuidle governor.
> >
> >  * When the first I/O device in the domain is resumed and
> >    __pm_genpd_poweron(() is called for the first time after
> >    power has been removed from the domain, the "domain" cpuidle
> >    state will be disabled to avoid subsequent surprise power removals
> >    via cpuidle.
> 
> This looks like a good approach.  I like that it keeps a pretty clean
> separation between CPUidle and PM domains.
> 
> My only comment would be that the recalc of the exit_latency should be
> described a bit more.  Specifically, I'm not sure why it's adjused at
> every genpd poweron.  At first I thought it was just supposed to be
> adjusted upon attach, then adjusted back on detatch, but with the recalc
> also in every poweron, I'm a little confused.  Care to clarify?

The problem is that the PM domains code measures the time it takes to
power off a domain and updates its power on latency parameter if the
measured time is greater.  This is done for PM QoS to operate on realistic
numbers (most of the time at least).

Of course, this also affects the CPU wakeup latency if the wakeup involves
turning a domain on.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Update 2x][RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support
  2012-05-24 16:17         ` Rafael J. Wysocki
@ 2012-05-24 23:59           ` Kevin Hilman
  2012-05-28 21:50             ` [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Hilman @ 2012-05-24 23:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Thursday, May 24, 2012, Kevin Hilman wrote:
>> Hi Rafael,
>> 
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> >
>> > On some systems there are CPU cores located in the same power
>> > domains as I/O devices.  Then, power can only be removed from the
>> > domain if all I/O devices in it are not in use and the CPU core
>> > is idle.  Add preliminary support for that to the generic PM domains
>> > framework.
>> >
>> > First, the platform is expected to provide a cpuidle driver with one
>> > extra state designated for use with the generic PM domains code.
>> > This state should be initially disabled and its exit_latency value
>> > should be set to whatever time is needed to bring up the CPU core
>> > itself after restoring power to it, not including the domain's
>> > power on latency.  Its .enter() callback should point to a procedure
>> > that will remove power from the domain containing the CPU core at
>> > the end of the CPU power transition.
>> >
>> > The remaining characteristics of the extra cpuidle state, referred to
>> > as the "domain" cpuidle state below, (e.g. power usage, target
>> > residency) should be populated in accordance with the properties of
>> > the hardware.
>> >
>> > Next, the platform should execute genpd_attach_cpuidle() on the PM
>> > domain containing the CPU core.  That will cause the generic PM
>> > domains framework to treat that domain in a special way such that:
>> >
>> >  * When all devices in the domain have been suspended and it is about
>> >    to be turned off, the states of the devices will be saved, but
>> >    power will not be removed from the domain.  Instead, the "domain"
>> >    cpuidle state will be enabled so that power can be removed from
>> >    the domain when the CPU core is idle and the state has been chosen
>> >    as the target by the cpuidle governor.
>> >
>> >  * When the first I/O device in the domain is resumed and
>> >    __pm_genpd_poweron(() is called for the first time after
>> >    power has been removed from the domain, the "domain" cpuidle
>> >    state will be disabled to avoid subsequent surprise power removals
>> >    via cpuidle.
>> 
>> This looks like a good approach.  I like that it keeps a pretty clean
>> separation between CPUidle and PM domains.
>> 
>> My only comment would be that the recalc of the exit_latency should be
>> described a bit more.  Specifically, I'm not sure why it's adjused at
>> every genpd poweron.  At first I thought it was just supposed to be
>> adjusted upon attach, then adjusted back on detatch, but with the recalc
>> also in every poweron, I'm a little confused.  Care to clarify?
>
> The problem is that the PM domains code measures the time it takes to
> power off a domain and updates its power on latency parameter if the
> measured time is greater.  This is done for PM QoS to operate on realistic
> numbers (most of the time at least).

OK, I see.  Maybe clarifying that in the changelog would help make that
clearer.

> Of course, this also affects the CPU wakeup latency if the wakeup involves
> turning a domain on.

Right.

Kevin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle
  2012-05-24 23:59           ` Kevin Hilman
@ 2012-05-28 21:50             ` Rafael J. Wysocki
  2012-05-30 22:18               ` Kevin Hilman
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-28 21:50 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

On Friday, May 25, 2012, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Thursday, May 24, 2012, Kevin Hilman wrote:
> >> Hi Rafael,
> >> 
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> >
> >> > On some systems there are CPU cores located in the same power
> >> > domains as I/O devices.  Then, power can only be removed from the
> >> > domain if all I/O devices in it are not in use and the CPU core
> >> > is idle.  Add preliminary support for that to the generic PM domains
> >> > framework.
> >> >
> >> > First, the platform is expected to provide a cpuidle driver with one
> >> > extra state designated for use with the generic PM domains code.
> >> > This state should be initially disabled and its exit_latency value
> >> > should be set to whatever time is needed to bring up the CPU core
> >> > itself after restoring power to it, not including the domain's
> >> > power on latency.  Its .enter() callback should point to a procedure
> >> > that will remove power from the domain containing the CPU core at
> >> > the end of the CPU power transition.
> >> >
> >> > The remaining characteristics of the extra cpuidle state, referred to
> >> > as the "domain" cpuidle state below, (e.g. power usage, target
> >> > residency) should be populated in accordance with the properties of
> >> > the hardware.
> >> >
> >> > Next, the platform should execute genpd_attach_cpuidle() on the PM
> >> > domain containing the CPU core.  That will cause the generic PM
> >> > domains framework to treat that domain in a special way such that:
> >> >
> >> >  * When all devices in the domain have been suspended and it is about
> >> >    to be turned off, the states of the devices will be saved, but
> >> >    power will not be removed from the domain.  Instead, the "domain"
> >> >    cpuidle state will be enabled so that power can be removed from
> >> >    the domain when the CPU core is idle and the state has been chosen
> >> >    as the target by the cpuidle governor.
> >> >
> >> >  * When the first I/O device in the domain is resumed and
> >> >    __pm_genpd_poweron(() is called for the first time after
> >> >    power has been removed from the domain, the "domain" cpuidle
> >> >    state will be disabled to avoid subsequent surprise power removals
> >> >    via cpuidle.
> >> 
> >> This looks like a good approach.  I like that it keeps a pretty clean
> >> separation between CPUidle and PM domains.
> >> 
> >> My only comment would be that the recalc of the exit_latency should be
> >> described a bit more.  Specifically, I'm not sure why it's adjused at
> >> every genpd poweron.  At first I thought it was just supposed to be
> >> adjusted upon attach, then adjusted back on detatch, but with the recalc
> >> also in every poweron, I'm a little confused.  Care to clarify?
> >
> > The problem is that the PM domains code measures the time it takes to
> > power off a domain and updates its power on latency parameter if the
> > measured time is greater.  This is done for PM QoS to operate on realistic
> > numbers (most of the time at least).
> 
> OK, I see.  Maybe clarifying that in the changelog would help make that
> clearer.

Sure.  I hope the updated changelog below is better.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Domains: Add preliminary support for cpuidle

On some systems there are CPU cores located in the same power
domains as I/O devices.  Then, power can only be removed from the
domain if all I/O devices in it are not in use and the CPU core
is idle.  Add preliminary support for that to the generic PM domains
framework.

First, the platform is expected to provide a cpuidle driver with one
extra state designated for use with the generic PM domains code.
This state should be initially disabled and its exit_latency value
should be set to whatever time is needed to bring up the CPU core
itself after restoring power to it, not including the domain's
power on latency.  Its .enter() callback should point to a procedure
that will remove power from the domain containing the CPU core at
the end of the CPU power transition.

The remaining characteristics of the extra cpuidle state, referred to
as the "domain" cpuidle state below, (e.g. power usage, target
residency) should be populated in accordance with the properties of
the hardware.

Next, the platform should execute genpd_attach_cpuidle() on the PM
domain containing the CPU core.  That will cause the generic PM
domains framework to treat that domain in a special way such that:

 * When all devices in the domain have been suspended and it is about
   to be turned off, the states of the devices will be saved, but
   power will not be removed from the domain.  Instead, the "domain"
   cpuidle state will be enabled so that power can be removed from
   the domain when the CPU core is idle and the state has been chosen
   as the target by the cpuidle governor.

 * When the first I/O device in the domain is resumed and
   __pm_genpd_poweron(() is called for the first time after
   power has been removed from the domain, the "domain" cpuidle
   state will be disabled to avoid subsequent surprise power removals
   via cpuidle.

The effective exit_latency value of the "domain" cpuidle state
depends on the time needed to bring up the CPU core itself after
restoring power to it as well as on the power on latency of the
domain containing the CPU core.  Thus the "domain" cpuidle state's
exit_latency has to be recomputed every time the domain's power on
latency is updated, which may happen every time power is restored
to the domain, if the measured power on latency is greater than
the latency stored in the corresponding generic_pm_domain structure.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/domain.c |  117 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |   17 ++++++
 2 files changed, 134 insertions(+)

Index: linux/include/linux/pm_domain.h
===================================================================
--- linux.orig/include/linux/pm_domain.h
+++ linux/include/linux/pm_domain.h
@@ -15,6 +15,7 @@
 #include <linux/err.h>
 #include <linux/of.h>
 #include <linux/notifier.h>
+#include <linux/cpuidle.h>
 
 enum gpd_status {
 	GPD_STATE_ACTIVE = 0,	/* PM domain is active */
@@ -45,6 +46,11 @@ struct gpd_dev_ops {
 	bool (*active_wakeup)(struct device *dev);
 };
 
+struct gpd_cpu_data {
+	unsigned int saved_exit_latency;
+	struct cpuidle_state *idle_state;
+};
+
 struct generic_pm_domain {
 	struct dev_pm_domain domain;	/* PM domain operations */
 	struct list_head gpd_list_node;	/* Node in the global PM domains list */
@@ -75,6 +81,7 @@ struct generic_pm_domain {
 	bool max_off_time_changed;
 	bool cached_power_down_ok;
 	struct device_node *of_node; /* Node in device tree */
+	struct gpd_cpu_data *cpu_data;
 };
 
 static inline struct generic_pm_domain *pd_to_genpd(struct dev_pm_domain *pd)
@@ -155,6 +162,8 @@ extern int pm_genpd_add_callbacks(struct
 				  struct gpd_dev_ops *ops,
 				  struct gpd_timing_data *td);
 extern int __pm_genpd_remove_callbacks(struct device *dev, bool clear_td);
+extern int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state);
+extern int genpd_detach_cpuidle(struct generic_pm_domain *genpd);
 extern void pm_genpd_init(struct generic_pm_domain *genpd,
 			  struct dev_power_governor *gov, bool is_off);
 
@@ -211,6 +220,14 @@ static inline int __pm_genpd_remove_call
 {
 	return -ENOSYS;
 }
+static inline int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int st)
+{
+	return -ENOSYS;
+}
+static inline int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	return -ENOSYS;
+}
 static inline void pm_genpd_init(struct generic_pm_domain *genpd,
 				 struct dev_power_governor *gov, bool is_off)
 {
Index: linux/drivers/base/power/domain.c
===================================================================
--- linux.orig/drivers/base/power/domain.c
+++ linux/drivers/base/power/domain.c
@@ -139,6 +139,19 @@ static void genpd_set_active(struct gene
 		genpd->status = GPD_STATE_ACTIVE;
 }
 
+static void genpd_recalc_cpu_exit_latency(struct generic_pm_domain *genpd)
+{
+	s64 usecs64;
+
+	if (!genpd->cpu_data)
+		return;
+
+	usecs64 = genpd->power_on_latency_ns;
+	do_div(usecs64, NSEC_PER_USEC);
+	usecs64 += genpd->cpu_data->saved_exit_latency;
+	genpd->cpu_data->idle_state->exit_latency = usecs64;
+}
+
 /**
  * __pm_genpd_poweron - Restore power to a given PM domain and its masters.
  * @genpd: PM domain to power up.
@@ -176,6 +189,13 @@ int __pm_genpd_poweron(struct generic_pm
 		return 0;
 	}
 
+	if (genpd->cpu_data) {
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = true;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	/*
 	 * The list is guaranteed not to change while the loop below is being
 	 * executed, unless one of the masters' .power_on() callbacks fiddles
@@ -215,6 +235,7 @@ int __pm_genpd_poweron(struct generic_pm
 		if (elapsed_ns > genpd->power_on_latency_ns) {
 			genpd->power_on_latency_ns = elapsed_ns;
 			genpd->max_off_time_changed = true;
+			genpd_recalc_cpu_exit_latency(genpd);
 			if (genpd->name)
 				pr_warning("%s: Power-on latency exceeded, "
 					"new value %lld ns\n", genpd->name,
@@ -222,6 +243,7 @@ int __pm_genpd_poweron(struct generic_pm
 		}
 	}
 
+ out:
 	genpd_set_active(genpd);
 
 	return 0;
@@ -464,6 +486,21 @@ static int pm_genpd_poweroff(struct gene
 		}
 	}
 
+	if (genpd->cpu_data) {
+		/*
+		 * If cpu_data is set, cpuidle should turn the domain off when
+		 * the CPU in it is idle.  In that case we don't decrement the
+		 * subdomain counts of the master domains, so that power is not
+		 * removed from the current domain prematurely as a result of
+		 * cutting off the masters' power.
+		 */
+		genpd->status = GPD_STATE_POWER_OFF;
+		cpuidle_pause_and_lock();
+		genpd->cpu_data->idle_state->disable = false;
+		cpuidle_resume_and_unlock();
+		goto out;
+	}
+
 	if (genpd->power_off) {
 		ktime_t time_start;
 		s64 elapsed_ns;
@@ -1612,6 +1649,86 @@ int __pm_genpd_remove_callbacks(struct d
 }
 EXPORT_SYMBOL_GPL(__pm_genpd_remove_callbacks);
 
+int genpd_attach_cpuidle(struct generic_pm_domain *genpd, int state)
+{
+	struct cpuidle_driver *cpuidle_drv;
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd) || state < 0)
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	if (genpd->cpu_data) {
+		ret = -EEXIST;
+		goto out;
+	}
+	cpu_data = kzalloc(sizeof(*cpu_data), GFP_KERNEL);
+	if (!cpu_data) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	cpuidle_drv = cpuidle_driver_ref();
+	if (!cpuidle_drv) {
+		ret = -ENODEV;
+		goto out;
+	}
+	if (cpuidle_drv->state_count <= state) {
+		ret = -EINVAL;
+		goto err;
+	}
+	idle_state = &cpuidle_drv->states[state];
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto err;
+	}
+	cpu_data->idle_state = idle_state;
+	cpu_data->saved_exit_latency = idle_state->exit_latency;
+	genpd->cpu_data = cpu_data;
+	genpd_recalc_cpu_exit_latency(genpd);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+
+ err:
+	cpuidle_driver_unref();
+	goto out;
+}
+
+int genpd_detach_cpuidle(struct generic_pm_domain *genpd)
+{
+	struct gpd_cpu_data *cpu_data;
+	struct cpuidle_state *idle_state;
+	int ret = 0;
+
+	if (IS_ERR_OR_NULL(genpd))
+		return -EINVAL;
+
+	genpd_acquire_lock(genpd);
+
+	cpu_data = genpd->cpu_data;
+	if (!cpu_data) {
+		ret = -ENODEV;
+		goto out;
+	}
+	idle_state = cpu_data->idle_state;
+	if (!idle_state->disable) {
+		ret = -EAGAIN;
+		goto out;
+	}
+	idle_state->exit_latency = cpu_data->saved_exit_latency;
+	cpuidle_driver_unref();
+	genpd->cpu_data = NULL;
+	kfree(cpu_data);
+
+ out:
+	genpd_release_lock(genpd);
+	return ret;
+}
+
 /* Default device callbacks for generic PM domains. */
 
 /**

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle
  2012-05-28 21:50             ` [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle Rafael J. Wysocki
@ 2012-05-30 22:18               ` Kevin Hilman
  2012-05-31 19:02                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Hilman @ 2012-05-30 22:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Friday, May 25, 2012, Kevin Hilman wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>> > On Thursday, May 24, 2012, Kevin Hilman wrote:
>> >> Hi Rafael,
>> >> 
>> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> >> 
>> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> >> >
>> >> > On some systems there are CPU cores located in the same power
>> >> > domains as I/O devices.  Then, power can only be removed from the
>> >> > domain if all I/O devices in it are not in use and the CPU core
>> >> > is idle.  Add preliminary support for that to the generic PM domains
>> >> > framework.
>> >> >
>> >> > First, the platform is expected to provide a cpuidle driver with one
>> >> > extra state designated for use with the generic PM domains code.
>> >> > This state should be initially disabled and its exit_latency value
>> >> > should be set to whatever time is needed to bring up the CPU core
>> >> > itself after restoring power to it, not including the domain's
>> >> > power on latency.  Its .enter() callback should point to a procedure
>> >> > that will remove power from the domain containing the CPU core at
>> >> > the end of the CPU power transition.
>> >> >
>> >> > The remaining characteristics of the extra cpuidle state, referred to
>> >> > as the "domain" cpuidle state below, (e.g. power usage, target
>> >> > residency) should be populated in accordance with the properties of
>> >> > the hardware.
>> >> >
>> >> > Next, the platform should execute genpd_attach_cpuidle() on the PM
>> >> > domain containing the CPU core.  That will cause the generic PM
>> >> > domains framework to treat that domain in a special way such that:
>> >> >
>> >> >  * When all devices in the domain have been suspended and it is about
>> >> >    to be turned off, the states of the devices will be saved, but
>> >> >    power will not be removed from the domain.  Instead, the "domain"
>> >> >    cpuidle state will be enabled so that power can be removed from
>> >> >    the domain when the CPU core is idle and the state has been chosen
>> >> >    as the target by the cpuidle governor.
>> >> >
>> >> >  * When the first I/O device in the domain is resumed and
>> >> >    __pm_genpd_poweron(() is called for the first time after
>> >> >    power has been removed from the domain, the "domain" cpuidle
>> >> >    state will be disabled to avoid subsequent surprise power removals
>> >> >    via cpuidle.
>> >> 
>> >> This looks like a good approach.  I like that it keeps a pretty clean
>> >> separation between CPUidle and PM domains.
>> >> 
>> >> My only comment would be that the recalc of the exit_latency should be
>> >> described a bit more.  Specifically, I'm not sure why it's adjused at
>> >> every genpd poweron.  At first I thought it was just supposed to be
>> >> adjusted upon attach, then adjusted back on detatch, but with the recalc
>> >> also in every poweron, I'm a little confused.  Care to clarify?
>> >
>> > The problem is that the PM domains code measures the time it takes to
>> > power off a domain and updates its power on latency parameter if the
>> > measured time is greater.  This is done for PM QoS to operate on realistic
>> > numbers (most of the time at least).
>> 
>> OK, I see.  Maybe clarifying that in the changelog would help make that
>> clearer.
>
> Sure.  I hope the updated changelog below is better.

Perfect, thanks.

Reviewed-by: Kevin Hilman <khilman@ti.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle
  2012-05-30 22:18               ` Kevin Hilman
@ 2012-05-31 19:02                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2012-05-31 19:02 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Linux PM list, LKML, Len Brown, Colin Cross, Magnus Damm,
	Arjan van de Ven, Santosh Shilimkar

On Thursday, May 31, 2012, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Friday, May 25, 2012, Kevin Hilman wrote:
> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> 
> >> > On Thursday, May 24, 2012, Kevin Hilman wrote:
> >> >> Hi Rafael,
> >> >> 
> >> >> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> >> >> 
> >> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> >> >
> >> >> > On some systems there are CPU cores located in the same power
> >> >> > domains as I/O devices.  Then, power can only be removed from the
> >> >> > domain if all I/O devices in it are not in use and the CPU core
> >> >> > is idle.  Add preliminary support for that to the generic PM domains
> >> >> > framework.
> >> >> >
> >> >> > First, the platform is expected to provide a cpuidle driver with one
> >> >> > extra state designated for use with the generic PM domains code.
> >> >> > This state should be initially disabled and its exit_latency value
> >> >> > should be set to whatever time is needed to bring up the CPU core
> >> >> > itself after restoring power to it, not including the domain's
> >> >> > power on latency.  Its .enter() callback should point to a procedure
> >> >> > that will remove power from the domain containing the CPU core at
> >> >> > the end of the CPU power transition.
> >> >> >
> >> >> > The remaining characteristics of the extra cpuidle state, referred to
> >> >> > as the "domain" cpuidle state below, (e.g. power usage, target
> >> >> > residency) should be populated in accordance with the properties of
> >> >> > the hardware.
> >> >> >
> >> >> > Next, the platform should execute genpd_attach_cpuidle() on the PM
> >> >> > domain containing the CPU core.  That will cause the generic PM
> >> >> > domains framework to treat that domain in a special way such that:
> >> >> >
> >> >> >  * When all devices in the domain have been suspended and it is about
> >> >> >    to be turned off, the states of the devices will be saved, but
> >> >> >    power will not be removed from the domain.  Instead, the "domain"
> >> >> >    cpuidle state will be enabled so that power can be removed from
> >> >> >    the domain when the CPU core is idle and the state has been chosen
> >> >> >    as the target by the cpuidle governor.
> >> >> >
> >> >> >  * When the first I/O device in the domain is resumed and
> >> >> >    __pm_genpd_poweron(() is called for the first time after
> >> >> >    power has been removed from the domain, the "domain" cpuidle
> >> >> >    state will be disabled to avoid subsequent surprise power removals
> >> >> >    via cpuidle.
> >> >> 
> >> >> This looks like a good approach.  I like that it keeps a pretty clean
> >> >> separation between CPUidle and PM domains.
> >> >> 
> >> >> My only comment would be that the recalc of the exit_latency should be
> >> >> described a bit more.  Specifically, I'm not sure why it's adjused at
> >> >> every genpd poweron.  At first I thought it was just supposed to be
> >> >> adjusted upon attach, then adjusted back on detatch, but with the recalc
> >> >> also in every poweron, I'm a little confused.  Care to clarify?
> >> >
> >> > The problem is that the PM domains code measures the time it takes to
> >> > power off a domain and updates its power on latency parameter if the
> >> > measured time is greater.  This is done for PM QoS to operate on realistic
> >> > numbers (most of the time at least).
> >> 
> >> OK, I see.  Maybe clarifying that in the changelog would help make that
> >> clearer.
> >
> > Sure.  I hope the updated changelog below is better.
> 
> Perfect, thanks.
> 
> Reviewed-by: Kevin Hilman <khilman@ti.com>

Cool, thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-05-31 18:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-09 21:40 [RFC][PATCH 0/2] PM / Domains / cpuidle: Preliminary cpuidle support for generic PM domains Rafael J. Wysocki
2012-05-09 21:41 ` [RFC][PATCH 1/2] PM / cpuidle: Add driver reference counter Rafael J. Wysocki
2012-05-09 21:43 ` [RFC][PATCH 2/2] PM / Domains: Add preliminary cpuidle support Rafael J. Wysocki
2012-05-10 10:10   ` Santosh Shilimkar
2012-05-10 18:41     ` Rafael J. Wysocki
2012-05-11  8:23       ` Santosh Shilimkar
2012-05-11  8:35         ` Magnus Damm
2012-05-11 18:59         ` Rafael J. Wysocki
2012-05-12  6:32           ` Shilimkar, Santosh
2012-05-12 19:33             ` Rafael J. Wysocki
2012-05-14  6:14               ` Shilimkar, Santosh
2012-05-12 19:40   ` [Update][RFC][PATCH " Rafael J. Wysocki
2012-05-16 20:29     ` [Update 2x][RFC][PATCH " Rafael J. Wysocki
2012-05-23 22:23       ` Kevin Hilman
2012-05-24 16:17         ` Rafael J. Wysocki
2012-05-24 23:59           ` Kevin Hilman
2012-05-28 21:50             ` [Update 3x][PATCH 2/2] PM / Domains: Add preliminary support for cpuidle Rafael J. Wysocki
2012-05-30 22:18               ` Kevin Hilman
2012-05-31 19:02                 ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).