All of lore.kernel.org
 help / color / mirror / Atom feed
* calling runtime PM from system PM methods
@ 2011-06-02  0:05 Kevin Hilman
  2011-06-02 14:18 ` Alan Stern
                   ` (3 more replies)
  0 siblings, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-02  0:05 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

Hi Rafael,

Once again, I'm back to some problems with using runtime PM from system
PM methods.  On OMAP, many drivers don't need to do anything different
for runtime PM compared to system PM, so the system PM methods can
simply use runtime PM.

The obvious complication arises when runtime PM is disabled from
userspace, preventing system PM.

Taking into consideration that runtime PM can be disabled from
userspace, the system PM methods need to manually call the subsystems
runtime PM callbask. An example of the resulting system PM methods can
be found in the currenty OMAP I2C driver (excerpt below[1])

This was working, but now we have device power domains which complicate
the story.  My first take was to change the system PM methods to check
the device power domain callbacks as well[2], and take care of the
precedence.  That seems OK, but it's starting to feel like extra work
for each driver that is easy to screw up, and includes some assumptions
about how the PM core works (e.g. power domain precedence.)

It also has the disadvantage of not taking into consideration the
IRQ-safe capabilities of the PM core.

Rather than adding this additional logic to every driver, what would be
best is if we could just take advantage of all the existing logic in the
runtime PM core, rather than duplicating some of it in the drivers.

The ideal case would be for system PM methods to be able to simply call
pm_runtime_get_sync/_put_sync as well, but somehow force the
transitions, even when pm_runtime_forbid() has been called.

I suspect you won't like that idea, but am curious about your opinions.

In the process of experimenting with other solutions, I found an
interesting discovery:

In the driver's ->suspend() hook, I did something like this:

	priv->forced_suspend = false;
	if (!pm_runtime_suspended(dev)) {
		pm_runtime_put_sync(dev);
		priv->forced_suspend = true;
	}

and in the resume hook I did this:

	if (priv->forced_suspend)
		pm_runtime_get_sync(dev);

Even after disabling runtime PM from userspace via
/sys/devices/.../power/control, the ->suspend() hook triggered an actual
transition.  This is because pm_runtime_forbid() just uses the usage
counter, so the _put_sync() in the ->suspend callback decrements the
counter and triggers an rpm_idle().   Is this expected behavior?

If I can count on this behavior, then the above solution seems better
than my workaround below[2], although I kinda don't like making
assumptions about how pm_runtime_forbid() is implemented.

Kevin

[1] from drivers/i2c/busses/i2c-omap.c

static int omap_i2c_suspend(struct device *dev)
{
	if (!pm_runtime_suspended(dev))
		if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
			dev->bus->pm->runtime_suspend(dev);

	return 0;
}

static int omap_i2c_resume(struct device *dev)
{
	if (!pm_runtime_suspended(dev))
		if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
			dev->bus->pm->runtime_resume(dev);

	return 0;
}




[2] 
static int omap_i2c_suspend(struct device *dev)
{
	int (*callback)(struct device *) = NULL;
	int ret = 0;

	if (!pm_runtime_suspended(dev)) {
		if (dev->pwr_domain)
			callback = dev->pwr_domain->ops.runtime_suspend;
		else if (dev->bus && dev->bus->pm)
			callback = dev->bus->pm->runtime_suspend;

		ret = callback(dev);
	}

	return ret;
}

static int omap_i2c_resume(struct device *dev)
{
	int (*callback)(struct device *) = NULL;
	int ret = 0;

	if (!pm_runtime_suspended(dev)) {
		if (dev->pwr_domain)
			callback = dev->pwr_domain->ops.runtime_resume;
		else if (dev->bus && dev->bus->pm)
			callback = dev->bus->pm->runtime_resume;

		ret = callback(dev);
	}

	return ret;
}


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02  0:05 calling runtime PM from system PM methods Kevin Hilman
@ 2011-06-02 14:18 ` Alan Stern
  2011-06-02 14:18 ` [linux-pm] " Alan Stern
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-02 14:18 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Wed, 1 Jun 2011, Kevin Hilman wrote:

> In the process of experimenting with other solutions, I found an
> interesting discovery:
> 
> In the driver's ->suspend() hook, I did something like this:
> 
> 	priv->forced_suspend = false;
> 	if (!pm_runtime_suspended(dev)) {
> 		pm_runtime_put_sync(dev);
> 		priv->forced_suspend = true;
> 	}
> 
> and in the resume hook I did this:
> 
> 	if (priv->forced_suspend)
> 		pm_runtime_get_sync(dev);
> 
> Even after disabling runtime PM from userspace via
> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
> transition.  This is because pm_runtime_forbid() just uses the usage
> counter, so the _put_sync() in the ->suspend callback decrements the
> counter and triggers an rpm_idle().   Is this expected behavior?

Not really.  In fact it is a bug in your experimental code -- you are
decrementing the usage counter in a context where you did not
previously increment it.  In principle, the counter might already be 0
when the suspend hook runs.

Yes, it is indeed possible for a device to be active while the usage
counter is 0.  For example (assuming the counter is initially 0), this
will happen if you call

	pm_runtime_get_sync(dev);
	pm_runtime_put_noidle(dev);

or even if you simply call

	pm_runtime_resume(dev);

Of course, the drivers you're talking about may never do this.  Still, 
it's a logical mistake to do a *_put without previously doing a *_get.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-02  0:05 calling runtime PM from system PM methods Kevin Hilman
  2011-06-02 14:18 ` Alan Stern
@ 2011-06-02 14:18 ` Alan Stern
  2011-06-02 17:10   ` Kevin Hilman
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
  2011-06-06 18:01 ` Rafael J. Wysocki
  2011-06-06 18:01 ` Rafael J. Wysocki
  3 siblings, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-02 14:18 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap

On Wed, 1 Jun 2011, Kevin Hilman wrote:

> In the process of experimenting with other solutions, I found an
> interesting discovery:
> 
> In the driver's ->suspend() hook, I did something like this:
> 
> 	priv->forced_suspend = false;
> 	if (!pm_runtime_suspended(dev)) {
> 		pm_runtime_put_sync(dev);
> 		priv->forced_suspend = true;
> 	}
> 
> and in the resume hook I did this:
> 
> 	if (priv->forced_suspend)
> 		pm_runtime_get_sync(dev);
> 
> Even after disabling runtime PM from userspace via
> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
> transition.  This is because pm_runtime_forbid() just uses the usage
> counter, so the _put_sync() in the ->suspend callback decrements the
> counter and triggers an rpm_idle().   Is this expected behavior?

Not really.  In fact it is a bug in your experimental code -- you are
decrementing the usage counter in a context where you did not
previously increment it.  In principle, the counter might already be 0
when the suspend hook runs.

Yes, it is indeed possible for a device to be active while the usage
counter is 0.  For example (assuming the counter is initially 0), this
will happen if you call

	pm_runtime_get_sync(dev);
	pm_runtime_put_noidle(dev);

or even if you simply call

	pm_runtime_resume(dev);

Of course, the drivers you're talking about may never do this.  Still, 
it's a logical mistake to do a *_put without previously doing a *_get.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02 14:18 ` [linux-pm] " Alan Stern
@ 2011-06-02 17:10   ` Kevin Hilman
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
  1 sibling, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-02 17:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

Alan Stern <stern@rowland.harvard.edu> writes:

> On Wed, 1 Jun 2011, Kevin Hilman wrote:
>
>> In the process of experimenting with other solutions, I found an
>> interesting discovery:
>> 
>> In the driver's ->suspend() hook, I did something like this:
>> 
>> 	priv->forced_suspend = false;
>> 	if (!pm_runtime_suspended(dev)) {
>> 		pm_runtime_put_sync(dev);
>> 		priv->forced_suspend = true;
>> 	}
>> 
>> and in the resume hook I did this:
>> 
>> 	if (priv->forced_suspend)
>> 		pm_runtime_get_sync(dev);
>> 
>> Even after disabling runtime PM from userspace via
>> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
>> transition.  This is because pm_runtime_forbid() just uses the usage
>> counter, so the _put_sync() in the ->suspend callback decrements the
>> counter and triggers an rpm_idle().   Is this expected behavior?
>
> Not really.  In fact it is a bug in your experimental code -- you are
> decrementing the usage counter in a context where you did not
> previously increment it.  In principle, the counter might already be 0
> when the suspend hook runs.
>
> Yes, it is indeed possible for a device to be active while the usage
> counter is 0.  For example (assuming the counter is initially 0), this
> will happen if you call
>
> 	pm_runtime_get_sync(dev);
> 	pm_runtime_put_noidle(dev);
>
> or even if you simply call
>
> 	pm_runtime_resume(dev);
>
> Of course, the drivers you're talking about may never do this.  Still, 
> it's a logical mistake to do a *_put without previously doing a *_get.

OK.  I was trying to catch that by checking pm_runtime_suspended(), but
now see that that cannot work in general.

The problem I'm trying to solve is how (or whether) I can use runtime PM
from the system PM methods, specifically in the case where runtime PM
has been disabled from userspace (or pm_runtime_forbid() has been
called.)  

In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
cancelled during system PM, thus allowing pending runtime PM events to
occur during system PM.

Basically, what I have is several drivers who don't really need suspend
hooks if runtime PM is enabled, because they use runtime PM on a per
transaction basis, handle all the HW stuff in the runtime PM callbacks,
and from a HW perspective, there is no difference in power state between
runtime and static suspend.  These devices are already runtime suspended
when the system PM callbacks run, so there is nothing for the system PM
callbacks to do.

If pm_runtime_forbid() has been called, but then a system suspend is
initiated, we'd like these devices to actually suspend, meaning allowing
any pending runtime PM transitions to happen during system suspend.
In order to force/trick/pursuade the device to to this, something like
this works:

static int omap_i2c_suspend(struct device *dev)
{
	if (dev->power.runtime_auto == false)
		pm_runtime_put_sync(dev);

	return 0;
}

static int omap_i2c_resume(struct device *dev)
{
	if (dev->power.runtime_auto == false)
		pm_runtime_get_sync(dev);

	return 0;
}

Yes, this does a put without a get, but when runtime_auto == true,
there was an implicit _get_noresume() done by the runtime PM core.

Possibly a cleaner way, but one that would force the driver to keep
addiional state would be something like

suspend (or prepare):

	if (dev->power.runtime_auto == false) {
		priv->rpm_forced = true;
		pm_runtime_allow(dev);
	}

resume (or complete):

       	if (priv->rpm_forced)
		pm_runtime_forbid(dev);

If this is acceptable, I'd probably implement this at the device power
domain level instead of having to have every driver do this.

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-02 14:18 ` [linux-pm] " Alan Stern
  2011-06-02 17:10   ` Kevin Hilman
@ 2011-06-02 17:10   ` Kevin Hilman
  2011-06-02 18:38     ` Alan Stern
                       ` (3 more replies)
  1 sibling, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-02 17:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap

Alan Stern <stern@rowland.harvard.edu> writes:

> On Wed, 1 Jun 2011, Kevin Hilman wrote:
>
>> In the process of experimenting with other solutions, I found an
>> interesting discovery:
>> 
>> In the driver's ->suspend() hook, I did something like this:
>> 
>> 	priv->forced_suspend = false;
>> 	if (!pm_runtime_suspended(dev)) {
>> 		pm_runtime_put_sync(dev);
>> 		priv->forced_suspend = true;
>> 	}
>> 
>> and in the resume hook I did this:
>> 
>> 	if (priv->forced_suspend)
>> 		pm_runtime_get_sync(dev);
>> 
>> Even after disabling runtime PM from userspace via
>> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
>> transition.  This is because pm_runtime_forbid() just uses the usage
>> counter, so the _put_sync() in the ->suspend callback decrements the
>> counter and triggers an rpm_idle().   Is this expected behavior?
>
> Not really.  In fact it is a bug in your experimental code -- you are
> decrementing the usage counter in a context where you did not
> previously increment it.  In principle, the counter might already be 0
> when the suspend hook runs.
>
> Yes, it is indeed possible for a device to be active while the usage
> counter is 0.  For example (assuming the counter is initially 0), this
> will happen if you call
>
> 	pm_runtime_get_sync(dev);
> 	pm_runtime_put_noidle(dev);
>
> or even if you simply call
>
> 	pm_runtime_resume(dev);
>
> Of course, the drivers you're talking about may never do this.  Still, 
> it's a logical mistake to do a *_put without previously doing a *_get.

OK.  I was trying to catch that by checking pm_runtime_suspended(), but
now see that that cannot work in general.

The problem I'm trying to solve is how (or whether) I can use runtime PM
from the system PM methods, specifically in the case where runtime PM
has been disabled from userspace (or pm_runtime_forbid() has been
called.)  

In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
cancelled during system PM, thus allowing pending runtime PM events to
occur during system PM.

Basically, what I have is several drivers who don't really need suspend
hooks if runtime PM is enabled, because they use runtime PM on a per
transaction basis, handle all the HW stuff in the runtime PM callbacks,
and from a HW perspective, there is no difference in power state between
runtime and static suspend.  These devices are already runtime suspended
when the system PM callbacks run, so there is nothing for the system PM
callbacks to do.

If pm_runtime_forbid() has been called, but then a system suspend is
initiated, we'd like these devices to actually suspend, meaning allowing
any pending runtime PM transitions to happen during system suspend.
In order to force/trick/pursuade the device to to this, something like
this works:

static int omap_i2c_suspend(struct device *dev)
{
	if (dev->power.runtime_auto == false)
		pm_runtime_put_sync(dev);

	return 0;
}

static int omap_i2c_resume(struct device *dev)
{
	if (dev->power.runtime_auto == false)
		pm_runtime_get_sync(dev);

	return 0;
}

Yes, this does a put without a get, but when runtime_auto == true,
there was an implicit _get_noresume() done by the runtime PM core.

Possibly a cleaner way, but one that would force the driver to keep
addiional state would be something like

suspend (or prepare):

	if (dev->power.runtime_auto == false) {
		priv->rpm_forced = true;
		pm_runtime_allow(dev);
	}

resume (or complete):

       	if (priv->rpm_forced)
		pm_runtime_forbid(dev);

If this is acceptable, I'd probably implement this at the device power
domain level instead of having to have every driver do this.

Kevin


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
@ 2011-06-02 18:38     ` Alan Stern
  2011-06-02 18:38     ` [linux-pm] " Alan Stern
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-02 18:38 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Thu, 2 Jun 2011, Kevin Hilman wrote:

> OK.  I was trying to catch that by checking pm_runtime_suspended(), but
> now see that that cannot work in general.
> 
> The problem I'm trying to solve is how (or whether) I can use runtime PM
> from the system PM methods, specifically in the case where runtime PM
> has been disabled from userspace (or pm_runtime_forbid() has been
> called.)  
> 
> In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> cancelled during system PM, thus allowing pending runtime PM events to
> occur during system PM.

There are a few approaches that might do what you want.  But I'd like 
to hear Rafael's opinion before making any suggestions.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
  2011-06-02 18:38     ` Alan Stern
@ 2011-06-02 18:38     ` Alan Stern
  2011-06-06 18:29     ` Rafael J. Wysocki
  2011-06-06 18:29     ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-02 18:38 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap

On Thu, 2 Jun 2011, Kevin Hilman wrote:

> OK.  I was trying to catch that by checking pm_runtime_suspended(), but
> now see that that cannot work in general.
> 
> The problem I'm trying to solve is how (or whether) I can use runtime PM
> from the system PM methods, specifically in the case where runtime PM
> has been disabled from userspace (or pm_runtime_forbid() has been
> called.)  
> 
> In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> cancelled during system PM, thus allowing pending runtime PM events to
> occur during system PM.

There are a few approaches that might do what you want.  But I'd like 
to hear Rafael's opinion before making any suggestions.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02  0:05 calling runtime PM from system PM methods Kevin Hilman
  2011-06-02 14:18 ` Alan Stern
  2011-06-02 14:18 ` [linux-pm] " Alan Stern
@ 2011-06-06 18:01 ` Rafael J. Wysocki
  2011-06-06 18:01 ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-06 18:01 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Thursday, June 02, 2011, Kevin Hilman wrote:
> Hi Rafael,

Hi,

> Once again, I'm back to some problems with using runtime PM from system
> PM methods.  On OMAP, many drivers don't need to do anything different
> for runtime PM compared to system PM, so the system PM methods can
> simply use runtime PM.

No, they can't (I was talking about that at LinuxCon Japan last week).

I'll give more details in replies to the other messages in this thread.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02  0:05 calling runtime PM from system PM methods Kevin Hilman
                   ` (2 preceding siblings ...)
  2011-06-06 18:01 ` Rafael J. Wysocki
@ 2011-06-06 18:01 ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-06 18:01 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap, Alan Stern

On Thursday, June 02, 2011, Kevin Hilman wrote:
> Hi Rafael,

Hi,

> Once again, I'm back to some problems with using runtime PM from system
> PM methods.  On OMAP, many drivers don't need to do anything different
> for runtime PM compared to system PM, so the system PM methods can
> simply use runtime PM.

No, they can't (I was talking about that at LinuxCon Japan last week).

I'll give more details in replies to the other messages in this thread.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
                       ` (2 preceding siblings ...)
  2011-06-06 18:29     ` Rafael J. Wysocki
@ 2011-06-06 18:29     ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-06 18:29 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

Hi,

Sorry for the delay.  After returning from Japan I found my cable modem
basically dead, so I have no Internet access from home at the moment, which is
a bit inconvenient, so to speak.

On Thursday, June 02, 2011, Kevin Hilman wrote:
> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> > On Wed, 1 Jun 2011, Kevin Hilman wrote:
> >
> >> In the process of experimenting with other solutions, I found an
> >> interesting discovery:
> >> 
> >> In the driver's ->suspend() hook, I did something like this:
> >> 
> >> 	priv->forced_suspend = false;
> >> 	if (!pm_runtime_suspended(dev)) {
> >> 		pm_runtime_put_sync(dev);
> >> 		priv->forced_suspend = true;
> >> 	}
> >> 
> >> and in the resume hook I did this:
> >> 
> >> 	if (priv->forced_suspend)
> >> 		pm_runtime_get_sync(dev);
> >> 
> >> Even after disabling runtime PM from userspace via
> >> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
> >> transition.  This is because pm_runtime_forbid() just uses the usage
> >> counter, so the _put_sync() in the ->suspend callback decrements the
> >> counter and triggers an rpm_idle().   Is this expected behavior?
> >
> > Not really.  In fact it is a bug in your experimental code -- you are
> > decrementing the usage counter in a context where you did not
> > previously increment it.  In principle, the counter might already be 0
> > when the suspend hook runs.
> >
> > Yes, it is indeed possible for a device to be active while the usage
> > counter is 0.  For example (assuming the counter is initially 0), this
> > will happen if you call
> >
> > 	pm_runtime_get_sync(dev);
> > 	pm_runtime_put_noidle(dev);
> >
> > or even if you simply call
> >
> > 	pm_runtime_resume(dev);
> >
> > Of course, the drivers you're talking about may never do this.  Still, 
> > it's a logical mistake to do a *_put without previously doing a *_get.
> 
> OK.  I was trying to catch that by checking pm_runtime_suspended(), but
> now see that that cannot work in general.
> 
> The problem I'm trying to solve is how (or whether) I can use runtime PM
> from the system PM methods, specifically in the case where runtime PM
> has been disabled from userspace (or pm_runtime_forbid() has been
> called.)  

We discussed this exact issue with Magnus and Paul during LinuxCon Japan
and the answer to it is that you can't.

Apart from the problem with user space disabling runtime PM, there is
a problem with subsystem callbacks, which accidentally doesn't apply to
the platform bus type.  Namely, calling pm_runtime_suspend() or
pm_runtime_put_sync() from a driver's .suspend() routine will result in
the subsystem's .runtime_suspend() routine being called, which is incorrect,
because the driver's .suspend() routine itself is called by the subsystem's
.suspend() routine.  So, if one attempted to call pm_runtime_suspend() from
a driver's .suspend(), we'd get:

(subsystem)->suspend() => (driver)->suspend() => pm_runtime_suspend() =>
(subsystem)->runtime_suspend() ...

However, the driver cannot assume that the subsystem's .runtime_suspend()
may always be called from withing its .suspend().  In fact, for many subsystems
this will cause interesting breakage to occur.

> In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> cancelled during system PM, thus allowing pending runtime PM events to
> occur during system PM.
> 
> Basically, what I have is several drivers who don't really need suspend
> hooks if runtime PM is enabled, because they use runtime PM on a per
> transaction basis, handle all the HW stuff in the runtime PM callbacks,
> and from a HW perspective, there is no difference in power state between
> runtime and static suspend.  These devices are already runtime suspended
> when the system PM callbacks run, so there is nothing for the system PM
> callbacks to do.

Well, I'm not quite sure this is the case.  You have to remember that
system suspend can happen at any time, so even if your runtime PM is used
around transactions, it theoretically is possible for system suspend to
happen while one of the transactions is in progress (unless you can guarantee
that the transactions can't be preempted).

> If pm_runtime_forbid() has been called, but then a system suspend is
> initiated, we'd like these devices to actually suspend, meaning allowing
> any pending runtime PM transitions to happen during system suspend.
> In order to force/trick/pursuade the device to to this, something like
> this works:
> 
> static int omap_i2c_suspend(struct device *dev)
> {
> 	if (dev->power.runtime_auto == false)
> 		pm_runtime_put_sync(dev);

This is incorrect (see above).

> 	return 0;
> }
> 
> static int omap_i2c_resume(struct device *dev)
> {
> 	if (dev->power.runtime_auto == false)
> 		pm_runtime_get_sync(dev);

Likewise.

> 	return 0;
> }
> 
> Yes, this does a put without a get, but when runtime_auto == true,
> there was an implicit _get_noresume() done by the runtime PM core.
> 
> Possibly a cleaner way, but one that would force the driver to keep
> addiional state would be something like
> 
> suspend (or prepare):
> 
> 	if (dev->power.runtime_auto == false) {
> 		priv->rpm_forced = true;
> 		pm_runtime_allow(dev);
> 	}
> 
> resume (or complete):
> 
>        	if (priv->rpm_forced)
> 		pm_runtime_forbid(dev);
> 
> If this is acceptable, I'd probably implement this at the device power
> domain level instead of having to have every driver do this.

While it is tempting to try to get away with only two PM callbacks per
driver instead of four (or even more), it generally is not doable, simply
because driver callbacks are not executed directly by the core.

The only way to address the problem of code duplication between .suspend()
and .runtime_suspend() callbacks (and analogously for resume) I see at the
moment is to make those callbacks execute common routines.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
  2011-06-02 18:38     ` Alan Stern
  2011-06-02 18:38     ` [linux-pm] " Alan Stern
@ 2011-06-06 18:29     ` Rafael J. Wysocki
  2011-06-06 19:16       ` Alan Stern
                         ` (3 more replies)
  2011-06-06 18:29     ` Rafael J. Wysocki
  3 siblings, 4 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-06 18:29 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

Hi,

Sorry for the delay.  After returning from Japan I found my cable modem
basically dead, so I have no Internet access from home at the moment, which is
a bit inconvenient, so to speak.

On Thursday, June 02, 2011, Kevin Hilman wrote:
> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> > On Wed, 1 Jun 2011, Kevin Hilman wrote:
> >
> >> In the process of experimenting with other solutions, I found an
> >> interesting discovery:
> >> 
> >> In the driver's ->suspend() hook, I did something like this:
> >> 
> >> 	priv->forced_suspend = false;
> >> 	if (!pm_runtime_suspended(dev)) {
> >> 		pm_runtime_put_sync(dev);
> >> 		priv->forced_suspend = true;
> >> 	}
> >> 
> >> and in the resume hook I did this:
> >> 
> >> 	if (priv->forced_suspend)
> >> 		pm_runtime_get_sync(dev);
> >> 
> >> Even after disabling runtime PM from userspace via
> >> /sys/devices/.../power/control, the ->suspend() hook triggered an actual
> >> transition.  This is because pm_runtime_forbid() just uses the usage
> >> counter, so the _put_sync() in the ->suspend callback decrements the
> >> counter and triggers an rpm_idle().   Is this expected behavior?
> >
> > Not really.  In fact it is a bug in your experimental code -- you are
> > decrementing the usage counter in a context where you did not
> > previously increment it.  In principle, the counter might already be 0
> > when the suspend hook runs.
> >
> > Yes, it is indeed possible for a device to be active while the usage
> > counter is 0.  For example (assuming the counter is initially 0), this
> > will happen if you call
> >
> > 	pm_runtime_get_sync(dev);
> > 	pm_runtime_put_noidle(dev);
> >
> > or even if you simply call
> >
> > 	pm_runtime_resume(dev);
> >
> > Of course, the drivers you're talking about may never do this.  Still, 
> > it's a logical mistake to do a *_put without previously doing a *_get.
> 
> OK.  I was trying to catch that by checking pm_runtime_suspended(), but
> now see that that cannot work in general.
> 
> The problem I'm trying to solve is how (or whether) I can use runtime PM
> from the system PM methods, specifically in the case where runtime PM
> has been disabled from userspace (or pm_runtime_forbid() has been
> called.)  

We discussed this exact issue with Magnus and Paul during LinuxCon Japan
and the answer to it is that you can't.

Apart from the problem with user space disabling runtime PM, there is
a problem with subsystem callbacks, which accidentally doesn't apply to
the platform bus type.  Namely, calling pm_runtime_suspend() or
pm_runtime_put_sync() from a driver's .suspend() routine will result in
the subsystem's .runtime_suspend() routine being called, which is incorrect,
because the driver's .suspend() routine itself is called by the subsystem's
.suspend() routine.  So, if one attempted to call pm_runtime_suspend() from
a driver's .suspend(), we'd get:

(subsystem)->suspend() => (driver)->suspend() => pm_runtime_suspend() =>
(subsystem)->runtime_suspend() ...

However, the driver cannot assume that the subsystem's .runtime_suspend()
may always be called from withing its .suspend().  In fact, for many subsystems
this will cause interesting breakage to occur.

> In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> cancelled during system PM, thus allowing pending runtime PM events to
> occur during system PM.
> 
> Basically, what I have is several drivers who don't really need suspend
> hooks if runtime PM is enabled, because they use runtime PM on a per
> transaction basis, handle all the HW stuff in the runtime PM callbacks,
> and from a HW perspective, there is no difference in power state between
> runtime and static suspend.  These devices are already runtime suspended
> when the system PM callbacks run, so there is nothing for the system PM
> callbacks to do.

Well, I'm not quite sure this is the case.  You have to remember that
system suspend can happen at any time, so even if your runtime PM is used
around transactions, it theoretically is possible for system suspend to
happen while one of the transactions is in progress (unless you can guarantee
that the transactions can't be preempted).

> If pm_runtime_forbid() has been called, but then a system suspend is
> initiated, we'd like these devices to actually suspend, meaning allowing
> any pending runtime PM transitions to happen during system suspend.
> In order to force/trick/pursuade the device to to this, something like
> this works:
> 
> static int omap_i2c_suspend(struct device *dev)
> {
> 	if (dev->power.runtime_auto == false)
> 		pm_runtime_put_sync(dev);

This is incorrect (see above).

> 	return 0;
> }
> 
> static int omap_i2c_resume(struct device *dev)
> {
> 	if (dev->power.runtime_auto == false)
> 		pm_runtime_get_sync(dev);

Likewise.

> 	return 0;
> }
> 
> Yes, this does a put without a get, but when runtime_auto == true,
> there was an implicit _get_noresume() done by the runtime PM core.
> 
> Possibly a cleaner way, but one that would force the driver to keep
> addiional state would be something like
> 
> suspend (or prepare):
> 
> 	if (dev->power.runtime_auto == false) {
> 		priv->rpm_forced = true;
> 		pm_runtime_allow(dev);
> 	}
> 
> resume (or complete):
> 
>        	if (priv->rpm_forced)
> 		pm_runtime_forbid(dev);
> 
> If this is acceptable, I'd probably implement this at the device power
> domain level instead of having to have every driver do this.

While it is tempting to try to get away with only two PM callbacks per
driver instead of four (or even more), it generally is not doable, simply
because driver callbacks are not executed directly by the core.

The only way to address the problem of code duplication between .suspend()
and .runtime_suspend() callbacks (and analogously for resume) I see at the
moment is to make those callbacks execute common routines.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-06 18:29     ` Rafael J. Wysocki
@ 2011-06-06 19:16       ` Alan Stern
  2011-06-06 19:16       ` [linux-pm] " Alan Stern
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-06 19:16 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

On Mon, 6 Jun 2011, Rafael J. Wysocki wrote:

> > In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> > cancelled during system PM, thus allowing pending runtime PM events to
> > occur during system PM.
> > 
> > Basically, what I have is several drivers who don't really need suspend
> > hooks if runtime PM is enabled, because they use runtime PM on a per
> > transaction basis, handle all the HW stuff in the runtime PM callbacks,
> > and from a HW perspective, there is no difference in power state between
> > runtime and static suspend.  These devices are already runtime suspended
> > when the system PM callbacks run, so there is nothing for the system PM
> > callbacks to do.
> 
> Well, I'm not quite sure this is the case.  You have to remember that
> system suspend can happen at any time, so even if your runtime PM is used
> around transactions, it theoretically is possible for system suspend to
> happen while one of the transactions is in progress (unless you can guarantee
> that the transactions can't be preempted).

That's right.  For example, there may be a queue of pending I/O 
transactions.  Runtime PM would leave the device active as long as the 
queue was non-empty.  System sleep should force the driver to stop 
processing the queue as well as suspending the device. 

> While it is tempting to try to get away with only two PM callbacks per
> driver instead of four (or even more), it generally is not doable, simply
> because driver callbacks are not executed directly by the core.
> 
> The only way to address the problem of code duplication between .suspend()
> and .runtime_suspend() callbacks (and analogously for resume) I see at the
> moment is to make those callbacks execute common routines.

For example, the power state might be protected by a device-specific
spinlock.  The suspend callback could acquire the spinlock, check to
see if the device is already in a low-power state and if not, put it in
one.  Then the same callback could be used for runtime suspend and
system suspend -- apart from issues like stopping the I/O request
queue.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-06 18:29     ` Rafael J. Wysocki
  2011-06-06 19:16       ` Alan Stern
@ 2011-06-06 19:16       ` Alan Stern
  2011-06-06 22:25       ` Kevin Hilman
  2011-06-06 22:25       ` Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-06 19:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Kevin Hilman, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Mon, 6 Jun 2011, Rafael J. Wysocki wrote:

> > In a nutshell, what I'm after is for any pm_runtime_forbid() calls to be
> > cancelled during system PM, thus allowing pending runtime PM events to
> > occur during system PM.
> > 
> > Basically, what I have is several drivers who don't really need suspend
> > hooks if runtime PM is enabled, because they use runtime PM on a per
> > transaction basis, handle all the HW stuff in the runtime PM callbacks,
> > and from a HW perspective, there is no difference in power state between
> > runtime and static suspend.  These devices are already runtime suspended
> > when the system PM callbacks run, so there is nothing for the system PM
> > callbacks to do.
> 
> Well, I'm not quite sure this is the case.  You have to remember that
> system suspend can happen at any time, so even if your runtime PM is used
> around transactions, it theoretically is possible for system suspend to
> happen while one of the transactions is in progress (unless you can guarantee
> that the transactions can't be preempted).

That's right.  For example, there may be a queue of pending I/O 
transactions.  Runtime PM would leave the device active as long as the 
queue was non-empty.  System sleep should force the driver to stop 
processing the queue as well as suspending the device. 

> While it is tempting to try to get away with only two PM callbacks per
> driver instead of four (or even more), it generally is not doable, simply
> because driver callbacks are not executed directly by the core.
> 
> The only way to address the problem of code duplication between .suspend()
> and .runtime_suspend() callbacks (and analogously for resume) I see at the
> moment is to make those callbacks execute common routines.

For example, the power state might be protected by a device-specific
spinlock.  The suspend callback could acquire the spinlock, check to
see if the device is already in a low-power state and if not, put it in
one.  Then the same callback could be used for runtime suspend and
system suspend -- apart from issues like stopping the I/O request
queue.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-06 18:29     ` Rafael J. Wysocki
                         ` (2 preceding siblings ...)
  2011-06-06 22:25       ` Kevin Hilman
@ 2011-06-06 22:25       ` Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-06 22:25 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[..]

> While it is tempting to try to get away with only two PM callbacks per
> driver instead of four (or even more), it generally is not doable, simply
> because driver callbacks are not executed directly by the core.
>
> The only way to address the problem of code duplication between .suspend()
> and .runtime_suspend() callbacks (and analogously for resume) I see at the
> moment is to make those callbacks execute common routines.

Makes sense if the "common routines" are in the driver.  The problem
arises when the common routines are not actually in the driver, but are
instead at the subsystem (or in this case, device power domain) level.

Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
callbacks.  It does runtime PM get/put per-xfer, and has no context to
save/restore, so it needs no runtime PM callbacks, or no special PM code
other than runtime PM get/put calls.  The device power domain level code
is managing the device clocks, power states, etc..

So what the system suspend needs to do is something like:

1. block any new xfers from starting
2. wait for any outstanding xfers
3. if device is already runtime suspended
    - nothing to do
4. trigger idle transition (at device power domain level)

If runtime PM has not been disabled from userspace, it should always end
at step 3, since the last xfer will always trigger a runtime suspend.

However, if runtime PM has been disabled from
userspace/pm_runtime_forbid(), we need some way to do step 4.  It's this
last 'trigger' step that I'm trying to figure out how a clean way of
implementing, particularily for drivers with no runtime PM callbacks.

Unless I'm missing something else, if runtime PM was not prevented via
userspace/pm_runtime_forbid(), we would not need this last 'trigger'
step.  That's why a solution where any pending runtime PM transitions
would be allowed during system PM is the ideal solution (to me.)  It
avoids having to call runtime PM methods from system PM all together.

The current OMAP I2C driver in mainline does this extra "trigger" step
by directly calling the subsystem (bus) callbacks. (It's also missing
the first two steps, which is a known bug and will be fixed once I
figure out the rest of this problem.)

However, now that we have device power domains, I was planning to extend
that to call device power domain callbacks first if they exist.  Since
that was starting to duplicate callback precedence assumptions in the
runtime PM core, I was thinking about ways to avoid that by simply using
runtime PM directly, that's what got me to start this thread in the
first place.

So, I see 2 ways forward

1. Having some per-device option/flag that would allow pending runtime
   PM transitions to happen during system PM, thus removing the need
   for step 4 above.

2. Decide the "right" way to trigger device power domain (or subsytem)
   transitions from the driver.   Directly calling the subsystem
   callbacks from the driver?   Any other options?

I have a rather strong preference for the first one, but am realizing
that I may be in the minority.  So what is the recommended solution for
2?

Kevin

P.S.  I'm glad you got to discuss this with Paul & Magnus at LinuxCon
Japan.  I wish I could've been there too.  Hope your return trip went
well and your internet is back working soon.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-06 18:29     ` Rafael J. Wysocki
  2011-06-06 19:16       ` Alan Stern
  2011-06-06 19:16       ` [linux-pm] " Alan Stern
@ 2011-06-06 22:25       ` Kevin Hilman
  2011-06-07 13:55         ` Alan Stern
                           ` (3 more replies)
  2011-06-06 22:25       ` Kevin Hilman
  3 siblings, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-06 22:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[..]

> While it is tempting to try to get away with only two PM callbacks per
> driver instead of four (or even more), it generally is not doable, simply
> because driver callbacks are not executed directly by the core.
>
> The only way to address the problem of code duplication between .suspend()
> and .runtime_suspend() callbacks (and analogously for resume) I see at the
> moment is to make those callbacks execute common routines.

Makes sense if the "common routines" are in the driver.  The problem
arises when the common routines are not actually in the driver, but are
instead at the subsystem (or in this case, device power domain) level.

Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
callbacks.  It does runtime PM get/put per-xfer, and has no context to
save/restore, so it needs no runtime PM callbacks, or no special PM code
other than runtime PM get/put calls.  The device power domain level code
is managing the device clocks, power states, etc..

So what the system suspend needs to do is something like:

1. block any new xfers from starting
2. wait for any outstanding xfers
3. if device is already runtime suspended
    - nothing to do
4. trigger idle transition (at device power domain level)

If runtime PM has not been disabled from userspace, it should always end
at step 3, since the last xfer will always trigger a runtime suspend.

However, if runtime PM has been disabled from
userspace/pm_runtime_forbid(), we need some way to do step 4.  It's this
last 'trigger' step that I'm trying to figure out how a clean way of
implementing, particularily for drivers with no runtime PM callbacks.

Unless I'm missing something else, if runtime PM was not prevented via
userspace/pm_runtime_forbid(), we would not need this last 'trigger'
step.  That's why a solution where any pending runtime PM transitions
would be allowed during system PM is the ideal solution (to me.)  It
avoids having to call runtime PM methods from system PM all together.

The current OMAP I2C driver in mainline does this extra "trigger" step
by directly calling the subsystem (bus) callbacks. (It's also missing
the first two steps, which is a known bug and will be fixed once I
figure out the rest of this problem.)

However, now that we have device power domains, I was planning to extend
that to call device power domain callbacks first if they exist.  Since
that was starting to duplicate callback precedence assumptions in the
runtime PM core, I was thinking about ways to avoid that by simply using
runtime PM directly, that's what got me to start this thread in the
first place.

So, I see 2 ways forward

1. Having some per-device option/flag that would allow pending runtime
   PM transitions to happen during system PM, thus removing the need
   for step 4 above.

2. Decide the "right" way to trigger device power domain (or subsytem)
   transitions from the driver.   Directly calling the subsystem
   callbacks from the driver?   Any other options?

I have a rather strong preference for the first one, but am realizing
that I may be in the minority.  So what is the recommended solution for
2?

Kevin

P.S.  I'm glad you got to discuss this with Paul & Magnus at LinuxCon
Japan.  I wish I could've been there too.  Hope your return trip went
well and your internet is back working soon.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-06 22:25       ` Kevin Hilman
  2011-06-07 13:55         ` Alan Stern
@ 2011-06-07 13:55         ` Alan Stern
  2011-06-07 21:32         ` Rafael J. Wysocki
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-07 13:55 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Mon, 6 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [..]
> 
> > While it is tempting to try to get away with only two PM callbacks per
> > driver instead of four (or even more), it generally is not doable, simply
> > because driver callbacks are not executed directly by the core.
> >
> > The only way to address the problem of code duplication between .suspend()
> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
> > moment is to make those callbacks execute common routines.
> 
> Makes sense if the "common routines" are in the driver.  The problem
> arises when the common routines are not actually in the driver, but are
> instead at the subsystem (or in this case, device power domain) level.

Why should that be a problem?  System suspend also invokes the 
subsystem and power domain methods.

> Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
> callbacks.  It does runtime PM get/put per-xfer, and has no context to
> save/restore, so it needs no runtime PM callbacks, or no special PM code
> other than runtime PM get/put calls.  The device power domain level code
> is managing the device clocks, power states, etc..
> 
> So what the system suspend needs to do is something like:
> 
> 1. block any new xfers from starting
> 2. wait for any outstanding xfers
> 3. if device is already runtime suspended
>     - nothing to do
> 4. trigger idle transition (at device power domain level)

1 - 2 should be handled at the device level, whereas 3 - 4 are handled
at the power domain level.  Those are separate callbacks during system
suspend.

> If runtime PM has not been disabled from userspace, it should always end
> at step 3, since the last xfer will always trigger a runtime suspend.
> 
> However, if runtime PM has been disabled from
> userspace/pm_runtime_forbid(), we need some way to do step 4.  It's this
> last 'trigger' step that I'm trying to figure out how a clean way of
> implementing, particularily for drivers with no runtime PM callbacks.

You're too hung-up on runtime PM.  Forget about that and concentrate on
system PM instead.  Power domains have system-PM callbacks as well as
runtime-PM callbacks; use them to do what you want.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-06 22:25       ` Kevin Hilman
@ 2011-06-07 13:55         ` Alan Stern
  2011-06-07 13:55         ` Alan Stern
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-07 13:55 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

On Mon, 6 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [..]
> 
> > While it is tempting to try to get away with only two PM callbacks per
> > driver instead of four (or even more), it generally is not doable, simply
> > because driver callbacks are not executed directly by the core.
> >
> > The only way to address the problem of code duplication between .suspend()
> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
> > moment is to make those callbacks execute common routines.
> 
> Makes sense if the "common routines" are in the driver.  The problem
> arises when the common routines are not actually in the driver, but are
> instead at the subsystem (or in this case, device power domain) level.

Why should that be a problem?  System suspend also invokes the 
subsystem and power domain methods.

> Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
> callbacks.  It does runtime PM get/put per-xfer, and has no context to
> save/restore, so it needs no runtime PM callbacks, or no special PM code
> other than runtime PM get/put calls.  The device power domain level code
> is managing the device clocks, power states, etc..
> 
> So what the system suspend needs to do is something like:
> 
> 1. block any new xfers from starting
> 2. wait for any outstanding xfers
> 3. if device is already runtime suspended
>     - nothing to do
> 4. trigger idle transition (at device power domain level)

1 - 2 should be handled at the device level, whereas 3 - 4 are handled
at the power domain level.  Those are separate callbacks during system
suspend.

> If runtime PM has not been disabled from userspace, it should always end
> at step 3, since the last xfer will always trigger a runtime suspend.
> 
> However, if runtime PM has been disabled from
> userspace/pm_runtime_forbid(), we need some way to do step 4.  It's this
> last 'trigger' step that I'm trying to figure out how a clean way of
> implementing, particularily for drivers with no runtime PM callbacks.

You're too hung-up on runtime PM.  Forget about that and concentrate on
system PM instead.  Power domains have system-PM callbacks as well as
runtime-PM callbacks; use them to do what you want.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-06 22:25       ` Kevin Hilman
  2011-06-07 13:55         ` Alan Stern
  2011-06-07 13:55         ` Alan Stern
@ 2011-06-07 21:32         ` Rafael J. Wysocki
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-07 21:32 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Tuesday, June 07, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [..]
> 
> > While it is tempting to try to get away with only two PM callbacks per
> > driver instead of four (or even more), it generally is not doable, simply
> > because driver callbacks are not executed directly by the core.
> >
> > The only way to address the problem of code duplication between .suspend()
> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
> > moment is to make those callbacks execute common routines.
> 
> Makes sense if the "common routines" are in the driver.  The problem
> arises when the common routines are not actually in the driver, but are
> instead at the subsystem (or in this case, device power domain) level.

As Alan said, I'm not sure why that is a problem, because device power
domain can (and most likely should) provide system suspend callbacks as well
as runtime PM callbacks.  Those callbacks can be designed to do whatever is
needed.

> Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
> callbacks.  It does runtime PM get/put per-xfer, and has no context to
> save/restore, so it needs no runtime PM callbacks, or no special PM code
> other than runtime PM get/put calls.  The device power domain level code
> is managing the device clocks, power states, etc..
> 
> So what the system suspend needs to do is something like:
> 
> 1. block any new xfers from starting
> 2. wait for any outstanding xfers
> 3. if device is already runtime suspended
>     - nothing to do
> 4. trigger idle transition (at device power domain level)
> 
> If runtime PM has not been disabled from userspace, it should always end
> at step 3, since the last xfer will always trigger a runtime suspend.
> 
> However, if runtime PM has been disabled from
> userspace/pm_runtime_forbid(), we need some way to do step 4.

Well, it looks like my previous message wasn't clear enough. :-)

Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
The reason is that doing that would cause the subsystem's (or power
domain's in this case) .runtime_suspend() callback to be invoked and
that's incorrect.  Namely, it would require the subsystem (power domain)
to expect that its .runtime_suspend() would always be executed indirectly
as a result of calling its .suspend() (through the driver's callback)
and that expectation may or may not be met (depending on the driver's
design).

The appropriate way to handle system suspend is to provide subsystem
(or power domain) .suspend()/.resume() callbacks that will do whatever is
needed at the subsystem level and will call the corresponding driver
callbacks.  There is no need whatsoever to involve runtime PM into this
in any way.

> It's this last 'trigger' step that I'm trying to figure out how a clean way of
> implementing, particularily for drivers with no runtime PM callbacks.
> 
> Unless I'm missing something else, if runtime PM was not prevented via
> userspace/pm_runtime_forbid(), we would not need this last 'trigger'
> step.  That's why a solution where any pending runtime PM transitions
> would be allowed during system PM is the ideal solution (to me.)  It
> avoids having to call runtime PM methods from system PM all together.

Well, again.  There's nothing to avoid, because all the thing you'd like to do
is incorrect in the first place.

> The current OMAP I2C driver in mainline does this extra "trigger" step
> by directly calling the subsystem (bus) callbacks. (It's also missing
> the first two steps, which is a known bug and will be fixed once I
> figure out the rest of this problem.)
> 
> However, now that we have device power domains, I was planning to extend
> that to call device power domain callbacks first if they exist.  Since
> that was starting to duplicate callback precedence assumptions in the
> runtime PM core, I was thinking about ways to avoid that by simply using
> runtime PM directly, that's what got me to start this thread in the
> first place.
> 
> So, I see 2 ways forward
> 
> 1. Having some per-device option/flag that would allow pending runtime
>    PM transitions to happen during system PM, thus removing the need
>    for step 4 above.
> 
> 2. Decide the "right" way to trigger device power domain (or subsytem)
>    transitions from the driver.   Directly calling the subsystem
>    callbacks from the driver?   Any other options?

Yes, see above.

> I have a rather strong preference for the first one, but am realizing
> that I may be in the minority.  So what is the recommended solution for
> 2?

Well, let me repeat: you need to separate the system suspend handling from
runtime PM.  Each of them requires a different approach, because the goal
is really different in both cases (basically, runtime PM triggers when
the device is _known_ to be idle, while system suspend may trigger while
it's actually being used).

> P.S.  I'm glad you got to discuss this with Paul & Magnus at LinuxCon
> Japan.  I wish I could've been there too.

There is a good opportunity to discuss those things during the Linux
Plumbers Conference in September.  We are going to have a Linux PM
miniconference during that event which you are welcome to attend. :-)

> Hope your return trip went well and your internet is back working soon.

Yes, it did and the internet is already working, thanks!

Take care,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-06 22:25       ` Kevin Hilman
                           ` (2 preceding siblings ...)
  2011-06-07 21:32         ` Rafael J. Wysocki
@ 2011-06-07 21:32         ` Rafael J. Wysocki
  2011-06-07 22:34           ` Kevin Hilman
                             ` (5 more replies)
  3 siblings, 6 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-07 21:32 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Tuesday, June 07, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [..]
> 
> > While it is tempting to try to get away with only two PM callbacks per
> > driver instead of four (or even more), it generally is not doable, simply
> > because driver callbacks are not executed directly by the core.
> >
> > The only way to address the problem of code duplication between .suspend()
> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
> > moment is to make those callbacks execute common routines.
> 
> Makes sense if the "common routines" are in the driver.  The problem
> arises when the common routines are not actually in the driver, but are
> instead at the subsystem (or in this case, device power domain) level.

As Alan said, I'm not sure why that is a problem, because device power
domain can (and most likely should) provide system suspend callbacks as well
as runtime PM callbacks.  Those callbacks can be designed to do whatever is
needed.

> Considering the OMAP I2C driver, it doesn't have (or need) runtime PM
> callbacks.  It does runtime PM get/put per-xfer, and has no context to
> save/restore, so it needs no runtime PM callbacks, or no special PM code
> other than runtime PM get/put calls.  The device power domain level code
> is managing the device clocks, power states, etc..
> 
> So what the system suspend needs to do is something like:
> 
> 1. block any new xfers from starting
> 2. wait for any outstanding xfers
> 3. if device is already runtime suspended
>     - nothing to do
> 4. trigger idle transition (at device power domain level)
> 
> If runtime PM has not been disabled from userspace, it should always end
> at step 3, since the last xfer will always trigger a runtime suspend.
> 
> However, if runtime PM has been disabled from
> userspace/pm_runtime_forbid(), we need some way to do step 4.

Well, it looks like my previous message wasn't clear enough. :-)

Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
The reason is that doing that would cause the subsystem's (or power
domain's in this case) .runtime_suspend() callback to be invoked and
that's incorrect.  Namely, it would require the subsystem (power domain)
to expect that its .runtime_suspend() would always be executed indirectly
as a result of calling its .suspend() (through the driver's callback)
and that expectation may or may not be met (depending on the driver's
design).

The appropriate way to handle system suspend is to provide subsystem
(or power domain) .suspend()/.resume() callbacks that will do whatever is
needed at the subsystem level and will call the corresponding driver
callbacks.  There is no need whatsoever to involve runtime PM into this
in any way.

> It's this last 'trigger' step that I'm trying to figure out how a clean way of
> implementing, particularily for drivers with no runtime PM callbacks.
> 
> Unless I'm missing something else, if runtime PM was not prevented via
> userspace/pm_runtime_forbid(), we would not need this last 'trigger'
> step.  That's why a solution where any pending runtime PM transitions
> would be allowed during system PM is the ideal solution (to me.)  It
> avoids having to call runtime PM methods from system PM all together.

Well, again.  There's nothing to avoid, because all the thing you'd like to do
is incorrect in the first place.

> The current OMAP I2C driver in mainline does this extra "trigger" step
> by directly calling the subsystem (bus) callbacks. (It's also missing
> the first two steps, which is a known bug and will be fixed once I
> figure out the rest of this problem.)
> 
> However, now that we have device power domains, I was planning to extend
> that to call device power domain callbacks first if they exist.  Since
> that was starting to duplicate callback precedence assumptions in the
> runtime PM core, I was thinking about ways to avoid that by simply using
> runtime PM directly, that's what got me to start this thread in the
> first place.
> 
> So, I see 2 ways forward
> 
> 1. Having some per-device option/flag that would allow pending runtime
>    PM transitions to happen during system PM, thus removing the need
>    for step 4 above.
> 
> 2. Decide the "right" way to trigger device power domain (or subsytem)
>    transitions from the driver.   Directly calling the subsystem
>    callbacks from the driver?   Any other options?

Yes, see above.

> I have a rather strong preference for the first one, but am realizing
> that I may be in the minority.  So what is the recommended solution for
> 2?

Well, let me repeat: you need to separate the system suspend handling from
runtime PM.  Each of them requires a different approach, because the goal
is really different in both cases (basically, runtime PM triggers when
the device is _known_ to be idle, while system suspend may trigger while
it's actually being used).

> P.S.  I'm glad you got to discuss this with Paul & Magnus at LinuxCon
> Japan.  I wish I could've been there too.

There is a good opportunity to discuss those things during the Linux
Plumbers Conference in September.  We are going to have a Linux PM
miniconference during that event which you are welcome to attend. :-)

> Hope your return trip went well and your internet is back working soon.

Yes, it did and the internet is already working, thanks!

Take care,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
  2011-06-07 22:34           ` Kevin Hilman
@ 2011-06-07 22:34           ` Kevin Hilman
  2011-06-08 22:50           ` Kevin Hilman
                             ` (3 subsequent siblings)
  5 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-07 22:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday, June 07, 2011, Kevin Hilman wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>> [..]
>> 
>> > While it is tempting to try to get away with only two PM callbacks per
>> > driver instead of four (or even more), it generally is not doable, simply
>> > because driver callbacks are not executed directly by the core.
>> >
>> > The only way to address the problem of code duplication between .suspend()
>> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
>> > moment is to make those callbacks execute common routines.
>> 
>> Makes sense if the "common routines" are in the driver.  The problem
>> arises when the common routines are not actually in the driver, but are
>> instead at the subsystem (or in this case, device power domain) level.
>
> As Alan said, I'm not sure why that is a problem, because device power
> domain can (and most likely should) provide system suspend callbacks as well
> as runtime PM callbacks.  Those callbacks can be designed to do whatever is
> needed.

Yes, I see now.

For some reason, my runtime PM focus caused me to not consider the
system PM callbacks in the device power domains.  Taking care of things
there should solve my problems.

Sorry for being a bit blind,

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-07 22:34           ` Kevin Hilman
  2011-06-07 22:34           ` Kevin Hilman
                             ` (4 subsequent siblings)
  5 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-07 22:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Tuesday, June 07, 2011, Kevin Hilman wrote:
>> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>> 
>> [..]
>> 
>> > While it is tempting to try to get away with only two PM callbacks per
>> > driver instead of four (or even more), it generally is not doable, simply
>> > because driver callbacks are not executed directly by the core.
>> >
>> > The only way to address the problem of code duplication between .suspend()
>> > and .runtime_suspend() callbacks (and analogously for resume) I see at the
>> > moment is to make those callbacks execute common routines.
>> 
>> Makes sense if the "common routines" are in the driver.  The problem
>> arises when the common routines are not actually in the driver, but are
>> instead at the subsystem (or in this case, device power domain) level.
>
> As Alan said, I'm not sure why that is a problem, because device power
> domain can (and most likely should) provide system suspend callbacks as well
> as runtime PM callbacks.  Those callbacks can be designed to do whatever is
> needed.

Yes, I see now.

For some reason, my runtime PM focus caused me to not consider the
system PM callbacks in the device power domains.  Taking care of things
there should solve my problems.

Sorry for being a bit blind,

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
  2011-06-07 22:34           ` Kevin Hilman
  2011-06-07 22:34           ` Kevin Hilman
@ 2011-06-08 22:50           ` Kevin Hilman
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
                             ` (2 subsequent siblings)
  5 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-08 22:50 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[...]

> you need to separate the system suspend handling from runtime PM.

/me risks getting yelled at again and jumps back in ;)

> Each of them requires a different approach, because the goal is really
> different in both cases (basically, runtime PM triggers when the
> device is _known_ to be idle, while system suspend may trigger while
> it's actually being used).

OK, but from the driver's perspective, the goals do not seem all that
different to me:

Runtime suspend
1. activity
2. activity finishes
3. device is _known_ to be idle
4. trigger device low-power transition 

system suspend [echo mem > /sys/power/state]
1. activity
2. prevent future activity, halt/wait for current activity
3. device is _known_ to be idle
4. trigger device low-power transition

The only difference is step 2.  In runtime suspend, the activity
finishes on its own, in system suspend, the activity is forcibly
stopped.  In either case, after that point the device is known to be
idle, and proceeding from there is identical.  IOW, based on the above,
another way of looking at system PM is forcing idle so that runtime PM
can happen.

I think it's important to note the similarities as well as the
differences.  Maybe I'm still really blind here, but I cannot see how
they can be completely decoupled.

More specifically, what should be the approach in system suspend when a
device is already runtime suspended?  If you treat runtime and system PM
as completely independent, you would have to runtime resume the device
so that it can then be immediately system suspended.

For many (if not all) devices though, what I suspect we would want is
for devices that are runtime suspended to stay runtime suspended across
a system suspend *and* resume.  That would mean that the device power
domain would not call system PM callbacks on devices that are runtime
suspended.

At least for device power domains where the low-power state is identical
between runtime suspend and system suspend, this makes a lot of sense
(to me.)  Device power domain ->suspend might look something like this
(obviously not tested):

    if (pm_runtime_suspended(dev))
        return;

    pm_generic_suspend(dev);       
    magic_device_idle(dev); /* HW-specific, shared with runtime PM */
    priv->flags |= MY_DEVICE_SYS_SUSPENDED;

and ->resume():

   if (priv->flags & MY_DEVICE_SYS_SUSPENDED) {
       magic_device_enable(dev);
       pm_generic_resume(dev);    
  }

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
                             ` (2 preceding siblings ...)
  2011-06-08 22:50           ` Kevin Hilman
@ 2011-06-08 22:50           ` Kevin Hilman
  2011-06-09  5:29             ` Magnus Damm
                               ` (3 more replies)
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
  2011-06-10 23:14           ` Kevin Hilman
  5 siblings, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-08 22:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[...]

> you need to separate the system suspend handling from runtime PM.

/me risks getting yelled at again and jumps back in ;)

> Each of them requires a different approach, because the goal is really
> different in both cases (basically, runtime PM triggers when the
> device is _known_ to be idle, while system suspend may trigger while
> it's actually being used).

OK, but from the driver's perspective, the goals do not seem all that
different to me:

Runtime suspend
1. activity
2. activity finishes
3. device is _known_ to be idle
4. trigger device low-power transition 

system suspend [echo mem > /sys/power/state]
1. activity
2. prevent future activity, halt/wait for current activity
3. device is _known_ to be idle
4. trigger device low-power transition

The only difference is step 2.  In runtime suspend, the activity
finishes on its own, in system suspend, the activity is forcibly
stopped.  In either case, after that point the device is known to be
idle, and proceeding from there is identical.  IOW, based on the above,
another way of looking at system PM is forcing idle so that runtime PM
can happen.

I think it's important to note the similarities as well as the
differences.  Maybe I'm still really blind here, but I cannot see how
they can be completely decoupled.

More specifically, what should be the approach in system suspend when a
device is already runtime suspended?  If you treat runtime and system PM
as completely independent, you would have to runtime resume the device
so that it can then be immediately system suspended.

For many (if not all) devices though, what I suspect we would want is
for devices that are runtime suspended to stay runtime suspended across
a system suspend *and* resume.  That would mean that the device power
domain would not call system PM callbacks on devices that are runtime
suspended.

At least for device power domains where the low-power state is identical
between runtime suspend and system suspend, this makes a lot of sense
(to me.)  Device power domain ->suspend might look something like this
(obviously not tested):

    if (pm_runtime_suspended(dev))
        return;

    pm_generic_suspend(dev);       
    magic_device_idle(dev); /* HW-specific, shared with runtime PM */
    priv->flags |= MY_DEVICE_SYS_SUSPENDED;

and ->resume():

   if (priv->flags & MY_DEVICE_SYS_SUSPENDED) {
       magic_device_enable(dev);
       pm_generic_resume(dev);    
  }

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
@ 2011-06-09  5:29             ` Magnus Damm
  2011-06-09  5:29             ` [linux-pm] " Magnus Damm
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Magnus Damm @ 2011-06-09  5:29 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

Hi Kevin,

On Thu, Jun 9, 2011 at 7:50 AM, Kevin Hilman <khilman@ti.com> wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
> [...]
>
>> you need to separate the system suspend handling from runtime PM.
>
> /me risks getting yelled at again and jumps back in ;)

=)

>> Each of them requires a different approach, because the goal is really
>> different in both cases (basically, runtime PM triggers when the
>> device is _known_ to be idle, while system suspend may trigger while
>> it's actually being used).
>
> OK, but from the driver's perspective, the goals do not seem all that
> different to me:
>
> Runtime suspend
> 1. activity
> 2. activity finishes
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
>
> system suspend [echo mem > /sys/power/state]
> 1. activity
> 2. prevent future activity, halt/wait for current activity
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
>
> The only difference is step 2.  In runtime suspend, the activity
> finishes on its own, in system suspend, the activity is forcibly
> stopped.  In either case, after that point the device is known to be
> idle, and proceeding from there is identical.  IOW, based on the above,
> another way of looking at system PM is forcing idle so that runtime PM
> can happen.

I agree with the view that system wide suspend is similar to force
idle in the case of a non-wakeup device. If you flip that around then
from a device driver perspective, system wide suspend on a device
which is a wakeup source looks like forcing enable.

This is how I see the system wide suspend including wakeup support:

1. activity
(In case of an ethernet device for instance, the network may be up or down)
2. save current state
3. prevent future activity, halt/wait for current activity
4. device is _known_ to be idle
5. if wakeup is enabled, force enable regardless of state in step 1 above
6. trigger device low-power transition (if possible)

For system wide resume:

1. wake up from low-power state (if needed)
2. if wakeup was enabled, force idle - similar to suspend step 3 above
3. device is _known_ to be idle
4. restore state saved in suspend step 2 above
5. activity
(Also, make sure no interrupts are lost)

The two roles for each wakeup-capable driver, and switching between
those adds quite a bit of complexity. The absolute best part is the
interrupt leakage between the wakeup state and the real state. Almost
impossible to get right. =)

/ magnus

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
  2011-06-09  5:29             ` Magnus Damm
@ 2011-06-09  5:29             ` Magnus Damm
  2011-06-09 13:56             ` Alan Stern
  2011-06-09 13:56             ` Alan Stern
  3 siblings, 0 replies; 118+ messages in thread
From: Magnus Damm @ 2011-06-09  5:29 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Alan Stern, Linux-pm mailing list, linux-omap,
	Paul Walmsley

Hi Kevin,

On Thu, Jun 9, 2011 at 7:50 AM, Kevin Hilman <khilman@ti.com> wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
>
> [...]
>
>> you need to separate the system suspend handling from runtime PM.
>
> /me risks getting yelled at again and jumps back in ;)

=)

>> Each of them requires a different approach, because the goal is really
>> different in both cases (basically, runtime PM triggers when the
>> device is _known_ to be idle, while system suspend may trigger while
>> it's actually being used).
>
> OK, but from the driver's perspective, the goals do not seem all that
> different to me:
>
> Runtime suspend
> 1. activity
> 2. activity finishes
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
>
> system suspend [echo mem > /sys/power/state]
> 1. activity
> 2. prevent future activity, halt/wait for current activity
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
>
> The only difference is step 2.  In runtime suspend, the activity
> finishes on its own, in system suspend, the activity is forcibly
> stopped.  In either case, after that point the device is known to be
> idle, and proceeding from there is identical.  IOW, based on the above,
> another way of looking at system PM is forcing idle so that runtime PM
> can happen.

I agree with the view that system wide suspend is similar to force
idle in the case of a non-wakeup device. If you flip that around then
from a device driver perspective, system wide suspend on a device
which is a wakeup source looks like forcing enable.

This is how I see the system wide suspend including wakeup support:

1. activity
(In case of an ethernet device for instance, the network may be up or down)
2. save current state
3. prevent future activity, halt/wait for current activity
4. device is _known_ to be idle
5. if wakeup is enabled, force enable regardless of state in step 1 above
6. trigger device low-power transition (if possible)

For system wide resume:

1. wake up from low-power state (if needed)
2. if wakeup was enabled, force idle - similar to suspend step 3 above
3. device is _known_ to be idle
4. restore state saved in suspend step 2 above
5. activity
(Also, make sure no interrupts are lost)

The two roles for each wakeup-capable driver, and switching between
those adds quite a bit of complexity. The absolute best part is the
interrupt leakage between the wakeup state and the real state. Almost
impossible to get right. =)

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
                               ` (2 preceding siblings ...)
  2011-06-09 13:56             ` Alan Stern
@ 2011-06-09 13:56             ` Alan Stern
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-09 13:56 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Wed, 8 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [...]
> 
> > you need to separate the system suspend handling from runtime PM.
> 
> /me risks getting yelled at again and jumps back in ;)
> 
> > Each of them requires a different approach, because the goal is really
> > different in both cases (basically, runtime PM triggers when the
> > device is _known_ to be idle, while system suspend may trigger while
> > it's actually being used).
> 
> OK, but from the driver's perspective, the goals do not seem all that
> different to me:
> 
> Runtime suspend
> 1. activity
> 2. activity finishes
> 3. device is _known_ to be idle
> 4. trigger device low-power transition 
> 
> system suspend [echo mem > /sys/power/state]
> 1. activity
> 2. prevent future activity, halt/wait for current activity
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
> 
> The only difference is step 2.

That _is_ the main difference, and it's a big one.  (As Magnus pointed 
out, wakeup-enabling is another difference).

>  In runtime suspend, the activity
> finishes on its own, in system suspend, the activity is forcibly
> stopped.  In either case, after that point the device is known to be
> idle, and proceeding from there is identical.  IOW, based on the above,
> another way of looking at system PM is forcing idle so that runtime PM
> can happen.
> 
> I think it's important to note the similarities as well as the
> differences.  Maybe I'm still really blind here, but I cannot see how
> they can be completely decoupled.

They don't have to be decoupled, and indeed they can share code.  The 
point Rafael and I are making is that they have to use different 
callback pointers, which gives you a chance to handle the differences 
as well as the similarities.

> More specifically, what should be the approach in system suspend when a
> device is already runtime suspended?  If you treat runtime and system PM
> as completely independent, you would have to runtime resume the device
> so that it can then be immediately system suspended.

Assuming the wakeup setting is correct, and assuming you use the same 
power level for runtime suspend and system suspend, then nothing needs 
to be done.

If the wakeup setting is not correct, it has to be changed.  That 
often implies going back to full power in order to change the 
wakeup setting, then going to low power again.

This is all described in various files under Documentation/power/, in 
particular, devices.txt and runtime_pm.txt.

> For many (if not all) devices though, what I suspect we would want is
> for devices that are runtime suspended to stay runtime suspended across
> a system suspend *and* resume.  That would mean that the device power
> domain would not call system PM callbacks on devices that are runtime
> suspended.

No, it's generally agreed that _all_ devices should return to full 
power during system resume -- even if they were runtime suspended 
before the system sleep.  This also is explained in the Documentation 
files.

> At least for device power domains where the low-power state is identical
> between runtime suspend and system suspend, this makes a lot of sense
> (to me.)  Device power domain ->suspend might look something like this
> (obviously not tested):
> 
>     if (pm_runtime_suspended(dev))
>         return;

You could test priv->flags here instead.  But I suppose this would
work.

>     pm_generic_suspend(dev);       

No, you shouldn't call the PM core here.

>     magic_device_idle(dev); /* HW-specific, shared with runtime PM */

At this point you should call magic_device_set_low_power(dev).  That
routine can also be shared with runtime PM.

>     priv->flags |= MY_DEVICE_SYS_SUSPENDED;

I don't see any point in having separate flags for system suspend and 
runtime suspend.  Just use MY_DEVICE_IS_SUSPENDED, and set it in the 
magic_device_set_low_power routine.

> and ->resume():
> 
>    if (priv->flags & MY_DEVICE_SYS_SUSPENDED) {
>        magic_device_enable(dev);
>        pm_generic_resume(dev);    
>   }

This should simply be:

	magic_device_set_full_power(dev);  // clears MY_DEVICE_IS_SUSPENDED
	magic_device_enable(dev);

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
  2011-06-09  5:29             ` Magnus Damm
  2011-06-09  5:29             ` [linux-pm] " Magnus Damm
@ 2011-06-09 13:56             ` Alan Stern
  2011-06-10 14:36               ` Mark Brown
                                 ` (3 more replies)
  2011-06-09 13:56             ` Alan Stern
  3 siblings, 4 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-09 13:56 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

On Wed, 8 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [...]
> 
> > you need to separate the system suspend handling from runtime PM.
> 
> /me risks getting yelled at again and jumps back in ;)
> 
> > Each of them requires a different approach, because the goal is really
> > different in both cases (basically, runtime PM triggers when the
> > device is _known_ to be idle, while system suspend may trigger while
> > it's actually being used).
> 
> OK, but from the driver's perspective, the goals do not seem all that
> different to me:
> 
> Runtime suspend
> 1. activity
> 2. activity finishes
> 3. device is _known_ to be idle
> 4. trigger device low-power transition 
> 
> system suspend [echo mem > /sys/power/state]
> 1. activity
> 2. prevent future activity, halt/wait for current activity
> 3. device is _known_ to be idle
> 4. trigger device low-power transition
> 
> The only difference is step 2.

That _is_ the main difference, and it's a big one.  (As Magnus pointed 
out, wakeup-enabling is another difference).

>  In runtime suspend, the activity
> finishes on its own, in system suspend, the activity is forcibly
> stopped.  In either case, after that point the device is known to be
> idle, and proceeding from there is identical.  IOW, based on the above,
> another way of looking at system PM is forcing idle so that runtime PM
> can happen.
> 
> I think it's important to note the similarities as well as the
> differences.  Maybe I'm still really blind here, but I cannot see how
> they can be completely decoupled.

They don't have to be decoupled, and indeed they can share code.  The 
point Rafael and I are making is that they have to use different 
callback pointers, which gives you a chance to handle the differences 
as well as the similarities.

> More specifically, what should be the approach in system suspend when a
> device is already runtime suspended?  If you treat runtime and system PM
> as completely independent, you would have to runtime resume the device
> so that it can then be immediately system suspended.

Assuming the wakeup setting is correct, and assuming you use the same 
power level for runtime suspend and system suspend, then nothing needs 
to be done.

If the wakeup setting is not correct, it has to be changed.  That 
often implies going back to full power in order to change the 
wakeup setting, then going to low power again.

This is all described in various files under Documentation/power/, in 
particular, devices.txt and runtime_pm.txt.

> For many (if not all) devices though, what I suspect we would want is
> for devices that are runtime suspended to stay runtime suspended across
> a system suspend *and* resume.  That would mean that the device power
> domain would not call system PM callbacks on devices that are runtime
> suspended.

No, it's generally agreed that _all_ devices should return to full 
power during system resume -- even if they were runtime suspended 
before the system sleep.  This also is explained in the Documentation 
files.

> At least for device power domains where the low-power state is identical
> between runtime suspend and system suspend, this makes a lot of sense
> (to me.)  Device power domain ->suspend might look something like this
> (obviously not tested):
> 
>     if (pm_runtime_suspended(dev))
>         return;

You could test priv->flags here instead.  But I suppose this would
work.

>     pm_generic_suspend(dev);       

No, you shouldn't call the PM core here.

>     magic_device_idle(dev); /* HW-specific, shared with runtime PM */

At this point you should call magic_device_set_low_power(dev).  That
routine can also be shared with runtime PM.

>     priv->flags |= MY_DEVICE_SYS_SUSPENDED;

I don't see any point in having separate flags for system suspend and 
runtime suspend.  Just use MY_DEVICE_IS_SUSPENDED, and set it in the 
magic_device_set_low_power routine.

> and ->resume():
> 
>    if (priv->flags & MY_DEVICE_SYS_SUSPENDED) {
>        magic_device_enable(dev);
>        pm_generic_resume(dev);    
>   }

This should simply be:

	magic_device_set_full_power(dev);  // clears MY_DEVICE_IS_SUSPENDED
	magic_device_enable(dev);

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-09 13:56             ` Alan Stern
  2011-06-10 14:36               ` Mark Brown
@ 2011-06-10 14:36               ` Mark Brown
  2011-06-10 23:52               ` Kevin Hilman
  2011-06-10 23:52               ` [linux-pm] " Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 14:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Thu, Jun 09, 2011 at 09:56:48AM -0400, Alan Stern wrote:
> On Wed, 8 Jun 2011, Kevin Hilman wrote:

> > Runtime suspend
> > 2. activity finishes

> > system suspend [echo mem > /sys/power/state]
> > 2. prevent future activity, halt/wait for current activity

> > The only difference is step 2.

> That _is_ the main difference, and it's a big one.  (As Magnus pointed 
> out, wakeup-enabling is another difference).

...

> They don't have to be decoupled, and indeed they can share code.  The 
> point Rafael and I are making is that they have to use different 
> callback pointers, which gives you a chance to handle the differences 
> as well as the similarities.

It seems like everyone's agreeing with each other here - from the user
side the request seems to be largely for core infastructure like
UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
that it doesn't do anything to handle the runtime/system interaction?).
Right now this all feels like more work than it should be in simpler
drivers.

> > For many (if not all) devices though, what I suspect we would want is
> > for devices that are runtime suspended to stay runtime suspended across
> > a system suspend *and* resume.  That would mean that the device power
> > domain would not call system PM callbacks on devices that are runtime
> > suspended.

> No, it's generally agreed that _all_ devices should return to full 
> power during system resume -- even if they were runtime suspended 
> before the system sleep.  This also is explained in the Documentation 
> files.

What is the reasoning behind this agreement?  It's not immediately
obvious why this is useful.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-09 13:56             ` Alan Stern
@ 2011-06-10 14:36               ` Mark Brown
  2011-06-10 14:51                 ` Alan Stern
                                   ` (3 more replies)
  2011-06-10 14:36               ` Mark Brown
                                 ` (2 subsequent siblings)
  3 siblings, 4 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 14:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kevin Hilman, Linux-pm mailing list, linux-omap

On Thu, Jun 09, 2011 at 09:56:48AM -0400, Alan Stern wrote:
> On Wed, 8 Jun 2011, Kevin Hilman wrote:

> > Runtime suspend
> > 2. activity finishes

> > system suspend [echo mem > /sys/power/state]
> > 2. prevent future activity, halt/wait for current activity

> > The only difference is step 2.

> That _is_ the main difference, and it's a big one.  (As Magnus pointed 
> out, wakeup-enabling is another difference).

...

> They don't have to be decoupled, and indeed they can share code.  The 
> point Rafael and I are making is that they have to use different 
> callback pointers, which gives you a chance to handle the differences 
> as well as the similarities.

It seems like everyone's agreeing with each other here - from the user
side the request seems to be largely for core infastructure like
UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
that it doesn't do anything to handle the runtime/system interaction?).
Right now this all feels like more work than it should be in simpler
drivers.

> > For many (if not all) devices though, what I suspect we would want is
> > for devices that are runtime suspended to stay runtime suspended across
> > a system suspend *and* resume.  That would mean that the device power
> > domain would not call system PM callbacks on devices that are runtime
> > suspended.

> No, it's generally agreed that _all_ devices should return to full 
> power during system resume -- even if they were runtime suspended 
> before the system sleep.  This also is explained in the Documentation 
> files.

What is the reasoning behind this agreement?  It's not immediately
obvious why this is useful.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 14:36               ` Mark Brown
@ 2011-06-10 14:51                 ` Alan Stern
  2011-06-10 14:51                 ` [linux-pm] " Alan Stern
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 14:51 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> > No, it's generally agreed that _all_ devices should return to full 
> > power during system resume -- even if they were runtime suspended 
> > before the system sleep.  This also is explained in the Documentation 
> > files.
> 
> What is the reasoning behind this agreement?  It's not immediately
> obvious why this is useful.

See section 6 of Documentation/power/runtime_pm.txt.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 14:36               ` Mark Brown
  2011-06-10 14:51                 ` Alan Stern
@ 2011-06-10 14:51                 ` Alan Stern
  2011-06-10 15:21                   ` Mark Brown
  2011-06-10 15:21                   ` Mark Brown
  2011-06-10 18:49                 ` [linux-pm] " Rafael J. Wysocki
  2011-06-10 18:49                 ` Rafael J. Wysocki
  3 siblings, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 14:51 UTC (permalink / raw)
  To: Mark Brown; +Cc: Kevin Hilman, Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> > No, it's generally agreed that _all_ devices should return to full 
> > power during system resume -- even if they were runtime suspended 
> > before the system sleep.  This also is explained in the Documentation 
> > files.
> 
> What is the reasoning behind this agreement?  It's not immediately
> obvious why this is useful.

See section 6 of Documentation/power/runtime_pm.txt.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 14:51                 ` [linux-pm] " Alan Stern
  2011-06-10 15:21                   ` Mark Brown
@ 2011-06-10 15:21                   ` Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 15:21 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 10:51:02AM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > > No, it's generally agreed that _all_ devices should return to full 
> > > power during system resume -- even if they were runtime suspended 
> > > before the system sleep.  This also is explained in the Documentation 
> > > files.

> > What is the reasoning behind this agreement?  It's not immediately
> > obvious why this is useful.

> See section 6 of Documentation/power/runtime_pm.txt.

It's not massively clear to me how much sense that makes for the
embedded case where we've got a much better idea of what happened to the
hardware over suspend.  Note that I'm thinking here mostly of the case
where we've runtime suspended the device, if the kernel thought the
device was powered then it's much clearer that we should do this on
resume.

  * The device's children may need the device to be at full power in
    order to resume themselves.

Right, this does need to be handled - I'd expect that in most situations
we'd have sorted through it before we ever enter suspend.

  * The device might need to switch power levels, wake-up settings, etc.
  * Remote wake-up events might have been lost by the firmware.
  * The driver's idea of the device state may not agree with the device's
    physical state.  This can happen during resume from hibernation.
  * The device might need to be reset.

This is all much more under control in the embedded case, and of course
the device does know if it's coming back from suspend or hibernation
IIRC which seems to be the only difficult case.

  * Even though the device was suspended, if its usage counter was > 0 then most
    likely it would need a run-time resume in the near future anyway.

Presumably in that case it wouldn't be runtime suspended anyway, or we'd
otherwise be able to cope with the situation without actually fully
powering the device?

  * Always going back to full power is simplest.

It's simple, but spending time and power rewriting large numbers of
registers over a slow bus (which is the sort of thing you end up doing
with some devices) or fiddling about with analogue bringup for devices
affected by that only to immediately power it off again isn't terribly
useful either - it's at best slow.

What I'd have expected resume to do is to bring the system back to the
state it was in before we entered suspend.  I've probably got a fairly
odd view of the world here in that I mostly care about big devices
connected over slow buses where power up can be user visible, and mostly
work on subsystems where the concept of "full power" isn't terribly
meaningful.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 14:51                 ` [linux-pm] " Alan Stern
@ 2011-06-10 15:21                   ` Mark Brown
  2011-06-10 15:45                     ` Alan Stern
  2011-06-10 15:45                     ` Alan Stern
  2011-06-10 15:21                   ` Mark Brown
  1 sibling, 2 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 15:21 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 10:51:02AM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > > No, it's generally agreed that _all_ devices should return to full 
> > > power during system resume -- even if they were runtime suspended 
> > > before the system sleep.  This also is explained in the Documentation 
> > > files.

> > What is the reasoning behind this agreement?  It's not immediately
> > obvious why this is useful.

> See section 6 of Documentation/power/runtime_pm.txt.

It's not massively clear to me how much sense that makes for the
embedded case where we've got a much better idea of what happened to the
hardware over suspend.  Note that I'm thinking here mostly of the case
where we've runtime suspended the device, if the kernel thought the
device was powered then it's much clearer that we should do this on
resume.

  * The device's children may need the device to be at full power in
    order to resume themselves.

Right, this does need to be handled - I'd expect that in most situations
we'd have sorted through it before we ever enter suspend.

  * The device might need to switch power levels, wake-up settings, etc.
  * Remote wake-up events might have been lost by the firmware.
  * The driver's idea of the device state may not agree with the device's
    physical state.  This can happen during resume from hibernation.
  * The device might need to be reset.

This is all much more under control in the embedded case, and of course
the device does know if it's coming back from suspend or hibernation
IIRC which seems to be the only difficult case.

  * Even though the device was suspended, if its usage counter was > 0 then most
    likely it would need a run-time resume in the near future anyway.

Presumably in that case it wouldn't be runtime suspended anyway, or we'd
otherwise be able to cope with the situation without actually fully
powering the device?

  * Always going back to full power is simplest.

It's simple, but spending time and power rewriting large numbers of
registers over a slow bus (which is the sort of thing you end up doing
with some devices) or fiddling about with analogue bringup for devices
affected by that only to immediately power it off again isn't terribly
useful either - it's at best slow.

What I'd have expected resume to do is to bring the system back to the
state it was in before we entered suspend.  I've probably got a fairly
odd view of the world here in that I mostly care about big devices
connected over slow buses where power up can be user visible, and mostly
work on subsystems where the concept of "full power" isn't terribly
meaningful.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 15:21                   ` Mark Brown
  2011-06-10 15:45                     ` Alan Stern
@ 2011-06-10 15:45                     ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 15:45 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> On Fri, Jun 10, 2011 at 10:51:02AM -0400, Alan Stern wrote:
> > On Fri, 10 Jun 2011, Mark Brown wrote:
> 
> > > > No, it's generally agreed that _all_ devices should return to full 
> > > > power during system resume -- even if they were runtime suspended 
> > > > before the system sleep.  This also is explained in the Documentation 
> > > > files.
> 
> > > What is the reasoning behind this agreement?  It's not immediately
> > > obvious why this is useful.
> 
> > See section 6 of Documentation/power/runtime_pm.txt.
> 
> It's not massively clear to me how much sense that makes for the
> embedded case where we've got a much better idea of what happened to the
> hardware over suspend.  Note that I'm thinking here mostly of the case
> where we've runtime suspended the device, if the kernel thought the
> device was powered then it's much clearer that we should do this on
> resume.

Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
device in a low-power state, it can do so.  Just bear in mind that the
PM core's idea of the device's runtime power state may end up not
matching reality unless you're careful.

>   * The device's children may need the device to be at full power in
>     order to resume themselves.
> 
> Right, this does need to be handled - I'd expect that in most situations
> we'd have sorted through it before we ever enter suspend.

Maybe; it depends on the situation.  In the embedded world you're 
likely to have this under better control than in the desktop world.

>   * The device might need to switch power levels, wake-up settings, etc.
>   * Remote wake-up events might have been lost by the firmware.
>   * The driver's idea of the device state may not agree with the device's
>     physical state.  This can happen during resume from hibernation.
>   * The device might need to be reset.
> 
> This is all much more under control in the embedded case, and of course
> the device does know if it's coming back from suspend or hibernation
> IIRC which seems to be the only difficult case.

Yes; you could bring the device to full power during resume from 
hibernation and leave it at low power during resume from system 
suspend.

>   * Even though the device was suspended, if its usage counter was > 0 then most
>     likely it would need a run-time resume in the near future anyway.
> 
> Presumably in that case it wouldn't be runtime suspended anyway, or we'd
> otherwise be able to cope with the situation without actually fully
> powering the device?

Again, this depends on the situation.  It probably won't come up in 
your case.

>   * Always going back to full power is simplest.
> 
> It's simple, but spending time and power rewriting large numbers of
> registers over a slow bus (which is the sort of thing you end up doing
> with some devices) or fiddling about with analogue bringup for devices
> affected by that only to immediately power it off again isn't terribly
> useful either - it's at best slow.
> 
> What I'd have expected resume to do is to bring the system back to the
> state it was in before we entered suspend.

That would indeed be the most logical approach.  But it turns out not 
to be entirely feasible for various odd reasons, which I don't remember 
at the moment.

Of course, even if all devices do get turned on during resume, one 
would expect the normal runtime PM mechanism to power them down again 
very shortly after the resume is finished.

>  I've probably got a fairly
> odd view of the world here in that I mostly care about big devices
> connected over slow buses where power up can be user visible, and mostly
> work on subsystems where the concept of "full power" isn't terribly
> meaningful.

The PM core tries to be flexible, but there are limitations.  In
particular, it's not very well suited for handling devices with
multiple power levels.  Drivers simply have to do the best they can to
fit in with the PM core's power model.  For example, all functional
states might be considered "full power" and all others might be
considered "low power".  Coordinating all the extra details would then
be the driver's responsibility.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 15:21                   ` Mark Brown
@ 2011-06-10 15:45                     ` Alan Stern
  2011-06-10 15:57                       ` Mark Brown
  2011-06-10 15:57                       ` Mark Brown
  2011-06-10 15:45                     ` Alan Stern
  1 sibling, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 15:45 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> On Fri, Jun 10, 2011 at 10:51:02AM -0400, Alan Stern wrote:
> > On Fri, 10 Jun 2011, Mark Brown wrote:
> 
> > > > No, it's generally agreed that _all_ devices should return to full 
> > > > power during system resume -- even if they were runtime suspended 
> > > > before the system sleep.  This also is explained in the Documentation 
> > > > files.
> 
> > > What is the reasoning behind this agreement?  It's not immediately
> > > obvious why this is useful.
> 
> > See section 6 of Documentation/power/runtime_pm.txt.
> 
> It's not massively clear to me how much sense that makes for the
> embedded case where we've got a much better idea of what happened to the
> hardware over suspend.  Note that I'm thinking here mostly of the case
> where we've runtime suspended the device, if the kernel thought the
> device was powered then it's much clearer that we should do this on
> resume.

Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
device in a low-power state, it can do so.  Just bear in mind that the
PM core's idea of the device's runtime power state may end up not
matching reality unless you're careful.

>   * The device's children may need the device to be at full power in
>     order to resume themselves.
> 
> Right, this does need to be handled - I'd expect that in most situations
> we'd have sorted through it before we ever enter suspend.

Maybe; it depends on the situation.  In the embedded world you're 
likely to have this under better control than in the desktop world.

>   * The device might need to switch power levels, wake-up settings, etc.
>   * Remote wake-up events might have been lost by the firmware.
>   * The driver's idea of the device state may not agree with the device's
>     physical state.  This can happen during resume from hibernation.
>   * The device might need to be reset.
> 
> This is all much more under control in the embedded case, and of course
> the device does know if it's coming back from suspend or hibernation
> IIRC which seems to be the only difficult case.

Yes; you could bring the device to full power during resume from 
hibernation and leave it at low power during resume from system 
suspend.

>   * Even though the device was suspended, if its usage counter was > 0 then most
>     likely it would need a run-time resume in the near future anyway.
> 
> Presumably in that case it wouldn't be runtime suspended anyway, or we'd
> otherwise be able to cope with the situation without actually fully
> powering the device?

Again, this depends on the situation.  It probably won't come up in 
your case.

>   * Always going back to full power is simplest.
> 
> It's simple, but spending time and power rewriting large numbers of
> registers over a slow bus (which is the sort of thing you end up doing
> with some devices) or fiddling about with analogue bringup for devices
> affected by that only to immediately power it off again isn't terribly
> useful either - it's at best slow.
> 
> What I'd have expected resume to do is to bring the system back to the
> state it was in before we entered suspend.

That would indeed be the most logical approach.  But it turns out not 
to be entirely feasible for various odd reasons, which I don't remember 
at the moment.

Of course, even if all devices do get turned on during resume, one 
would expect the normal runtime PM mechanism to power them down again 
very shortly after the resume is finished.

>  I've probably got a fairly
> odd view of the world here in that I mostly care about big devices
> connected over slow buses where power up can be user visible, and mostly
> work on subsystems where the concept of "full power" isn't terribly
> meaningful.

The PM core tries to be flexible, but there are limitations.  In
particular, it's not very well suited for handling devices with
multiple power levels.  Drivers simply have to do the best they can to
fit in with the PM core's power model.  For example, all functional
states might be considered "full power" and all others might be
considered "low power".  Coordinating all the extra details would then
be the driver's responsibility.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 15:45                     ` Alan Stern
  2011-06-10 15:57                       ` Mark Brown
@ 2011-06-10 15:57                       ` Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 15:57 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 11:45:30AM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > It's not massively clear to me how much sense that makes for the
> > embedded case where we've got a much better idea of what happened to the
> > hardware over suspend.  Note that I'm thinking here mostly of the case
> > where we've runtime suspended the device, if the kernel thought the
> > device was powered then it's much clearer that we should do this on
> > resume.

> Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
> device in a low-power state, it can do so.  Just bear in mind that the
> PM core's idea of the device's runtime power state may end up not
> matching reality unless you're careful.

This is part of the trouble, it all feels like a lot more work than it
should be for relatively common cases.  In the audio case we're fine as
the subsystem implements a completely independent PM infrastructure
which ignores the PM core except for system suspend (and sometimes
ignores that), it's noticably harder to reason about what's going on
when I go outside there and when I think about what I'm doing it always
feels like it should be possible to factor it out of the drivers.

> Of course, even if all devices do get turned on during resume, one 
> would expect the normal runtime PM mechanism to power them down again 
> very shortly after the resume is finished.

Yeah, though of course if you're only going to be resumed for a very
brief time anyway the amount of time you spend powering up and down
suddenly gets a lot more interesting.  Things like responding to a
keepalive from the network can be done quickly enough that people get
annoyed if you burn 10ms or whatever powering up some irrelevant bit of
hardware.

> >  I've probably got a fairly
> > odd view of the world here in that I mostly care about big devices
> > connected over slow buses where power up can be user visible, and mostly
> > work on subsystems where the concept of "full power" isn't terribly
> > meaningful.
> 
> The PM core tries to be flexible, but there are limitations.  In
> particular, it's not very well suited for handling devices with
> multiple power levels.  Drivers simply have to do the best they can to
> fit in with the PM core's power model.  For example, all functional
> states might be considered "full power" and all others might be
> considered "low power".  Coordinating all the extra details would then
> be the driver's responsibility.

Right, I mostly agree - like I say I think the main thing that would be
useful here is to extend the helpers for the common case unified suspend
situation.  This may be totally separate to what Kevin needs, though.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 15:45                     ` Alan Stern
@ 2011-06-10 15:57                       ` Mark Brown
  2011-06-10 17:17                         ` Alan Stern
  2011-06-10 17:17                         ` Alan Stern
  2011-06-10 15:57                       ` Mark Brown
  1 sibling, 2 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 15:57 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 11:45:30AM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > It's not massively clear to me how much sense that makes for the
> > embedded case where we've got a much better idea of what happened to the
> > hardware over suspend.  Note that I'm thinking here mostly of the case
> > where we've runtime suspended the device, if the kernel thought the
> > device was powered then it's much clearer that we should do this on
> > resume.

> Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
> device in a low-power state, it can do so.  Just bear in mind that the
> PM core's idea of the device's runtime power state may end up not
> matching reality unless you're careful.

This is part of the trouble, it all feels like a lot more work than it
should be for relatively common cases.  In the audio case we're fine as
the subsystem implements a completely independent PM infrastructure
which ignores the PM core except for system suspend (and sometimes
ignores that), it's noticably harder to reason about what's going on
when I go outside there and when I think about what I'm doing it always
feels like it should be possible to factor it out of the drivers.

> Of course, even if all devices do get turned on during resume, one 
> would expect the normal runtime PM mechanism to power them down again 
> very shortly after the resume is finished.

Yeah, though of course if you're only going to be resumed for a very
brief time anyway the amount of time you spend powering up and down
suddenly gets a lot more interesting.  Things like responding to a
keepalive from the network can be done quickly enough that people get
annoyed if you burn 10ms or whatever powering up some irrelevant bit of
hardware.

> >  I've probably got a fairly
> > odd view of the world here in that I mostly care about big devices
> > connected over slow buses where power up can be user visible, and mostly
> > work on subsystems where the concept of "full power" isn't terribly
> > meaningful.
> 
> The PM core tries to be flexible, but there are limitations.  In
> particular, it's not very well suited for handling devices with
> multiple power levels.  Drivers simply have to do the best they can to
> fit in with the PM core's power model.  For example, all functional
> states might be considered "full power" and all others might be
> considered "low power".  Coordinating all the extra details would then
> be the driver's responsibility.

Right, I mostly agree - like I say I think the main thing that would be
useful here is to extend the helpers for the common case unified suspend
situation.  This may be totally separate to what Kevin needs, though.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 15:57                       ` Mark Brown
  2011-06-10 17:17                         ` Alan Stern
@ 2011-06-10 17:17                         ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 17:17 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> > Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
> > device in a low-power state, it can do so.  Just bear in mind that the
> > PM core's idea of the device's runtime power state may end up not
> > matching reality unless you're careful.
> 
> This is part of the trouble, it all feels like a lot more work than it
> should be for relatively common cases.  In the audio case we're fine as
> the subsystem implements a completely independent PM infrastructure
> which ignores the PM core except for system suspend (and sometimes
> ignores that), it's noticably harder to reason about what's going on
> when I go outside there and when I think about what I'm doing it always
> feels like it should be possible to factor it out of the drivers.

What would make the common cases easier?

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 15:57                       ` Mark Brown
@ 2011-06-10 17:17                         ` Alan Stern
  2011-06-10 17:31                           ` Mark Brown
  2011-06-10 17:31                           ` [linux-pm] " Mark Brown
  2011-06-10 17:17                         ` Alan Stern
  1 sibling, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 17:17 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Mark Brown wrote:

> > Well, this is a SHOULD, not a MUST.  If you want your driver to leave a
> > device in a low-power state, it can do so.  Just bear in mind that the
> > PM core's idea of the device's runtime power state may end up not
> > matching reality unless you're careful.
> 
> This is part of the trouble, it all feels like a lot more work than it
> should be for relatively common cases.  In the audio case we're fine as
> the subsystem implements a completely independent PM infrastructure
> which ignores the PM core except for system suspend (and sometimes
> ignores that), it's noticably harder to reason about what's going on
> when I go outside there and when I think about what I'm doing it always
> feels like it should be possible to factor it out of the drivers.

What would make the common cases easier?

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 17:17                         ` Alan Stern
@ 2011-06-10 17:31                           ` Mark Brown
  2011-06-10 17:31                           ` [linux-pm] " Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 17:31 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 01:17:56PM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > ignores that), it's noticably harder to reason about what's going on
> > when I go outside there and when I think about what I'm doing it always
> > feels like it should be possible to factor it out of the drivers.

> What would make the common cases easier?

I think from an interface point of view it's something like
UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
that can do the glue bits like enabling wakeup and quiescing activity.
I'd need to think harder about what exactly that'd look like - for my
cases the fundamental thing I want to say is that there's one suspend
routine and one resume routine and I'd like some framework code to work
out when they're called.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 17:17                         ` Alan Stern
  2011-06-10 17:31                           ` Mark Brown
@ 2011-06-10 17:31                           ` Mark Brown
  2011-06-10 18:38                             ` Rafael J. Wysocki
  2011-06-10 18:38                             ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 17:31 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Fri, Jun 10, 2011 at 01:17:56PM -0400, Alan Stern wrote:
> On Fri, 10 Jun 2011, Mark Brown wrote:

> > ignores that), it's noticably harder to reason about what's going on
> > when I go outside there and when I think about what I'm doing it always
> > feels like it should be possible to factor it out of the drivers.

> What would make the common cases easier?

I think from an interface point of view it's something like
UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
that can do the glue bits like enabling wakeup and quiescing activity.
I'd need to think harder about what exactly that'd look like - for my
cases the fundamental thing I want to say is that there's one suspend
routine and one resume routine and I'd like some framework code to work
out when they're called.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 17:31                           ` [linux-pm] " Mark Brown
  2011-06-10 18:38                             ` Rafael J. Wysocki
@ 2011-06-10 18:38                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 18:38 UTC (permalink / raw)
  To: linux-pm; +Cc: linux-omap, Mark Brown

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 01:17:56PM -0400, Alan Stern wrote:
> > On Fri, 10 Jun 2011, Mark Brown wrote:
> 
> > > ignores that), it's noticably harder to reason about what's going on
> > > when I go outside there and when I think about what I'm doing it always
> > > feels like it should be possible to factor it out of the drivers.
> 
> > What would make the common cases easier?
> 
> I think from an interface point of view it's something like
> UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> that can do the glue bits like enabling wakeup and quiescing activity.
> I'd need to think harder about what exactly that'd look like - for my
> cases the fundamental thing I want to say is that there's one suspend
> routine and one resume routine and I'd like some framework code to work
> out when they're called.

Can your device generate wakeup signals?

Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 17:31                           ` [linux-pm] " Mark Brown
@ 2011-06-10 18:38                             ` Rafael J. Wysocki
  2011-06-10 18:42                               ` Mark Brown
  2011-06-10 18:42                               ` [linux-pm] " Mark Brown
  2011-06-10 18:38                             ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 18:38 UTC (permalink / raw)
  To: linux-pm; +Cc: Mark Brown, Alan Stern, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 01:17:56PM -0400, Alan Stern wrote:
> > On Fri, 10 Jun 2011, Mark Brown wrote:
> 
> > > ignores that), it's noticably harder to reason about what's going on
> > > when I go outside there and when I think about what I'm doing it always
> > > feels like it should be possible to factor it out of the drivers.
> 
> > What would make the common cases easier?
> 
> I think from an interface point of view it's something like
> UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> that can do the glue bits like enabling wakeup and quiescing activity.
> I'd need to think harder about what exactly that'd look like - for my
> cases the fundamental thing I want to say is that there's one suspend
> routine and one resume routine and I'd like some framework code to work
> out when they're called.

Can your device generate wakeup signals?

Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 18:38                             ` Rafael J. Wysocki
@ 2011-06-10 18:42                               ` Mark Brown
  2011-06-10 18:42                               ` [linux-pm] " Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 18:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Fri, Jun 10, 2011 at 08:38:22PM +0200, Rafael J. Wysocki wrote:
> On Friday, June 10, 2011, Mark Brown wrote:

> > I think from an interface point of view it's something like
> > UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> > that can do the glue bits like enabling wakeup and quiescing activity.
> > I'd need to think harder about what exactly that'd look like - for my
> > cases the fundamental thing I want to say is that there's one suspend
> > routine and one resume routine and I'd like some framework code to work
> > out when they're called.

> Can your device generate wakeup signals?

I am interested in a fairly large selection of devices but broadly
speaking any off-SoC device can generate wakeups if it can generate
interrupts.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 18:38                             ` Rafael J. Wysocki
  2011-06-10 18:42                               ` Mark Brown
@ 2011-06-10 18:42                               ` Mark Brown
  2011-06-10 20:27                                 ` Rafael J. Wysocki
  2011-06-10 20:27                                 ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 18:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Alan Stern, linux-omap

On Fri, Jun 10, 2011 at 08:38:22PM +0200, Rafael J. Wysocki wrote:
> On Friday, June 10, 2011, Mark Brown wrote:

> > I think from an interface point of view it's something like
> > UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> > that can do the glue bits like enabling wakeup and quiescing activity.
> > I'd need to think harder about what exactly that'd look like - for my
> > cases the fundamental thing I want to say is that there's one suspend
> > routine and one resume routine and I'd like some framework code to work
> > out when they're called.

> Can your device generate wakeup signals?

I am interested in a fairly large selection of devices but broadly
speaking any off-SoC device can generate wakeups if it can generate
interrupts.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 14:36               ` Mark Brown
                                   ` (2 preceding siblings ...)
  2011-06-10 18:49                 ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-10 18:49                 ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 18:49 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Thu, Jun 09, 2011 at 09:56:48AM -0400, Alan Stern wrote:
> > On Wed, 8 Jun 2011, Kevin Hilman wrote:
> 
> > > Runtime suspend
> > > 2. activity finishes
> 
> > > system suspend [echo mem > /sys/power/state]
> > > 2. prevent future activity, halt/wait for current activity
> 
> > > The only difference is step 2.
> 
> > That _is_ the main difference, and it's a big one.  (As Magnus pointed 
> > out, wakeup-enabling is another difference).
> 
> ...
> 
> > They don't have to be decoupled, and indeed they can share code.  The 
> > point Rafael and I are making is that they have to use different 
> > callback pointers, which gives you a chance to handle the differences 
> > as well as the similarities.
> 
> It seems like everyone's agreeing with each other here - from the user
> side the request seems to be largely for core infastructure like
> UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> that it doesn't do anything to handle the runtime/system interaction?).

I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
does what it says, defines a set of operations to use for system suspend,
hibernation and runtime PM all the same.

> Right now this all feels like more work than it should be in simpler
> drivers.

My impression is that you and Kevin are talking of different things.

The Kevin's point originally was that it might be desirable to do things
like calling pm_runtime_suspend() from a driver's (system) .suspend()
callback, if I understood it correctly, and the answer was that it wasn't
the right thing to do (for reasons given elsewhere in the thread).

Your point seems to be that simple drivers should not be required to
define separate callback routines, for example, for system suspend and
runtime PM.  However, they aren't required to do so, they can point
all of their "suspend" callback pointers to the same routine, which is
what the UNIVERSAL_DEV_PM_OPS() macro does.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 14:36               ` Mark Brown
  2011-06-10 14:51                 ` Alan Stern
  2011-06-10 14:51                 ` [linux-pm] " Alan Stern
@ 2011-06-10 18:49                 ` Rafael J. Wysocki
  2011-06-10 18:54                   ` Mark Brown
  2011-06-10 18:54                   ` Mark Brown
  2011-06-10 18:49                 ` Rafael J. Wysocki
  3 siblings, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 18:49 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, Alan Stern, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Thu, Jun 09, 2011 at 09:56:48AM -0400, Alan Stern wrote:
> > On Wed, 8 Jun 2011, Kevin Hilman wrote:
> 
> > > Runtime suspend
> > > 2. activity finishes
> 
> > > system suspend [echo mem > /sys/power/state]
> > > 2. prevent future activity, halt/wait for current activity
> 
> > > The only difference is step 2.
> 
> > That _is_ the main difference, and it's a big one.  (As Magnus pointed 
> > out, wakeup-enabling is another difference).
> 
> ...
> 
> > They don't have to be decoupled, and indeed they can share code.  The 
> > point Rafael and I are making is that they have to use different 
> > callback pointers, which gives you a chance to handle the differences 
> > as well as the similarities.
> 
> It seems like everyone's agreeing with each other here - from the user
> side the request seems to be largely for core infastructure like
> UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> that it doesn't do anything to handle the runtime/system interaction?).

I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
does what it says, defines a set of operations to use for system suspend,
hibernation and runtime PM all the same.

> Right now this all feels like more work than it should be in simpler
> drivers.

My impression is that you and Kevin are talking of different things.

The Kevin's point originally was that it might be desirable to do things
like calling pm_runtime_suspend() from a driver's (system) .suspend()
callback, if I understood it correctly, and the answer was that it wasn't
the right thing to do (for reasons given elsewhere in the thread).

Your point seems to be that simple drivers should not be required to
define separate callback routines, for example, for system suspend and
runtime PM.  However, they aren't required to do so, they can point
all of their "suspend" callback pointers to the same routine, which is
what the UNIVERSAL_DEV_PM_OPS() macro does.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 18:49                 ` [linux-pm] " Rafael J. Wysocki
  2011-06-10 18:54                   ` Mark Brown
@ 2011-06-10 18:54                   ` Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 18:54 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Fri, Jun 10, 2011 at 08:49:03PM +0200, Rafael J. Wysocki wrote:
> On Friday, June 10, 2011, Mark Brown wrote:

> > It seems like everyone's agreeing with each other here - from the user
> > side the request seems to be largely for core infastructure like
> > UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> > that it doesn't do anything to handle the runtime/system interaction?).

> I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
> does what it says, defines a set of operations to use for system suspend,
> hibernation and runtime PM all the same.

Right, but in the light of what you guys are saying about the
interactions between runtime suspend and resume I'm no longer clear that
that is actually sane for something which does use runtime PM, and of
course if a driver wants to support the wake configuration interface
then this might also fall out of the window.

> The Kevin's point originally was that it might be desirable to do things
> like calling pm_runtime_suspend() from a driver's (system) .suspend()
> callback, if I understood it correctly, and the answer was that it wasn't
> the right thing to do (for reasons given elsewhere in the thread).

Yeah, I think it is too.

> Your point seems to be that simple drivers should not be required to
> define separate callback routines, for example, for system suspend and
> runtime PM.  However, they aren't required to do so, they can point
> all of their "suspend" callback pointers to the same routine, which is
> what the UNIVERSAL_DEV_PM_OPS() macro does.

So that's definitely safe?  I guess this partly comes back to the thing
I'm saying about how I'm finding all this stuff difficult to reason
about, every time I see such discussion I get confused about needing to
worry about it or not.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 18:49                 ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-10 18:54                   ` Mark Brown
  2011-06-10 20:45                     ` Rafael J. Wysocki
  2011-06-10 20:45                     ` [linux-pm] " Rafael J. Wysocki
  2011-06-10 18:54                   ` Mark Brown
  1 sibling, 2 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-10 18:54 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Alan Stern, linux-omap

On Fri, Jun 10, 2011 at 08:49:03PM +0200, Rafael J. Wysocki wrote:
> On Friday, June 10, 2011, Mark Brown wrote:

> > It seems like everyone's agreeing with each other here - from the user
> > side the request seems to be largely for core infastructure like
> > UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> > that it doesn't do anything to handle the runtime/system interaction?).

> I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
> does what it says, defines a set of operations to use for system suspend,
> hibernation and runtime PM all the same.

Right, but in the light of what you guys are saying about the
interactions between runtime suspend and resume I'm no longer clear that
that is actually sane for something which does use runtime PM, and of
course if a driver wants to support the wake configuration interface
then this might also fall out of the window.

> The Kevin's point originally was that it might be desirable to do things
> like calling pm_runtime_suspend() from a driver's (system) .suspend()
> callback, if I understood it correctly, and the answer was that it wasn't
> the right thing to do (for reasons given elsewhere in the thread).

Yeah, I think it is too.

> Your point seems to be that simple drivers should not be required to
> define separate callback routines, for example, for system suspend and
> runtime PM.  However, they aren't required to do so, they can point
> all of their "suspend" callback pointers to the same routine, which is
> what the UNIVERSAL_DEV_PM_OPS() macro does.

So that's definitely safe?  I guess this partly comes back to the thing
I'm saying about how I'm finding all this stuff difficult to reason
about, every time I see such discussion I get confused about needing to
worry about it or not.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 18:42                               ` [linux-pm] " Mark Brown
  2011-06-10 20:27                                 ` Rafael J. Wysocki
@ 2011-06-10 20:27                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 20:27 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 08:38:22PM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 10, 2011, Mark Brown wrote:
> 
> > > I think from an interface point of view it's something like
> > > UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> > > that can do the glue bits like enabling wakeup and quiescing activity.
> > > I'd need to think harder about what exactly that'd look like - for my
> > > cases the fundamental thing I want to say is that there's one suspend
> > > routine and one resume routine and I'd like some framework code to work
> > > out when they're called.
> 
> > Can your device generate wakeup signals?
> 
> I am interested in a fairly large selection of devices but broadly
> speaking any off-SoC device can generate wakeups if it can generate
> interrupts.

So, there are a few things to consider:

* Can the device do things like DMA?
* Does the driver use a workqueue?
* Does it use timers?

In all of the above cases your system suspend handling will require extra
care to make sure those things won't get in the way of the suspend process.

Next, what subsystem (e.g. bus type) is the driver going to work with?

If the subsystem is "smart" enough, it can take care of many things, like
"powering off", wakeup preparations and so on.

Now, there are a few combinations possible.  First, if the subsystem is
"smart" and the driver need not take care of the things listed above, then
very likely .runtime_suspend() and .suspend() can do the same and
UNIVERSAL_DEV_PM_OPS() can be used.  Next, if the subsystem is "smart",
but the driver needs to take care of those things, then .suspend() has
more to do, but very likely .runtime_suspend() and .suspend_noirq() can
do the same, while .suspend() may simply prepare the device for the next
stage.  And so on.

It's probably fair to say that everithing depends on the subsystem, what it
does and what it expects from the driver.  In the extreme case, when the
subsystem is like the platform bus type, the driver unfortunately is on its
own and has to deal with the whole complexity.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 18:42                               ` [linux-pm] " Mark Brown
@ 2011-06-10 20:27                                 ` Rafael J. Wysocki
  2011-06-10 21:27                                   ` Alan Stern
                                                     ` (3 more replies)
  2011-06-10 20:27                                 ` Rafael J. Wysocki
  1 sibling, 4 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 20:27 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, Alan Stern, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 08:38:22PM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 10, 2011, Mark Brown wrote:
> 
> > > I think from an interface point of view it's something like
> > > UNIVERSAL_DEV_PM_OPS() and friends, probably with some additional ops
> > > that can do the glue bits like enabling wakeup and quiescing activity.
> > > I'd need to think harder about what exactly that'd look like - for my
> > > cases the fundamental thing I want to say is that there's one suspend
> > > routine and one resume routine and I'd like some framework code to work
> > > out when they're called.
> 
> > Can your device generate wakeup signals?
> 
> I am interested in a fairly large selection of devices but broadly
> speaking any off-SoC device can generate wakeups if it can generate
> interrupts.

So, there are a few things to consider:

* Can the device do things like DMA?
* Does the driver use a workqueue?
* Does it use timers?

In all of the above cases your system suspend handling will require extra
care to make sure those things won't get in the way of the suspend process.

Next, what subsystem (e.g. bus type) is the driver going to work with?

If the subsystem is "smart" enough, it can take care of many things, like
"powering off", wakeup preparations and so on.

Now, there are a few combinations possible.  First, if the subsystem is
"smart" and the driver need not take care of the things listed above, then
very likely .runtime_suspend() and .suspend() can do the same and
UNIVERSAL_DEV_PM_OPS() can be used.  Next, if the subsystem is "smart",
but the driver needs to take care of those things, then .suspend() has
more to do, but very likely .runtime_suspend() and .suspend_noirq() can
do the same, while .suspend() may simply prepare the device for the next
stage.  And so on.

It's probably fair to say that everithing depends on the subsystem, what it
does and what it expects from the driver.  In the extreme case, when the
subsystem is like the platform bus type, the driver unfortunately is on its
own and has to deal with the whole complexity.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 18:54                   ` Mark Brown
@ 2011-06-10 20:45                     ` Rafael J. Wysocki
  2011-06-10 20:45                     ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 20:45 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 08:49:03PM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 10, 2011, Mark Brown wrote:
> 
> > > It seems like everyone's agreeing with each other here - from the user
> > > side the request seems to be largely for core infastructure like
> > > UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> > > that it doesn't do anything to handle the runtime/system interaction?).
> 
> > I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
> > does what it says, defines a set of operations to use for system suspend,
> > hibernation and runtime PM all the same.
> 
> Right, but in the light of what you guys are saying about the
> interactions between runtime suspend and resume I'm no longer clear that
> that is actually sane for something which does use runtime PM, and of
> course if a driver wants to support the wake configuration interface
> then this might also fall out of the window.

It is not generally safe and there are multiple factors deciding of it
(see the message I've just sent for details).

> > The Kevin's point originally was that it might be desirable to do things
> > like calling pm_runtime_suspend() from a driver's (system) .suspend()
> > callback, if I understood it correctly, and the answer was that it wasn't
> > the right thing to do (for reasons given elsewhere in the thread).
> 
> Yeah, I think it is too.
> 
> > Your point seems to be that simple drivers should not be required to
> > define separate callback routines, for example, for system suspend and
> > runtime PM.  However, they aren't required to do so, they can point
> > all of their "suspend" callback pointers to the same routine, which is
> > what the UNIVERSAL_DEV_PM_OPS() macro does.
> 
> So that's definitely safe?  I guess this partly comes back to the thing
> I'm saying about how I'm finding all this stuff difficult to reason
> about, every time I see such discussion I get confused about needing to
> worry about it or not.

Well, it's just that multiple things go into play here: the subsystem, the
overall complexity of the driver and so on.  I can probably say what's
good for a PCI driver, but that may not be suitable for a USB driver, for
one example.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 18:54                   ` Mark Brown
  2011-06-10 20:45                     ` Rafael J. Wysocki
@ 2011-06-10 20:45                     ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-10 20:45 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, Alan Stern, linux-omap

On Friday, June 10, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 08:49:03PM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 10, 2011, Mark Brown wrote:
> 
> > > It seems like everyone's agreeing with each other here - from the user
> > > side the request seems to be largely for core infastructure like
> > > UNIVERSAL_DEV_PM_OPS() (which I'm not sure is a good idea any more given
> > > that it doesn't do anything to handle the runtime/system interaction?).
> 
> > I'm not sure what you mean here.  First, UNIVERSAL_DEV_PM_OPS() actually
> > does what it says, defines a set of operations to use for system suspend,
> > hibernation and runtime PM all the same.
> 
> Right, but in the light of what you guys are saying about the
> interactions between runtime suspend and resume I'm no longer clear that
> that is actually sane for something which does use runtime PM, and of
> course if a driver wants to support the wake configuration interface
> then this might also fall out of the window.

It is not generally safe and there are multiple factors deciding of it
(see the message I've just sent for details).

> > The Kevin's point originally was that it might be desirable to do things
> > like calling pm_runtime_suspend() from a driver's (system) .suspend()
> > callback, if I understood it correctly, and the answer was that it wasn't
> > the right thing to do (for reasons given elsewhere in the thread).
> 
> Yeah, I think it is too.
> 
> > Your point seems to be that simple drivers should not be required to
> > define separate callback routines, for example, for system suspend and
> > runtime PM.  However, they aren't required to do so, they can point
> > all of their "suspend" callback pointers to the same routine, which is
> > what the UNIVERSAL_DEV_PM_OPS() macro does.
> 
> So that's definitely safe?  I guess this partly comes back to the thing
> I'm saying about how I'm finding all this stuff difficult to reason
> about, every time I see such discussion I get confused about needing to
> worry about it or not.

Well, it's just that multiple things go into play here: the subsystem, the
overall complexity of the driver and so on.  I can probably say what's
good for a PCI driver, but that may not be suitable for a USB driver, for
one example.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 20:27                                 ` Rafael J. Wysocki
  2011-06-10 21:27                                   ` Alan Stern
@ 2011-06-10 21:27                                   ` Alan Stern
  2011-06-11 11:42                                   ` Mark Brown
  2011-06-11 11:42                                   ` [linux-pm] " Mark Brown
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 21:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Mark Brown, linux-omap

On Fri, 10 Jun 2011, Rafael J. Wysocki wrote:

> Next, what subsystem (e.g. bus type) is the driver going to work with?
> 
> If the subsystem is "smart" enough, it can take care of many things, like
> "powering off", wakeup preparations and so on.
> 
> Now, there are a few combinations possible.  First, if the subsystem is
> "smart" and the driver need not take care of the things listed above, then
> very likely .runtime_suspend() and .suspend() can do the same and
> UNIVERSAL_DEV_PM_OPS() can be used.  Next, if the subsystem is "smart",
> but the driver needs to take care of those things, then .suspend() has
> more to do, but very likely .runtime_suspend() and .suspend_noirq() can
> do the same, while .suspend() may simply prepare the device for the next
> stage.  And so on.
> 
> It's probably fair to say that everithing depends on the subsystem, what it
> does and what it expects from the driver.

Right.  For example, consider the USB subsystem.  It has separate entry
points for system suspend and runtime suspend.  These routines decide
whether or not wakeup should be enabled, check the device's current
power state, and do a few other things; then they call the driver-level
routines.  Each driver has a single suspend routine; all it does is
quiesce the device and make sure I/O queues are stopped.  The subsystem
code then takes care of setting the proper power state.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 20:27                                 ` Rafael J. Wysocki
@ 2011-06-10 21:27                                   ` Alan Stern
  2011-06-10 21:27                                   ` Alan Stern
                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-10 21:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Mark Brown, linux-pm, linux-omap

On Fri, 10 Jun 2011, Rafael J. Wysocki wrote:

> Next, what subsystem (e.g. bus type) is the driver going to work with?
> 
> If the subsystem is "smart" enough, it can take care of many things, like
> "powering off", wakeup preparations and so on.
> 
> Now, there are a few combinations possible.  First, if the subsystem is
> "smart" and the driver need not take care of the things listed above, then
> very likely .runtime_suspend() and .suspend() can do the same and
> UNIVERSAL_DEV_PM_OPS() can be used.  Next, if the subsystem is "smart",
> but the driver needs to take care of those things, then .suspend() has
> more to do, but very likely .runtime_suspend() and .suspend_noirq() can
> do the same, while .suspend() may simply prepare the device for the next
> stage.  And so on.
> 
> It's probably fair to say that everithing depends on the subsystem, what it
> does and what it expects from the driver.

Right.  For example, consider the USB subsystem.  It has separate entry
points for system suspend and runtime suspend.  These routines decide
whether or not wakeup should be enabled, check the device's current
power state, and do a few other things; then they call the driver-level
routines.  Each driver has a single suspend routine; all it does is
quiesce the device and make sure I/O queues are stopped.  The subsystem
code then takes care of setting the proper power state.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
                             ` (4 preceding siblings ...)
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
@ 2011-06-10 23:14           ` Kevin Hilman
  5 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-10 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[...]

> Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
> system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
> pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
> The reason is that doing that would cause the subsystem's (or power
> domain's in this case) .runtime_suspend() callback to be invoked and
> that's incorrect.  Namely, it would require the subsystem (power domain)
> to expect that its .runtime_suspend() would always be executed indirectly
> as a result of calling its .suspend() (through the driver's callback)
> and that expectation may or may not be met (depending on the driver's
> design).

So here's an interesting scenario which I think it triggers the same
problem as you highlight above.

Assume you have a driver that's using runtime PM on a per-xfer basis.
Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
does a pm_runtime_put_sync() (for this example, it's important that it's
a _put_sync()).  The _put_sync() might happen in an ISR, or possibly in
a thread waiting on a completion which is awoken by the ISR, etc. etc.
(the runtime PM callbacks are IRQ safe, and device is marked as such.)

The driver is in the middle of an xfer and a system suspend request
happens.

The driver's ->suspend() callback happens, and the driver

- enables/disables wakeups based on device_may_wakeup()
- prevents future xfers
- waits for current xfer to finish

As soon as the xfer finishes, the driver gets notified (completion,
callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
subsys->runtime_suspend --> driver->runtime_suspend.

While the driver's ->suspend() callback doesn't directly call
pm_runtime_put_sync(), the act of waiting for the xfer to finish
causes the subsystem/driver->runtime_suspend callbacks to be called
during the subsytem/driver->suspend callback, which is the same problem
as you highlight above.  

Based on your commit that removed incrementing the usage count across
suspend[1], you mentioned "we can rely on subsystems and device drivers
to avoid doing that unnecessarily."  The above example shows that this
type of thing might not be that obvious to detect and thus avoid.

I suspect the solution to the above will be to add back the usage count
increment across system suspend, but I'm hoping not.  IMO, it would be
more flexible to allow the subsystems to decide.  The subsystems could
provide locking (or manage dev->power.usage_count) themselves if
necessary.  For example, leave it to the subsystem->prepare() to
pm_runtime_get_noresume() if it wants to avoid the "nesting" of
callbacks.

A related question: does the pm_wq need to be freezable?  From
Documentation/power/runtime_pm.txt:

* The power management workqueue pm_wq in which bus types and device drivers can
  put their PM-related work items.  It is strongly recommended that pm_wq be
  used for queuing all work items related to run-time PM, because this allows
  them to be synchronized with system-wide power transitions (suspend to RAM,
  hibernation and resume from system sleep states).  pm_wq is declared in
  include/linux/pm_runtime.h and defined in kernel/power/main.c.

Is "synchronized with system-wide power transistions" correct here?
Rather than synchronize, using a freezable workqueue actually _prevents_
runtime PM events (at least async ones.)

Again, proper locking (or management of dev->power.usage_count) at the
subsystem level would get you the same effect, but still leave
flexibility to the subsystem/pwr_domain layer.

Kevin

P.S. the commit below[1] removed the usage count increment/decrement
     across system suspend/resume, but Documentation/power/runtime_pm.txt 
     still refers to it.   Patch below[2] removes it, ssuming you're
     not planning on adding it back.  ;)

[1]
commit e8665002477f0278f84f898145b1f141ba26ee26
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Sat Feb 12 01:42:41 2011 +0100

    PM: Allow pm_runtime_suspend() to succeed during system suspend
    
    The dpm_prepare() function increments the runtime PM reference
    counters of all devices to prevent pm_runtime_suspend() from
    executing subsystem-level callbacks.  However, this was supposed to
    guard against a specific race condition that cannot happen, because
    the power management workqueue is freezable, so pm_runtime_suspend()
    can only be called synchronously during system suspend and we can
    rely on subsystems and device drivers to avoid doing that
    unnecessarily.
    
    Make dpm_prepare() drop the runtime PM reference to each device
    after making sure that runtime resume is not pending for it.
    
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Acked-by: Kevin Hilman <khilman@ti.com>

[2]
>From 8968e3e41d785e7e5ce7584d64f6a55b303e7060 Mon Sep 17 00:00:00 2001
From: Kevin Hilman <khilman@ti.com>
Date: Fri, 10 Jun 2011 16:05:51 -0700
Subject: [PATCH] PM / Runtime: update doc: usage count no longer incremented across system PM

commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow
pm_runtime_suspend() to succeed during system suspend) removed usage
count increment across system PM.

Update doc to reflect this.

Signed-off-by: Kevin Hilman <khilman@ti.com>
---
Applies on v3.0-rc2

 Documentation/power/runtime_pm.txt |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 654097b..22accb3 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -566,11 +566,6 @@ to do this is:
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
 
-The PM core always increments the run-time usage counter before calling the
-->prepare() callback and decrements it after calling the ->complete() callback.
-Hence disabling run-time PM temporarily like this will not cause any run-time
-suspend callbacks to be lost.
-
 7. Generic subsystem callbacks
 
 Subsystems may wish to conserve code space by using the set of generic power
-- 
1.7.4

^ permalink raw reply related	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
                             ` (3 preceding siblings ...)
  2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
@ 2011-06-10 23:14           ` Kevin Hilman
  2011-06-11 16:27             ` Alan Stern
                               ` (3 more replies)
  2011-06-10 23:14           ` Kevin Hilman
  5 siblings, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-10 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[...]

> Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
> system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
> pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
> The reason is that doing that would cause the subsystem's (or power
> domain's in this case) .runtime_suspend() callback to be invoked and
> that's incorrect.  Namely, it would require the subsystem (power domain)
> to expect that its .runtime_suspend() would always be executed indirectly
> as a result of calling its .suspend() (through the driver's callback)
> and that expectation may or may not be met (depending on the driver's
> design).

So here's an interesting scenario which I think it triggers the same
problem as you highlight above.

Assume you have a driver that's using runtime PM on a per-xfer basis.
Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
does a pm_runtime_put_sync() (for this example, it's important that it's
a _put_sync()).  The _put_sync() might happen in an ISR, or possibly in
a thread waiting on a completion which is awoken by the ISR, etc. etc.
(the runtime PM callbacks are IRQ safe, and device is marked as such.)

The driver is in the middle of an xfer and a system suspend request
happens.

The driver's ->suspend() callback happens, and the driver

- enables/disables wakeups based on device_may_wakeup()
- prevents future xfers
- waits for current xfer to finish

As soon as the xfer finishes, the driver gets notified (completion,
callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
subsys->runtime_suspend --> driver->runtime_suspend.

While the driver's ->suspend() callback doesn't directly call
pm_runtime_put_sync(), the act of waiting for the xfer to finish
causes the subsystem/driver->runtime_suspend callbacks to be called
during the subsytem/driver->suspend callback, which is the same problem
as you highlight above.  

Based on your commit that removed incrementing the usage count across
suspend[1], you mentioned "we can rely on subsystems and device drivers
to avoid doing that unnecessarily."  The above example shows that this
type of thing might not be that obvious to detect and thus avoid.

I suspect the solution to the above will be to add back the usage count
increment across system suspend, but I'm hoping not.  IMO, it would be
more flexible to allow the subsystems to decide.  The subsystems could
provide locking (or manage dev->power.usage_count) themselves if
necessary.  For example, leave it to the subsystem->prepare() to
pm_runtime_get_noresume() if it wants to avoid the "nesting" of
callbacks.

A related question: does the pm_wq need to be freezable?  From
Documentation/power/runtime_pm.txt:

* The power management workqueue pm_wq in which bus types and device drivers can
  put their PM-related work items.  It is strongly recommended that pm_wq be
  used for queuing all work items related to run-time PM, because this allows
  them to be synchronized with system-wide power transitions (suspend to RAM,
  hibernation and resume from system sleep states).  pm_wq is declared in
  include/linux/pm_runtime.h and defined in kernel/power/main.c.

Is "synchronized with system-wide power transistions" correct here?
Rather than synchronize, using a freezable workqueue actually _prevents_
runtime PM events (at least async ones.)

Again, proper locking (or management of dev->power.usage_count) at the
subsystem level would get you the same effect, but still leave
flexibility to the subsystem/pwr_domain layer.

Kevin

P.S. the commit below[1] removed the usage count increment/decrement
     across system suspend/resume, but Documentation/power/runtime_pm.txt 
     still refers to it.   Patch below[2] removes it, ssuming you're
     not planning on adding it back.  ;)

[1]
commit e8665002477f0278f84f898145b1f141ba26ee26
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Sat Feb 12 01:42:41 2011 +0100

    PM: Allow pm_runtime_suspend() to succeed during system suspend
    
    The dpm_prepare() function increments the runtime PM reference
    counters of all devices to prevent pm_runtime_suspend() from
    executing subsystem-level callbacks.  However, this was supposed to
    guard against a specific race condition that cannot happen, because
    the power management workqueue is freezable, so pm_runtime_suspend()
    can only be called synchronously during system suspend and we can
    rely on subsystems and device drivers to avoid doing that
    unnecessarily.
    
    Make dpm_prepare() drop the runtime PM reference to each device
    after making sure that runtime resume is not pending for it.
    
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Acked-by: Kevin Hilman <khilman@ti.com>

[2]
>From 8968e3e41d785e7e5ce7584d64f6a55b303e7060 Mon Sep 17 00:00:00 2001
From: Kevin Hilman <khilman@ti.com>
Date: Fri, 10 Jun 2011 16:05:51 -0700
Subject: [PATCH] PM / Runtime: update doc: usage count no longer incremented across system PM

commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow
pm_runtime_suspend() to succeed during system suspend) removed usage
count increment across system PM.

Update doc to reflect this.

Signed-off-by: Kevin Hilman <khilman@ti.com>
---
Applies on v3.0-rc2

 Documentation/power/runtime_pm.txt |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 654097b..22accb3 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -566,11 +566,6 @@ to do this is:
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
 
-The PM core always increments the run-time usage counter before calling the
-->prepare() callback and decrements it after calling the ->complete() callback.
-Hence disabling run-time PM temporarily like this will not cause any run-time
-suspend callbacks to be lost.
-
 7. Generic subsystem callbacks
 
 Subsystems may wish to conserve code space by using the set of generic power
-- 
1.7.4


^ permalink raw reply related	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-09 13:56             ` Alan Stern
  2011-06-10 14:36               ` Mark Brown
  2011-06-10 14:36               ` Mark Brown
@ 2011-06-10 23:52               ` Kevin Hilman
  2011-06-10 23:52               ` [linux-pm] " Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-10 23:52 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

Alan Stern <stern@rowland.harvard.edu> writes:

[...]

>> More specifically, what should be the approach in system suspend when a
>> device is already runtime suspended?  If you treat runtime and system PM
>> as completely independent, you would have to runtime resume the device
>> so that it can then be immediately system suspended.
>
> Assuming the wakeup setting is correct, and assuming you use the same 
> power level for runtime suspend and system suspend, then nothing needs 
> to be done.
>
> If the wakeup setting is not correct, it has to be changed.  That 
> often implies going back to full power in order to change the 
> wakeup setting, then going to low power again.

OK, but how should this be implemented?  

If the device is runtime suspended at system suspend time, it implies
that somwhere in the system suspend path, the device has to be powered
on and enabled (a.k.a. runtime resumed.)

>From a driver writer's perspective, doing a pm_runtime_get_sync() would
be the obvious choice, but that causes nesting of ->runtime_resume
callbacks within ->suspend callbacks which is apparently forbidden (or
rather strongly recommended against :)

Now, assuming the driver's suspend can't do a pm_runtime_get()...

In order to power on & enable the device, the driver has to essentially
duplicate everything that would be done by a runtime resume.

The problem comes because this work is shared between the driver and the
subsystem.  IOW, it's the driver's ->suspend() callback that decides
whether or not the device needs to be powered-on/enabled (e.g. to
enable/disable wakeups), but it might be the subsystem that actually has
does the magic_device_set_full_power(), magic_device_enable().

So once the driver's ->suspend() realizes it needs to power on & enable
the device, it has no way to tell the subsystem to do so, wait for it to
happen, and then enable/disable its wakeups.

Maybe I'm being really dense, really blind, or really stubborn (or all
three), but it seems to be that using runtime PM calls to implement
these things would be the most obvious and the most readable.

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-09 13:56             ` Alan Stern
                                 ` (2 preceding siblings ...)
  2011-06-10 23:52               ` Kevin Hilman
@ 2011-06-10 23:52               ` Kevin Hilman
  2011-06-11 16:42                 ` Alan Stern
  2011-06-11 16:42                 ` [linux-pm] " Alan Stern
  3 siblings, 2 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-10 23:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

Alan Stern <stern@rowland.harvard.edu> writes:

[...]

>> More specifically, what should be the approach in system suspend when a
>> device is already runtime suspended?  If you treat runtime and system PM
>> as completely independent, you would have to runtime resume the device
>> so that it can then be immediately system suspended.
>
> Assuming the wakeup setting is correct, and assuming you use the same 
> power level for runtime suspend and system suspend, then nothing needs 
> to be done.
>
> If the wakeup setting is not correct, it has to be changed.  That 
> often implies going back to full power in order to change the 
> wakeup setting, then going to low power again.

OK, but how should this be implemented?  

If the device is runtime suspended at system suspend time, it implies
that somwhere in the system suspend path, the device has to be powered
on and enabled (a.k.a. runtime resumed.)

>From a driver writer's perspective, doing a pm_runtime_get_sync() would
be the obvious choice, but that causes nesting of ->runtime_resume
callbacks within ->suspend callbacks which is apparently forbidden (or
rather strongly recommended against :)

Now, assuming the driver's suspend can't do a pm_runtime_get()...

In order to power on & enable the device, the driver has to essentially
duplicate everything that would be done by a runtime resume.

The problem comes because this work is shared between the driver and the
subsystem.  IOW, it's the driver's ->suspend() callback that decides
whether or not the device needs to be powered-on/enabled (e.g. to
enable/disable wakeups), but it might be the subsystem that actually has
does the magic_device_set_full_power(), magic_device_enable().

So once the driver's ->suspend() realizes it needs to power on & enable
the device, it has no way to tell the subsystem to do so, wait for it to
happen, and then enable/disable its wakeups.

Maybe I'm being really dense, really blind, or really stubborn (or all
three), but it seems to be that using runtime PM calls to implement
these things would be the most obvious and the most readable.

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 20:27                                 ` Rafael J. Wysocki
  2011-06-10 21:27                                   ` Alan Stern
  2011-06-10 21:27                                   ` Alan Stern
@ 2011-06-11 11:42                                   ` Mark Brown
  2011-06-11 11:42                                   ` [linux-pm] " Mark Brown
  3 siblings, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-11 11:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Fri, Jun 10, 2011 at 10:27:25PM +0200, Rafael J. Wysocki wrote:

> So, there are a few things to consider:

> * Can the device do things like DMA?
> * Does the driver use a workqueue?
> * Does it use timers?

> In all of the above cases your system suspend handling will require extra
> care to make sure those things won't get in the way of the suspend process.

Yes, that's the quiesce operation I think myself or Alan mentioned.

> It's probably fair to say that everithing depends on the subsystem, what it
> does and what it expects from the driver.  In the extreme case, when the
> subsystem is like the platform bus type, the driver unfortunately is on its
> own and has to deal with the whole complexity.

I'm pretty much only working with buses that have no infrastructure and
for which power is essentially orthogonal to the control bus itself -
that's a very large proportion of the embedded space.  It really feels
like we could be doing a better job for drivers using these buses,
there's a lot of similarities in what many of them need but I can never
find the time to get my head round it confidently enough to actually
propose anything.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 20:27                                 ` Rafael J. Wysocki
                                                     ` (2 preceding siblings ...)
  2011-06-11 11:42                                   ` Mark Brown
@ 2011-06-11 11:42                                   ` Mark Brown
  2011-06-11 20:56                                     ` Rafael J. Wysocki
  3 siblings, 1 reply; 118+ messages in thread
From: Mark Brown @ 2011-06-11 11:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Alan Stern, linux-omap

On Fri, Jun 10, 2011 at 10:27:25PM +0200, Rafael J. Wysocki wrote:

> So, there are a few things to consider:

> * Can the device do things like DMA?
> * Does the driver use a workqueue?
> * Does it use timers?

> In all of the above cases your system suspend handling will require extra
> care to make sure those things won't get in the way of the suspend process.

Yes, that's the quiesce operation I think myself or Alan mentioned.

> It's probably fair to say that everithing depends on the subsystem, what it
> does and what it expects from the driver.  In the extreme case, when the
> subsystem is like the platform bus type, the driver unfortunately is on its
> own and has to deal with the whole complexity.

I'm pretty much only working with buses that have no infrastructure and
for which power is essentially orthogonal to the control bus itself -
that's a very large proportion of the embedded space.  It really feels
like we could be doing a better job for drivers using these buses,
there's a lot of similarities in what many of them need but I can never
find the time to get my head round it confidently enough to actually
propose anything.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
@ 2011-06-11 16:27             ` Alan Stern
  2011-06-11 16:27             ` [linux-pm] " Alan Stern
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-11 16:27 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Kevin Hilman wrote:

> So here's an interesting scenario which I think it triggers the same
> problem as you highlight above.
> 
> Assume you have a driver that's using runtime PM on a per-xfer basis.
> Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
> does a pm_runtime_put_sync() (for this example, it's important that it's
> a _put_sync()).  The _put_sync() might happen in an ISR, or possibly in
> a thread waiting on a completion which is awoken by the ISR, etc. etc.
> (the runtime PM callbacks are IRQ safe, and device is marked as such.)
> 
> The driver is in the middle of an xfer and a system suspend request
> happens.
> 
> The driver's ->suspend() callback happens, and the driver
> 
> - enables/disables wakeups based on device_may_wakeup()
> - prevents future xfers
> - waits for current xfer to finish
> 
> As soon as the xfer finishes, the driver gets notified (completion,
> callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
> subsys->runtime_suspend --> driver->runtime_suspend.
> 
> While the driver's ->suspend() callback doesn't directly call
> pm_runtime_put_sync(), the act of waiting for the xfer to finish
> causes the subsystem/driver->runtime_suspend callbacks to be called
> during the subsytem/driver->suspend callback, which is the same problem
> as you highlight above.  
> 
> Based on your commit that removed incrementing the usage count across
> suspend[1], you mentioned "we can rely on subsystems and device drivers
> to avoid doing that unnecessarily."  The above example shows that this
> type of thing might not be that obvious to detect and thus avoid.

As with so many other things, this depends entirely on how the 
subsystem and driver are designed.  If they are written to allow this 
sort of thing and handle it properly, there's no problem.

Nothing in the PM core itself cares whether the runtime PM routines are
invoked during system sleep.

> I suspect the solution to the above will be to add back the usage count
> increment across system suspend, but I'm hoping not.  IMO, it would be
> more flexible to allow the subsystems to decide.  The subsystems could
> provide locking (or manage dev->power.usage_count) themselves if
> necessary.  For example, leave it to the subsystem->prepare() to
> pm_runtime_get_noresume() if it wants to avoid the "nesting" of
> callbacks.

Exactly.

> A related question: does the pm_wq need to be freezable?  From
> Documentation/power/runtime_pm.txt:
> 
> * The power management workqueue pm_wq in which bus types and device drivers can
>   put their PM-related work items.  It is strongly recommended that pm_wq be
>   used for queuing all work items related to run-time PM, because this allows
>   them to be synchronized with system-wide power transitions (suspend to RAM,
>   hibernation and resume from system sleep states).  pm_wq is declared in
>   include/linux/pm_runtime.h and defined in kernel/power/main.c.
> 
> Is "synchronized with system-wide power transistions" correct here?
> Rather than synchronize, using a freezable workqueue actually _prevents_
> runtime PM events (at least async ones.)

Which prevents races -- the goal of synchronization.  If you use pm_wq 
for your asynchronous runtime PM events, you never have to worry about 
one of them occurring in the middle of a system sleep transition.

> Again, proper locking (or management of dev->power.usage_count) at the
> subsystem level would get you the same effect, but still leave
> flexibility to the subsystem/pwr_domain layer.

I'm not so sure about that.  For example, how would you prevent an 
async resume from interfering with a system suspend?

> Kevin
> 
> P.S. the commit below[1] removed the usage count increment/decrement
>      across system suspend/resume, but Documentation/power/runtime_pm.txt 
>      still refers to it.   Patch below[2] removes it, ssuming you're
>      not planning on adding it back.  ;)

...

> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index 654097b..22accb3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -566,11 +566,6 @@ to do this is:
>  	pm_runtime_set_active(dev);
>  	pm_runtime_enable(dev);
>  
> -The PM core always increments the run-time usage counter before calling the
> -->prepare() callback and decrements it after calling the ->complete() callback.
> -Hence disabling run-time PM temporarily like this will not cause any run-time
> -suspend callbacks to be lost.
> -

Thank you for pointing this out.  I had forgotten about this; it
implies that temporarily disabling runtime PM during system resume is
no longer safe!

Maybe we should put the get_noresume and put_sync calls back into the
PM core, but only during the system resume stages.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
  2011-06-11 16:27             ` Alan Stern
@ 2011-06-11 16:27             ` Alan Stern
  2011-06-11 23:13             ` Rafael J. Wysocki
  2011-06-11 23:13             ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-11 16:27 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

On Fri, 10 Jun 2011, Kevin Hilman wrote:

> So here's an interesting scenario which I think it triggers the same
> problem as you highlight above.
> 
> Assume you have a driver that's using runtime PM on a per-xfer basis.
> Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
> does a pm_runtime_put_sync() (for this example, it's important that it's
> a _put_sync()).  The _put_sync() might happen in an ISR, or possibly in
> a thread waiting on a completion which is awoken by the ISR, etc. etc.
> (the runtime PM callbacks are IRQ safe, and device is marked as such.)
> 
> The driver is in the middle of an xfer and a system suspend request
> happens.
> 
> The driver's ->suspend() callback happens, and the driver
> 
> - enables/disables wakeups based on device_may_wakeup()
> - prevents future xfers
> - waits for current xfer to finish
> 
> As soon as the xfer finishes, the driver gets notified (completion,
> callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
> subsys->runtime_suspend --> driver->runtime_suspend.
> 
> While the driver's ->suspend() callback doesn't directly call
> pm_runtime_put_sync(), the act of waiting for the xfer to finish
> causes the subsystem/driver->runtime_suspend callbacks to be called
> during the subsytem/driver->suspend callback, which is the same problem
> as you highlight above.  
> 
> Based on your commit that removed incrementing the usage count across
> suspend[1], you mentioned "we can rely on subsystems and device drivers
> to avoid doing that unnecessarily."  The above example shows that this
> type of thing might not be that obvious to detect and thus avoid.

As with so many other things, this depends entirely on how the 
subsystem and driver are designed.  If they are written to allow this 
sort of thing and handle it properly, there's no problem.

Nothing in the PM core itself cares whether the runtime PM routines are
invoked during system sleep.

> I suspect the solution to the above will be to add back the usage count
> increment across system suspend, but I'm hoping not.  IMO, it would be
> more flexible to allow the subsystems to decide.  The subsystems could
> provide locking (or manage dev->power.usage_count) themselves if
> necessary.  For example, leave it to the subsystem->prepare() to
> pm_runtime_get_noresume() if it wants to avoid the "nesting" of
> callbacks.

Exactly.

> A related question: does the pm_wq need to be freezable?  From
> Documentation/power/runtime_pm.txt:
> 
> * The power management workqueue pm_wq in which bus types and device drivers can
>   put their PM-related work items.  It is strongly recommended that pm_wq be
>   used for queuing all work items related to run-time PM, because this allows
>   them to be synchronized with system-wide power transitions (suspend to RAM,
>   hibernation and resume from system sleep states).  pm_wq is declared in
>   include/linux/pm_runtime.h and defined in kernel/power/main.c.
> 
> Is "synchronized with system-wide power transistions" correct here?
> Rather than synchronize, using a freezable workqueue actually _prevents_
> runtime PM events (at least async ones.)

Which prevents races -- the goal of synchronization.  If you use pm_wq 
for your asynchronous runtime PM events, you never have to worry about 
one of them occurring in the middle of a system sleep transition.

> Again, proper locking (or management of dev->power.usage_count) at the
> subsystem level would get you the same effect, but still leave
> flexibility to the subsystem/pwr_domain layer.

I'm not so sure about that.  For example, how would you prevent an 
async resume from interfering with a system suspend?

> Kevin
> 
> P.S. the commit below[1] removed the usage count increment/decrement
>      across system suspend/resume, but Documentation/power/runtime_pm.txt 
>      still refers to it.   Patch below[2] removes it, ssuming you're
>      not planning on adding it back.  ;)

...

> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index 654097b..22accb3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -566,11 +566,6 @@ to do this is:
>  	pm_runtime_set_active(dev);
>  	pm_runtime_enable(dev);
>  
> -The PM core always increments the run-time usage counter before calling the
> -->prepare() callback and decrements it after calling the ->complete() callback.
> -Hence disabling run-time PM temporarily like this will not cause any run-time
> -suspend callbacks to be lost.
> -

Thank you for pointing this out.  I had forgotten about this; it
implies that temporarily disabling runtime PM during system resume is
no longer safe!

Maybe we should put the get_noresume and put_sync calls back into the
PM core, but only during the system resume stages.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 23:52               ` [linux-pm] " Kevin Hilman
@ 2011-06-11 16:42                 ` Alan Stern
  2011-06-11 16:42                 ` [linux-pm] " Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-11 16:42 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Fri, 10 Jun 2011, Kevin Hilman wrote:

> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> [...]

> > If the wakeup setting is not correct, it has to be changed.  That 
> > often implies going back to full power in order to change the 
> > wakeup setting, then going to low power again.
> 
> OK, but how should this be implemented?  
> 
> If the device is runtime suspended at system suspend time, it implies
> that somwhere in the system suspend path, the device has to be powered
> on and enabled (a.k.a. runtime resumed.)
> 
> From a driver writer's perspective, doing a pm_runtime_get_sync() would
> be the obvious choice, but that causes nesting of ->runtime_resume
> callbacks within ->suspend callbacks which is apparently forbidden (or
> rather strongly recommended against :)
> 
> Now, assuming the driver's suspend can't do a pm_runtime_get()...
> 
> In order to power on & enable the device, the driver has to essentially
> duplicate everything that would be done by a runtime resume.

Again, this depends on the subsystem and the driver.  For example, the
USB subsystem does call pm_runtime_resume() in order to bring a device
back to full power if the wakeup setting needs to be changed.  This is
done in the subsystem code, and the subsystem is designed to allow it.

(Actually, it could be improved.  In theory the driver doesn't need to
be involved at all; a USB device's wakeup setting can be changed purely
by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
up the driver, which then needs to be quiesced again shortly thereafter
-- overall a waste of time.  This was the easiest approach.)

> The problem comes because this work is shared between the driver and the
> subsystem.  IOW, it's the driver's ->suspend() callback that decides
> whether or not the device needs to be powered-on/enabled (e.g. to
> enable/disable wakeups), but it might be the subsystem that actually has
> does the magic_device_set_full_power(), magic_device_enable().
> 
> So once the driver's ->suspend() realizes it needs to power on & enable
> the device, it has no way to tell the subsystem to do so, wait for it to
> happen, and then enable/disable its wakeups.

Then the subsystem should _provide_ a way, if that's how you decide to
handle things.

> Maybe I'm being really dense, really blind, or really stubborn (or all
> three), but it seems to be that using runtime PM calls to implement
> these things would be the most obvious and the most readable.

Have you tried actually doing it in a situation where you control both
the driver and the subsystem?

Basically, I think what Rafael was saying before referred to the 
general case, where you don't know anything about the subsystem and 
can't afford to make assumptions.  But in the real world you'll be 
writing a driver for a particular subsystem and you'll know how that 
subsystem works.  If the subsystem permits runtime PM calls to be 
nested within the system PM routines, feel free to go ahead and use 
them.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 23:52               ` [linux-pm] " Kevin Hilman
  2011-06-11 16:42                 ` Alan Stern
@ 2011-06-11 16:42                 ` Alan Stern
  2011-06-11 22:46                   ` Rafael J. Wysocki
  2011-06-11 22:46                   ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-11 16:42 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

On Fri, 10 Jun 2011, Kevin Hilman wrote:

> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> [...]

> > If the wakeup setting is not correct, it has to be changed.  That 
> > often implies going back to full power in order to change the 
> > wakeup setting, then going to low power again.
> 
> OK, but how should this be implemented?  
> 
> If the device is runtime suspended at system suspend time, it implies
> that somwhere in the system suspend path, the device has to be powered
> on and enabled (a.k.a. runtime resumed.)
> 
> From a driver writer's perspective, doing a pm_runtime_get_sync() would
> be the obvious choice, but that causes nesting of ->runtime_resume
> callbacks within ->suspend callbacks which is apparently forbidden (or
> rather strongly recommended against :)
> 
> Now, assuming the driver's suspend can't do a pm_runtime_get()...
> 
> In order to power on & enable the device, the driver has to essentially
> duplicate everything that would be done by a runtime resume.

Again, this depends on the subsystem and the driver.  For example, the
USB subsystem does call pm_runtime_resume() in order to bring a device
back to full power if the wakeup setting needs to be changed.  This is
done in the subsystem code, and the subsystem is designed to allow it.

(Actually, it could be improved.  In theory the driver doesn't need to
be involved at all; a USB device's wakeup setting can be changed purely
by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
up the driver, which then needs to be quiesced again shortly thereafter
-- overall a waste of time.  This was the easiest approach.)

> The problem comes because this work is shared between the driver and the
> subsystem.  IOW, it's the driver's ->suspend() callback that decides
> whether or not the device needs to be powered-on/enabled (e.g. to
> enable/disable wakeups), but it might be the subsystem that actually has
> does the magic_device_set_full_power(), magic_device_enable().
> 
> So once the driver's ->suspend() realizes it needs to power on & enable
> the device, it has no way to tell the subsystem to do so, wait for it to
> happen, and then enable/disable its wakeups.

Then the subsystem should _provide_ a way, if that's how you decide to
handle things.

> Maybe I'm being really dense, really blind, or really stubborn (or all
> three), but it seems to be that using runtime PM calls to implement
> these things would be the most obvious and the most readable.

Have you tried actually doing it in a situation where you control both
the driver and the subsystem?

Basically, I think what Rafael was saying before referred to the 
general case, where you don't know anything about the subsystem and 
can't afford to make assumptions.  But in the real world you'll be 
writing a driver for a particular subsystem and you'll know how that 
subsystem works.  If the subsystem permits runtime PM calls to be 
nested within the system PM routines, feel free to go ahead and use 
them.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-11 11:42                                   ` [linux-pm] " Mark Brown
@ 2011-06-11 20:56                                     ` Rafael J. Wysocki
  2011-06-13 12:22                                       ` [linux-pm] " Mark Brown
  2011-06-13 12:22                                       ` Mark Brown
  0 siblings, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-11 20:56 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-pm, linux-omap

On Saturday, June 11, 2011, Mark Brown wrote:
> On Fri, Jun 10, 2011 at 10:27:25PM +0200, Rafael J. Wysocki wrote:
> 
> > So, there are a few things to consider:
> 
> > * Can the device do things like DMA?
> > * Does the driver use a workqueue?
> > * Does it use timers?
> 
> > In all of the above cases your system suspend handling will require extra
> > care to make sure those things won't get in the way of the suspend process.
> 
> Yes, that's the quiesce operation I think myself or Alan mentioned.
> 
> > It's probably fair to say that everithing depends on the subsystem, what it
> > does and what it expects from the driver.  In the extreme case, when the
> > subsystem is like the platform bus type, the driver unfortunately is on its
> > own and has to deal with the whole complexity.
> 
> I'm pretty much only working with buses that have no infrastructure and
> for which power is essentially orthogonal to the control bus itself -
> that's a very large proportion of the embedded space.  It really feels
> like we could be doing a better job for drivers using these buses,
> there's a lot of similarities in what many of them need but I can never
> find the time to get my head round it confidently enough to actually
> propose anything.

I agree.  That's one of the reasons why I introduced the struct dev_power_domain
thing a while ago and the generic PM domains patchset I've just posted is a step
in that direction.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-11 16:42                 ` [linux-pm] " Alan Stern
  2011-06-11 22:46                   ` Rafael J. Wysocki
@ 2011-06-11 22:46                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-11 22:46 UTC (permalink / raw)
  To: Alan Stern, Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Saturday, June 11, 2011, Alan Stern wrote:
> On Fri, 10 Jun 2011, Kevin Hilman wrote:
> 
> > Alan Stern <stern@rowland.harvard.edu> writes:
> > 
> > [...]
> 
> > > If the wakeup setting is not correct, it has to be changed.  That 
> > > often implies going back to full power in order to change the 
> > > wakeup setting, then going to low power again.
> > 
> > OK, but how should this be implemented?  
> > 
> > If the device is runtime suspended at system suspend time, it implies
> > that somwhere in the system suspend path, the device has to be powered
> > on and enabled (a.k.a. runtime resumed.)
> > 
> > From a driver writer's perspective, doing a pm_runtime_get_sync() would
> > be the obvious choice, but that causes nesting of ->runtime_resume
> > callbacks within ->suspend callbacks which is apparently forbidden (or
> > rather strongly recommended against :)
> > 
> > Now, assuming the driver's suspend can't do a pm_runtime_get()...
> > 
> > In order to power on & enable the device, the driver has to essentially
> > duplicate everything that would be done by a runtime resume.
> 
> Again, this depends on the subsystem and the driver.  For example, the
> USB subsystem does call pm_runtime_resume() in order to bring a device
> back to full power if the wakeup setting needs to be changed.  This is
> done in the subsystem code, and the subsystem is designed to allow it.
> 
> (Actually, it could be improved.  In theory the driver doesn't need to
> be involved at all; a USB device's wakeup setting can be changed purely
> by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
> up the driver, which then needs to be quiesced again shortly thereafter
> -- overall a waste of time.  This was the easiest approach.)
> 
> > The problem comes because this work is shared between the driver and the
> > subsystem.  IOW, it's the driver's ->suspend() callback that decides
> > whether or not the device needs to be powered-on/enabled (e.g. to
> > enable/disable wakeups), but it might be the subsystem that actually has
> > does the magic_device_set_full_power(), magic_device_enable().
> > 
> > So once the driver's ->suspend() realizes it needs to power on & enable
> > the device, it has no way to tell the subsystem to do so, wait for it to
> > happen, and then enable/disable its wakeups.
> 
> Then the subsystem should _provide_ a way, if that's how you decide to
> handle things.
> 
> > Maybe I'm being really dense, really blind, or really stubborn (or all
> > three), but it seems to be that using runtime PM calls to implement
> > these things would be the most obvious and the most readable.
> 
> Have you tried actually doing it in a situation where you control both
> the driver and the subsystem?
> 
> Basically, I think what Rafael was saying before referred to the 
> general case, where you don't know anything about the subsystem and 
> can't afford to make assumptions.  But in the real world you'll be 
> writing a driver for a particular subsystem and you'll know how that 
> subsystem works.  If the subsystem permits runtime PM calls to be 
> nested within the system PM routines, feel free to go ahead and use 
> them.

But then we get the problem that user space may echo "on" to the
device's "control" file in sysfs and the whole clever plan basically goes
south.

Moreover, on some systems devices will belong to PM domains and their
drivers may potentially be used with different PM domains on different
platforms.  This means that drivers really should not make any assumptions
about whether or not they can use runtime PM in their system suspend/resume
routines.  They can't.

Now, Kevin, I think that the problem you really want to address is this:
Suppose a driver needs to do one thing in its .runtime_suspend() callback
(e.g. "save state") and it wants to do two things in its .suspend()
callback (e.g. "quiesce device" and "save state").  Then, it seems, the
simplest approach would be to call its .runtie_suspend() routine from
its .suspend() routine (after doing the "quiesce device" thing).

So far, so good, but suppose there's a subsystem, different from the platform
bus type, or a PM domain such that it's not sufficient to call the driver's
.runtime_suspend() alone, because the subsystem-level .runtime_suspend() does
something that's necessary for "really suspending" the device.  Then,
apparently, one can simply call pm_runtime_suspend() from the driver's
.suspend() callback and that will take care of runniung the subsystem-level
.runtime_suspend() too.

Unfortunately, the problem with subsystem-level PM callbacks is that, in
general, the subsystem-level .runtime_suspend() needs to do something slightly
different that the subsystem-level system suspend callbacks.  The reason why is,
more or less, wakeup (plus the fact that hibernate callbacks need not power
down things, which is a detail and I'll ignore it from now on).  More precisely,
the set of wakeup devices for system suspend is determined by user space, while
for runtime PM all devices that can do remote wakeup should be set up to do it.
That's why, in general, the subsystem-level .runtime_suspend() may do wrong
things when it's invoked via the driver's .suspend() routine, during system
suspend.  Apart from this, of course, the subsystem-level .suspend() that
has invoked the driver's .suspend() might already do something that won't
play well with the subsystem-level .runtime_suspend(), if it's called at this
point, or even more likely the subsystem-level .suspend_noirq() that will be
run later may not play well with whatever the subsystem-level .runtime_suspend()
does.

So, we seem to be in a "Catch 22" situation, in which the driver needs to run
its .runtime_suspend() code during system suspend, but it has to do it through
the subsystem-level .runtime_suspend() that cannot be run at that time.
Fortunately, however, there is a way out of it, because the driver has an
option to point its .suspend_noirq() callback to the same routine pointed to
by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
designed so that this always works.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-11 16:42                 ` [linux-pm] " Alan Stern
@ 2011-06-11 22:46                   ` Rafael J. Wysocki
  2011-06-12 15:59                     ` Alan Stern
                                       ` (3 more replies)
  2011-06-11 22:46                   ` Rafael J. Wysocki
  1 sibling, 4 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-11 22:46 UTC (permalink / raw)
  To: Alan Stern, Kevin Hilman
  Cc: Linux-pm mailing list, linux-omap, Magnus Damm, Paul Walmsley

On Saturday, June 11, 2011, Alan Stern wrote:
> On Fri, 10 Jun 2011, Kevin Hilman wrote:
> 
> > Alan Stern <stern@rowland.harvard.edu> writes:
> > 
> > [...]
> 
> > > If the wakeup setting is not correct, it has to be changed.  That 
> > > often implies going back to full power in order to change the 
> > > wakeup setting, then going to low power again.
> > 
> > OK, but how should this be implemented?  
> > 
> > If the device is runtime suspended at system suspend time, it implies
> > that somwhere in the system suspend path, the device has to be powered
> > on and enabled (a.k.a. runtime resumed.)
> > 
> > From a driver writer's perspective, doing a pm_runtime_get_sync() would
> > be the obvious choice, but that causes nesting of ->runtime_resume
> > callbacks within ->suspend callbacks which is apparently forbidden (or
> > rather strongly recommended against :)
> > 
> > Now, assuming the driver's suspend can't do a pm_runtime_get()...
> > 
> > In order to power on & enable the device, the driver has to essentially
> > duplicate everything that would be done by a runtime resume.
> 
> Again, this depends on the subsystem and the driver.  For example, the
> USB subsystem does call pm_runtime_resume() in order to bring a device
> back to full power if the wakeup setting needs to be changed.  This is
> done in the subsystem code, and the subsystem is designed to allow it.
> 
> (Actually, it could be improved.  In theory the driver doesn't need to
> be involved at all; a USB device's wakeup setting can be changed purely
> by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
> up the driver, which then needs to be quiesced again shortly thereafter
> -- overall a waste of time.  This was the easiest approach.)
> 
> > The problem comes because this work is shared between the driver and the
> > subsystem.  IOW, it's the driver's ->suspend() callback that decides
> > whether or not the device needs to be powered-on/enabled (e.g. to
> > enable/disable wakeups), but it might be the subsystem that actually has
> > does the magic_device_set_full_power(), magic_device_enable().
> > 
> > So once the driver's ->suspend() realizes it needs to power on & enable
> > the device, it has no way to tell the subsystem to do so, wait for it to
> > happen, and then enable/disable its wakeups.
> 
> Then the subsystem should _provide_ a way, if that's how you decide to
> handle things.
> 
> > Maybe I'm being really dense, really blind, or really stubborn (or all
> > three), but it seems to be that using runtime PM calls to implement
> > these things would be the most obvious and the most readable.
> 
> Have you tried actually doing it in a situation where you control both
> the driver and the subsystem?
> 
> Basically, I think what Rafael was saying before referred to the 
> general case, where you don't know anything about the subsystem and 
> can't afford to make assumptions.  But in the real world you'll be 
> writing a driver for a particular subsystem and you'll know how that 
> subsystem works.  If the subsystem permits runtime PM calls to be 
> nested within the system PM routines, feel free to go ahead and use 
> them.

But then we get the problem that user space may echo "on" to the
device's "control" file in sysfs and the whole clever plan basically goes
south.

Moreover, on some systems devices will belong to PM domains and their
drivers may potentially be used with different PM domains on different
platforms.  This means that drivers really should not make any assumptions
about whether or not they can use runtime PM in their system suspend/resume
routines.  They can't.

Now, Kevin, I think that the problem you really want to address is this:
Suppose a driver needs to do one thing in its .runtime_suspend() callback
(e.g. "save state") and it wants to do two things in its .suspend()
callback (e.g. "quiesce device" and "save state").  Then, it seems, the
simplest approach would be to call its .runtie_suspend() routine from
its .suspend() routine (after doing the "quiesce device" thing).

So far, so good, but suppose there's a subsystem, different from the platform
bus type, or a PM domain such that it's not sufficient to call the driver's
.runtime_suspend() alone, because the subsystem-level .runtime_suspend() does
something that's necessary for "really suspending" the device.  Then,
apparently, one can simply call pm_runtime_suspend() from the driver's
.suspend() callback and that will take care of runniung the subsystem-level
.runtime_suspend() too.

Unfortunately, the problem with subsystem-level PM callbacks is that, in
general, the subsystem-level .runtime_suspend() needs to do something slightly
different that the subsystem-level system suspend callbacks.  The reason why is,
more or less, wakeup (plus the fact that hibernate callbacks need not power
down things, which is a detail and I'll ignore it from now on).  More precisely,
the set of wakeup devices for system suspend is determined by user space, while
for runtime PM all devices that can do remote wakeup should be set up to do it.
That's why, in general, the subsystem-level .runtime_suspend() may do wrong
things when it's invoked via the driver's .suspend() routine, during system
suspend.  Apart from this, of course, the subsystem-level .suspend() that
has invoked the driver's .suspend() might already do something that won't
play well with the subsystem-level .runtime_suspend(), if it's called at this
point, or even more likely the subsystem-level .suspend_noirq() that will be
run later may not play well with whatever the subsystem-level .runtime_suspend()
does.

So, we seem to be in a "Catch 22" situation, in which the driver needs to run
its .runtime_suspend() code during system suspend, but it has to do it through
the subsystem-level .runtime_suspend() that cannot be run at that time.
Fortunately, however, there is a way out of it, because the driver has an
option to point its .suspend_noirq() callback to the same routine pointed to
by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
designed so that this always works.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
                               ` (2 preceding siblings ...)
  2011-06-11 23:13             ` Rafael J. Wysocki
@ 2011-06-11 23:13             ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-11 23:13 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Saturday, June 11, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [...]
> 
> > Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
> > system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
> > pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
> > The reason is that doing that would cause the subsystem's (or power
> > domain's in this case) .runtime_suspend() callback to be invoked and
> > that's incorrect.  Namely, it would require the subsystem (power domain)
> > to expect that its .runtime_suspend() would always be executed indirectly
> > as a result of calling its .suspend() (through the driver's callback)
> > and that expectation may or may not be met (depending on the driver's
> > design).
> 
> So here's an interesting scenario which I think it triggers the same
> problem as you highlight above.
> 
> Assume you have a driver that's using runtime PM on a per-xfer basis.
> Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
> does a pm_runtime_put_sync() (for this example, it's important that it's
> a _put_sync()).  The _put_sync() might happen in an ISR,

It can't happen in an ISR, to be precise.

> or possibly in a thread waiting on a completion which is awoken by the ISR,
> etc. etc. (the runtime PM callbacks are IRQ safe, and device is marked as such.)
> 
> The driver is in the middle of an xfer and a system suspend request
> happens.
> 
> The driver's ->suspend() callback happens, and the driver
> 
> - enables/disables wakeups based on device_may_wakeup()
> - prevents future xfers
> - waits for current xfer to finish
> 
> As soon as the xfer finishes, the driver gets notified (completion,
> callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
> subsys->runtime_suspend --> driver->runtime_suspend.
> 
> While the driver's ->suspend() callback doesn't directly call
> pm_runtime_put_sync(), the act of waiting for the xfer to finish
> causes the subsystem/driver->runtime_suspend callbacks to be called
> during the subsytem/driver->suspend callback, which is the same problem
> as you highlight above.  

It's not exactly the same.  The difference is that you're talking about race
conditions between runtime PM and system suspend (I kind of know why I wanted
system suspend to block runtime PM now :-)) that may be prevented by
subsystem-level code from happening (by using locking and some flags etc.),
while that code cannot do much if its .runtime_suspend() callback, for example,
is executed directly from the system suspend code path.
 
> Based on your commit that removed incrementing the usage count across
> suspend[1], you mentioned "we can rely on subsystems and device drivers
> to avoid doing that unnecessarily."  The above example shows that this
> type of thing might not be that obvious to detect and thus avoid.
> 
> I suspect the solution to the above will be to add back the usage count
> increment across system suspend, but I'm hoping not.  IMO, it would be
> more flexible to allow the subsystems to decide.  The subsystems could
> provide locking (or manage dev->power.usage_count) themselves if
> necessary.  For example, leave it to the subsystem->prepare() to
> pm_runtime_get_noresume() if it wants to avoid the "nesting" of
> callbacks.

I agree.

> A related question: does the pm_wq need to be freezable?  From
> Documentation/power/runtime_pm.txt:
> 
> * The power management workqueue pm_wq in which bus types and device drivers can
>   put their PM-related work items.  It is strongly recommended that pm_wq be
>   used for queuing all work items related to run-time PM, because this allows
>   them to be synchronized with system-wide power transitions (suspend to RAM,
>   hibernation and resume from system sleep states).  pm_wq is declared in
>   include/linux/pm_runtime.h and defined in kernel/power/main.c.
> 
> Is "synchronized with system-wide power transistions" correct here?
> Rather than synchronize, using a freezable workqueue actually _prevents_
> runtime PM events (at least async ones.)
> 
> Again, proper locking (or management of dev->power.usage_count) at the
> subsystem level would get you the same effect, but still leave
> flexibility to the subsystem/pwr_domain layer.

No, please.

The problem here is that I don't want runtime PM stuff to be called during
the "noirq" stages of system suspend and resume which the freezing of the
workqueue takes care of nicely.


> P.S. the commit below[1] removed the usage count increment/decrement
>      across system suspend/resume, but Documentation/power/runtime_pm.txt 
>      still refers to it.   Patch below[2] removes it, ssuming you're
>      not planning on adding it back.  ;)

No, I'm not.  In fact, I'm going to apply your patch. :-)

Thanks,
Rafael


> [1]
> commit e8665002477f0278f84f898145b1f141ba26ee26
> Author: Rafael J. Wysocki <rjw@sisk.pl>
> Date:   Sat Feb 12 01:42:41 2011 +0100
> 
>     PM: Allow pm_runtime_suspend() to succeed during system suspend
>     
>     The dpm_prepare() function increments the runtime PM reference
>     counters of all devices to prevent pm_runtime_suspend() from
>     executing subsystem-level callbacks.  However, this was supposed to
>     guard against a specific race condition that cannot happen, because
>     the power management workqueue is freezable, so pm_runtime_suspend()
>     can only be called synchronously during system suspend and we can
>     rely on subsystems and device drivers to avoid doing that
>     unnecessarily.
>     
>     Make dpm_prepare() drop the runtime PM reference to each device
>     after making sure that runtime resume is not pending for it.
>     
>     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
>     Acked-by: Kevin Hilman <khilman@ti.com>
> 
> [2]
> From 8968e3e41d785e7e5ce7584d64f6a55b303e7060 Mon Sep 17 00:00:00 2001
> From: Kevin Hilman <khilman@ti.com>
> Date: Fri, 10 Jun 2011 16:05:51 -0700
> Subject: [PATCH] PM / Runtime: update doc: usage count no longer incremented across system PM
> 
> commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow
> pm_runtime_suspend() to succeed during system suspend) removed usage
> count increment across system PM.
> 
> Update doc to reflect this.
> 
> Signed-off-by: Kevin Hilman <khilman@ti.com>
> ---
> Applies on v3.0-rc2
> 
>  Documentation/power/runtime_pm.txt |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index 654097b..22accb3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -566,11 +566,6 @@ to do this is:
>  	pm_runtime_set_active(dev);
>  	pm_runtime_enable(dev);
>  
> -The PM core always increments the run-time usage counter before calling the
> -->prepare() callback and decrements it after calling the ->complete() callback.
> -Hence disabling run-time PM temporarily like this will not cause any run-time
> -suspend callbacks to be lost.
> -
>  7. Generic subsystem callbacks
>  
>  Subsystems may wish to conserve code space by using the set of generic power
> 

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
  2011-06-11 16:27             ` Alan Stern
  2011-06-11 16:27             ` [linux-pm] " Alan Stern
@ 2011-06-11 23:13             ` Rafael J. Wysocki
  2011-06-11 23:13             ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-11 23:13 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Saturday, June 11, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> [...]
> 
> > Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
> > system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
> > pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
> > The reason is that doing that would cause the subsystem's (or power
> > domain's in this case) .runtime_suspend() callback to be invoked and
> > that's incorrect.  Namely, it would require the subsystem (power domain)
> > to expect that its .runtime_suspend() would always be executed indirectly
> > as a result of calling its .suspend() (through the driver's callback)
> > and that expectation may or may not be met (depending on the driver's
> > design).
> 
> So here's an interesting scenario which I think it triggers the same
> problem as you highlight above.
> 
> Assume you have a driver that's using runtime PM on a per-xfer basis.
> Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
> does a pm_runtime_put_sync() (for this example, it's important that it's
> a _put_sync()).  The _put_sync() might happen in an ISR,

It can't happen in an ISR, to be precise.

> or possibly in a thread waiting on a completion which is awoken by the ISR,
> etc. etc. (the runtime PM callbacks are IRQ safe, and device is marked as such.)
> 
> The driver is in the middle of an xfer and a system suspend request
> happens.
> 
> The driver's ->suspend() callback happens, and the driver
> 
> - enables/disables wakeups based on device_may_wakeup()
> - prevents future xfers
> - waits for current xfer to finish
> 
> As soon as the xfer finishes, the driver gets notified (completion,
> callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
> subsys->runtime_suspend --> driver->runtime_suspend.
> 
> While the driver's ->suspend() callback doesn't directly call
> pm_runtime_put_sync(), the act of waiting for the xfer to finish
> causes the subsystem/driver->runtime_suspend callbacks to be called
> during the subsytem/driver->suspend callback, which is the same problem
> as you highlight above.  

It's not exactly the same.  The difference is that you're talking about race
conditions between runtime PM and system suspend (I kind of know why I wanted
system suspend to block runtime PM now :-)) that may be prevented by
subsystem-level code from happening (by using locking and some flags etc.),
while that code cannot do much if its .runtime_suspend() callback, for example,
is executed directly from the system suspend code path.
 
> Based on your commit that removed incrementing the usage count across
> suspend[1], you mentioned "we can rely on subsystems and device drivers
> to avoid doing that unnecessarily."  The above example shows that this
> type of thing might not be that obvious to detect and thus avoid.
> 
> I suspect the solution to the above will be to add back the usage count
> increment across system suspend, but I'm hoping not.  IMO, it would be
> more flexible to allow the subsystems to decide.  The subsystems could
> provide locking (or manage dev->power.usage_count) themselves if
> necessary.  For example, leave it to the subsystem->prepare() to
> pm_runtime_get_noresume() if it wants to avoid the "nesting" of
> callbacks.

I agree.

> A related question: does the pm_wq need to be freezable?  From
> Documentation/power/runtime_pm.txt:
> 
> * The power management workqueue pm_wq in which bus types and device drivers can
>   put their PM-related work items.  It is strongly recommended that pm_wq be
>   used for queuing all work items related to run-time PM, because this allows
>   them to be synchronized with system-wide power transitions (suspend to RAM,
>   hibernation and resume from system sleep states).  pm_wq is declared in
>   include/linux/pm_runtime.h and defined in kernel/power/main.c.
> 
> Is "synchronized with system-wide power transistions" correct here?
> Rather than synchronize, using a freezable workqueue actually _prevents_
> runtime PM events (at least async ones.)
> 
> Again, proper locking (or management of dev->power.usage_count) at the
> subsystem level would get you the same effect, but still leave
> flexibility to the subsystem/pwr_domain layer.

No, please.

The problem here is that I don't want runtime PM stuff to be called during
the "noirq" stages of system suspend and resume which the freezing of the
workqueue takes care of nicely.


> P.S. the commit below[1] removed the usage count increment/decrement
>      across system suspend/resume, but Documentation/power/runtime_pm.txt 
>      still refers to it.   Patch below[2] removes it, ssuming you're
>      not planning on adding it back.  ;)

No, I'm not.  In fact, I'm going to apply your patch. :-)

Thanks,
Rafael


> [1]
> commit e8665002477f0278f84f898145b1f141ba26ee26
> Author: Rafael J. Wysocki <rjw@sisk.pl>
> Date:   Sat Feb 12 01:42:41 2011 +0100
> 
>     PM: Allow pm_runtime_suspend() to succeed during system suspend
>     
>     The dpm_prepare() function increments the runtime PM reference
>     counters of all devices to prevent pm_runtime_suspend() from
>     executing subsystem-level callbacks.  However, this was supposed to
>     guard against a specific race condition that cannot happen, because
>     the power management workqueue is freezable, so pm_runtime_suspend()
>     can only be called synchronously during system suspend and we can
>     rely on subsystems and device drivers to avoid doing that
>     unnecessarily.
>     
>     Make dpm_prepare() drop the runtime PM reference to each device
>     after making sure that runtime resume is not pending for it.
>     
>     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
>     Acked-by: Kevin Hilman <khilman@ti.com>
> 
> [2]
> From 8968e3e41d785e7e5ce7584d64f6a55b303e7060 Mon Sep 17 00:00:00 2001
> From: Kevin Hilman <khilman@ti.com>
> Date: Fri, 10 Jun 2011 16:05:51 -0700
> Subject: [PATCH] PM / Runtime: update doc: usage count no longer incremented across system PM
> 
> commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow
> pm_runtime_suspend() to succeed during system suspend) removed usage
> count increment across system PM.
> 
> Update doc to reflect this.
> 
> Signed-off-by: Kevin Hilman <khilman@ti.com>
> ---
> Applies on v3.0-rc2
> 
>  Documentation/power/runtime_pm.txt |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index 654097b..22accb3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -566,11 +566,6 @@ to do this is:
>  	pm_runtime_set_active(dev);
>  	pm_runtime_enable(dev);
>  
> -The PM core always increments the run-time usage counter before calling the
> -->prepare() callback and decrements it after calling the ->complete() callback.
> -Hence disabling run-time PM temporarily like this will not cause any run-time
> -suspend callbacks to be lost.
> -
>  7. Generic subsystem callbacks
>  
>  Subsystems may wish to conserve code space by using the set of generic power
> 


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-11 22:46                   ` Rafael J. Wysocki
  2011-06-12 15:59                     ` Alan Stern
@ 2011-06-12 15:59                     ` Alan Stern
  2011-06-15 21:54                     ` Kevin Hilman
  2011-06-15 21:54                     ` [linux-pm] " Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-12 15:59 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

On Sun, 12 Jun 2011, Rafael J. Wysocki wrote:

> > Basically, I think what Rafael was saying before referred to the 
> > general case, where you don't know anything about the subsystem and 
> > can't afford to make assumptions.  But in the real world you'll be 
> > writing a driver for a particular subsystem and you'll know how that 
> > subsystem works.  If the subsystem permits runtime PM calls to be 
> > nested within the system PM routines, feel free to go ahead and use 
> > them.
> 
> But then we get the problem that user space may echo "on" to the
> device's "control" file in sysfs and the whole clever plan basically goes
> south.

That would indeed be a problem if we used pm_runtime_suspend() in the
system suspend code.  But it doesn't prevent us from calling
pm_runtime_resume() during system suspend.

> Moreover, on some systems devices will belong to PM domains and their
> drivers may potentially be used with different PM domains on different
> platforms.  This means that drivers really should not make any assumptions
> about whether or not they can use runtime PM in their system suspend/resume
> routines.  They can't.

Yes, clearly one mustn't make assumptions about unknown subsystems or
PM domains.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-11 22:46                   ` Rafael J. Wysocki
@ 2011-06-12 15:59                     ` Alan Stern
  2011-06-12 18:27                       ` Rafael J. Wysocki
  2011-06-12 18:27                       ` [linux-pm] " Rafael J. Wysocki
  2011-06-12 15:59                     ` Alan Stern
                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-12 15:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Kevin Hilman, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Sun, 12 Jun 2011, Rafael J. Wysocki wrote:

> > Basically, I think what Rafael was saying before referred to the 
> > general case, where you don't know anything about the subsystem and 
> > can't afford to make assumptions.  But in the real world you'll be 
> > writing a driver for a particular subsystem and you'll know how that 
> > subsystem works.  If the subsystem permits runtime PM calls to be 
> > nested within the system PM routines, feel free to go ahead and use 
> > them.
> 
> But then we get the problem that user space may echo "on" to the
> device's "control" file in sysfs and the whole clever plan basically goes
> south.

That would indeed be a problem if we used pm_runtime_suspend() in the
system suspend code.  But it doesn't prevent us from calling
pm_runtime_resume() during system suspend.

> Moreover, on some systems devices will belong to PM domains and their
> drivers may potentially be used with different PM domains on different
> platforms.  This means that drivers really should not make any assumptions
> about whether or not they can use runtime PM in their system suspend/resume
> routines.  They can't.

Yes, clearly one mustn't make assumptions about unknown subsystems or
PM domains.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-12 15:59                     ` Alan Stern
@ 2011-06-12 18:27                       ` Rafael J. Wysocki
  2011-06-12 18:27                       ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-12 18:27 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Sunday, June 12, 2011, Alan Stern wrote:
> On Sun, 12 Jun 2011, Rafael J. Wysocki wrote:
> 
> > > Basically, I think what Rafael was saying before referred to the 
> > > general case, where you don't know anything about the subsystem and 
> > > can't afford to make assumptions.  But in the real world you'll be 
> > > writing a driver for a particular subsystem and you'll know how that 
> > > subsystem works.  If the subsystem permits runtime PM calls to be 
> > > nested within the system PM routines, feel free to go ahead and use 
> > > them.
> > 
> > But then we get the problem that user space may echo "on" to the
> > device's "control" file in sysfs and the whole clever plan basically goes
> > south.
> 
> That would indeed be a problem if we used pm_runtime_suspend() in the
> system suspend code.  But it doesn't prevent us from calling
> pm_runtime_resume() during system suspend.

I agree.  Moreover, calling pm_runtime_resume() from a .prepare() callback
is actually fine, because it makes .runtime_resume() run before .suspend()
and .suspend_noirq().  But if .runtime_resume() is run between .suspend()
and .suspend_noirq(), that may lead to some trouble, again, depending on
the subsystem (or PM domain) involved.

Also, if a subsystem calls pm_runtime_resume() in one of its system suspend
callbacks (e.g. .prepare()), that's pretty much OK, because the subsystem
is basically responsible for making things work at this point.  However, if
a driver does that, it's kind of like saying "here I'm smarter than the
subsystem and I know better".

Unfortunately, platform drivers deal with a dumb subsystem, so they have
the right to think so, but that's not the case in general.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-12 15:59                     ` Alan Stern
  2011-06-12 18:27                       ` Rafael J. Wysocki
@ 2011-06-12 18:27                       ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-12 18:27 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kevin Hilman, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Sunday, June 12, 2011, Alan Stern wrote:
> On Sun, 12 Jun 2011, Rafael J. Wysocki wrote:
> 
> > > Basically, I think what Rafael was saying before referred to the 
> > > general case, where you don't know anything about the subsystem and 
> > > can't afford to make assumptions.  But in the real world you'll be 
> > > writing a driver for a particular subsystem and you'll know how that 
> > > subsystem works.  If the subsystem permits runtime PM calls to be 
> > > nested within the system PM routines, feel free to go ahead and use 
> > > them.
> > 
> > But then we get the problem that user space may echo "on" to the
> > device's "control" file in sysfs and the whole clever plan basically goes
> > south.
> 
> That would indeed be a problem if we used pm_runtime_suspend() in the
> system suspend code.  But it doesn't prevent us from calling
> pm_runtime_resume() during system suspend.

I agree.  Moreover, calling pm_runtime_resume() from a .prepare() callback
is actually fine, because it makes .runtime_resume() run before .suspend()
and .suspend_noirq().  But if .runtime_resume() is run between .suspend()
and .suspend_noirq(), that may lead to some trouble, again, depending on
the subsystem (or PM domain) involved.

Also, if a subsystem calls pm_runtime_resume() in one of its system suspend
callbacks (e.g. .prepare()), that's pretty much OK, because the subsystem
is basically responsible for making things work at this point.  However, if
a driver does that, it's kind of like saying "here I'm smarter than the
subsystem and I know better".

Unfortunately, platform drivers deal with a dumb subsystem, so they have
the right to think so, but that's not the case in general.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-11 20:56                                     ` Rafael J. Wysocki
  2011-06-13 12:22                                       ` [linux-pm] " Mark Brown
@ 2011-06-13 12:22                                       ` Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-13 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Sat, Jun 11, 2011 at 10:56:36PM +0200, Rafael J. Wysocki wrote:
> On Saturday, June 11, 2011, Mark Brown wrote:

> > that's a very large proportion of the embedded space.  It really feels
> > like we could be doing a better job for drivers using these buses,
> > there's a lot of similarities in what many of them need but I can never
> > find the time to get my head round it confidently enough to actually
> > propose anything.

> I agree.  That's one of the reasons why I introduced the struct dev_power_domain
> thing a while ago and the generic PM domains patchset I've just posted is a step
> in that direction.

Ah, right - I'd not been thinking of those in the context of
simplifications for basic devices, I'd been thinking of them as being
for CPUs with more complex power domains that need managing.  I'll keep
more of an eye on what's going on there.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-11 20:56                                     ` Rafael J. Wysocki
@ 2011-06-13 12:22                                       ` Mark Brown
  2011-06-13 12:22                                       ` Mark Brown
  1 sibling, 0 replies; 118+ messages in thread
From: Mark Brown @ 2011-06-13 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Alan Stern, linux-omap

On Sat, Jun 11, 2011 at 10:56:36PM +0200, Rafael J. Wysocki wrote:
> On Saturday, June 11, 2011, Mark Brown wrote:

> > that's a very large proportion of the embedded space.  It really feels
> > like we could be doing a better job for drivers using these buses,
> > there's a lot of similarities in what many of them need but I can never
> > find the time to get my head round it confidently enough to actually
> > propose anything.

> I agree.  That's one of the reasons why I introduced the struct dev_power_domain
> thing a while ago and the generic PM domains patchset I've just posted is a step
> in that direction.

Ah, right - I'd not been thinking of those in the context of
simplifications for basic devices, I'd been thinking of them as being
for CPUs with more complex power domains that need managing.  I'll keep
more of an eye on what's going on there.

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-11 22:46                   ` Rafael J. Wysocki
  2011-06-12 15:59                     ` Alan Stern
  2011-06-12 15:59                     ` Alan Stern
@ 2011-06-15 21:54                     ` Kevin Hilman
  2011-06-15 21:54                     ` [linux-pm] " Kevin Hilman
  3 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-15 21:54 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday, June 11, 2011, Alan Stern wrote:
>> On Fri, 10 Jun 2011, Kevin Hilman wrote:
>> 
>> > Alan Stern <stern@rowland.harvard.edu> writes:
>> > 
>> > [...]
>> 
>> > > If the wakeup setting is not correct, it has to be changed.  That 
>> > > often implies going back to full power in order to change the 
>> > > wakeup setting, then going to low power again.
>> > 
>> > OK, but how should this be implemented?  
>> > 
>> > If the device is runtime suspended at system suspend time, it implies
>> > that somwhere in the system suspend path, the device has to be powered
>> > on and enabled (a.k.a. runtime resumed.)
>> > 
>> > From a driver writer's perspective, doing a pm_runtime_get_sync() would
>> > be the obvious choice, but that causes nesting of ->runtime_resume
>> > callbacks within ->suspend callbacks which is apparently forbidden (or
>> > rather strongly recommended against :)
>> > 
>> > Now, assuming the driver's suspend can't do a pm_runtime_get()...
>> > 
>> > In order to power on & enable the device, the driver has to essentially
>> > duplicate everything that would be done by a runtime resume.
>> 
>> Again, this depends on the subsystem and the driver.  For example, the
>> USB subsystem does call pm_runtime_resume() in order to bring a device
>> back to full power if the wakeup setting needs to be changed.  This is
>> done in the subsystem code, and the subsystem is designed to allow it.
>> 
>> (Actually, it could be improved.  In theory the driver doesn't need to
>> be involved at all; a USB device's wakeup setting can be changed purely
>> by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
>> up the driver, which then needs to be quiesced again shortly thereafter
>> -- overall a waste of time.  This was the easiest approach.)
>> 
>> > The problem comes because this work is shared between the driver and the
>> > subsystem.  IOW, it's the driver's ->suspend() callback that decides
>> > whether or not the device needs to be powered-on/enabled (e.g. to
>> > enable/disable wakeups), but it might be the subsystem that actually has
>> > does the magic_device_set_full_power(), magic_device_enable().
>> > 
>> > So once the driver's ->suspend() realizes it needs to power on & enable
>> > the device, it has no way to tell the subsystem to do so, wait for it to
>> > happen, and then enable/disable its wakeups.
>> 
>> Then the subsystem should _provide_ a way, if that's how you decide to
>> handle things.
>> 
>> > Maybe I'm being really dense, really blind, or really stubborn (or all
>> > three), but it seems to be that using runtime PM calls to implement
>> > these things would be the most obvious and the most readable.
>> 
>> Have you tried actually doing it in a situation where you control both
>> the driver and the subsystem?
>> 
>> Basically, I think what Rafael was saying before referred to the 
>> general case, where you don't know anything about the subsystem and 
>> can't afford to make assumptions.  But in the real world you'll be 
>> writing a driver for a particular subsystem and you'll know how that 
>> subsystem works.  If the subsystem permits runtime PM calls to be 
>> nested within the system PM routines, feel free to go ahead and use 
>> them.
>
> But then we get the problem that user space may echo "on" to the
> device's "control" file in sysfs and the whole clever plan basically goes
> south.
>
> Moreover, on some systems devices will belong to PM domains and their
> drivers may potentially be used with different PM domains on different
> platforms.  This means that drivers really should not make any assumptions
> about whether or not they can use runtime PM in their system suspend/resume
> routines.  They can't.

Sure, but it's easy enough for subsystems that need protection to add
it.  Why not just better document that driver & subsytem runtime PM
callbacks *could* be called during a system suspend (and same for
resume.)  Any subsystems that want/need protection can prevent nesting
simply with pm_runtime_get_noresume() and _put_noidle().

As I mentioned earlier in the thread, this can already happen today
without .suspend() callbacks directly calling pm_runtime_suspend()
(e.g. driver xfer finishes and does pm_runtime_put_sync() anytime after
system suspend has started.)

> Now, Kevin, I think that the problem you really want to address is this:
> Suppose a driver needs to do one thing in its .runtime_suspend() callback
> (e.g. "save state") and it wants to do two things in its .suspend()
> callback (e.g. "quiesce device" and "save state").  Then, it seems, the
> simplest approach would be to call its .runtie_suspend() routine from
> its .suspend() routine (after doing the "quiesce device" thing).

Partially, yes.  But I'm not primarily concerned about the callbacks.
Many of our simple drivers don't even need runtime PM callbacks
(e.g. state is saved using shadow regs, or device is re-init'd for for
every xfer etc.)

More important to me is how driver writers for embedded devices think
about PM for embedded systems.  IMO, driver writers should think
primarily in terms of runtime PM, and use that as the primary API for
all driver PM.

>From my POV, system PM for embedded devices is just a special case of
runtime PM.  From a device driver perspective, system PM is just runtime
PM where the "idleness" was forced and only a subset of possible wakeup
sources are enabled.   I think this runtime-PM-centric view of the world
is maybe where our differences of opinion are coming from.

So with that perspecive, I'd like the code to reflect a
runtime-PM-centric view as well.  The development effort is primarily
focused on implementing efficient runtime PM for an _active_ system.
When this is working, implementing system PM is easy: all that is needed
is to enable/disable relevant wakeups and force the device to idle.
This allows runtime PM to trigger, and the device is suspended.

> So far, so good, but suppose there's a subsystem, different from the platform
> bus type, or a PM domain such that it's not sufficient to call the driver's
> .runtime_suspend() alone, because the subsystem-level .runtime_suspend() does
> something that's necessary for "really suspending" the device.  

Yes, for OMAP, the "really suspending" work is done by the subsystem.

> Then, apparently, one can simply call pm_runtime_suspend() from the
> driver's .suspend() callback and that will take care of runniung the
> subsystem-level .runtime_suspend() too.

Exactly.

> Unfortunately, the problem with subsystem-level PM callbacks is that, in
> general, the subsystem-level .runtime_suspend() needs to do something slightly
> different that the subsystem-level system suspend callbacks.  The reason why is,
> more or less, wakeup (plus the fact that hibernate callbacks need not power
> down things, which is a detail and I'll ignore it from now on).  More precisely,
> the set of wakeup devices for system suspend is determined by user space, while
> for runtime PM all devices that can do remote wakeup should be set up to do it.
> That's why, in general, the subsystem-level .runtime_suspend() may do wrong
> things when it's invoked via the driver's .suspend() routine, during system
> suspend.  

I still don't quite see what runtime_suspend() would do wrong in terms
of wakeups.  Do you mean that subsys->runtime_suspend() might enable
wakeups even though subsys->suspend() has just disabled them?  If so, it
should be the responsibility of the subsystem to manage this correctly.  

It would be pretty straightforward for the subsystem to know if its
.runtime_suspend() is being called during system suspend (e.g. flag set
during ->prepare, etc.) and not mess with wakeup settings.

At least on OMAP, this isn't an issue since the runtime PM path doesn't
touch wakeups at all.  Wakeup-capable devices have wakeups enabled
during device init, and remain wakeup capable during runtime PM.
Neither the driver or subsystem runtime PM callbacks do anything for
wakeups.  Only the driver (or possibly subsystem) .suspend() and
.resume() do any changing of wakeup settings.

> Apart from this, of course, the subsystem-level .suspend() that
> has invoked the driver's .suspend() might already do something that won't
> play well with the subsystem-level .runtime_suspend(), if it's called at this
> point, or even more likely the subsystem-level .suspend_noirq() that will be
> run later may not play well with whatever the subsystem-level .runtime_suspend()
> does.

Do you have something in mind about how they wouldn't play well
together?  

I'm starting from the assumption that subsystems need to be aware or
potential nesting of callbacks (which can happen today), and either take
care of it or prevent it.

If the HW really needs different handling for system suspend and runtime
PM, then I see your point, and the subsystem is free to treat them more
independently, and even to prevent them from nesting.  My point is that
for embedded systems, there is no difference at the HW other than wakeup
programming, and wakeups are easy enough to handle.

Yes, all of this means that the subsystem has to be written with this
runtime-PM-centric view in mind, but I am pursuaded that doing so is the
best model for the PM domains on embedded devices.

Put differently, with a runtime-PM-centric view of the world, the
subsystem .suspend really has nothing to do, so it is rather easy for it
to play well with .runtime_suspend().  The driver .suspend will
enable/disable wakeups, quiesce the HW, and as a result a runtime PM
transition will occur.   Then there's nothing left for the subsystem
.suspend to do.

Maybe it helps to show the flow of how I think this would work for a
typical device during system suspend:

subsys->suspend()
    driver->suspend()
        /* check device_may_wakeup(), enable/disable wakeups */
        /* quiesce HW, triggers runtime PM _put() or _suspend() */
        subsys->runtime_suspend()
            driver->runtime_suspend()
                driver_save_context()
            /* subsys idles HW, sets low-power state */
        /* nothing left for driver to do */
    /* nothing left for subsys to do */

> So, we seem to be in a "Catch 22" situation, in which the driver needs to run
> its .runtime_suspend() code during system suspend, but it has to do it through
> the subsystem-level .runtime_suspend() that cannot be run at that time.
> Fortunately, however, there is a way out of it, because the driver has an
> option to point its .suspend_noirq() callback to the same routine pointed to
> by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
> execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
> designed so that this always works.

I don't follow this part.

So you're not OK with running the subsystem or driver .runtime_suspend()
during .suspend(), but it is OK during .suspend_noirq()? 

Also, where/when would the subsystem .runtime_suspend() be called?

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-11 22:46                   ` Rafael J. Wysocki
                                       ` (2 preceding siblings ...)
  2011-06-15 21:54                     ` Kevin Hilman
@ 2011-06-15 21:54                     ` Kevin Hilman
  2011-06-16  0:01                       ` Rafael J. Wysocki
  2011-06-16  0:01                       ` Rafael J. Wysocki
  3 siblings, 2 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-15 21:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Saturday, June 11, 2011, Alan Stern wrote:
>> On Fri, 10 Jun 2011, Kevin Hilman wrote:
>> 
>> > Alan Stern <stern@rowland.harvard.edu> writes:
>> > 
>> > [...]
>> 
>> > > If the wakeup setting is not correct, it has to be changed.  That 
>> > > often implies going back to full power in order to change the 
>> > > wakeup setting, then going to low power again.
>> > 
>> > OK, but how should this be implemented?  
>> > 
>> > If the device is runtime suspended at system suspend time, it implies
>> > that somwhere in the system suspend path, the device has to be powered
>> > on and enabled (a.k.a. runtime resumed.)
>> > 
>> > From a driver writer's perspective, doing a pm_runtime_get_sync() would
>> > be the obvious choice, but that causes nesting of ->runtime_resume
>> > callbacks within ->suspend callbacks which is apparently forbidden (or
>> > rather strongly recommended against :)
>> > 
>> > Now, assuming the driver's suspend can't do a pm_runtime_get()...
>> > 
>> > In order to power on & enable the device, the driver has to essentially
>> > duplicate everything that would be done by a runtime resume.
>> 
>> Again, this depends on the subsystem and the driver.  For example, the
>> USB subsystem does call pm_runtime_resume() in order to bring a device
>> back to full power if the wakeup setting needs to be changed.  This is
>> done in the subsystem code, and the subsystem is designed to allow it.
>> 
>> (Actually, it could be improved.  In theory the driver doesn't need to
>> be involved at all; a USB device's wakeup setting can be changed purely
>> by the subsystem.  Nevertheless, the pm_runtime_resume call does wake
>> up the driver, which then needs to be quiesced again shortly thereafter
>> -- overall a waste of time.  This was the easiest approach.)
>> 
>> > The problem comes because this work is shared between the driver and the
>> > subsystem.  IOW, it's the driver's ->suspend() callback that decides
>> > whether or not the device needs to be powered-on/enabled (e.g. to
>> > enable/disable wakeups), but it might be the subsystem that actually has
>> > does the magic_device_set_full_power(), magic_device_enable().
>> > 
>> > So once the driver's ->suspend() realizes it needs to power on & enable
>> > the device, it has no way to tell the subsystem to do so, wait for it to
>> > happen, and then enable/disable its wakeups.
>> 
>> Then the subsystem should _provide_ a way, if that's how you decide to
>> handle things.
>> 
>> > Maybe I'm being really dense, really blind, or really stubborn (or all
>> > three), but it seems to be that using runtime PM calls to implement
>> > these things would be the most obvious and the most readable.
>> 
>> Have you tried actually doing it in a situation where you control both
>> the driver and the subsystem?
>> 
>> Basically, I think what Rafael was saying before referred to the 
>> general case, where you don't know anything about the subsystem and 
>> can't afford to make assumptions.  But in the real world you'll be 
>> writing a driver for a particular subsystem and you'll know how that 
>> subsystem works.  If the subsystem permits runtime PM calls to be 
>> nested within the system PM routines, feel free to go ahead and use 
>> them.
>
> But then we get the problem that user space may echo "on" to the
> device's "control" file in sysfs and the whole clever plan basically goes
> south.
>
> Moreover, on some systems devices will belong to PM domains and their
> drivers may potentially be used with different PM domains on different
> platforms.  This means that drivers really should not make any assumptions
> about whether or not they can use runtime PM in their system suspend/resume
> routines.  They can't.

Sure, but it's easy enough for subsystems that need protection to add
it.  Why not just better document that driver & subsytem runtime PM
callbacks *could* be called during a system suspend (and same for
resume.)  Any subsystems that want/need protection can prevent nesting
simply with pm_runtime_get_noresume() and _put_noidle().

As I mentioned earlier in the thread, this can already happen today
without .suspend() callbacks directly calling pm_runtime_suspend()
(e.g. driver xfer finishes and does pm_runtime_put_sync() anytime after
system suspend has started.)

> Now, Kevin, I think that the problem you really want to address is this:
> Suppose a driver needs to do one thing in its .runtime_suspend() callback
> (e.g. "save state") and it wants to do two things in its .suspend()
> callback (e.g. "quiesce device" and "save state").  Then, it seems, the
> simplest approach would be to call its .runtie_suspend() routine from
> its .suspend() routine (after doing the "quiesce device" thing).

Partially, yes.  But I'm not primarily concerned about the callbacks.
Many of our simple drivers don't even need runtime PM callbacks
(e.g. state is saved using shadow regs, or device is re-init'd for for
every xfer etc.)

More important to me is how driver writers for embedded devices think
about PM for embedded systems.  IMO, driver writers should think
primarily in terms of runtime PM, and use that as the primary API for
all driver PM.

>From my POV, system PM for embedded devices is just a special case of
runtime PM.  From a device driver perspective, system PM is just runtime
PM where the "idleness" was forced and only a subset of possible wakeup
sources are enabled.   I think this runtime-PM-centric view of the world
is maybe where our differences of opinion are coming from.

So with that perspecive, I'd like the code to reflect a
runtime-PM-centric view as well.  The development effort is primarily
focused on implementing efficient runtime PM for an _active_ system.
When this is working, implementing system PM is easy: all that is needed
is to enable/disable relevant wakeups and force the device to idle.
This allows runtime PM to trigger, and the device is suspended.

> So far, so good, but suppose there's a subsystem, different from the platform
> bus type, or a PM domain such that it's not sufficient to call the driver's
> .runtime_suspend() alone, because the subsystem-level .runtime_suspend() does
> something that's necessary for "really suspending" the device.  

Yes, for OMAP, the "really suspending" work is done by the subsystem.

> Then, apparently, one can simply call pm_runtime_suspend() from the
> driver's .suspend() callback and that will take care of runniung the
> subsystem-level .runtime_suspend() too.

Exactly.

> Unfortunately, the problem with subsystem-level PM callbacks is that, in
> general, the subsystem-level .runtime_suspend() needs to do something slightly
> different that the subsystem-level system suspend callbacks.  The reason why is,
> more or less, wakeup (plus the fact that hibernate callbacks need not power
> down things, which is a detail and I'll ignore it from now on).  More precisely,
> the set of wakeup devices for system suspend is determined by user space, while
> for runtime PM all devices that can do remote wakeup should be set up to do it.
> That's why, in general, the subsystem-level .runtime_suspend() may do wrong
> things when it's invoked via the driver's .suspend() routine, during system
> suspend.  

I still don't quite see what runtime_suspend() would do wrong in terms
of wakeups.  Do you mean that subsys->runtime_suspend() might enable
wakeups even though subsys->suspend() has just disabled them?  If so, it
should be the responsibility of the subsystem to manage this correctly.  

It would be pretty straightforward for the subsystem to know if its
.runtime_suspend() is being called during system suspend (e.g. flag set
during ->prepare, etc.) and not mess with wakeup settings.

At least on OMAP, this isn't an issue since the runtime PM path doesn't
touch wakeups at all.  Wakeup-capable devices have wakeups enabled
during device init, and remain wakeup capable during runtime PM.
Neither the driver or subsystem runtime PM callbacks do anything for
wakeups.  Only the driver (or possibly subsystem) .suspend() and
.resume() do any changing of wakeup settings.

> Apart from this, of course, the subsystem-level .suspend() that
> has invoked the driver's .suspend() might already do something that won't
> play well with the subsystem-level .runtime_suspend(), if it's called at this
> point, or even more likely the subsystem-level .suspend_noirq() that will be
> run later may not play well with whatever the subsystem-level .runtime_suspend()
> does.

Do you have something in mind about how they wouldn't play well
together?  

I'm starting from the assumption that subsystems need to be aware or
potential nesting of callbacks (which can happen today), and either take
care of it or prevent it.

If the HW really needs different handling for system suspend and runtime
PM, then I see your point, and the subsystem is free to treat them more
independently, and even to prevent them from nesting.  My point is that
for embedded systems, there is no difference at the HW other than wakeup
programming, and wakeups are easy enough to handle.

Yes, all of this means that the subsystem has to be written with this
runtime-PM-centric view in mind, but I am pursuaded that doing so is the
best model for the PM domains on embedded devices.

Put differently, with a runtime-PM-centric view of the world, the
subsystem .suspend really has nothing to do, so it is rather easy for it
to play well with .runtime_suspend().  The driver .suspend will
enable/disable wakeups, quiesce the HW, and as a result a runtime PM
transition will occur.   Then there's nothing left for the subsystem
.suspend to do.

Maybe it helps to show the flow of how I think this would work for a
typical device during system suspend:

subsys->suspend()
    driver->suspend()
        /* check device_may_wakeup(), enable/disable wakeups */
        /* quiesce HW, triggers runtime PM _put() or _suspend() */
        subsys->runtime_suspend()
            driver->runtime_suspend()
                driver_save_context()
            /* subsys idles HW, sets low-power state */
        /* nothing left for driver to do */
    /* nothing left for subsys to do */

> So, we seem to be in a "Catch 22" situation, in which the driver needs to run
> its .runtime_suspend() code during system suspend, but it has to do it through
> the subsystem-level .runtime_suspend() that cannot be run at that time.
> Fortunately, however, there is a way out of it, because the driver has an
> option to point its .suspend_noirq() callback to the same routine pointed to
> by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
> execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
> designed so that this always works.

I don't follow this part.

So you're not OK with running the subsystem or driver .runtime_suspend()
during .suspend(), but it is OK during .suspend_noirq()? 

Also, where/when would the subsystem .runtime_suspend() be called?

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-15 21:54                     ` [linux-pm] " Kevin Hilman
  2011-06-16  0:01                       ` Rafael J. Wysocki
@ 2011-06-16  0:01                       ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16  0:01 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Wednesday, June 15, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
...
> > Moreover, on some systems devices will belong to PM domains and their
> > drivers may potentially be used with different PM domains on different
> > platforms.  This means that drivers really should not make any assumptions
> > about whether or not they can use runtime PM in their system suspend/resume
> > routines.  They can't.
> 
> Sure, but it's easy enough for subsystems that need protection to add
> it.  Why not just better document that driver & subsytem runtime PM
> callbacks *could* be called during a system suspend (and same for
> resume.)

Because allowing that to happen was a mistake in the first place.

> Any subsystems that want/need protection can prevent nesting
> simply with pm_runtime_get_noresume() and _put_noidle().
> 
> As I mentioned earlier in the thread, this can already happen today
> without .suspend() callbacks directly calling pm_runtime_suspend()
> (e.g. driver xfer finishes and does pm_runtime_put_sync() anytime after
> system suspend has started.)

Which is wrong.  At least, as I said in a different thread, no runtime PM
stuff should be run after .suspend() has returned for the given device.
Otherwise it will violate some assumptions regarding the conditions in which
the _noirq() callbacks are run.

> > Now, Kevin, I think that the problem you really want to address is this:
> > Suppose a driver needs to do one thing in its .runtime_suspend() callback
> > (e.g. "save state") and it wants to do two things in its .suspend()
> > callback (e.g. "quiesce device" and "save state").  Then, it seems, the
> > simplest approach would be to call its .runtie_suspend() routine from
> > its .suspend() routine (after doing the "quiesce device" thing).
> 
> Partially, yes.  But I'm not primarily concerned about the callbacks.
> Many of our simple drivers don't even need runtime PM callbacks
> (e.g. state is saved using shadow regs, or device is re-init'd for for
> every xfer etc.)
> 
> More important to me is how driver writers for embedded devices think
> about PM for embedded systems.  IMO, driver writers should think
> primarily in terms of runtime PM, and use that as the primary API for
> all driver PM. From my POV, system PM for embedded devices is just a
> special case of runtime PM.
>
> From a device driver perspective, system PM is just runtime
> PM where the "idleness" was forced and only a subset of possible wakeup
> sources are enabled.

Oh well, I wonder how much of a difference would make you think those things
are really different. ;-)

> I think this runtime-PM-centric view of the world
> is maybe where our differences of opinion are coming from.

Very much so.

> So with that perspecive, I'd like the code to reflect a
> runtime-PM-centric view as well.

And I wouldn't.

> The development effort is primarily
> focused on implementing efficient runtime PM for an _active_ system.
> When this is working, implementing system PM is easy: all that is needed
> is to enable/disable relevant wakeups and force the device to idle.
> This allows runtime PM to trigger, and the device is suspended.

No, it doesn't.  What you're trying to do is to "maunally" trigger runtime PM
when _you_ think is suitable.

...
> 
> Maybe it helps to show the flow of how I think this would work for a
> typical device during system suspend:
> 
> subsys->suspend()
>     driver->suspend()
>         /* check device_may_wakeup(), enable/disable wakeups */
>         /* quiesce HW, triggers runtime PM _put() or _suspend() */
>         subsys->runtime_suspend()
>             driver->runtime_suspend()
>                 driver_save_context()
>             /* subsys idles HW, sets low-power state */
>         /* nothing left for driver to do */
>     /* nothing left for subsys to do */

Please.  Why do you want to use subsys->runtime_suspend() _directly_ instead of
implementing .suspend_noirq() in the subsystem and allowing the core to run it
for you?  Quite frankly, I don't get it.

> > So, we seem to be in a "Catch 22" situation, in which the driver needs to run
> > its .runtime_suspend() code during system suspend, but it has to do it through
> > the subsystem-level .runtime_suspend() that cannot be run at that time.
> > Fortunately, however, there is a way out of it, because the driver has an
> > option to point its .suspend_noirq() callback to the same routine pointed to
> > by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
> > execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
> > designed so that this always works.
> 
> I don't follow this part.
> 
> So you're not OK with running the subsystem or driver .runtime_suspend()
> during .suspend(), but it is OK during .suspend_noirq()? 

No.  Please see above.

> Also, where/when would the subsystem .runtime_suspend() be called?

It won't be called during system suspend at all.  It doesn't have to be
called and it really shouldn't be called at that time.

Let me put it this way.  We have runtime PM callbacks and we have system suspend
callbacks _precisely_ because we want people to use the former for runtime PM
and the latter for system suspend.  We introduced different callback pointers
for the two things _on_ _purpose_ and I totally disagree with your trying to
play games to avoid using some of them.

I hope that's clear enough.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-15 21:54                     ` [linux-pm] " Kevin Hilman
@ 2011-06-16  0:01                       ` Rafael J. Wysocki
  2011-06-16  1:17                         ` Kevin Hilman
  2011-06-16  1:17                         ` Kevin Hilman
  2011-06-16  0:01                       ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16  0:01 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Wednesday, June 15, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
...
> > Moreover, on some systems devices will belong to PM domains and their
> > drivers may potentially be used with different PM domains on different
> > platforms.  This means that drivers really should not make any assumptions
> > about whether or not they can use runtime PM in their system suspend/resume
> > routines.  They can't.
> 
> Sure, but it's easy enough for subsystems that need protection to add
> it.  Why not just better document that driver & subsytem runtime PM
> callbacks *could* be called during a system suspend (and same for
> resume.)

Because allowing that to happen was a mistake in the first place.

> Any subsystems that want/need protection can prevent nesting
> simply with pm_runtime_get_noresume() and _put_noidle().
> 
> As I mentioned earlier in the thread, this can already happen today
> without .suspend() callbacks directly calling pm_runtime_suspend()
> (e.g. driver xfer finishes and does pm_runtime_put_sync() anytime after
> system suspend has started.)

Which is wrong.  At least, as I said in a different thread, no runtime PM
stuff should be run after .suspend() has returned for the given device.
Otherwise it will violate some assumptions regarding the conditions in which
the _noirq() callbacks are run.

> > Now, Kevin, I think that the problem you really want to address is this:
> > Suppose a driver needs to do one thing in its .runtime_suspend() callback
> > (e.g. "save state") and it wants to do two things in its .suspend()
> > callback (e.g. "quiesce device" and "save state").  Then, it seems, the
> > simplest approach would be to call its .runtie_suspend() routine from
> > its .suspend() routine (after doing the "quiesce device" thing).
> 
> Partially, yes.  But I'm not primarily concerned about the callbacks.
> Many of our simple drivers don't even need runtime PM callbacks
> (e.g. state is saved using shadow regs, or device is re-init'd for for
> every xfer etc.)
> 
> More important to me is how driver writers for embedded devices think
> about PM for embedded systems.  IMO, driver writers should think
> primarily in terms of runtime PM, and use that as the primary API for
> all driver PM. From my POV, system PM for embedded devices is just a
> special case of runtime PM.
>
> From a device driver perspective, system PM is just runtime
> PM where the "idleness" was forced and only a subset of possible wakeup
> sources are enabled.

Oh well, I wonder how much of a difference would make you think those things
are really different. ;-)

> I think this runtime-PM-centric view of the world
> is maybe where our differences of opinion are coming from.

Very much so.

> So with that perspecive, I'd like the code to reflect a
> runtime-PM-centric view as well.

And I wouldn't.

> The development effort is primarily
> focused on implementing efficient runtime PM for an _active_ system.
> When this is working, implementing system PM is easy: all that is needed
> is to enable/disable relevant wakeups and force the device to idle.
> This allows runtime PM to trigger, and the device is suspended.

No, it doesn't.  What you're trying to do is to "maunally" trigger runtime PM
when _you_ think is suitable.

...
> 
> Maybe it helps to show the flow of how I think this would work for a
> typical device during system suspend:
> 
> subsys->suspend()
>     driver->suspend()
>         /* check device_may_wakeup(), enable/disable wakeups */
>         /* quiesce HW, triggers runtime PM _put() or _suspend() */
>         subsys->runtime_suspend()
>             driver->runtime_suspend()
>                 driver_save_context()
>             /* subsys idles HW, sets low-power state */
>         /* nothing left for driver to do */
>     /* nothing left for subsys to do */

Please.  Why do you want to use subsys->runtime_suspend() _directly_ instead of
implementing .suspend_noirq() in the subsystem and allowing the core to run it
for you?  Quite frankly, I don't get it.

> > So, we seem to be in a "Catch 22" situation, in which the driver needs to run
> > its .runtime_suspend() code during system suspend, but it has to do it through
> > the subsystem-level .runtime_suspend() that cannot be run at that time.
> > Fortunately, however, there is a way out of it, because the driver has an
> > option to point its .suspend_noirq() callback to the same routine pointed to
> > by its .runtime_suspend() and get the subsystem-level .suspend_noirq() to
> > execute it.  The subsystem-level (e.g. PM domain) callbacks, in turn, may be
> > designed so that this always works.
> 
> I don't follow this part.
> 
> So you're not OK with running the subsystem or driver .runtime_suspend()
> during .suspend(), but it is OK during .suspend_noirq()? 

No.  Please see above.

> Also, where/when would the subsystem .runtime_suspend() be called?

It won't be called during system suspend at all.  It doesn't have to be
called and it really shouldn't be called at that time.

Let me put it this way.  We have runtime PM callbacks and we have system suspend
callbacks _precisely_ because we want people to use the former for runtime PM
and the latter for system suspend.  We introduced different callback pointers
for the two things _on_ _purpose_ and I totally disagree with your trying to
play games to avoid using some of them.

I hope that's clear enough.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-16  0:01                       ` Rafael J. Wysocki
  2011-06-16  1:17                         ` Kevin Hilman
@ 2011-06-16  1:17                         ` Kevin Hilman
  1 sibling, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-16  1:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday, June 15, 2011, Kevin Hilman wrote:

[...]

>>
>> From a device driver perspective, system PM is just runtime
>> PM where the "idleness" was forced and only a subset of possible wakeup
>> sources are enabled.
>
> Oh well, I wonder how much of a difference would make you think those things
> are really different. ;-)

Seeing a description of the differences would help.  So far the list is
rather short: wakeups and forcibly quieting the hardware.

I guess I still don't see why system PM cannot be viewed as a special
case of runtime PM, so how about a specific question: From a device
driver perspective, how is system PM anything other than
manually/forcibly creating the right conditions for a runtime PM
transition to happen?

[...]

>> The development effort is primarily
>> focused on implementing efficient runtime PM for an _active_ system.
>> When this is working, implementing system PM is easy: all that is needed
>> is to enable/disable relevant wakeups and force the device to idle.
>> This allows runtime PM to trigger, and the device is suspended.
>
> No, it doesn't.  What you're trying to do is to "maunally" trigger runtime PM

No.  

What I'm trying to do in .suspend() is create the conditions necessary
such that a runtime PM transition will occur *by itself*.  If the right
conditions exist (namely, idle HW, no pending activity, etc.) a runtime
PM transition will happen *on its own*, and will not need to be manually
triggered.  IOW, a runtime PM transition is a side effect of creating
the right idle conditions.

> when _you_ think is suitable.

No, only when a system suspend is requested by the user.

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-16  0:01                       ` Rafael J. Wysocki
@ 2011-06-16  1:17                         ` Kevin Hilman
  2011-06-16 14:27                           ` Alan Stern
                                             ` (3 more replies)
  2011-06-16  1:17                         ` Kevin Hilman
  1 sibling, 4 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-16  1:17 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> On Wednesday, June 15, 2011, Kevin Hilman wrote:

[...]

>>
>> From a device driver perspective, system PM is just runtime
>> PM where the "idleness" was forced and only a subset of possible wakeup
>> sources are enabled.
>
> Oh well, I wonder how much of a difference would make you think those things
> are really different. ;-)

Seeing a description of the differences would help.  So far the list is
rather short: wakeups and forcibly quieting the hardware.

I guess I still don't see why system PM cannot be viewed as a special
case of runtime PM, so how about a specific question: From a device
driver perspective, how is system PM anything other than
manually/forcibly creating the right conditions for a runtime PM
transition to happen?

[...]

>> The development effort is primarily
>> focused on implementing efficient runtime PM for an _active_ system.
>> When this is working, implementing system PM is easy: all that is needed
>> is to enable/disable relevant wakeups and force the device to idle.
>> This allows runtime PM to trigger, and the device is suspended.
>
> No, it doesn't.  What you're trying to do is to "maunally" trigger runtime PM

No.  

What I'm trying to do in .suspend() is create the conditions necessary
such that a runtime PM transition will occur *by itself*.  If the right
conditions exist (namely, idle HW, no pending activity, etc.) a runtime
PM transition will happen *on its own*, and will not need to be manually
triggered.  IOW, a runtime PM transition is a side effect of creating
the right idle conditions.

> when _you_ think is suitable.

No, only when a system suspend is requested by the user.

Kevin

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-16  1:17                         ` Kevin Hilman
@ 2011-06-16 14:27                           ` Alan Stern
  2011-06-16 14:27                           ` [linux-pm] " Alan Stern
                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-16 14:27 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Wed, 15 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> 
> [...]
> 
> >>
> >> From a device driver perspective, system PM is just runtime
> >> PM where the "idleness" was forced and only a subset of possible wakeup
> >> sources are enabled.
> >
> > Oh well, I wonder how much of a difference would make you think those things
> > are really different. ;-)
> 
> Seeing a description of the differences would help.  So far the list is
> rather short: wakeups and forcibly quieting the hardware.

Another difference is that the user can forbid runtime power management 
of any device through the power/control attribute, independently of 
system sleeps.

Yet another difference arises because during system PM, the PM
workqueue is frozen.  If a driver relies on asynchronous runtime PM
then nothing will happen.  This may not apply to you, but it applies to
plenty of other drivers.

> I guess I still don't see why system PM cannot be viewed as a special
> case of runtime PM, so how about a specific question: From a device
> driver perspective, how is system PM anything other than
> manually/forcibly creating the right conditions for a runtime PM
> transition to happen?

What you're missing is that runtime PM has two separate aspects: a 
hardware/power aspect and an administrative aspect.  In terms of 
hardware/power it is very similar to system PM, but in administrative 
terms it is quite different.

Another thing you need to realize: Rafael is open to the idea that
subsystems may be designed specifically to allow drivers to use runtime
PM during their ->suspend and ->resume callbacks.  However in the
period between ->suspend returning and ->resume being called, runtime
PM should _not_ be used.  In particular, this includes the times when
->suspend_noirq and ->resume_noirq are called -- and these are the
routines which are often expected to do the real work of setting the
device's power state.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-16  1:17                         ` Kevin Hilman
  2011-06-16 14:27                           ` Alan Stern
@ 2011-06-16 14:27                           ` Alan Stern
  2011-06-16 22:48                             ` Rafael J. Wysocki
  2011-06-16 22:48                             ` Rafael J. Wysocki
  2011-06-16 22:30                           ` Rafael J. Wysocki
  2011-06-16 22:30                           ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-16 14:27 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Linux-pm mailing list, linux-omap,
	Magnus Damm, Paul Walmsley

On Wed, 15 Jun 2011, Kevin Hilman wrote:

> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> 
> [...]
> 
> >>
> >> From a device driver perspective, system PM is just runtime
> >> PM where the "idleness" was forced and only a subset of possible wakeup
> >> sources are enabled.
> >
> > Oh well, I wonder how much of a difference would make you think those things
> > are really different. ;-)
> 
> Seeing a description of the differences would help.  So far the list is
> rather short: wakeups and forcibly quieting the hardware.

Another difference is that the user can forbid runtime power management 
of any device through the power/control attribute, independently of 
system sleeps.

Yet another difference arises because during system PM, the PM
workqueue is frozen.  If a driver relies on asynchronous runtime PM
then nothing will happen.  This may not apply to you, but it applies to
plenty of other drivers.

> I guess I still don't see why system PM cannot be viewed as a special
> case of runtime PM, so how about a specific question: From a device
> driver perspective, how is system PM anything other than
> manually/forcibly creating the right conditions for a runtime PM
> transition to happen?

What you're missing is that runtime PM has two separate aspects: a 
hardware/power aspect and an administrative aspect.  In terms of 
hardware/power it is very similar to system PM, but in administrative 
terms it is quite different.

Another thing you need to realize: Rafael is open to the idea that
subsystems may be designed specifically to allow drivers to use runtime
PM during their ->suspend and ->resume callbacks.  However in the
period between ->suspend returning and ->resume being called, runtime
PM should _not_ be used.  In particular, this includes the times when
->suspend_noirq and ->resume_noirq are called -- and these are the
routines which are often expected to do the real work of setting the
device's power state.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-16  1:17                         ` Kevin Hilman
  2011-06-16 14:27                           ` Alan Stern
  2011-06-16 14:27                           ` [linux-pm] " Alan Stern
@ 2011-06-16 22:30                           ` Rafael J. Wysocki
  2011-06-16 22:30                           ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 22:30 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: Linux-pm mailing list, linux-omap

On Thursday, June 16, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> 
> [...]
> 
> >>
> >> From a device driver perspective, system PM is just runtime
> >> PM where the "idleness" was forced and only a subset of possible wakeup
> >> sources are enabled.
> >
> > Oh well, I wonder how much of a difference would make you think those things
> > are really different. ;-)
> 
> Seeing a description of the differences would help.  So far the list is
> rather short: wakeups and forcibly quieting the hardware.

Alan has given some more examples in his reply already, I don't think
I'd do it better. :-)

> I guess I still don't see why system PM cannot be viewed as a special
> case of runtime PM, so how about a specific question: From a device
> driver perspective, how is system PM anything other than
> manually/forcibly creating the right conditions for a runtime PM
> transition to happen?

Because it doesn't create those conditions?  It doesn't make runtime PM
usage counters magically drop to zero, for one example, and it can't do
that because of the user space part.

Still, I'm not saying you can't use the same _code_ for both runtime PM
and system suspend.  In the majority of cases this really is necessary to
avoid code duplication.  However, you really need not use the same callback
pointers to that code in both cases.  You may use common functions that will
be called by your .suspend() or .suspend_noirq() callbacks and by your
.runtime_suspend().  You may make .suspend_norq() and .runtime_suspend()
point to the same routine if that's suitable.  You don't necessarily have
to call pm_runtime_suspend() from your .suspend() callback to make that
code run and that applies to subsystems too.

As I said before, at one point we decided to use different PM callback
pointers for different purposes and that's why we have so many of them.
We could use multipurpose .suspend(dev, arg) instead, where arg would
determine the action to be taken (e.g. SYSTEM_SUSPEND, RUNTIME_SUSPEND etc.).
Now imagine we've done so and, at the subsystem level,
.suspend(dev, SYSTEM_SUSPEND) is called during system suspend by the PM core.
Suppose further that it calls a driver's .suspend(dev, SYSTEM_SUSPEND) and
that, in turn, does pm_runtime_suspend() which calls back to the very same
function it came from, but with different arg.  So we have:

subsys->suspend(dev, SYSTEM_SUSPEND)
    driver->suspend(dev, SYSTEMS_SUSPEND)
        subsys->suspend(dev, RUNTIME_SUSPEND)

So it effectively makes subsys->suspend() call itself recursively with
a different second argument without knowing that this is going to happen.
Do you still not see any problem with that?

There's more, though, because pm_runtime_suspend() not only invokes the
subsystem's callback, but also uses infrastructure that's designed to be
used in the system's working state, when user space is available and so on.
The state of the system during system suspend/resume is different however.

So, what I'm saying is that (1) calling things like pm_runtime_suspend()
from system suspend callbacks is pointless, because it doesn't guarantee
that the device will be really suspended, and unnecessary, because the
code it is supposed to invoke may be called into in a different way, and
(2) you can always design your subsystem-level code to make use of system
suspend callbacks in such a way that all of the necessary code will be
executed without calling pm_runtime_suspend() etc. from system suspend
callbacks.

It really doesn't matter that in your picture of the world system suspend
is a degenerate case of runtime PM, just because your model hardware
platform doesn't provide any "suspend the system" magic.  We have two
different frameworks for both and that's for a reason (there are platforms
where we need to use different hardware/firmware mechanisms in both cases)
and those frameworks are kind of mutually exclusive (i.e. you're not supposed
to use them both at the same time).  Nevertheless, you _can_ use the same
code, when necessary, in each case, but each case requires that code to be
called in a different way.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-16  1:17                         ` Kevin Hilman
                                             ` (2 preceding siblings ...)
  2011-06-16 22:30                           ` Rafael J. Wysocki
@ 2011-06-16 22:30                           ` Rafael J. Wysocki
  3 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 22:30 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Alan Stern, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Thursday, June 16, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> 
> [...]
> 
> >>
> >> From a device driver perspective, system PM is just runtime
> >> PM where the "idleness" was forced and only a subset of possible wakeup
> >> sources are enabled.
> >
> > Oh well, I wonder how much of a difference would make you think those things
> > are really different. ;-)
> 
> Seeing a description of the differences would help.  So far the list is
> rather short: wakeups and forcibly quieting the hardware.

Alan has given some more examples in his reply already, I don't think
I'd do it better. :-)

> I guess I still don't see why system PM cannot be viewed as a special
> case of runtime PM, so how about a specific question: From a device
> driver perspective, how is system PM anything other than
> manually/forcibly creating the right conditions for a runtime PM
> transition to happen?

Because it doesn't create those conditions?  It doesn't make runtime PM
usage counters magically drop to zero, for one example, and it can't do
that because of the user space part.

Still, I'm not saying you can't use the same _code_ for both runtime PM
and system suspend.  In the majority of cases this really is necessary to
avoid code duplication.  However, you really need not use the same callback
pointers to that code in both cases.  You may use common functions that will
be called by your .suspend() or .suspend_noirq() callbacks and by your
.runtime_suspend().  You may make .suspend_norq() and .runtime_suspend()
point to the same routine if that's suitable.  You don't necessarily have
to call pm_runtime_suspend() from your .suspend() callback to make that
code run and that applies to subsystems too.

As I said before, at one point we decided to use different PM callback
pointers for different purposes and that's why we have so many of them.
We could use multipurpose .suspend(dev, arg) instead, where arg would
determine the action to be taken (e.g. SYSTEM_SUSPEND, RUNTIME_SUSPEND etc.).
Now imagine we've done so and, at the subsystem level,
.suspend(dev, SYSTEM_SUSPEND) is called during system suspend by the PM core.
Suppose further that it calls a driver's .suspend(dev, SYSTEM_SUSPEND) and
that, in turn, does pm_runtime_suspend() which calls back to the very same
function it came from, but with different arg.  So we have:

subsys->suspend(dev, SYSTEM_SUSPEND)
    driver->suspend(dev, SYSTEMS_SUSPEND)
        subsys->suspend(dev, RUNTIME_SUSPEND)

So it effectively makes subsys->suspend() call itself recursively with
a different second argument without knowing that this is going to happen.
Do you still not see any problem with that?

There's more, though, because pm_runtime_suspend() not only invokes the
subsystem's callback, but also uses infrastructure that's designed to be
used in the system's working state, when user space is available and so on.
The state of the system during system suspend/resume is different however.

So, what I'm saying is that (1) calling things like pm_runtime_suspend()
from system suspend callbacks is pointless, because it doesn't guarantee
that the device will be really suspended, and unnecessary, because the
code it is supposed to invoke may be called into in a different way, and
(2) you can always design your subsystem-level code to make use of system
suspend callbacks in such a way that all of the necessary code will be
executed without calling pm_runtime_suspend() etc. from system suspend
callbacks.

It really doesn't matter that in your picture of the world system suspend
is a degenerate case of runtime PM, just because your model hardware
platform doesn't provide any "suspend the system" magic.  We have two
different frameworks for both and that's for a reason (there are platforms
where we need to use different hardware/firmware mechanisms in both cases)
and those frameworks are kind of mutually exclusive (i.e. you're not supposed
to use them both at the same time).  Nevertheless, you _can_ use the same
code, when necessary, in each case, but each case requires that code to be
called in a different way.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-16 14:27                           ` [linux-pm] " Alan Stern
  2011-06-16 22:48                             ` Rafael J. Wysocki
@ 2011-06-16 22:48                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 22:48 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list, linux-omap

On Thursday, June 16, 2011, Alan Stern wrote:
> On Wed, 15 Jun 2011, Kevin Hilman wrote:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> > 
> > > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> > 
> > [...]
> > 
> > >>
> > >> From a device driver perspective, system PM is just runtime
> > >> PM where the "idleness" was forced and only a subset of possible wakeup
> > >> sources are enabled.
> > >
> > > Oh well, I wonder how much of a difference would make you think those things
> > > are really different. ;-)
> > 
> > Seeing a description of the differences would help.  So far the list is
> > rather short: wakeups and forcibly quieting the hardware.
> 
> Another difference is that the user can forbid runtime power management 
> of any device through the power/control attribute, independently of 
> system sleeps.
> 
> Yet another difference arises because during system PM, the PM
> workqueue is frozen.  If a driver relies on asynchronous runtime PM
> then nothing will happen.  This may not apply to you, but it applies to
> plenty of other drivers.
> 
> > I guess I still don't see why system PM cannot be viewed as a special
> > case of runtime PM, so how about a specific question: From a device
> > driver perspective, how is system PM anything other than
> > manually/forcibly creating the right conditions for a runtime PM
> > transition to happen?
> 
> What you're missing is that runtime PM has two separate aspects: a 
> hardware/power aspect and an administrative aspect.  In terms of 
> hardware/power it is very similar to system PM, but in administrative 
> terms it is quite different.
> 
> Another thing you need to realize: Rafael is open to the idea that
> subsystems may be designed specifically to allow drivers to use runtime
> PM during their ->suspend and ->resume callbacks.  However in the
> period between ->suspend returning and ->resume being called, runtime
> PM should _not_ be used.  In particular, this includes the times when
> ->suspend_noirq and ->resume_noirq are called -- and these are the
> routines which are often expected to do the real work of setting the
> device's power state.

To be precise, my opinion is that calling pm_runtime_suspend() or
pm_runtime_put_sync() from a driver's .suspend() callback always is a bad
idea, because it leads to unnecessary complications and doesn't guarantee
that the desired action will take place at all.  That said I don't really
think that the PM core should actively prevent that from being done,
because it's not a direct correctness issue.  Using the runtime PM framework
after suspend_device_irqs() has run is a different problem, though, and
in my opinion the PM core should prevent that from being done, this way or
another.

Moreover, I don't have any problems with subsystems calling pm_runtime_resume()
or pm_runtime_get_sync() from their .prepare() callbacks, because that simply
causes devices to be put into a well known state before the "real" system
suspend callbacks are run.  In principle, drivers can do the same thing if
their subsystems don't, but that's not so obviously clean (given that the
same driver may be used with a few different PM domain implementations, for
example).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-16 14:27                           ` [linux-pm] " Alan Stern
@ 2011-06-16 22:48                             ` Rafael J. Wysocki
  2011-06-17 19:47                               ` Rafael J. Wysocki
  2011-06-17 19:47                               ` Rafael J. Wysocki
  2011-06-16 22:48                             ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 22:48 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kevin Hilman, Linux-pm mailing list, linux-omap, Magnus Damm,
	Paul Walmsley

On Thursday, June 16, 2011, Alan Stern wrote:
> On Wed, 15 Jun 2011, Kevin Hilman wrote:
> 
> > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> > 
> > > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> > 
> > [...]
> > 
> > >>
> > >> From a device driver perspective, system PM is just runtime
> > >> PM where the "idleness" was forced and only a subset of possible wakeup
> > >> sources are enabled.
> > >
> > > Oh well, I wonder how much of a difference would make you think those things
> > > are really different. ;-)
> > 
> > Seeing a description of the differences would help.  So far the list is
> > rather short: wakeups and forcibly quieting the hardware.
> 
> Another difference is that the user can forbid runtime power management 
> of any device through the power/control attribute, independently of 
> system sleeps.
> 
> Yet another difference arises because during system PM, the PM
> workqueue is frozen.  If a driver relies on asynchronous runtime PM
> then nothing will happen.  This may not apply to you, but it applies to
> plenty of other drivers.
> 
> > I guess I still don't see why system PM cannot be viewed as a special
> > case of runtime PM, so how about a specific question: From a device
> > driver perspective, how is system PM anything other than
> > manually/forcibly creating the right conditions for a runtime PM
> > transition to happen?
> 
> What you're missing is that runtime PM has two separate aspects: a 
> hardware/power aspect and an administrative aspect.  In terms of 
> hardware/power it is very similar to system PM, but in administrative 
> terms it is quite different.
> 
> Another thing you need to realize: Rafael is open to the idea that
> subsystems may be designed specifically to allow drivers to use runtime
> PM during their ->suspend and ->resume callbacks.  However in the
> period between ->suspend returning and ->resume being called, runtime
> PM should _not_ be used.  In particular, this includes the times when
> ->suspend_noirq and ->resume_noirq are called -- and these are the
> routines which are often expected to do the real work of setting the
> device's power state.

To be precise, my opinion is that calling pm_runtime_suspend() or
pm_runtime_put_sync() from a driver's .suspend() callback always is a bad
idea, because it leads to unnecessary complications and doesn't guarantee
that the desired action will take place at all.  That said I don't really
think that the PM core should actively prevent that from being done,
because it's not a direct correctness issue.  Using the runtime PM framework
after suspend_device_irqs() has run is a different problem, though, and
in my opinion the PM core should prevent that from being done, this way or
another.

Moreover, I don't have any problems with subsystems calling pm_runtime_resume()
or pm_runtime_get_sync() from their .prepare() callbacks, because that simply
causes devices to be put into a well known state before the "real" system
suspend callbacks are run.  In principle, drivers can do the same thing if
their subsystems don't, but that's not so obviously clean (given that the
same driver may be used with a few different PM domain implementations, for
example).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-16 22:48                             ` Rafael J. Wysocki
  2011-06-17 19:47                               ` Rafael J. Wysocki
@ 2011-06-17 19:47                               ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-17 19:47 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap

On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> On Thursday, June 16, 2011, Alan Stern wrote:
> > On Wed, 15 Jun 2011, Kevin Hilman wrote:
> > 
> > > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> > > 
> > > > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> > > 
> > > [...]
> > > 
> > > >>
> > > >> From a device driver perspective, system PM is just runtime
> > > >> PM where the "idleness" was forced and only a subset of possible wakeup
> > > >> sources are enabled.
> > > >
> > > > Oh well, I wonder how much of a difference would make you think those things
> > > > are really different. ;-)
> > > 
> > > Seeing a description of the differences would help.  So far the list is
> > > rather short: wakeups and forcibly quieting the hardware.
> > 
> > Another difference is that the user can forbid runtime power management 
> > of any device through the power/control attribute, independently of 
> > system sleeps.
> > 
> > Yet another difference arises because during system PM, the PM
> > workqueue is frozen.  If a driver relies on asynchronous runtime PM
> > then nothing will happen.  This may not apply to you, but it applies to
> > plenty of other drivers.
> > 
> > > I guess I still don't see why system PM cannot be viewed as a special
> > > case of runtime PM, so how about a specific question: From a device
> > > driver perspective, how is system PM anything other than
> > > manually/forcibly creating the right conditions for a runtime PM
> > > transition to happen?
> > 
> > What you're missing is that runtime PM has two separate aspects: a 
> > hardware/power aspect and an administrative aspect.  In terms of 
> > hardware/power it is very similar to system PM, but in administrative 
> > terms it is quite different.
> > 
> > Another thing you need to realize: Rafael is open to the idea that
> > subsystems may be designed specifically to allow drivers to use runtime
> > PM during their ->suspend and ->resume callbacks.  However in the
> > period between ->suspend returning and ->resume being called, runtime
> > PM should _not_ be used.  In particular, this includes the times when
> > ->suspend_noirq and ->resume_noirq are called -- and these are the
> > routines which are often expected to do the real work of setting the
> > device's power state.
> 
> To be precise, my opinion is that calling pm_runtime_suspend() or
> pm_runtime_put_sync() from a driver's .suspend() callback always is a bad
> idea, because it leads to unnecessary complications and doesn't guarantee
> that the desired action will take place at all.  That said I don't really
> think that the PM core should actively prevent that from being done,
> because it's not a direct correctness issue.  Using the runtime PM framework
> after suspend_device_irqs() has run is a different problem, though, and
> in my opinion the PM core should prevent that from being done, this way or
> another.

Having considered that a bit more I see that, in fact, commit
e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
succeed during system suspend) has introduced at least one regression.
Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
callback to guarantee that devices will be in a well known state before
the PCI .suspend() and .suspend_noirq() callbacks are executed.
Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
isn't valid any more, because devices can be runtime-suspend after the
pm_runtime_resume() in .prepare() has run.

USB seems to do something similar in choose_wakeup().

So, either the both of these subsystems should be modified to use
pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.

Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
prefer to revert that commit for 3.0.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-16 22:48                             ` Rafael J. Wysocki
@ 2011-06-17 19:47                               ` Rafael J. Wysocki
  2011-06-17 20:04                                 ` Alan Stern
  2011-06-17 20:04                                 ` Alan Stern
  2011-06-17 19:47                               ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-17 19:47 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap, Kevin Hilman, Magnus Damm, Paul Walmsley

On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> On Thursday, June 16, 2011, Alan Stern wrote:
> > On Wed, 15 Jun 2011, Kevin Hilman wrote:
> > 
> > > "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> > > 
> > > > On Wednesday, June 15, 2011, Kevin Hilman wrote:
> > > 
> > > [...]
> > > 
> > > >>
> > > >> From a device driver perspective, system PM is just runtime
> > > >> PM where the "idleness" was forced and only a subset of possible wakeup
> > > >> sources are enabled.
> > > >
> > > > Oh well, I wonder how much of a difference would make you think those things
> > > > are really different. ;-)
> > > 
> > > Seeing a description of the differences would help.  So far the list is
> > > rather short: wakeups and forcibly quieting the hardware.
> > 
> > Another difference is that the user can forbid runtime power management 
> > of any device through the power/control attribute, independently of 
> > system sleeps.
> > 
> > Yet another difference arises because during system PM, the PM
> > workqueue is frozen.  If a driver relies on asynchronous runtime PM
> > then nothing will happen.  This may not apply to you, but it applies to
> > plenty of other drivers.
> > 
> > > I guess I still don't see why system PM cannot be viewed as a special
> > > case of runtime PM, so how about a specific question: From a device
> > > driver perspective, how is system PM anything other than
> > > manually/forcibly creating the right conditions for a runtime PM
> > > transition to happen?
> > 
> > What you're missing is that runtime PM has two separate aspects: a 
> > hardware/power aspect and an administrative aspect.  In terms of 
> > hardware/power it is very similar to system PM, but in administrative 
> > terms it is quite different.
> > 
> > Another thing you need to realize: Rafael is open to the idea that
> > subsystems may be designed specifically to allow drivers to use runtime
> > PM during their ->suspend and ->resume callbacks.  However in the
> > period between ->suspend returning and ->resume being called, runtime
> > PM should _not_ be used.  In particular, this includes the times when
> > ->suspend_noirq and ->resume_noirq are called -- and these are the
> > routines which are often expected to do the real work of setting the
> > device's power state.
> 
> To be precise, my opinion is that calling pm_runtime_suspend() or
> pm_runtime_put_sync() from a driver's .suspend() callback always is a bad
> idea, because it leads to unnecessary complications and doesn't guarantee
> that the desired action will take place at all.  That said I don't really
> think that the PM core should actively prevent that from being done,
> because it's not a direct correctness issue.  Using the runtime PM framework
> after suspend_device_irqs() has run is a different problem, though, and
> in my opinion the PM core should prevent that from being done, this way or
> another.

Having considered that a bit more I see that, in fact, commit
e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
succeed during system suspend) has introduced at least one regression.
Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
callback to guarantee that devices will be in a well known state before
the PCI .suspend() and .suspend_noirq() callbacks are executed.
Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
isn't valid any more, because devices can be runtime-suspend after the
pm_runtime_resume() in .prepare() has run.

USB seems to do something similar in choose_wakeup().

So, either the both of these subsystems should be modified to use
pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.

Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
prefer to revert that commit for 3.0.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-17 19:47                               ` Rafael J. Wysocki
  2011-06-17 20:04                                 ` Alan Stern
@ 2011-06-17 20:04                                 ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-17 20:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:

> Having considered that a bit more I see that, in fact, commit
> e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> succeed during system suspend) has introduced at least one regression.
> Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> callback to guarantee that devices will be in a well known state before
> the PCI .suspend() and .suspend_noirq() callbacks are executed.
> Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> isn't valid any more, because devices can be runtime-suspend after the
> pm_runtime_resume() in .prepare() has run.
> 
> USB seems to do something similar in choose_wakeup().
> 
> So, either the both of these subsystems should be modified to use
> pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.

pm_runtime_put_noidle would be appropriate.

> Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> prefer to revert that commit for 3.0.

Maybe we can compromise.  Instead of reverting that commit outright,
put the get_noresume just before the suspend callback and put the
put_sync just after the resume callback.

The point is that some embedded systems may rely heavily on runtime PM
to keep power usage low, so we shouldn't prevent it from doing its job
during the entire prepare -> suspend window.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-17 19:47                               ` Rafael J. Wysocki
@ 2011-06-17 20:04                                 ` Alan Stern
  2011-06-17 21:29                                   ` Rafael J. Wysocki
  2011-06-17 21:29                                   ` Rafael J. Wysocki
  2011-06-17 20:04                                 ` Alan Stern
  1 sibling, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-17 20:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Magnus Damm, Paul Walmsley

On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:

> Having considered that a bit more I see that, in fact, commit
> e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> succeed during system suspend) has introduced at least one regression.
> Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> callback to guarantee that devices will be in a well known state before
> the PCI .suspend() and .suspend_noirq() callbacks are executed.
> Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> isn't valid any more, because devices can be runtime-suspend after the
> pm_runtime_resume() in .prepare() has run.
> 
> USB seems to do something similar in choose_wakeup().
> 
> So, either the both of these subsystems should be modified to use
> pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.

pm_runtime_put_noidle would be appropriate.

> Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> prefer to revert that commit for 3.0.

Maybe we can compromise.  Instead of reverting that commit outright,
put the get_noresume just before the suspend callback and put the
put_sync just after the resume callback.

The point is that some embedded systems may rely heavily on runtime PM
to keep power usage low, so we shouldn't prevent it from doing its job
during the entire prepare -> suspend window.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-17 20:04                                 ` Alan Stern
  2011-06-17 21:29                                   ` Rafael J. Wysocki
@ 2011-06-17 21:29                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-17 21:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap

On Friday, June 17, 2011, Alan Stern wrote:
> On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> 
> > Having considered that a bit more I see that, in fact, commit
> > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > succeed during system suspend) has introduced at least one regression.
> > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > callback to guarantee that devices will be in a well known state before
> > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > isn't valid any more, because devices can be runtime-suspend after the
> > pm_runtime_resume() in .prepare() has run.
> > 
> > USB seems to do something similar in choose_wakeup().
> > 
> > So, either the both of these subsystems should be modified to use
> > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> 
> pm_runtime_put_noidle would be appropriate.
> 
> > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > prefer to revert that commit for 3.0.
> 
> Maybe we can compromise.  Instead of reverting that commit outright,
> put the get_noresume just before the suspend callback and put the
> put_sync just after the resume callback.

That wouldn't fix the PCI problem, though, because it would leave a small
window in which the device could be suspended after the pm_runtime_resume()
in pci_pm_prepare() had run.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-17 20:04                                 ` Alan Stern
@ 2011-06-17 21:29                                   ` Rafael J. Wysocki
  2011-06-18 11:08                                     ` Rafael J. Wysocki
  2011-06-18 11:08                                     ` Rafael J. Wysocki
  2011-06-17 21:29                                   ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-17 21:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap, Kevin Hilman, Magnus Damm, Paul Walmsley

On Friday, June 17, 2011, Alan Stern wrote:
> On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> 
> > Having considered that a bit more I see that, in fact, commit
> > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > succeed during system suspend) has introduced at least one regression.
> > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > callback to guarantee that devices will be in a well known state before
> > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > isn't valid any more, because devices can be runtime-suspend after the
> > pm_runtime_resume() in .prepare() has run.
> > 
> > USB seems to do something similar in choose_wakeup().
> > 
> > So, either the both of these subsystems should be modified to use
> > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> 
> pm_runtime_put_noidle would be appropriate.
> 
> > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > prefer to revert that commit for 3.0.
> 
> Maybe we can compromise.  Instead of reverting that commit outright,
> put the get_noresume just before the suspend callback and put the
> put_sync just after the resume callback.

That wouldn't fix the PCI problem, though, because it would leave a small
window in which the device could be suspended after the pm_runtime_resume()
in pci_pm_prepare() had run.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-17 21:29                                   ` Rafael J. Wysocki
  2011-06-18 11:08                                     ` Rafael J. Wysocki
@ 2011-06-18 11:08                                     ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 11:08 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap

On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> On Friday, June 17, 2011, Alan Stern wrote:
> > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > 
> > > Having considered that a bit more I see that, in fact, commit
> > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > succeed during system suspend) has introduced at least one regression.
> > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > callback to guarantee that devices will be in a well known state before
> > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > isn't valid any more, because devices can be runtime-suspend after the
> > > pm_runtime_resume() in .prepare() has run.
> > > 
> > > USB seems to do something similar in choose_wakeup().
> > > 
> > > So, either the both of these subsystems should be modified to use
> > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > 
> > pm_runtime_put_noidle would be appropriate.
> > 
> > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > prefer to revert that commit for 3.0.
> > 
> > Maybe we can compromise.  Instead of reverting that commit outright,
> > put the get_noresume just before the suspend callback and put the
> > put_sync just after the resume callback.
> 
> That wouldn't fix the PCI problem, though, because it would leave a small
> window in which the device could be suspended after the pm_runtime_resume()
> in pci_pm_prepare() had run.

That said, the PCI case can be solved with a separate patch and if the other
subsystems are not affected, perhaps that's the best approach.

Still, I'd like to make sure that there won't be any races between runtime
PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
the patch below.

Thanks,
Rafael

---
 drivers/base/power/main.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
 	async_error = 0;
 
 	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
+		pm_runtime_get_noresume(dev);
+		pm_runtime_enable(dev);
 		INIT_COMPLETION(dev->power.completion);
 		if (is_async(dev)) {
 			get_device(dev);
@@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
 		}
 		if (!list_empty(&dev->power.entry))
 			list_move_tail(&dev->power.entry, &dpm_prepared_list);
+		pm_runtime_put_noidle(dev);
 		put_device(dev);
 	}
 	mutex_unlock(&dpm_list_mtx);
@@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
 			put_device(dev);
 			break;
 		}
-		if (!list_empty(&dev->power.entry))
+		if (!list_empty(&dev->power.entry)) {
 			list_move(&dev->power.entry, &dpm_suspended_list);
+			pm_runtime_disable(dev);
+		}
 		put_device(dev);
 		if (async_error)
 			break;

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-17 21:29                                   ` Rafael J. Wysocki
@ 2011-06-18 11:08                                     ` Rafael J. Wysocki
  2011-06-18 15:31                                       ` Alan Stern
  2011-06-18 15:31                                       ` [linux-pm] " Alan Stern
  2011-06-18 11:08                                     ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 11:08 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm

On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> On Friday, June 17, 2011, Alan Stern wrote:
> > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > 
> > > Having considered that a bit more I see that, in fact, commit
> > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > succeed during system suspend) has introduced at least one regression.
> > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > callback to guarantee that devices will be in a well known state before
> > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > isn't valid any more, because devices can be runtime-suspend after the
> > > pm_runtime_resume() in .prepare() has run.
> > > 
> > > USB seems to do something similar in choose_wakeup().
> > > 
> > > So, either the both of these subsystems should be modified to use
> > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > 
> > pm_runtime_put_noidle would be appropriate.
> > 
> > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > prefer to revert that commit for 3.0.
> > 
> > Maybe we can compromise.  Instead of reverting that commit outright,
> > put the get_noresume just before the suspend callback and put the
> > put_sync just after the resume callback.
> 
> That wouldn't fix the PCI problem, though, because it would leave a small
> window in which the device could be suspended after the pm_runtime_resume()
> in pci_pm_prepare() had run.

That said, the PCI case can be solved with a separate patch and if the other
subsystems are not affected, perhaps that's the best approach.

Still, I'd like to make sure that there won't be any races between runtime
PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
the patch below.

Thanks,
Rafael

---
 drivers/base/power/main.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
 	async_error = 0;
 
 	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
+		pm_runtime_get_noresume(dev);
+		pm_runtime_enable(dev);
 		INIT_COMPLETION(dev->power.completion);
 		if (is_async(dev)) {
 			get_device(dev);
@@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
 		}
 		if (!list_empty(&dev->power.entry))
 			list_move_tail(&dev->power.entry, &dpm_prepared_list);
+		pm_runtime_put_noidle(dev);
 		put_device(dev);
 	}
 	mutex_unlock(&dpm_list_mtx);
@@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
 			put_device(dev);
 			break;
 		}
-		if (!list_empty(&dev->power.entry))
+		if (!list_empty(&dev->power.entry)) {
 			list_move(&dev->power.entry, &dpm_suspended_list);
+			pm_runtime_disable(dev);
+		}
 		put_device(dev);
 		if (async_error)
 			break;

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-18 11:08                                     ` Rafael J. Wysocki
@ 2011-06-18 15:31                                       ` Alan Stern
  2011-06-18 15:31                                       ` [linux-pm] " Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-18 15:31 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, linux-omap

On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:

> On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> > On Friday, June 17, 2011, Alan Stern wrote:
> > > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > > 
> > > > Having considered that a bit more I see that, in fact, commit
> > > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > > succeed during system suspend) has introduced at least one regression.
> > > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > > callback to guarantee that devices will be in a well known state before
> > > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > > isn't valid any more, because devices can be runtime-suspend after the
> > > > pm_runtime_resume() in .prepare() has run.
> > > > 
> > > > USB seems to do something similar in choose_wakeup().
> > > > 
> > > > So, either the both of these subsystems should be modified to use
> > > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > > 
> > > pm_runtime_put_noidle would be appropriate.
> > > 
> > > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > > prefer to revert that commit for 3.0.
> > > 
> > > Maybe we can compromise.  Instead of reverting that commit outright,
> > > put the get_noresume just before the suspend callback and put the
> > > put_sync just after the resume callback.
> > 
> > That wouldn't fix the PCI problem, though, because it would leave a small
> > window in which the device could be suspended after the pm_runtime_resume()
> > in pci_pm_prepare() had run.
> 
> That said, the PCI case can be solved with a separate patch and if the other
> subsystems are not affected, perhaps that's the best approach.

Yes, it would be a simple change.

> Still, I'd like to make sure that there won't be any races between runtime
> PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
> the patch below.
> 
> Thanks,
> Rafael
> 
> ---
>  drivers/base/power/main.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
>  	async_error = 0;
>  
>  	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
> +		pm_runtime_get_noresume(dev);
> +		pm_runtime_enable(dev);
>  		INIT_COMPLETION(dev->power.completion);
>  		if (is_async(dev)) {
>  			get_device(dev);
> @@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
>  		}
>  		if (!list_empty(&dev->power.entry))
>  			list_move_tail(&dev->power.entry, &dpm_prepared_list);
> +		pm_runtime_put_noidle(dev);
>  		put_device(dev);
>  	}
>  	mutex_unlock(&dpm_list_mtx);
> @@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
>  			put_device(dev);
>  			break;
>  		}
> -		if (!list_empty(&dev->power.entry))
> +		if (!list_empty(&dev->power.entry)) {
>  			list_move(&dev->power.entry, &dpm_suspended_list);
> +			pm_runtime_disable(dev);
> +		}

The put_noidle is in the wrong place for async resumes.  Likewise for
the pm_runtime_disable() and async suspends.  Also this runs into
problems if a device is never suspended (i.e., if the sleep transition
aborts before suspending that device).

I have been working on a similar patch to do these things.  But it got
derailed by the problems mentioned earlier in the other email thread
(and the bug fix I posted yesterday).  Maybe I can send it in early
next week.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-18 11:08                                     ` Rafael J. Wysocki
  2011-06-18 15:31                                       ` Alan Stern
@ 2011-06-18 15:31                                       ` Alan Stern
  2011-06-18 21:01                                         ` Rafael J. Wysocki
  2011-06-18 21:01                                         ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-18 15:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm

On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:

> On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> > On Friday, June 17, 2011, Alan Stern wrote:
> > > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > > 
> > > > Having considered that a bit more I see that, in fact, commit
> > > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > > succeed during system suspend) has introduced at least one regression.
> > > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > > callback to guarantee that devices will be in a well known state before
> > > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > > isn't valid any more, because devices can be runtime-suspend after the
> > > > pm_runtime_resume() in .prepare() has run.
> > > > 
> > > > USB seems to do something similar in choose_wakeup().
> > > > 
> > > > So, either the both of these subsystems should be modified to use
> > > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > > 
> > > pm_runtime_put_noidle would be appropriate.
> > > 
> > > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > > prefer to revert that commit for 3.0.
> > > 
> > > Maybe we can compromise.  Instead of reverting that commit outright,
> > > put the get_noresume just before the suspend callback and put the
> > > put_sync just after the resume callback.
> > 
> > That wouldn't fix the PCI problem, though, because it would leave a small
> > window in which the device could be suspended after the pm_runtime_resume()
> > in pci_pm_prepare() had run.
> 
> That said, the PCI case can be solved with a separate patch and if the other
> subsystems are not affected, perhaps that's the best approach.

Yes, it would be a simple change.

> Still, I'd like to make sure that there won't be any races between runtime
> PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
> the patch below.
> 
> Thanks,
> Rafael
> 
> ---
>  drivers/base/power/main.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
>  	async_error = 0;
>  
>  	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
> +		pm_runtime_get_noresume(dev);
> +		pm_runtime_enable(dev);
>  		INIT_COMPLETION(dev->power.completion);
>  		if (is_async(dev)) {
>  			get_device(dev);
> @@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
>  		}
>  		if (!list_empty(&dev->power.entry))
>  			list_move_tail(&dev->power.entry, &dpm_prepared_list);
> +		pm_runtime_put_noidle(dev);
>  		put_device(dev);
>  	}
>  	mutex_unlock(&dpm_list_mtx);
> @@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
>  			put_device(dev);
>  			break;
>  		}
> -		if (!list_empty(&dev->power.entry))
> +		if (!list_empty(&dev->power.entry)) {
>  			list_move(&dev->power.entry, &dpm_suspended_list);
> +			pm_runtime_disable(dev);
> +		}

The put_noidle is in the wrong place for async resumes.  Likewise for
the pm_runtime_disable() and async suspends.  Also this runs into
problems if a device is never suspended (i.e., if the sleep transition
aborts before suspending that device).

I have been working on a similar patch to do these things.  But it got
derailed by the problems mentioned earlier in the other email thread
(and the bug fix I posted yesterday).  Maybe I can send it in early
next week.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-18 15:31                                       ` [linux-pm] " Alan Stern
@ 2011-06-18 21:01                                         ` Rafael J. Wysocki
  2011-06-18 23:57                                           ` Rafael J. Wysocki
  2011-06-18 23:57                                           ` Rafael J. Wysocki
  2011-06-18 21:01                                         ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 21:01 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm, LKML

On Saturday, June 18, 2011, Alan Stern wrote:
> On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> 
> > On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> > > On Friday, June 17, 2011, Alan Stern wrote:
> > > > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > > > 
> > > > > Having considered that a bit more I see that, in fact, commit
> > > > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > > > succeed during system suspend) has introduced at least one regression.
> > > > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > > > callback to guarantee that devices will be in a well known state before
> > > > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > > > isn't valid any more, because devices can be runtime-suspend after the
> > > > > pm_runtime_resume() in .prepare() has run.
> > > > > 
> > > > > USB seems to do something similar in choose_wakeup().
> > > > > 
> > > > > So, either the both of these subsystems should be modified to use
> > > > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > > > 
> > > > pm_runtime_put_noidle would be appropriate.
> > > > 
> > > > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > > > prefer to revert that commit for 3.0.
> > > > 
> > > > Maybe we can compromise.  Instead of reverting that commit outright,
> > > > put the get_noresume just before the suspend callback and put the
> > > > put_sync just after the resume callback.
> > > 
> > > That wouldn't fix the PCI problem, though, because it would leave a small
> > > window in which the device could be suspended after the pm_runtime_resume()
> > > in pci_pm_prepare() had run.
> > 
> > That said, the PCI case can be solved with a separate patch and if the other
> > subsystems are not affected, perhaps that's the best approach.
> 
> Yes, it would be a simple change.
> 
> > Still, I'd like to make sure that there won't be any races between runtime
> > PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
> > the patch below.
> > 
> > Thanks,
> > Rafael
> > 
> > ---
> >  drivers/base/power/main.c |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > Index: linux-2.6/drivers/base/power/main.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/base/power/main.c
> > +++ linux-2.6/drivers/base/power/main.c
> > @@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
> >  	async_error = 0;
> >  
> >  	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
> > +		pm_runtime_get_noresume(dev);
> > +		pm_runtime_enable(dev);
> >  		INIT_COMPLETION(dev->power.completion);
> >  		if (is_async(dev)) {
> >  			get_device(dev);
> > @@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
> >  		}
> >  		if (!list_empty(&dev->power.entry))
> >  			list_move_tail(&dev->power.entry, &dpm_prepared_list);
> > +		pm_runtime_put_noidle(dev);
> >  		put_device(dev);
> >  	}
> >  	mutex_unlock(&dpm_list_mtx);
> > @@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
> >  			put_device(dev);
> >  			break;
> >  		}
> > -		if (!list_empty(&dev->power.entry))
> > +		if (!list_empty(&dev->power.entry)) {
> >  			list_move(&dev->power.entry, &dpm_suspended_list);
> > +			pm_runtime_disable(dev);
> > +		}
> 
> The put_noidle is in the wrong place for async resumes.  Likewise for
> the pm_runtime_disable() and async suspends.  Also this runs into
> problems if a device is never suspended (i.e., if the sleep transition
> aborts before suspending that device).

I overlooked that, thanks for pointing it out.

Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
which is going to be, I think we can put

+       pm_runtime_get_noresume(dev);
+       pm_runtime_enable(dev);

in device_resume() after the dev->power.is_suspended check and
pm_runtime_put_noidle() under the End label.  That cause them to
be called under the device lock, but that shouldn't be a big deal.

Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
right next to the setting of power.is_suspended.

This is implemented by the patch below.

Thanks,
Rafael

---
 drivers/base/power/main.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -888,7 +892,10 @@ static int __device_suspend(struct devic
 	}
 
  End:
-	dev->power.is_suspended = !error;
+	if (!error) {
+		dev->power.is_suspended = true;
+		pm_runtime_disable(dev);
+	}
 
  Unlock:
 	device_unlock(dev);

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-18 15:31                                       ` [linux-pm] " Alan Stern
  2011-06-18 21:01                                         ` Rafael J. Wysocki
@ 2011-06-18 21:01                                         ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 21:01 UTC (permalink / raw)
  To: Alan Stern; +Cc: LKML, linux-pm, linux-omap

On Saturday, June 18, 2011, Alan Stern wrote:
> On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> 
> > On Friday, June 17, 2011, Rafael J. Wysocki wrote:
> > > On Friday, June 17, 2011, Alan Stern wrote:
> > > > On Fri, 17 Jun 2011, Rafael J. Wysocki wrote:
> > > > 
> > > > > Having considered that a bit more I see that, in fact, commit
> > > > > e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to
> > > > > succeed during system suspend) has introduced at least one regression.
> > > > > Namely, the PCI bus type runs pm_runtime_resume() in its .prepare()
> > > > > callback to guarantee that devices will be in a well known state before
> > > > > the PCI .suspend() and .suspend_noirq() callbacks are executed.
> > > > > Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 this
> > > > > isn't valid any more, because devices can be runtime-suspend after the
> > > > > pm_runtime_resume() in .prepare() has run.
> > > > > 
> > > > > USB seems to do something similar in choose_wakeup().
> > > > > 
> > > > > So, either the both of these subsystems should be modified to use
> > > > > pm_runtime_get_sync() and then pm_runtime_put_<something>() some time
> > > > > during resume, or we should revert commit e8665002477f0278f84f898145b1f141ba26ee26.
> > > > 
> > > > pm_runtime_put_noidle would be appropriate.
> > > > 
> > > > > Quite frankly, which shouldn't be a surprise to anyone at this point, I'd
> > > > > prefer to revert that commit for 3.0.
> > > > 
> > > > Maybe we can compromise.  Instead of reverting that commit outright,
> > > > put the get_noresume just before the suspend callback and put the
> > > > put_sync just after the resume callback.
> > > 
> > > That wouldn't fix the PCI problem, though, because it would leave a small
> > > window in which the device could be suspended after the pm_runtime_resume()
> > > in pci_pm_prepare() had run.
> > 
> > That said, the PCI case can be solved with a separate patch and if the other
> > subsystems are not affected, perhaps that's the best approach.
> 
> Yes, it would be a simple change.
> 
> > Still, I'd like to make sure that there won't be any races between runtime
> > PM and .suspend_noirq() and .resume_noirq() callbacks, so I'd like to apply
> > the patch below.
> > 
> > Thanks,
> > Rafael
> > 
> > ---
> >  drivers/base/power/main.c |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > Index: linux-2.6/drivers/base/power/main.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/base/power/main.c
> > +++ linux-2.6/drivers/base/power/main.c
> > @@ -591,6 +591,8 @@ void dpm_resume(pm_message_t state)
> >  	async_error = 0;
> >  
> >  	list_for_each_entry(dev, &dpm_suspended_list, power.entry) {
> > +		pm_runtime_get_noresume(dev);
> > +		pm_runtime_enable(dev);
> >  		INIT_COMPLETION(dev->power.completion);
> >  		if (is_async(dev)) {
> >  			get_device(dev);
> > @@ -614,6 +616,7 @@ void dpm_resume(pm_message_t state)
> >  		}
> >  		if (!list_empty(&dev->power.entry))
> >  			list_move_tail(&dev->power.entry, &dpm_prepared_list);
> > +		pm_runtime_put_noidle(dev);
> >  		put_device(dev);
> >  	}
> >  	mutex_unlock(&dpm_list_mtx);
> > @@ -939,8 +942,10 @@ int dpm_suspend(pm_message_t state)
> >  			put_device(dev);
> >  			break;
> >  		}
> > -		if (!list_empty(&dev->power.entry))
> > +		if (!list_empty(&dev->power.entry)) {
> >  			list_move(&dev->power.entry, &dpm_suspended_list);
> > +			pm_runtime_disable(dev);
> > +		}
> 
> The put_noidle is in the wrong place for async resumes.  Likewise for
> the pm_runtime_disable() and async suspends.  Also this runs into
> problems if a device is never suspended (i.e., if the sleep transition
> aborts before suspending that device).

I overlooked that, thanks for pointing it out.

Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
which is going to be, I think we can put

+       pm_runtime_get_noresume(dev);
+       pm_runtime_enable(dev);

in device_resume() after the dev->power.is_suspended check and
pm_runtime_put_noidle() under the End label.  That cause them to
be called under the device lock, but that shouldn't be a big deal.

Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
right next to the setting of power.is_suspended.

This is implemented by the patch below.

Thanks,
Rafael

---
 drivers/base/power/main.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -888,7 +892,10 @@ static int __device_suspend(struct devic
 	}
 
  End:
-	dev->power.is_suspended = !error;
+	if (!error) {
+		dev->power.is_suspended = true;
+		pm_runtime_disable(dev);
+	}
 
  Unlock:
 	device_unlock(dev);

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-18 21:01                                         ` Rafael J. Wysocki
@ 2011-06-18 23:57                                           ` Rafael J. Wysocki
  2011-06-19  1:42                                               ` Alan Stern
  2011-06-19  1:42                                             ` Alan Stern
  2011-06-18 23:57                                           ` Rafael J. Wysocki
  1 sibling, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 23:57 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm, LKML

On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> On Saturday, June 18, 2011, Alan Stern wrote:
> > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
...
> 
> Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> which is going to be, I think we can put
> 
> +       pm_runtime_get_noresume(dev);
> +       pm_runtime_enable(dev);
> 
> in device_resume() after the dev->power.is_suspended check and
> pm_runtime_put_noidle() under the End label.  That cause them to
> be called under the device lock, but that shouldn't be a big deal.
> 
> Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> right next to the setting of power.is_suspended.
> 
> This is implemented by the patch below.

Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.

This happens even if the pm_runtime_disable() is replaced with a version
that only increments the disable depth, so it looks like something down
the road relies on disable_depth being zero.  Which is worrisome.

Trying to figure out what the problem is I noticed that, for example,
the generic PM operations use pm_runtime_suspended() to decide whether or
not to execute system suspend callbacks, so the patch below would break it.

Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
pm_runtime_suspended() check in __pm_generic_call() doesn't really make
sense.

Thanks,
Rafael


> ---
>  drivers/base/power/main.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -521,6 +521,9 @@ static int device_resume(struct device *
>  	if (!dev->power.is_suspended)
>  		goto Unlock;
>  
> +	pm_runtime_get_noresume(dev);
> +	pm_runtime_enable(dev);
> +
>  	if (dev->pwr_domain) {
>  		pm_dev_dbg(dev, state, "power domain ");
>  		error = pm_op(dev, &dev->pwr_domain->ops, state);
> @@ -557,6 +560,7 @@ static int device_resume(struct device *
>  
>   End:
>  	dev->power.is_suspended = false;
> +	pm_runtime_put_noidle(dev);
>  
>   Unlock:
>  	device_unlock(dev);
> @@ -888,7 +892,10 @@ static int __device_suspend(struct devic
>  	}
>  
>   End:
> -	dev->power.is_suspended = !error;
> +	if (!error) {
> +		dev->power.is_suspended = true;
> +		pm_runtime_disable(dev);
> +	}
>  
>   Unlock:
>  	device_unlock(dev);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-18 21:01                                         ` Rafael J. Wysocki
  2011-06-18 23:57                                           ` Rafael J. Wysocki
@ 2011-06-18 23:57                                           ` Rafael J. Wysocki
  1 sibling, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-18 23:57 UTC (permalink / raw)
  To: Alan Stern; +Cc: LKML, linux-pm, linux-omap

On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> On Saturday, June 18, 2011, Alan Stern wrote:
> > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
...
> 
> Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> which is going to be, I think we can put
> 
> +       pm_runtime_get_noresume(dev);
> +       pm_runtime_enable(dev);
> 
> in device_resume() after the dev->power.is_suspended check and
> pm_runtime_put_noidle() under the End label.  That cause them to
> be called under the device lock, but that shouldn't be a big deal.
> 
> Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> right next to the setting of power.is_suspended.
> 
> This is implemented by the patch below.

Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.

This happens even if the pm_runtime_disable() is replaced with a version
that only increments the disable depth, so it looks like something down
the road relies on disable_depth being zero.  Which is worrisome.

Trying to figure out what the problem is I noticed that, for example,
the generic PM operations use pm_runtime_suspended() to decide whether or
not to execute system suspend callbacks, so the patch below would break it.

Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
pm_runtime_suspended() check in __pm_generic_call() doesn't really make
sense.

Thanks,
Rafael


> ---
>  drivers/base/power/main.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/drivers/base/power/main.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/power/main.c
> +++ linux-2.6/drivers/base/power/main.c
> @@ -521,6 +521,9 @@ static int device_resume(struct device *
>  	if (!dev->power.is_suspended)
>  		goto Unlock;
>  
> +	pm_runtime_get_noresume(dev);
> +	pm_runtime_enable(dev);
> +
>  	if (dev->pwr_domain) {
>  		pm_dev_dbg(dev, state, "power domain ");
>  		error = pm_op(dev, &dev->pwr_domain->ops, state);
> @@ -557,6 +560,7 @@ static int device_resume(struct device *
>  
>   End:
>  	dev->power.is_suspended = false;
> +	pm_runtime_put_noidle(dev);
>  
>   Unlock:
>  	device_unlock(dev);
> @@ -888,7 +892,10 @@ static int __device_suspend(struct devic
>  	}
>  
>   End:
> -	dev->power.is_suspended = !error;
> +	if (!error) {
> +		dev->power.is_suspended = true;
> +		pm_runtime_disable(dev);
> +	}
>  
>   Unlock:
>  	device_unlock(dev);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-18 23:57                                           ` Rafael J. Wysocki
@ 2011-06-19  1:42                                               ` Alan Stern
  2011-06-19  1:42                                             ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19  1:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm, LKML

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> > On Saturday, June 18, 2011, Alan Stern wrote:
> > > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> ...
> > 
> > Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> > which is going to be, I think we can put
> > 
> > +       pm_runtime_get_noresume(dev);
> > +       pm_runtime_enable(dev);
> > 
> > in device_resume() after the dev->power.is_suspended check and
> > pm_runtime_put_noidle() under the End label.  That cause them to
> > be called under the device lock, but that shouldn't be a big deal.
> > 
> > Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> > right next to the setting of power.is_suspended.
> > 
> > This is implemented by the patch below.
> 
> Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.
> 
> This happens even if the pm_runtime_disable() is replaced with a version
> that only increments the disable depth, so it looks like something down
> the road relies on disable_depth being zero.  Which is worrisome.

This is a sign that the PM subsystem is getting a little too 
complicated.  :-(

> Trying to figure out what the problem is I noticed that, for example,
> the generic PM operations use pm_runtime_suspended() to decide whether or
> not to execute system suspend callbacks, so the patch below would break it.
> 
> Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
> pm_runtime_suspended() check in __pm_generic_call() doesn't really make
> sense.

In light of the recent changes, we should revisit the decisions behind 
the generic PM operations.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-18 23:57                                           ` Rafael J. Wysocki
  2011-06-19  1:42                                               ` Alan Stern
@ 2011-06-19  1:42                                             ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19  1:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML, linux-pm, linux-omap

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> > On Saturday, June 18, 2011, Alan Stern wrote:
> > > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> ...
> > 
> > Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> > which is going to be, I think we can put
> > 
> > +       pm_runtime_get_noresume(dev);
> > +       pm_runtime_enable(dev);
> > 
> > in device_resume() after the dev->power.is_suspended check and
> > pm_runtime_put_noidle() under the End label.  That cause them to
> > be called under the device lock, but that shouldn't be a big deal.
> > 
> > Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> > right next to the setting of power.is_suspended.
> > 
> > This is implemented by the patch below.
> 
> Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.
> 
> This happens even if the pm_runtime_disable() is replaced with a version
> that only increments the disable depth, so it looks like something down
> the road relies on disable_depth being zero.  Which is worrisome.

This is a sign that the PM subsystem is getting a little too 
complicated.  :-(

> Trying to figure out what the problem is I noticed that, for example,
> the generic PM operations use pm_runtime_suspended() to decide whether or
> not to execute system suspend callbacks, so the patch below would break it.
> 
> Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
> pm_runtime_suspended() check in __pm_generic_call() doesn't really make
> sense.

In light of the recent changes, we should revisit the decisions behind 
the generic PM operations.

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
@ 2011-06-19  1:42                                               ` Alan Stern
  0 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19  1:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm, LKML

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> > On Saturday, June 18, 2011, Alan Stern wrote:
> > > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> ...
> > 
> > Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> > which is going to be, I think we can put
> > 
> > +       pm_runtime_get_noresume(dev);
> > +       pm_runtime_enable(dev);
> > 
> > in device_resume() after the dev->power.is_suspended check and
> > pm_runtime_put_noidle() under the End label.  That cause them to
> > be called under the device lock, but that shouldn't be a big deal.
> > 
> > Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> > right next to the setting of power.is_suspended.
> > 
> > This is implemented by the patch below.
> 
> Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.
> 
> This happens even if the pm_runtime_disable() is replaced with a version
> that only increments the disable depth, so it looks like something down
> the road relies on disable_depth being zero.  Which is worrisome.

This is a sign that the PM subsystem is getting a little too 
complicated.  :-(

> Trying to figure out what the problem is I noticed that, for example,
> the generic PM operations use pm_runtime_suspended() to decide whether or
> not to execute system suspend callbacks, so the patch below would break it.
> 
> Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
> pm_runtime_suspended() check in __pm_generic_call() doesn't really make
> sense.

In light of the recent changes, we should revisit the decisions behind 
the generic PM operations.

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-19  1:42                                               ` Alan Stern
  (?)
  (?)
@ 2011-06-19 14:04                                               ` Rafael J. Wysocki
  2011-06-19 15:01                                                 ` Alan Stern
  2011-06-19 15:01                                                   ` Alan Stern
  -1 siblings, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-19 14:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm,
	LKML, Tejun Heo

On Sunday, June 19, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> > > On Saturday, June 18, 2011, Alan Stern wrote:
> > > > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> > ...
> > > 
> > > Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> > > which is going to be, I think we can put
> > > 
> > > +       pm_runtime_get_noresume(dev);
> > > +       pm_runtime_enable(dev);
> > > 
> > > in device_resume() after the dev->power.is_suspended check and
> > > pm_runtime_put_noidle() under the End label.  That cause them to
> > > be called under the device lock, but that shouldn't be a big deal.
> > > 
> > > Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> > > right next to the setting of power.is_suspended.
> > > 
> > > This is implemented by the patch below.
> > 
> > Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.
> > 
> > This happens even if the pm_runtime_disable() is replaced with a version
> > that only increments the disable depth, so it looks like something down
> > the road relies on disable_depth being zero.  Which is worrisome.
> 
> This is a sign that the PM subsystem is getting a little too 
> complicated.  :-(

Well, that was kind of difficult to debug, but not impossible. :-)

The problem here turns out to be related to the SCSI subsystem.
Namely, when the AHCI controller is suspended, it uses the SCSI error
handling mechanism for scheduling the suspend operation (I'm still at a little
loss why that is necessary, but Tejun says it is :-)).  This (after several
convoluted operations) causes scsi_error_handler() to be woken up and
it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
because the runtime PM has been disabled at the host level.

This happens because scsi_autopm_get_host() uses
pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
shost_gendev.power.disable_depth is nonzero.

So, the problem is either in scsi_autopm_get_host() that should check the
error code returned by pm_runtime_get_sync(), or in rpm_suspend() that should
return 0 if RPM_GET_PUT is set in flags.  I'm inclined to say that the
problem should be fixed in rpm_suspend() and hence the appended patch that
works (well, it probably should be split into three separate patches).

> > Trying to figure out what the problem is I noticed that, for example,
> > the generic PM operations use pm_runtime_suspended() to decide whether or
> > not to execute system suspend callbacks, so the patch below would break it.
> > 
> > Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
> > pm_runtime_suspended() check in __pm_generic_call() doesn't really make
> > sense.
> 
> In light of the recent changes, we should revisit the decisions behind 
> the generic PM operations.

Certainly, although the situation is not as bad as I thought, because
__pm_generic_call() is executed after the patch below disables runtime PM
during system suspend.  Still, the pm_runtime_suspended() check in there is
pointless.

Thanks,
Rafael

---
 drivers/base/power/main.c    |    9 ++++++++-
 drivers/base/power/runtime.c |   41 ++++++++++++++++++++++++-----------------
 include/linux/pm_runtime.h   |   13 ++++++++++---
 3 files changed, 42 insertions(+), 21 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -888,7 +892,10 @@ static int __device_suspend(struct devic
 	}
 
  End:
-	dev->power.is_suspended = !error;
+	if (!error) {
+		dev->power.is_suspended = true;
+		__pm_runtime_disable(dev, PRD_DEPTH);
+	}
 
  Unlock:
 	device_unlock(dev);
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -455,12 +455,14 @@ static int rpm_resume(struct device *dev
 	dev_dbg(dev, "%s flags 0x%x\n", __func__, rpmflags);
 
  repeat:
-	if (dev->power.runtime_error)
+	if (dev->power.runtime_error) {
 		retval = -EINVAL;
-	else if (dev->power.disable_depth > 0)
-		retval = -EAGAIN;
-	if (retval)
 		goto out;
+	} else if (dev->power.disable_depth > 0) {
+		if (!(rpmflags & RPM_GET_PUT))
+			retval = -EAGAIN;
+		goto out;
+	}
 
 	/*
 	 * Other scheduled or pending requests need to be canceled.  Small
@@ -965,18 +967,23 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
 /**
  * __pm_runtime_disable - Disable run-time PM of a device.
  * @dev: Device to handle.
- * @check_resume: If set, check if there's a resume request for the device.
+ * @level: How much to do.
+ *
+ * Increment power.disable_depth for the device and if was zero previously.
+ *
+ * If @level is at least PRD_BARRIER, additionally cancel all pending run-time
+ * PM requests for the device and wait for all operations in progress to
+ * complete.
+ *
+ * If @level is at least PRD_CHECK_RESUME and there's a resume request pending
+ * when this function is called, and power.disable_depth is zero, the device
+ * will be woken up before disabling its run-time PM.
+ *
+ * The device can be either active or suspended after its run-time PM has been
+ * disabled.
  *
- * Increment power.disable_depth for the device and if was zero previously,
- * cancel all pending run-time PM requests for the device and wait for all
- * operations in progress to complete.  The device can be either active or
- * suspended after its run-time PM has been disabled.
- *
- * If @check_resume is set and there's a resume request pending when
- * __pm_runtime_disable() is called and power.disable_depth is zero, the
- * function will wake up the device before disabling its run-time PM.
  */
-void __pm_runtime_disable(struct device *dev, bool check_resume)
+void __pm_runtime_disable(struct device *dev, enum prd_level level)
 {
 	spin_lock_irq(&dev->power.lock);
 
@@ -990,7 +997,7 @@ void __pm_runtime_disable(struct device
 	 * means there probably is some I/O to process and disabling run-time PM
 	 * shouldn't prevent the device from processing the I/O.
 	 */
-	if (check_resume && dev->power.request_pending
+	if (level >= PRD_CHECK_RESUME && dev->power.request_pending
 	    && dev->power.request == RPM_REQ_RESUME) {
 		/*
 		 * Prevent suspends and idle notifications from being carried
@@ -1003,7 +1010,7 @@ void __pm_runtime_disable(struct device
 		pm_runtime_put_noidle(dev);
 	}
 
-	if (!dev->power.disable_depth++)
+	if (!dev->power.disable_depth++ && level >= PRD_BARRIER)
 		__pm_runtime_barrier(dev);
 
  out:
@@ -1230,7 +1237,7 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_remove(struct device *dev)
 {
-	__pm_runtime_disable(dev, false);
+	__pm_runtime_disable(dev, PRD_BARRIER);
 
 	/* Change the status back to 'suspended' to match the initial status. */
 	if (dev->power.runtime_status == RPM_ACTIVE)
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- linux-2.6.orig/include/linux/pm_runtime.h
+++ linux-2.6/include/linux/pm_runtime.h
@@ -22,6 +22,13 @@
 					    usage_count */
 #define RPM_AUTO		0x08	/* Use autosuspend_delay */
 
+/* Runtime PM disable levels */
+enum prd_level {
+	PRD_DEPTH = 0,		/* Only increment disable depth */
+	PRD_BARRIER,		/* Additionally, act as a runtime PM barrier */
+	PRD_CHECK_RESUME,	/* Additionally, check if resume is pending */
+};
+
 #ifdef CONFIG_PM_RUNTIME
 
 extern struct workqueue_struct *pm_wq;
@@ -33,7 +40,7 @@ extern int pm_schedule_suspend(struct de
 extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
 extern int pm_runtime_barrier(struct device *dev);
 extern void pm_runtime_enable(struct device *dev);
-extern void __pm_runtime_disable(struct device *dev, bool check_resume);
+extern void __pm_runtime_disable(struct device *dev, enum prd_level level);
 extern void pm_runtime_allow(struct device *dev);
 extern void pm_runtime_forbid(struct device *dev);
 extern int pm_generic_runtime_idle(struct device *dev);
@@ -119,7 +126,7 @@ static inline int __pm_runtime_set_statu
 					    unsigned int status) { return 0; }
 static inline int pm_runtime_barrier(struct device *dev) { return 0; }
 static inline void pm_runtime_enable(struct device *dev) {}
-static inline void __pm_runtime_disable(struct device *dev, bool c) {}
+static inline void __pm_runtime_disable(struct device *dev, enum prd_level l) {}
 static inline void pm_runtime_allow(struct device *dev) {}
 static inline void pm_runtime_forbid(struct device *dev) {}
 
@@ -232,7 +239,7 @@ static inline void pm_runtime_set_suspen
 
 static inline void pm_runtime_disable(struct device *dev)
 {
-	__pm_runtime_disable(dev, true);
+	__pm_runtime_disable(dev, PRD_CHECK_RESUME);
 }
 
 static inline void pm_runtime_use_autosuspend(struct device *dev)

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-19  1:42                                               ` Alan Stern
  (?)
@ 2011-06-19 14:04                                               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-19 14:04 UTC (permalink / raw)
  To: Alan Stern; +Cc: LKML, Tejun Heo, linux-pm, linux-omap

On Sunday, June 19, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > On Saturday, June 18, 2011, Rafael J. Wysocki wrote:
> > > On Saturday, June 18, 2011, Alan Stern wrote:
> > > > On Sat, 18 Jun 2011, Rafael J. Wysocki wrote:
> > ...
> > > 
> > > Well, assuming that https://patchwork.kernel.org/patch/893722/ is applied,
> > > which is going to be, I think we can put
> > > 
> > > +       pm_runtime_get_noresume(dev);
> > > +       pm_runtime_enable(dev);
> > > 
> > > in device_resume() after the dev->power.is_suspended check and
> > > pm_runtime_put_noidle() under the End label.  That cause them to
> > > be called under the device lock, but that shouldn't be a big deal.
> > > 
> > > Accordingly, we can call pm_runtime_disable(dev) in __device_suspend(),
> > > right next to the setting of power.is_suspended.
> > > 
> > > This is implemented by the patch below.
> > 
> > Well, it hangs suspend on my Toshiba test box, I'm not sure why exactly.
> > 
> > This happens even if the pm_runtime_disable() is replaced with a version
> > that only increments the disable depth, so it looks like something down
> > the road relies on disable_depth being zero.  Which is worrisome.
> 
> This is a sign that the PM subsystem is getting a little too 
> complicated.  :-(

Well, that was kind of difficult to debug, but not impossible. :-)

The problem here turns out to be related to the SCSI subsystem.
Namely, when the AHCI controller is suspended, it uses the SCSI error
handling mechanism for scheduling the suspend operation (I'm still at a little
loss why that is necessary, but Tejun says it is :-)).  This (after several
convoluted operations) causes scsi_error_handler() to be woken up and
it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
because the runtime PM has been disabled at the host level.

This happens because scsi_autopm_get_host() uses
pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
shost_gendev.power.disable_depth is nonzero.

So, the problem is either in scsi_autopm_get_host() that should check the
error code returned by pm_runtime_get_sync(), or in rpm_suspend() that should
return 0 if RPM_GET_PUT is set in flags.  I'm inclined to say that the
problem should be fixed in rpm_suspend() and hence the appended patch that
works (well, it probably should be split into three separate patches).

> > Trying to figure out what the problem is I noticed that, for example,
> > the generic PM operations use pm_runtime_suspended() to decide whether or
> > not to execute system suspend callbacks, so the patch below would break it.
> > 
> > Also, after commit e8665002477f0278f84f898145b1f141ba26ee26 the
> > pm_runtime_suspended() check in __pm_generic_call() doesn't really make
> > sense.
> 
> In light of the recent changes, we should revisit the decisions behind 
> the generic PM operations.

Certainly, although the situation is not as bad as I thought, because
__pm_generic_call() is executed after the patch below disables runtime PM
during system suspend.  Still, the pm_runtime_suspended() check in there is
pointless.

Thanks,
Rafael

---
 drivers/base/power/main.c    |    9 ++++++++-
 drivers/base/power/runtime.c |   41 ++++++++++++++++++++++++-----------------
 include/linux/pm_runtime.h   |   13 ++++++++++---
 3 files changed, 42 insertions(+), 21 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -888,7 +892,10 @@ static int __device_suspend(struct devic
 	}
 
  End:
-	dev->power.is_suspended = !error;
+	if (!error) {
+		dev->power.is_suspended = true;
+		__pm_runtime_disable(dev, PRD_DEPTH);
+	}
 
  Unlock:
 	device_unlock(dev);
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -455,12 +455,14 @@ static int rpm_resume(struct device *dev
 	dev_dbg(dev, "%s flags 0x%x\n", __func__, rpmflags);
 
  repeat:
-	if (dev->power.runtime_error)
+	if (dev->power.runtime_error) {
 		retval = -EINVAL;
-	else if (dev->power.disable_depth > 0)
-		retval = -EAGAIN;
-	if (retval)
 		goto out;
+	} else if (dev->power.disable_depth > 0) {
+		if (!(rpmflags & RPM_GET_PUT))
+			retval = -EAGAIN;
+		goto out;
+	}
 
 	/*
 	 * Other scheduled or pending requests need to be canceled.  Small
@@ -965,18 +967,23 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
 /**
  * __pm_runtime_disable - Disable run-time PM of a device.
  * @dev: Device to handle.
- * @check_resume: If set, check if there's a resume request for the device.
+ * @level: How much to do.
+ *
+ * Increment power.disable_depth for the device and if was zero previously.
+ *
+ * If @level is at least PRD_BARRIER, additionally cancel all pending run-time
+ * PM requests for the device and wait for all operations in progress to
+ * complete.
+ *
+ * If @level is at least PRD_CHECK_RESUME and there's a resume request pending
+ * when this function is called, and power.disable_depth is zero, the device
+ * will be woken up before disabling its run-time PM.
+ *
+ * The device can be either active or suspended after its run-time PM has been
+ * disabled.
  *
- * Increment power.disable_depth for the device and if was zero previously,
- * cancel all pending run-time PM requests for the device and wait for all
- * operations in progress to complete.  The device can be either active or
- * suspended after its run-time PM has been disabled.
- *
- * If @check_resume is set and there's a resume request pending when
- * __pm_runtime_disable() is called and power.disable_depth is zero, the
- * function will wake up the device before disabling its run-time PM.
  */
-void __pm_runtime_disable(struct device *dev, bool check_resume)
+void __pm_runtime_disable(struct device *dev, enum prd_level level)
 {
 	spin_lock_irq(&dev->power.lock);
 
@@ -990,7 +997,7 @@ void __pm_runtime_disable(struct device
 	 * means there probably is some I/O to process and disabling run-time PM
 	 * shouldn't prevent the device from processing the I/O.
 	 */
-	if (check_resume && dev->power.request_pending
+	if (level >= PRD_CHECK_RESUME && dev->power.request_pending
 	    && dev->power.request == RPM_REQ_RESUME) {
 		/*
 		 * Prevent suspends and idle notifications from being carried
@@ -1003,7 +1010,7 @@ void __pm_runtime_disable(struct device
 		pm_runtime_put_noidle(dev);
 	}
 
-	if (!dev->power.disable_depth++)
+	if (!dev->power.disable_depth++ && level >= PRD_BARRIER)
 		__pm_runtime_barrier(dev);
 
  out:
@@ -1230,7 +1237,7 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_remove(struct device *dev)
 {
-	__pm_runtime_disable(dev, false);
+	__pm_runtime_disable(dev, PRD_BARRIER);
 
 	/* Change the status back to 'suspended' to match the initial status. */
 	if (dev->power.runtime_status == RPM_ACTIVE)
Index: linux-2.6/include/linux/pm_runtime.h
===================================================================
--- linux-2.6.orig/include/linux/pm_runtime.h
+++ linux-2.6/include/linux/pm_runtime.h
@@ -22,6 +22,13 @@
 					    usage_count */
 #define RPM_AUTO		0x08	/* Use autosuspend_delay */
 
+/* Runtime PM disable levels */
+enum prd_level {
+	PRD_DEPTH = 0,		/* Only increment disable depth */
+	PRD_BARRIER,		/* Additionally, act as a runtime PM barrier */
+	PRD_CHECK_RESUME,	/* Additionally, check if resume is pending */
+};
+
 #ifdef CONFIG_PM_RUNTIME
 
 extern struct workqueue_struct *pm_wq;
@@ -33,7 +40,7 @@ extern int pm_schedule_suspend(struct de
 extern int __pm_runtime_set_status(struct device *dev, unsigned int status);
 extern int pm_runtime_barrier(struct device *dev);
 extern void pm_runtime_enable(struct device *dev);
-extern void __pm_runtime_disable(struct device *dev, bool check_resume);
+extern void __pm_runtime_disable(struct device *dev, enum prd_level level);
 extern void pm_runtime_allow(struct device *dev);
 extern void pm_runtime_forbid(struct device *dev);
 extern int pm_generic_runtime_idle(struct device *dev);
@@ -119,7 +126,7 @@ static inline int __pm_runtime_set_statu
 					    unsigned int status) { return 0; }
 static inline int pm_runtime_barrier(struct device *dev) { return 0; }
 static inline void pm_runtime_enable(struct device *dev) {}
-static inline void __pm_runtime_disable(struct device *dev, bool c) {}
+static inline void __pm_runtime_disable(struct device *dev, enum prd_level l) {}
 static inline void pm_runtime_allow(struct device *dev) {}
 static inline void pm_runtime_forbid(struct device *dev) {}
 
@@ -232,7 +239,7 @@ static inline void pm_runtime_set_suspen
 
 static inline void pm_runtime_disable(struct device *dev)
 {
-	__pm_runtime_disable(dev, true);
+	__pm_runtime_disable(dev, PRD_CHECK_RESUME);
 }
 
 static inline void pm_runtime_use_autosuspend(struct device *dev)

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-19 14:04                                               ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-19 15:01                                                   ` Alan Stern
  2011-06-19 15:01                                                   ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm,
	LKML, Tejun Heo

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> Well, that was kind of difficult to debug, but not impossible. :-)
> 
> The problem here turns out to be related to the SCSI subsystem.
> Namely, when the AHCI controller is suspended, it uses the SCSI error
> handling mechanism for scheduling the suspend operation (I'm still at a little
> loss why that is necessary, but Tejun says it is :-)).  This (after several
> convoluted operations) causes scsi_error_handler() to be woken up and
> it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
> because the runtime PM has been disabled at the host level.

Oh no.  I was afraid something like this was going to happen 
eventually.

It's clear that we don't want runtime PM kicking in while the SCSI 
error handler is running.  That's why I added the 
scsi_autopm_get_host().  But this also means we will run into trouble 
if the error handler needs to be used during a power transition.

> This happens because scsi_autopm_get_host() uses
> pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
> shost_gendev.power.disable_depth is nonzero.

Maybe get_sync doesn't need to return an error if the runtime status is 
already ACTIVE.  I'm not sure about this; it's just an idea...

> So, the problem is either in scsi_autopm_get_host() that should check the
> error code returned by pm_runtime_get_sync(), or in rpm_suspend() that should
> return 0 if RPM_GET_PUT is set in flags.  I'm inclined to say that the
> problem should be fixed in rpm_suspend() and hence the appended patch that
> works (well, it probably should be split into three separate patches).

Maybe it would be good enough if the error handler ended up doing a 
get_noresume instead of get_sync?  Although I could be wrong, I don't 
think scsi_error_handler() will ever run in a situation where the host 
adapter is not runtime-active.

Tejun, does that sound right to you?

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-19 14:04                                               ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-19 15:01                                                 ` Alan Stern
  2011-06-19 15:01                                                   ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML, Tejun Heo, linux-pm, linux-omap

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> Well, that was kind of difficult to debug, but not impossible. :-)
> 
> The problem here turns out to be related to the SCSI subsystem.
> Namely, when the AHCI controller is suspended, it uses the SCSI error
> handling mechanism for scheduling the suspend operation (I'm still at a little
> loss why that is necessary, but Tejun says it is :-)).  This (after several
> convoluted operations) causes scsi_error_handler() to be woken up and
> it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
> because the runtime PM has been disabled at the host level.

Oh no.  I was afraid something like this was going to happen 
eventually.

It's clear that we don't want runtime PM kicking in while the SCSI 
error handler is running.  That's why I added the 
scsi_autopm_get_host().  But this also means we will run into trouble 
if the error handler needs to be used during a power transition.

> This happens because scsi_autopm_get_host() uses
> pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
> shost_gendev.power.disable_depth is nonzero.

Maybe get_sync doesn't need to return an error if the runtime status is 
already ACTIVE.  I'm not sure about this; it's just an idea...

> So, the problem is either in scsi_autopm_get_host() that should check the
> error code returned by pm_runtime_get_sync(), or in rpm_suspend() that should
> return 0 if RPM_GET_PUT is set in flags.  I'm inclined to say that the
> problem should be fixed in rpm_suspend() and hence the appended patch that
> works (well, it probably should be split into three separate patches).

Maybe it would be good enough if the error handler ended up doing a 
get_noresume instead of get_sync?  Although I could be wrong, I don't 
think scsi_error_handler() will ever run in a situation where the host 
adapter is not runtime-active.

Tejun, does that sound right to you?

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
@ 2011-06-19 15:01                                                   ` Alan Stern
  0 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-19 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm,
	LKML, Tejun Heo

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> Well, that was kind of difficult to debug, but not impossible. :-)
> 
> The problem here turns out to be related to the SCSI subsystem.
> Namely, when the AHCI controller is suspended, it uses the SCSI error
> handling mechanism for scheduling the suspend operation (I'm still at a little
> loss why that is necessary, but Tejun says it is :-)).  This (after several
> convoluted operations) causes scsi_error_handler() to be woken up and
> it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
> because the runtime PM has been disabled at the host level.

Oh no.  I was afraid something like this was going to happen 
eventually.

It's clear that we don't want runtime PM kicking in while the SCSI 
error handler is running.  That's why I added the 
scsi_autopm_get_host().  But this also means we will run into trouble 
if the error handler needs to be used during a power transition.

> This happens because scsi_autopm_get_host() uses
> pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
> shost_gendev.power.disable_depth is nonzero.

Maybe get_sync doesn't need to return an error if the runtime status is 
already ACTIVE.  I'm not sure about this; it's just an idea...

> So, the problem is either in scsi_autopm_get_host() that should check the
> error code returned by pm_runtime_get_sync(), or in rpm_suspend() that should
> return 0 if RPM_GET_PUT is set in flags.  I'm inclined to say that the
> problem should be fixed in rpm_suspend() and hence the appended patch that
> works (well, it probably should be split into three separate patches).

Maybe it would be good enough if the error handler ended up doing a 
get_noresume instead of get_sync?  Although I could be wrong, I don't 
think scsi_error_handler() will ever run in a situation where the host 
adapter is not runtime-active.

Tejun, does that sound right to you?

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-19 15:01                                                   ` Alan Stern
  (?)
  (?)
@ 2011-06-19 19:36                                                   ` Rafael J. Wysocki
  2011-06-20 14:39                                                     ` Alan Stern
  2011-06-20 14:39                                                       ` Alan Stern
  -1 siblings, 2 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-19 19:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, linux-omap, Kevin Hilman, Paul Walmsley, Magnus Damm,
	LKML, Tejun Heo

On Sunday, June 19, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > Well, that was kind of difficult to debug, but not impossible. :-)
> > 
> > The problem here turns out to be related to the SCSI subsystem.
> > Namely, when the AHCI controller is suspended, it uses the SCSI error
> > handling mechanism for scheduling the suspend operation (I'm still at a little
> > loss why that is necessary, but Tejun says it is :-)).  This (after several
> > convoluted operations) causes scsi_error_handler() to be woken up and
> > it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
> > because the runtime PM has been disabled at the host level.
> 
> Oh no.  I was afraid something like this was going to happen 
> eventually.
> 
> It's clear that we don't want runtime PM kicking in while the SCSI 
> error handler is running.  That's why I added the 
> scsi_autopm_get_host().  But this also means we will run into trouble 
> if the error handler needs to be used during a power transition.
> 
> > This happens because scsi_autopm_get_host() uses
> > pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
> > shost_gendev.power.disable_depth is nonzero.
> 
> Maybe get_sync doesn't need to return an error if the runtime status is 
> already ACTIVE.  I'm not sure about this; it's just an idea...

Well, if disable_depth > 0, ACTIVE isn't really well defined.  As I said,
though, I think it makes sense for pm_runtime_get_sync() to return 0 when
disable_depth > 0, because the grabbing of a reference is successful anyway and
the caller may assume that the device is accessible in that case.

In the meantime I rethought the __pm_runtime_disable() part of my previous
patch and I now think it's not necessary to complicate it any more.  Of course,
we need not check if runtime resume is pending in __device_suspend(), because
we've done it already in dpm_prepare(), but the barrier part should better be
done in there too.

Updated patch is appended.

Thanks,
Rafael

---
 drivers/base/power/main.c    |    6 ++++++
 drivers/base/power/runtime.c |   10 ++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -896,6 +900,8 @@ static int __device_suspend(struct devic
 
 	if (error)
 		async_error = error;
+	else if (dev->power.is_suspended)
+		__pm_runtime_disable(dev, false);
 
 	return error;
 }
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -455,12 +455,14 @@ static int rpm_resume(struct device *dev
 	dev_dbg(dev, "%s flags 0x%x\n", __func__, rpmflags);
 
  repeat:
-	if (dev->power.runtime_error)
+	if (dev->power.runtime_error) {
 		retval = -EINVAL;
-	else if (dev->power.disable_depth > 0)
-		retval = -EAGAIN;
-	if (retval)
 		goto out;
+	} else if (dev->power.disable_depth > 0) {
+		if (!(rpmflags & RPM_GET_PUT))
+			retval = -EAGAIN;
+		goto out;
+	}
 
 	/*
 	 * Other scheduled or pending requests need to be canceled.  Small

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-19 15:01                                                   ` Alan Stern
  (?)
@ 2011-06-19 19:36                                                   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-19 19:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: LKML, Tejun Heo, linux-pm, linux-omap

On Sunday, June 19, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > Well, that was kind of difficult to debug, but not impossible. :-)
> > 
> > The problem here turns out to be related to the SCSI subsystem.
> > Namely, when the AHCI controller is suspended, it uses the SCSI error
> > handling mechanism for scheduling the suspend operation (I'm still at a little
> > loss why that is necessary, but Tejun says it is :-)).  This (after several
> > convoluted operations) causes scsi_error_handler() to be woken up and
> > it calls scsi_autopm_get_host(shost), which returns error code (-EAGAIN),
> > because the runtime PM has been disabled at the host level.
> 
> Oh no.  I was afraid something like this was going to happen 
> eventually.
> 
> It's clear that we don't want runtime PM kicking in while the SCSI 
> error handler is running.  That's why I added the 
> scsi_autopm_get_host().  But this also means we will run into trouble 
> if the error handler needs to be used during a power transition.
> 
> > This happens because scsi_autopm_get_host() uses
> > pm_runtime_get_sync(&shost->shost_gendev) and returns error code when
> > shost_gendev.power.disable_depth is nonzero.
> 
> Maybe get_sync doesn't need to return an error if the runtime status is 
> already ACTIVE.  I'm not sure about this; it's just an idea...

Well, if disable_depth > 0, ACTIVE isn't really well defined.  As I said,
though, I think it makes sense for pm_runtime_get_sync() to return 0 when
disable_depth > 0, because the grabbing of a reference is successful anyway and
the caller may assume that the device is accessible in that case.

In the meantime I rethought the __pm_runtime_disable() part of my previous
patch and I now think it's not necessary to complicate it any more.  Of course,
we need not check if runtime resume is pending in __device_suspend(), because
we've done it already in dpm_prepare(), but the barrier part should better be
done in there too.

Updated patch is appended.

Thanks,
Rafael

---
 drivers/base/power/main.c    |    6 ++++++
 drivers/base/power/runtime.c |   10 ++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -521,6 +521,9 @@ static int device_resume(struct device *
 	if (!dev->power.is_suspended)
 		goto Unlock;
 
+	pm_runtime_get_noresume(dev);
+	pm_runtime_enable(dev);
+
 	if (dev->pwr_domain) {
 		pm_dev_dbg(dev, state, "power domain ");
 		error = pm_op(dev, &dev->pwr_domain->ops, state);
@@ -557,6 +560,7 @@ static int device_resume(struct device *
 
  End:
 	dev->power.is_suspended = false;
+	pm_runtime_put_noidle(dev);
 
  Unlock:
 	device_unlock(dev);
@@ -896,6 +900,8 @@ static int __device_suspend(struct devic
 
 	if (error)
 		async_error = error;
+	else if (dev->power.is_suspended)
+		__pm_runtime_disable(dev, false);
 
 	return error;
 }
Index: linux-2.6/drivers/base/power/runtime.c
===================================================================
--- linux-2.6.orig/drivers/base/power/runtime.c
+++ linux-2.6/drivers/base/power/runtime.c
@@ -455,12 +455,14 @@ static int rpm_resume(struct device *dev
 	dev_dbg(dev, "%s flags 0x%x\n", __func__, rpmflags);
 
  repeat:
-	if (dev->power.runtime_error)
+	if (dev->power.runtime_error) {
 		retval = -EINVAL;
-	else if (dev->power.disable_depth > 0)
-		retval = -EAGAIN;
-	if (retval)
 		goto out;
+	} else if (dev->power.disable_depth > 0) {
+		if (!(rpmflags & RPM_GET_PUT))
+			retval = -EAGAIN;
+		goto out;
+	}
 
 	/*
 	 * Other scheduled or pending requests need to be canceled.  Small

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-19 19:36                                                   ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-20 14:39                                                       ` Alan Stern
  2011-06-20 14:39                                                       ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-20 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, linux-omap, Kevin Hilman, Paul Walmsley,
	Magnus Damm, LKML, Tejun Heo

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> In the meantime I rethought the __pm_runtime_disable() part of my previous
> patch and I now think it's not necessary to complicate it any more.  Of course,
> we need not check if runtime resume is pending in __device_suspend(), because
> we've done it already in dpm_prepare(), but the barrier part should better be
> done in there too.

Does this really make sense?  What use is a barrier in dpm_prepare() if 
runtime PM is allowed to continue functioning up to the 
suspend callback?

As I see it, we never want a suspend or suspend_noirq callback to call 
pm_runtime_suspend().  However it's okay for the suspend callback to 
invoke pm_runtime_resume(), as long as this is all done in subsystem 
code.

And in between the prepare and suspend callbacks, runtime PM should be
more or less fully functional, right?  For most devices it will never
be triggered, because it has to run in process context and both
userspace and pm_wq are frozen.  It may trigger for devices marked as
IRQ-safe, though.

Maybe the barrier should be moved into __device_suspend().

Alan Stern


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-19 19:36                                                   ` [linux-pm] " Rafael J. Wysocki
@ 2011-06-20 14:39                                                     ` Alan Stern
  2011-06-20 14:39                                                       ` Alan Stern
  1 sibling, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-20 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML, Tejun Heo, Linux-pm mailing list, linux-omap

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> In the meantime I rethought the __pm_runtime_disable() part of my previous
> patch and I now think it's not necessary to complicate it any more.  Of course,
> we need not check if runtime resume is pending in __device_suspend(), because
> we've done it already in dpm_prepare(), but the barrier part should better be
> done in there too.

Does this really make sense?  What use is a barrier in dpm_prepare() if 
runtime PM is allowed to continue functioning up to the 
suspend callback?

As I see it, we never want a suspend or suspend_noirq callback to call 
pm_runtime_suspend().  However it's okay for the suspend callback to 
invoke pm_runtime_resume(), as long as this is all done in subsystem 
code.

And in between the prepare and suspend callbacks, runtime PM should be
more or less fully functional, right?  For most devices it will never
be triggered, because it has to run in process context and both
userspace and pm_wq are frozen.  It may trigger for devices marked as
IRQ-safe, though.

Maybe the barrier should be moved into __device_suspend().

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
@ 2011-06-20 14:39                                                       ` Alan Stern
  0 siblings, 0 replies; 118+ messages in thread
From: Alan Stern @ 2011-06-20 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux-pm mailing list, linux-omap, Kevin Hilman, Paul Walmsley,
	Magnus Damm, LKML, Tejun Heo

On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:

> In the meantime I rethought the __pm_runtime_disable() part of my previous
> patch and I now think it's not necessary to complicate it any more.  Of course,
> we need not check if runtime resume is pending in __device_suspend(), because
> we've done it already in dpm_prepare(), but the barrier part should better be
> done in there too.

Does this really make sense?  What use is a barrier in dpm_prepare() if 
runtime PM is allowed to continue functioning up to the 
suspend callback?

As I see it, we never want a suspend or suspend_noirq callback to call 
pm_runtime_suspend().  However it's okay for the suspend callback to 
invoke pm_runtime_resume(), as long as this is all done in subsystem 
code.

And in between the prepare and suspend callbacks, runtime PM should be
more or less fully functional, right?  For most devices it will never
be triggered, because it has to run in process context and both
userspace and pm_wq are frozen.  It may trigger for devices marked as
IRQ-safe, though.

Maybe the barrier should be moved into __device_suspend().

Alan Stern

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [linux-pm] calling runtime PM from system PM methods
  2011-06-20 14:39                                                       ` Alan Stern
  (?)
  (?)
@ 2011-06-20 19:53                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-20 19:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, linux-omap, Kevin Hilman, Paul Walmsley,
	Magnus Damm, LKML, Tejun Heo

On Monday, June 20, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > In the meantime I rethought the __pm_runtime_disable() part of my previous
> > patch and I now think it's not necessary to complicate it any more.  Of course,
> > we need not check if runtime resume is pending in __device_suspend(), because
> > we've done it already in dpm_prepare(), but the barrier part should better be
> > done in there too.
> 
> Does this really make sense?  What use is a barrier in dpm_prepare() if 
> runtime PM is allowed to continue functioning up to the 
> suspend callback?

It checks if a resume request is pending and executes runtime resume in that
case.

> As I see it, we never want a suspend or suspend_noirq callback to call 
> pm_runtime_suspend().  However it's okay for the suspend callback to 
> invoke pm_runtime_resume(), as long as this is all done in subsystem 
> code.

First off, I don't really see a reason for a subsystem to call
pm_runtime_resume() from its .suspend_noirq() callback.  Now, if
pm_runtime_resume() is to be called concurrently with the subsystem's
.suspend_noirq() callback, I'd rather won't let that happen. :-)

> And in between the prepare and suspend callbacks, runtime PM should be
> more or less fully functional, right?  For most devices it will never
> be triggered, because it has to run in process context and both
> userspace and pm_wq are frozen.  It may trigger for devices marked as
> IRQ-safe, though.

It also may trigger for drivers using non-freezable workqueues and calling
runtime PM synchronously from there.

> Maybe the barrier should be moved into __device_suspend().

I _really_ think that the initial approach, i.e. before commit
e8665002477f0278f84f898145b1f141ba26ee26, made the most sense.  It didn't
cover the "pm_runtime_resume() called during system suspend" case, but
it did cover everything else.

So, I think there are serious technical arguments for reverting that commit.

I think we went really far trying to avoid that, but I'm not sure I want to go
any further.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: calling runtime PM from system PM methods
  2011-06-20 14:39                                                       ` Alan Stern
  (?)
@ 2011-06-20 19:53                                                       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 118+ messages in thread
From: Rafael J. Wysocki @ 2011-06-20 19:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: LKML, Tejun Heo, Linux-pm mailing list, linux-omap

On Monday, June 20, 2011, Alan Stern wrote:
> On Sun, 19 Jun 2011, Rafael J. Wysocki wrote:
> 
> > In the meantime I rethought the __pm_runtime_disable() part of my previous
> > patch and I now think it's not necessary to complicate it any more.  Of course,
> > we need not check if runtime resume is pending in __device_suspend(), because
> > we've done it already in dpm_prepare(), but the barrier part should better be
> > done in there too.
> 
> Does this really make sense?  What use is a barrier in dpm_prepare() if 
> runtime PM is allowed to continue functioning up to the 
> suspend callback?

It checks if a resume request is pending and executes runtime resume in that
case.

> As I see it, we never want a suspend or suspend_noirq callback to call 
> pm_runtime_suspend().  However it's okay for the suspend callback to 
> invoke pm_runtime_resume(), as long as this is all done in subsystem 
> code.

First off, I don't really see a reason for a subsystem to call
pm_runtime_resume() from its .suspend_noirq() callback.  Now, if
pm_runtime_resume() is to be called concurrently with the subsystem's
.suspend_noirq() callback, I'd rather won't let that happen. :-)

> And in between the prepare and suspend callbacks, runtime PM should be
> more or less fully functional, right?  For most devices it will never
> be triggered, because it has to run in process context and both
> userspace and pm_wq are frozen.  It may trigger for devices marked as
> IRQ-safe, though.

It also may trigger for drivers using non-freezable workqueues and calling
runtime PM synchronously from there.

> Maybe the barrier should be moved into __device_suspend().

I _really_ think that the initial approach, i.e. before commit
e8665002477f0278f84f898145b1f141ba26ee26, made the most sense.  It didn't
cover the "pm_runtime_resume() called during system suspend" case, but
it did cover everything else.

So, I think there are serious technical arguments for reverting that commit.

I think we went really far trying to avoid that, but I'm not sure I want to go
any further.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 118+ messages in thread

* calling runtime PM from system PM methods
@ 2011-06-02  0:05 Kevin Hilman
  0 siblings, 0 replies; 118+ messages in thread
From: Kevin Hilman @ 2011-06-02  0:05 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux-pm mailing list, linux-omap

Hi Rafael,

Once again, I'm back to some problems with using runtime PM from system
PM methods.  On OMAP, many drivers don't need to do anything different
for runtime PM compared to system PM, so the system PM methods can
simply use runtime PM.

The obvious complication arises when runtime PM is disabled from
userspace, preventing system PM.

Taking into consideration that runtime PM can be disabled from
userspace, the system PM methods need to manually call the subsystems
runtime PM callbask. An example of the resulting system PM methods can
be found in the currenty OMAP I2C driver (excerpt below[1])

This was working, but now we have device power domains which complicate
the story.  My first take was to change the system PM methods to check
the device power domain callbacks as well[2], and take care of the
precedence.  That seems OK, but it's starting to feel like extra work
for each driver that is easy to screw up, and includes some assumptions
about how the PM core works (e.g. power domain precedence.)

It also has the disadvantage of not taking into consideration the
IRQ-safe capabilities of the PM core.

Rather than adding this additional logic to every driver, what would be
best is if we could just take advantage of all the existing logic in the
runtime PM core, rather than duplicating some of it in the drivers.

The ideal case would be for system PM methods to be able to simply call
pm_runtime_get_sync/_put_sync as well, but somehow force the
transitions, even when pm_runtime_forbid() has been called.

I suspect you won't like that idea, but am curious about your opinions.

In the process of experimenting with other solutions, I found an
interesting discovery:

In the driver's ->suspend() hook, I did something like this:

	priv->forced_suspend = false;
	if (!pm_runtime_suspended(dev)) {
		pm_runtime_put_sync(dev);
		priv->forced_suspend = true;
	}

and in the resume hook I did this:

	if (priv->forced_suspend)
		pm_runtime_get_sync(dev);

Even after disabling runtime PM from userspace via
/sys/devices/.../power/control, the ->suspend() hook triggered an actual
transition.  This is because pm_runtime_forbid() just uses the usage
counter, so the _put_sync() in the ->suspend callback decrements the
counter and triggers an rpm_idle().   Is this expected behavior?

If I can count on this behavior, then the above solution seems better
than my workaround below[2], although I kinda don't like making
assumptions about how pm_runtime_forbid() is implemented.

Kevin

[1] from drivers/i2c/busses/i2c-omap.c

static int omap_i2c_suspend(struct device *dev)
{
	if (!pm_runtime_suspended(dev))
		if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_suspend)
			dev->bus->pm->runtime_suspend(dev);

	return 0;
}

static int omap_i2c_resume(struct device *dev)
{
	if (!pm_runtime_suspended(dev))
		if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume)
			dev->bus->pm->runtime_resume(dev);

	return 0;
}




[2] 
static int omap_i2c_suspend(struct device *dev)
{
	int (*callback)(struct device *) = NULL;
	int ret = 0;

	if (!pm_runtime_suspended(dev)) {
		if (dev->pwr_domain)
			callback = dev->pwr_domain->ops.runtime_suspend;
		else if (dev->bus && dev->bus->pm)
			callback = dev->bus->pm->runtime_suspend;

		ret = callback(dev);
	}

	return ret;
}

static int omap_i2c_resume(struct device *dev)
{
	int (*callback)(struct device *) = NULL;
	int ret = 0;

	if (!pm_runtime_suspended(dev)) {
		if (dev->pwr_domain)
			callback = dev->pwr_domain->ops.runtime_resume;
		else if (dev->bus && dev->bus->pm)
			callback = dev->bus->pm->runtime_resume;

		ret = callback(dev);
	}

	return ret;
}

^ permalink raw reply	[flat|nested] 118+ messages in thread

end of thread, other threads:[~2011-07-27 12:19 UTC | newest]

Thread overview: 118+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-02  0:05 calling runtime PM from system PM methods Kevin Hilman
2011-06-02 14:18 ` Alan Stern
2011-06-02 14:18 ` [linux-pm] " Alan Stern
2011-06-02 17:10   ` Kevin Hilman
2011-06-02 17:10   ` [linux-pm] " Kevin Hilman
2011-06-02 18:38     ` Alan Stern
2011-06-02 18:38     ` [linux-pm] " Alan Stern
2011-06-06 18:29     ` Rafael J. Wysocki
2011-06-06 19:16       ` Alan Stern
2011-06-06 19:16       ` [linux-pm] " Alan Stern
2011-06-06 22:25       ` Kevin Hilman
2011-06-07 13:55         ` Alan Stern
2011-06-07 13:55         ` Alan Stern
2011-06-07 21:32         ` Rafael J. Wysocki
2011-06-07 21:32         ` [linux-pm] " Rafael J. Wysocki
2011-06-07 22:34           ` Kevin Hilman
2011-06-07 22:34           ` Kevin Hilman
2011-06-08 22:50           ` Kevin Hilman
2011-06-08 22:50           ` [linux-pm] " Kevin Hilman
2011-06-09  5:29             ` Magnus Damm
2011-06-09  5:29             ` [linux-pm] " Magnus Damm
2011-06-09 13:56             ` Alan Stern
2011-06-10 14:36               ` Mark Brown
2011-06-10 14:51                 ` Alan Stern
2011-06-10 14:51                 ` [linux-pm] " Alan Stern
2011-06-10 15:21                   ` Mark Brown
2011-06-10 15:45                     ` Alan Stern
2011-06-10 15:57                       ` Mark Brown
2011-06-10 17:17                         ` Alan Stern
2011-06-10 17:31                           ` Mark Brown
2011-06-10 17:31                           ` [linux-pm] " Mark Brown
2011-06-10 18:38                             ` Rafael J. Wysocki
2011-06-10 18:42                               ` Mark Brown
2011-06-10 18:42                               ` [linux-pm] " Mark Brown
2011-06-10 20:27                                 ` Rafael J. Wysocki
2011-06-10 21:27                                   ` Alan Stern
2011-06-10 21:27                                   ` Alan Stern
2011-06-11 11:42                                   ` Mark Brown
2011-06-11 11:42                                   ` [linux-pm] " Mark Brown
2011-06-11 20:56                                     ` Rafael J. Wysocki
2011-06-13 12:22                                       ` [linux-pm] " Mark Brown
2011-06-13 12:22                                       ` Mark Brown
2011-06-10 20:27                                 ` Rafael J. Wysocki
2011-06-10 18:38                             ` Rafael J. Wysocki
2011-06-10 17:17                         ` Alan Stern
2011-06-10 15:57                       ` Mark Brown
2011-06-10 15:45                     ` Alan Stern
2011-06-10 15:21                   ` Mark Brown
2011-06-10 18:49                 ` [linux-pm] " Rafael J. Wysocki
2011-06-10 18:54                   ` Mark Brown
2011-06-10 20:45                     ` Rafael J. Wysocki
2011-06-10 20:45                     ` [linux-pm] " Rafael J. Wysocki
2011-06-10 18:54                   ` Mark Brown
2011-06-10 18:49                 ` Rafael J. Wysocki
2011-06-10 14:36               ` Mark Brown
2011-06-10 23:52               ` Kevin Hilman
2011-06-10 23:52               ` [linux-pm] " Kevin Hilman
2011-06-11 16:42                 ` Alan Stern
2011-06-11 16:42                 ` [linux-pm] " Alan Stern
2011-06-11 22:46                   ` Rafael J. Wysocki
2011-06-12 15:59                     ` Alan Stern
2011-06-12 18:27                       ` Rafael J. Wysocki
2011-06-12 18:27                       ` [linux-pm] " Rafael J. Wysocki
2011-06-12 15:59                     ` Alan Stern
2011-06-15 21:54                     ` Kevin Hilman
2011-06-15 21:54                     ` [linux-pm] " Kevin Hilman
2011-06-16  0:01                       ` Rafael J. Wysocki
2011-06-16  1:17                         ` Kevin Hilman
2011-06-16 14:27                           ` Alan Stern
2011-06-16 14:27                           ` [linux-pm] " Alan Stern
2011-06-16 22:48                             ` Rafael J. Wysocki
2011-06-17 19:47                               ` Rafael J. Wysocki
2011-06-17 20:04                                 ` Alan Stern
2011-06-17 21:29                                   ` Rafael J. Wysocki
2011-06-18 11:08                                     ` Rafael J. Wysocki
2011-06-18 15:31                                       ` Alan Stern
2011-06-18 15:31                                       ` [linux-pm] " Alan Stern
2011-06-18 21:01                                         ` Rafael J. Wysocki
2011-06-18 23:57                                           ` Rafael J. Wysocki
2011-06-19  1:42                                             ` Alan Stern
2011-06-19  1:42                                               ` Alan Stern
2011-06-19 14:04                                               ` Rafael J. Wysocki
2011-06-19 14:04                                               ` [linux-pm] " Rafael J. Wysocki
2011-06-19 15:01                                                 ` Alan Stern
2011-06-19 15:01                                                 ` [linux-pm] " Alan Stern
2011-06-19 15:01                                                   ` Alan Stern
2011-06-19 19:36                                                   ` Rafael J. Wysocki
2011-06-19 19:36                                                   ` [linux-pm] " Rafael J. Wysocki
2011-06-20 14:39                                                     ` Alan Stern
2011-06-20 14:39                                                     ` [linux-pm] " Alan Stern
2011-06-20 14:39                                                       ` Alan Stern
2011-06-20 19:53                                                       ` Rafael J. Wysocki
2011-06-20 19:53                                                       ` [linux-pm] " Rafael J. Wysocki
2011-06-19  1:42                                             ` Alan Stern
2011-06-18 23:57                                           ` Rafael J. Wysocki
2011-06-18 21:01                                         ` Rafael J. Wysocki
2011-06-18 11:08                                     ` Rafael J. Wysocki
2011-06-17 21:29                                   ` Rafael J. Wysocki
2011-06-17 20:04                                 ` Alan Stern
2011-06-17 19:47                               ` Rafael J. Wysocki
2011-06-16 22:48                             ` Rafael J. Wysocki
2011-06-16 22:30                           ` Rafael J. Wysocki
2011-06-16 22:30                           ` [linux-pm] " Rafael J. Wysocki
2011-06-16  1:17                         ` Kevin Hilman
2011-06-16  0:01                       ` Rafael J. Wysocki
2011-06-11 22:46                   ` Rafael J. Wysocki
2011-06-09 13:56             ` Alan Stern
2011-06-10 23:14           ` [linux-pm] " Kevin Hilman
2011-06-11 16:27             ` Alan Stern
2011-06-11 16:27             ` [linux-pm] " Alan Stern
2011-06-11 23:13             ` Rafael J. Wysocki
2011-06-11 23:13             ` Rafael J. Wysocki
2011-06-10 23:14           ` Kevin Hilman
2011-06-06 22:25       ` Kevin Hilman
2011-06-06 18:29     ` Rafael J. Wysocki
2011-06-06 18:01 ` Rafael J. Wysocki
2011-06-06 18:01 ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2011-06-02  0:05 Kevin Hilman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.