All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-05  1:17 Huang Ying
  2012-11-05  1:56   ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-05  1:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-kernel, linux-pm, Huang Ying

In current runtime PM implementation, the active child count of the
parent device may be decreased if the runtime PM of the child device
is disabled and forbidden.  For example, to unbind a PCI driver with a
PCI device, the following code path is possible:

  pci_device_remove
    pm_runtime_set_suspended
      __pm_runtime_set_status
        atomic_add_unless(&parent->power.child_count, -1, 0)

That is, the parent device may be suspended, even if the runtime PM of
child device is forbidden to be suspended.  This violate the rule that
parent is allowed to be suspended only after all its children are
suspended, and may cause issue.

This issue is fixed via checking the usage count of the child device
because if the runtime PM of the child device is forbidden, the
usage_count of the child device will be non-zero.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/base/power/runtime.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -916,7 +916,7 @@ int __pm_runtime_set_status(struct devic
 
 	if (status == RPM_SUSPENDED) {
 		/* It always is possible to set the status to 'suspended'. */
-		if (parent) {
+		if (parent && atomic_read(&dev->power.usage_count) == 0) {
 			atomic_add_unless(&parent->power.child_count, -1, 0);
 			notify_parent = !parent->power.ignore_children;
 		}

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-05  1:17 [BUGFIX] PM: Fix active child counting when disabled and forbidden Huang Ying
@ 2012-11-05  1:56   ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-05  1:56 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 5 Nov 2012, Huang Ying wrote:

> In current runtime PM implementation, the active child count of the
> parent device may be decreased if the runtime PM of the child device
> is disabled and forbidden.  For example, to unbind a PCI driver with a
> PCI device, the following code path is possible:
> 
>   pci_device_remove
>     pm_runtime_set_suspended
>       __pm_runtime_set_status
>         atomic_add_unless(&parent->power.child_count, -1, 0)
> 
> That is, the parent device may be suspended, even if the runtime PM of
> child device is forbidden to be suspended.  This violate the rule that
> parent is allowed to be suspended only after all its children are
> suspended, and may cause issue.

This doesn't sound like a correct description of the situation.  The 
rule is not violated.  After pm_runtime_set_suspended runs, the child 
_is_ suspended.  Thus there's no reason not to allow the parent to be 
suspended.

The problem -- if there really is one -- is that a driver can put a 
device into the suspended state by calling pm_runtime_disable followed 
by pm_runtime_set_suspended, even if the usage count is > 0.

I'm not so sure this should count as a problem.  Generally devices 
aren't disabled for runtime PM unless something is wrong.  Under those 
circumstances, the meaning of pm_runtime_forbid isn't very well 
defined.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-05  1:56   ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-05  1:56 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 5 Nov 2012, Huang Ying wrote:

> In current runtime PM implementation, the active child count of the
> parent device may be decreased if the runtime PM of the child device
> is disabled and forbidden.  For example, to unbind a PCI driver with a
> PCI device, the following code path is possible:
> 
>   pci_device_remove
>     pm_runtime_set_suspended
>       __pm_runtime_set_status
>         atomic_add_unless(&parent->power.child_count, -1, 0)
> 
> That is, the parent device may be suspended, even if the runtime PM of
> child device is forbidden to be suspended.  This violate the rule that
> parent is allowed to be suspended only after all its children are
> suspended, and may cause issue.

This doesn't sound like a correct description of the situation.  The 
rule is not violated.  After pm_runtime_set_suspended runs, the child 
_is_ suspended.  Thus there's no reason not to allow the parent to be 
suspended.

The problem -- if there really is one -- is that a driver can put a 
device into the suspended state by calling pm_runtime_disable followed 
by pm_runtime_set_suspended, even if the usage count is > 0.

I'm not so sure this should count as a problem.  Generally devices 
aren't disabled for runtime PM unless something is wrong.  Under those 
circumstances, the meaning of pm_runtime_forbid isn't very well 
defined.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-05  1:56   ` Alan Stern
  (?)
@ 2012-11-06  0:43   ` Huang Ying
  2012-11-06 15:17       ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-06  0:43 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Sun, 2012-11-04 at 20:56 -0500, Alan Stern wrote:
> On Mon, 5 Nov 2012, Huang Ying wrote:
> 
> > In current runtime PM implementation, the active child count of the
> > parent device may be decreased if the runtime PM of the child device
> > is disabled and forbidden.  For example, to unbind a PCI driver with a
> > PCI device, the following code path is possible:
> > 
> >   pci_device_remove
> >     pm_runtime_set_suspended
> >       __pm_runtime_set_status
> >         atomic_add_unless(&parent->power.child_count, -1, 0)
> > 
> > That is, the parent device may be suspended, even if the runtime PM of
> > child device is forbidden to be suspended.  This violate the rule that
> > parent is allowed to be suspended only after all its children are
> > suspended, and may cause issue.
> 
> This doesn't sound like a correct description of the situation.  The 
> rule is not violated.  After pm_runtime_set_suspended runs, the child 
> _is_ suspended.  Thus there's no reason not to allow the parent to be 
> suspended.
> 
> The problem -- if there really is one -- is that a driver can put a 
> device into the suspended state by calling pm_runtime_disable followed 
> by pm_runtime_set_suspended, even if the usage count is > 0.
> 
> I'm not so sure this should count as a problem.  Generally devices 
> aren't disabled for runtime PM unless something is wrong.

Devices will be disabled if the PCI driver is unbound from the PCI
device.

So I think the rule could be: even if the device is suspended, the
device can be put into suspended state only if its usage count == 0.  In
this way, we can solve the issue for PCI driver unbound and that looks
more reasonable.

Best Regards,
Huang Ying

>   Under those 
> circumstances, the meaning of pm_runtime_forbid isn't very well 
> defined.
> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-06  0:43   ` Huang Ying
@ 2012-11-06 15:17       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-06 15:17 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 6 Nov 2012, Huang Ying wrote:

> On Sun, 2012-11-04 at 20:56 -0500, Alan Stern wrote:
> > On Mon, 5 Nov 2012, Huang Ying wrote:
> > 
> > > In current runtime PM implementation, the active child count of the
> > > parent device may be decreased if the runtime PM of the child device
> > > is disabled and forbidden.  For example, to unbind a PCI driver with a
> > > PCI device, the following code path is possible:
> > > 
> > >   pci_device_remove
> > >     pm_runtime_set_suspended
> > >       __pm_runtime_set_status
> > >         atomic_add_unless(&parent->power.child_count, -1, 0)
> > > 
> > > That is, the parent device may be suspended, even if the runtime PM of
> > > child device is forbidden to be suspended.  This violate the rule that
> > > parent is allowed to be suspended only after all its children are
> > > suspended, and may cause issue.
> > 
> > This doesn't sound like a correct description of the situation.  The 
> > rule is not violated.  After pm_runtime_set_suspended runs, the child 
> > _is_ suspended.  Thus there's no reason not to allow the parent to be 
> > suspended.
> > 
> > The problem -- if there really is one -- is that a driver can put a 
> > device into the suspended state by calling pm_runtime_disable followed 
> > by pm_runtime_set_suspended, even if the usage count is > 0.
> > 
> > I'm not so sure this should count as a problem.  Generally devices 
> > aren't disabled for runtime PM unless something is wrong.
> 
> Devices will be disabled if the PCI driver is unbound from the PCI
> device.

Yes.  But without a PCI driver, nothing will call 
pm_runtime_set_suspended.  And even if something does call 
pm_runtime_set_suspended, it's still not a problem -- the device can't 
be used without a driver.

> So I think the rule could be: even if the device is suspended, the
> device can be put into suspended state only if its usage count == 0.  In
> this way, we can solve the issue for PCI driver unbound and that looks
> more reasonable.

You still have not shown that there really is a problem.  Do you have 
any particular use case in mind?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-06 15:17       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-06 15:17 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 6 Nov 2012, Huang Ying wrote:

> On Sun, 2012-11-04 at 20:56 -0500, Alan Stern wrote:
> > On Mon, 5 Nov 2012, Huang Ying wrote:
> > 
> > > In current runtime PM implementation, the active child count of the
> > > parent device may be decreased if the runtime PM of the child device
> > > is disabled and forbidden.  For example, to unbind a PCI driver with a
> > > PCI device, the following code path is possible:
> > > 
> > >   pci_device_remove
> > >     pm_runtime_set_suspended
> > >       __pm_runtime_set_status
> > >         atomic_add_unless(&parent->power.child_count, -1, 0)
> > > 
> > > That is, the parent device may be suspended, even if the runtime PM of
> > > child device is forbidden to be suspended.  This violate the rule that
> > > parent is allowed to be suspended only after all its children are
> > > suspended, and may cause issue.
> > 
> > This doesn't sound like a correct description of the situation.  The 
> > rule is not violated.  After pm_runtime_set_suspended runs, the child 
> > _is_ suspended.  Thus there's no reason not to allow the parent to be 
> > suspended.
> > 
> > The problem -- if there really is one -- is that a driver can put a 
> > device into the suspended state by calling pm_runtime_disable followed 
> > by pm_runtime_set_suspended, even if the usage count is > 0.
> > 
> > I'm not so sure this should count as a problem.  Generally devices 
> > aren't disabled for runtime PM unless something is wrong.
> 
> Devices will be disabled if the PCI driver is unbound from the PCI
> device.

Yes.  But without a PCI driver, nothing will call 
pm_runtime_set_suspended.  And even if something does call 
pm_runtime_set_suspended, it's still not a problem -- the device can't 
be used without a driver.

> So I think the rule could be: even if the device is suspended, the
> device can be put into suspended state only if its usage count == 0.  In
> this way, we can solve the issue for PCI driver unbound and that looks
> more reasonable.

You still have not shown that there really is a problem.  Do you have 
any particular use case in mind?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-06 15:17       ` Alan Stern
  (?)
@ 2012-11-07  0:26       ` Huang Ying
  2012-11-07 15:49           ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-07  0:26 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 2012-11-06 at 10:17 -0500, Alan Stern wrote:
> On Tue, 6 Nov 2012, Huang Ying wrote:
> 
> > On Sun, 2012-11-04 at 20:56 -0500, Alan Stern wrote:
> > > On Mon, 5 Nov 2012, Huang Ying wrote:
> > > 
> > > > In current runtime PM implementation, the active child count of the
> > > > parent device may be decreased if the runtime PM of the child device
> > > > is disabled and forbidden.  For example, to unbind a PCI driver with a
> > > > PCI device, the following code path is possible:
> > > > 
> > > >   pci_device_remove
> > > >     pm_runtime_set_suspended
> > > >       __pm_runtime_set_status
> > > >         atomic_add_unless(&parent->power.child_count, -1, 0)
> > > > 
> > > > That is, the parent device may be suspended, even if the runtime PM of
> > > > child device is forbidden to be suspended.  This violate the rule that
> > > > parent is allowed to be suspended only after all its children are
> > > > suspended, and may cause issue.
> > > 
> > > This doesn't sound like a correct description of the situation.  The 
> > > rule is not violated.  After pm_runtime_set_suspended runs, the child 
> > > _is_ suspended.  Thus there's no reason not to allow the parent to be 
> > > suspended.
> > > 
> > > The problem -- if there really is one -- is that a driver can put a 
> > > device into the suspended state by calling pm_runtime_disable followed 
> > > by pm_runtime_set_suspended, even if the usage count is > 0.
> > > 
> > > I'm not so sure this should count as a problem.  Generally devices 
> > > aren't disabled for runtime PM unless something is wrong.
> > 
> > Devices will be disabled if the PCI driver is unbound from the PCI
> > device.
> 
> Yes.  But without a PCI driver, nothing will call 
> pm_runtime_set_suspended.

pm_runtime_set_suspended will be called in pci_device_remove or error
path of local_pci_probe.

>   And even if something does call 
> pm_runtime_set_suspended, it's still not a problem -- the device can't 
> be used without a driver.

The VGA device can be used without a driver.

Best Regards,
Huang Ying


> > So I think the rule could be: even if the device is suspended, the
> > device can be put into suspended state only if its usage count == 0.  In
> > this way, we can solve the issue for PCI driver unbound and that looks
> > more reasonable.
> 
> You still have not shown that there really is a problem.  Do you have 
> any particular use case in mind?
> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07  0:26       ` Huang Ying
@ 2012-11-07 15:49           ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 15:49 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Huang Ying wrote:

> > > Devices will be disabled if the PCI driver is unbound from the PCI
> > > device.
> > 
> > Yes.  But without a PCI driver, nothing will call 
> > pm_runtime_set_suspended.
> 
> pm_runtime_set_suspended will be called in pci_device_remove or error
> path of local_pci_probe.

Yes, okay.  But that's what we want.  Unused, driverless PCI devices
shouldn't force their parents to remain at full power.

> >   And even if something does call 
> > pm_runtime_set_suspended, it's still not a problem -- the device can't 
> > be used without a driver.
> 
> The VGA device can be used without a driver.

Ah, right, that's your _real_ problem.  You should have mentioned this 
in the original Changelog for the patch.

Rafael, this does need to be fixed.  The PCI subsystem assumes that 
driverless devices are not in use, so they are disabled for runtime PM 
and marked as suspended.  This is not appropriate for VGA devices, 
which can indeed be used without a driver.

I'm not sure what the best solution is.  Maybe we should Ying's 
proposal a step farther:

	Make pm_runtime_set_suspended() fail if runtime PM is 
	forbidden.

	Make pm_runtime_forbid() call pm_runtime_set_active()
	(and do a runtime resume of the parent) if disable_depth > 0.

	Make the PCI runtime-idle routine call 
	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
	do this for all devices, in the runtime PM core.

What do you think?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-07 15:49           ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 15:49 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Huang Ying wrote:

> > > Devices will be disabled if the PCI driver is unbound from the PCI
> > > device.
> > 
> > Yes.  But without a PCI driver, nothing will call 
> > pm_runtime_set_suspended.
> 
> pm_runtime_set_suspended will be called in pci_device_remove or error
> path of local_pci_probe.

Yes, okay.  But that's what we want.  Unused, driverless PCI devices
shouldn't force their parents to remain at full power.

> >   And even if something does call 
> > pm_runtime_set_suspended, it's still not a problem -- the device can't 
> > be used without a driver.
> 
> The VGA device can be used without a driver.

Ah, right, that's your _real_ problem.  You should have mentioned this 
in the original Changelog for the patch.

Rafael, this does need to be fixed.  The PCI subsystem assumes that 
driverless devices are not in use, so they are disabled for runtime PM 
and marked as suspended.  This is not appropriate for VGA devices, 
which can indeed be used without a driver.

I'm not sure what the best solution is.  Maybe we should Ying's 
proposal a step farther:

	Make pm_runtime_set_suspended() fail if runtime PM is 
	forbidden.

	Make pm_runtime_forbid() call pm_runtime_set_active()
	(and do a runtime resume of the parent) if disable_depth > 0.

	Make the PCI runtime-idle routine call 
	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
	do this for all devices, in the runtime PM core.

What do you think?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 15:49           ` Alan Stern
  (?)
@ 2012-11-07 16:09           ` Rafael J. Wysocki
  2012-11-07 17:17               ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-07 16:09 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 07, 2012 10:49:04 AM Alan Stern wrote:
> On Wed, 7 Nov 2012, Huang Ying wrote:
> 
> > > > Devices will be disabled if the PCI driver is unbound from the PCI
> > > > device.
> > > 
> > > Yes.  But without a PCI driver, nothing will call 
> > > pm_runtime_set_suspended.
> > 
> > pm_runtime_set_suspended will be called in pci_device_remove or error
> > path of local_pci_probe.
> 
> Yes, okay.  But that's what we want.  Unused, driverless PCI devices
> shouldn't force their parents to remain at full power.
> 
> > >   And even if something does call 
> > > pm_runtime_set_suspended, it's still not a problem -- the device can't 
> > > be used without a driver.
> > 
> > The VGA device can be used without a driver.
> 
> Ah, right, that's your _real_ problem.  You should have mentioned this 
> in the original Changelog for the patch.
> 
> Rafael, this does need to be fixed.

Yup.

> The PCI subsystem assumes that 
> driverless devices are not in use, so they are disabled for runtime PM 
> and marked as suspended.  This is not appropriate for VGA devices, 
> which can indeed be used without a driver.
> 
> I'm not sure what the best solution is.  Maybe we should Ying's 
> proposal a step farther:
> 
> 	Make pm_runtime_set_suspended() fail if runtime PM is 
> 	forbidden.
>
> 	Make pm_runtime_forbid() call pm_runtime_set_active()
> 	(and do a runtime resume of the parent) if disable_depth > 0.

I'd prefer this one.  The callers of pm_runtime_forbid() may actually
reasonably expect something like this to happen.

> 	Make the PCI runtime-idle routine call 
> 	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
> 	do this for all devices, in the runtime PM core.

That would mean calling it on every call to pm_runtime_put() and friends
which seems to be a bit wasteful.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 16:09           ` Rafael J. Wysocki
@ 2012-11-07 17:17               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 17:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> > The PCI subsystem assumes that 
> > driverless devices are not in use, so they are disabled for runtime PM 
> > and marked as suspended.  This is not appropriate for VGA devices, 
> > which can indeed be used without a driver.
> > 
> > I'm not sure what the best solution is.  Maybe we should Ying's 
> > proposal a step farther:
> > 
> > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > 	forbidden.
> >
> > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > 	(and do a runtime resume of the parent) if disable_depth > 0.
> 
> I'd prefer this one.

That wasn't meant to be a choice.  The first item is close to what the 
original patch did; I was suggesting that we should adopt all three 
items.

If you adopt the second item but not the first then things won't work 
when a driver is removed after power/control is set to "on".

>  The callers of pm_runtime_forbid() may actually
> reasonably expect something like this to happen.
> 
> > 	Make the PCI runtime-idle routine call 
> > 	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
> > 	do this for all devices, in the runtime PM core.
> 
> That would mean calling it on every call to pm_runtime_put() and friends
> which seems to be a bit wasteful.

Not on every call; only when disable_depth > 0.  We don't expect to see 
idle calls very often under those circumstances.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-07 17:17               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 17:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> > The PCI subsystem assumes that 
> > driverless devices are not in use, so they are disabled for runtime PM 
> > and marked as suspended.  This is not appropriate for VGA devices, 
> > which can indeed be used without a driver.
> > 
> > I'm not sure what the best solution is.  Maybe we should Ying's 
> > proposal a step farther:
> > 
> > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > 	forbidden.
> >
> > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > 	(and do a runtime resume of the parent) if disable_depth > 0.
> 
> I'd prefer this one.

That wasn't meant to be a choice.  The first item is close to what the 
original patch did; I was suggesting that we should adopt all three 
items.

If you adopt the second item but not the first then things won't work 
when a driver is removed after power/control is set to "on".

>  The callers of pm_runtime_forbid() may actually
> reasonably expect something like this to happen.
> 
> > 	Make the PCI runtime-idle routine call 
> > 	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
> > 	do this for all devices, in the runtime PM core.
> 
> That would mean calling it on every call to pm_runtime_put() and friends
> which seems to be a bit wasteful.

Not on every call; only when disable_depth > 0.  We don't expect to see 
idle calls very often under those circumstances.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 17:17               ` Alan Stern
  (?)
@ 2012-11-07 20:21               ` Rafael J. Wysocki
  2012-11-07 20:47                   ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-07 20:21 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 07, 2012 12:17:02 PM Alan Stern wrote:
> On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> 
> > > The PCI subsystem assumes that 
> > > driverless devices are not in use, so they are disabled for runtime PM 
> > > and marked as suspended.  This is not appropriate for VGA devices, 
> > > which can indeed be used without a driver.
> > > 
> > > I'm not sure what the best solution is.  Maybe we should Ying's 
> > > proposal a step farther:
> > > 
> > > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > > 	forbidden.
> > >
> > > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > > 	(and do a runtime resume of the parent) if disable_depth > 0.
> > 
> > I'd prefer this one.
> 
> That wasn't meant to be a choice.  The first item is close to what the 
> original patch did; I was suggesting that we should adopt all three 
> items.

OK, I need to think about this a bit more, then.

The problem seems to be that our initial assumption, ie. that driverless
devices won't be in use, is not satisfied in the relevant case.  It may
not be satisfied in more cases like this, I suppose, but so far we  don't
really know.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 20:21               ` Rafael J. Wysocki
@ 2012-11-07 20:47                   ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 20:47 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> On Wednesday, November 07, 2012 12:17:02 PM Alan Stern wrote:
> > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > The PCI subsystem assumes that 
> > > > driverless devices are not in use, so they are disabled for runtime PM 
> > > > and marked as suspended.  This is not appropriate for VGA devices, 
> > > > which can indeed be used without a driver.
> > > > 
> > > > I'm not sure what the best solution is.  Maybe we should Ying's 
> > > > proposal a step farther:
> > > > 
> > > > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > > > 	forbidden.
> > > >
> > > > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > > > 	(and do a runtime resume of the parent) if disable_depth > 0.
> > > 
> > > I'd prefer this one.
> > 
> > That wasn't meant to be a choice.  The first item is close to what the 
> > original patch did; I was suggesting that we should adopt all three 
> > items.
> 
> OK, I need to think about this a bit more, then.
> 
> The problem seems to be that our initial assumption, ie. that driverless
> devices won't be in use, is not satisfied in the relevant case.  It may
> not be satisfied in more cases like this, I suppose, but so far we  don't
> really know.

Right.  The reasoning behind my proposal goes like this: When there's
no driver, the subsystem can let userspace directly control the
device's power level through the power/control attribute.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-07 20:47                   ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 20:47 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> On Wednesday, November 07, 2012 12:17:02 PM Alan Stern wrote:
> > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > The PCI subsystem assumes that 
> > > > driverless devices are not in use, so they are disabled for runtime PM 
> > > > and marked as suspended.  This is not appropriate for VGA devices, 
> > > > which can indeed be used without a driver.
> > > > 
> > > > I'm not sure what the best solution is.  Maybe we should Ying's 
> > > > proposal a step farther:
> > > > 
> > > > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > > > 	forbidden.
> > > >
> > > > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > > > 	(and do a runtime resume of the parent) if disable_depth > 0.
> > > 
> > > I'd prefer this one.
> > 
> > That wasn't meant to be a choice.  The first item is close to what the 
> > original patch did; I was suggesting that we should adopt all three 
> > items.
> 
> OK, I need to think about this a bit more, then.
> 
> The problem seems to be that our initial assumption, ie. that driverless
> devices won't be in use, is not satisfied in the relevant case.  It may
> not be satisfied in more cases like this, I suppose, but so far we  don't
> really know.

Right.  The reasoning behind my proposal goes like this: When there's
no driver, the subsystem can let userspace directly control the
device's power level through the power/control attribute.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 20:47                   ` Alan Stern
  (?)
@ 2012-11-07 21:44                   ` Rafael J. Wysocki
  2012-11-07 21:56                       ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-07 21:44 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 07, 2012 03:47:02 PM Alan Stern wrote:
> On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> 
> > On Wednesday, November 07, 2012 12:17:02 PM Alan Stern wrote:
> > > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > > 
> > > > > The PCI subsystem assumes that 
> > > > > driverless devices are not in use, so they are disabled for runtime PM 
> > > > > and marked as suspended.  This is not appropriate for VGA devices, 
> > > > > which can indeed be used without a driver.
> > > > > 
> > > > > I'm not sure what the best solution is.  Maybe we should Ying's 
> > > > > proposal a step farther:
> > > > > 
> > > > > 	Make pm_runtime_set_suspended() fail if runtime PM is 
> > > > > 	forbidden.
> > > > >
> > > > > 	Make pm_runtime_forbid() call pm_runtime_set_active()
> > > > > 	(and do a runtime resume of the parent) if disable_depth > 0.
> > > > 
> > > > I'd prefer this one.
> > > 
> > > That wasn't meant to be a choice.  The first item is close to what the 
> > > original patch did; I was suggesting that we should adopt all three 
> > > items.
> > 
> > OK, I need to think about this a bit more, then.
> > 
> > The problem seems to be that our initial assumption, ie. that driverless
> > devices won't be in use, is not satisfied in the relevant case.  It may
> > not be satisfied in more cases like this, I suppose, but so far we  don't
> > really know.
> 
> Right.  The reasoning behind my proposal goes like this: When there's
> no driver, the subsystem can let userspace directly control the
> device's power level through the power/control attribute.

Well, we might as well just leave the runtime PM of PCI devices enabled, even
if they have no drivers, but modify the PCI bus type's runtime PM callbacks
to ignore devices with no drivers.

IIRC the reason why we decided to disable runtime PM for PCI device with no
drivers was that some of them refused to work again after being put by the
core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
we'd avoid this problem without modifying the core's behavior.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 21:44                   ` Rafael J. Wysocki
@ 2012-11-07 21:56                       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 21:56 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> > Right.  The reasoning behind my proposal goes like this: When there's
> > no driver, the subsystem can let userspace directly control the
> > device's power level through the power/control attribute.
> 
> Well, we might as well just leave the runtime PM of PCI devices enabled, even
> if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> to ignore devices with no drivers.
> 
> IIRC the reason why we decided to disable runtime PM for PCI device with no
> drivers was that some of them refused to work again after being put by the
> core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> we'd avoid this problem without modifying the core's behavior.

It comes down to a question of the parent.  If a driverless PCI device
isn't being used, shouldn't its parent be allowed to go into runtime
suspend?  As things stand now, we do allow it.

The problem is that we don't disallow it when the driverless device
_is_ being used.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-07 21:56                       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-07 21:56 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:

> > Right.  The reasoning behind my proposal goes like this: When there's
> > no driver, the subsystem can let userspace directly control the
> > device's power level through the power/control attribute.
> 
> Well, we might as well just leave the runtime PM of PCI devices enabled, even
> if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> to ignore devices with no drivers.
> 
> IIRC the reason why we decided to disable runtime PM for PCI device with no
> drivers was that some of them refused to work again after being put by the
> core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> we'd avoid this problem without modifying the core's behavior.

It comes down to a question of the parent.  If a driverless PCI device
isn't being used, shouldn't its parent be allowed to go into runtime
suspend?  As things stand now, we do allow it.

The problem is that we don't disallow it when the driverless device
_is_ being used.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 21:56                       ` Alan Stern
  (?)
@ 2012-11-07 22:51                       ` Rafael J. Wysocki
  2012-11-07 23:09                         ` Rafael J. Wysocki
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-07 22:51 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 07, 2012 04:56:49 PM Alan Stern wrote:
> On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> 
> > > Right.  The reasoning behind my proposal goes like this: When there's
> > > no driver, the subsystem can let userspace directly control the
> > > device's power level through the power/control attribute.
> > 
> > Well, we might as well just leave the runtime PM of PCI devices enabled, even
> > if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> > to ignore devices with no drivers.
> > 
> > IIRC the reason why we decided to disable runtime PM for PCI device with no
> > drivers was that some of them refused to work again after being put by the
> > core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> > we'd avoid this problem without modifying the core's behavior.
> 
> It comes down to a question of the parent.  If a driverless PCI device
> isn't being used, shouldn't its parent be allowed to go into runtime
> suspend?  As things stand now, we do allow it.
> 
> The problem is that we don't disallow it when the driverless device
> _is_ being used.

We can make it depend on what's there in the control file.  Let's say if that's
"on" (ie. the devices usage counter is not zero), we won't allow the parent
to be suspended.

So, as I said, why don't we keep the runtime PM of PCI devices always enabled,
regardless of whether or not there is a driver, and arrange things in such a
way that the device is automatically "suspended" if user space writes "auto"
to the control file.  IOW, I suppose we can do something like this:

---
 drivers/pci/pci-driver.c |   34 ++++++++++++----------------------
 drivers/pci/pci.c        |    2 ++
 2 files changed, 14 insertions(+), 22 deletions(-)

Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -263,22 +263,17 @@ static long local_pci_probe(void *_ddi)
 	/* The parent bridge must be in active state when probing */
 	if (parent)
 		pm_runtime_get_sync(parent);
-	/* Unbound PCI devices are always set to disabled and suspended.
-	 * During probe, the device is set to enabled and active and the
-	 * usage count is incremented.  If the driver supports runtime PM,
-	 * it should call pm_runtime_put_noidle() in its probe routine and
+	/*
+	 * During probe, the device is set to active and the usage count is
+	 * incremented.  If the driver supports runtime PM, it should call
+	 * pm_runtime_put_noidle() in its probe routine and
 	 * pm_runtime_get_noresume() in its remove routine.
 	 */
-	pm_runtime_get_noresume(dev);
-	pm_runtime_set_active(dev);
-	pm_runtime_enable(dev);
-
+	pm_runtime_get_sync(dev);
 	rc = ddi->drv->probe(ddi->dev, ddi->id);
-	if (rc) {
-		pm_runtime_disable(dev);
-		pm_runtime_set_suspended(dev);
-		pm_runtime_put_noidle(dev);
-	}
+	if (rc)
+		pm_runtime_put_sync(dev);
+
 	if (parent)
 		pm_runtime_put(parent);
 	return rc;
@@ -369,9 +364,7 @@ static int pci_device_remove(struct devi
 	}
 
 	/* Undo the runtime PM settings in local_pci_probe() */
-	pm_runtime_disable(dev);
-	pm_runtime_set_suspended(dev);
-	pm_runtime_put_noidle(dev);
+	pm_runtime_put_sync(dev);
 
 	/*
 	 * If the device is still on, set the power state as "unknown",
@@ -1003,7 +996,7 @@ static int pci_pm_runtime_suspend(struct
 	int error;
 
 	if (!pm || !pm->runtime_suspend)
-		return -ENOSYS;
+		return 0;
 
 	pci_dev->no_d3cold = false;
 	error = pm->runtime_suspend(dev);
@@ -1038,7 +1031,7 @@ static int pci_pm_runtime_resume(struct
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 
 	if (!pm || !pm->runtime_resume)
-		return -ENOSYS;
+		return 0;
 
 	pci_restore_standard_config(pci_dev);
 	pci_fixup_device(pci_fixup_resume_early, pci_dev);
@@ -1056,10 +1049,7 @@ static int pci_pm_runtime_idle(struct de
 {
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 
-	if (!pm)
-		return -ENOSYS;
-
-	if (pm->runtime_idle) {
+	if (pm && pm->runtime_idle) {
 		int ret = pm->runtime_idle(dev);
 		if (ret)
 			return ret;
Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -1868,6 +1868,8 @@ void pci_pm_init(struct pci_dev *dev)
 	u16 pmc;
 
 	pm_runtime_forbid(&dev->dev);
+	pm_runtime_set_active(dev);
+	pm_runtime_enable(&dev->dev);
 	device_enable_async_suspend(&dev->dev);
 	dev->wakeup_prepared = false;
 



-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 22:51                       ` Rafael J. Wysocki
@ 2012-11-07 23:09                         ` Rafael J. Wysocki
  2012-11-08  1:15                           ` Huang Ying
  0 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-07 23:09 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 07, 2012 11:51:15 PM Rafael J. Wysocki wrote:
> On Wednesday, November 07, 2012 04:56:49 PM Alan Stern wrote:
> > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > Right.  The reasoning behind my proposal goes like this: When there's
> > > > no driver, the subsystem can let userspace directly control the
> > > > device's power level through the power/control attribute.
> > > 
> > > Well, we might as well just leave the runtime PM of PCI devices enabled, even
> > > if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> > > to ignore devices with no drivers.
> > > 
> > > IIRC the reason why we decided to disable runtime PM for PCI device with no
> > > drivers was that some of them refused to work again after being put by the
> > > core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> > > we'd avoid this problem without modifying the core's behavior.
> > 
> > It comes down to a question of the parent.  If a driverless PCI device
> > isn't being used, shouldn't its parent be allowed to go into runtime
> > suspend?  As things stand now, we do allow it.
> > 
> > The problem is that we don't disallow it when the driverless device
> > _is_ being used.
> 
> We can make it depend on what's there in the control file.  Let's say if that's
> "on" (ie. the devices usage counter is not zero), we won't allow the parent
> to be suspended.
> 
> So, as I said, why don't we keep the runtime PM of PCI devices always enabled,
> regardless of whether or not there is a driver, and arrange things in such a
> way that the device is automatically "suspended" if user space writes "auto"
> to the control file.  IOW, I suppose we can do something like this:

It probably is better to treat the "no driver" case in a special way, though:

---
 drivers/pci/pci-driver.c |   45 +++++++++++++++++++++++++--------------------
 drivers/pci/pci.c        |    2 ++
 2 files changed, 27 insertions(+), 20 deletions(-)

Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -263,22 +263,17 @@ static long local_pci_probe(void *_ddi)
 	/* The parent bridge must be in active state when probing */
 	if (parent)
 		pm_runtime_get_sync(parent);
-	/* Unbound PCI devices are always set to disabled and suspended.
-	 * During probe, the device is set to enabled and active and the
-	 * usage count is incremented.  If the driver supports runtime PM,
-	 * it should call pm_runtime_put_noidle() in its probe routine and
+	/*
+	 * During probe, the device is set to active and the usage count is
+	 * incremented.  If the driver supports runtime PM, it should call
+	 * pm_runtime_put_noidle() in its probe routine and
 	 * pm_runtime_get_noresume() in its remove routine.
 	 */
-	pm_runtime_get_noresume(dev);
-	pm_runtime_set_active(dev);
-	pm_runtime_enable(dev);
-
+	pm_runtime_get_sync(dev);
 	rc = ddi->drv->probe(ddi->dev, ddi->id);
-	if (rc) {
-		pm_runtime_disable(dev);
-		pm_runtime_set_suspended(dev);
-		pm_runtime_put_noidle(dev);
-	}
+	if (rc)
+		pm_runtime_put_sync(dev);
+
 	if (parent)
 		pm_runtime_put(parent);
 	return rc;
@@ -369,9 +364,7 @@ static int pci_device_remove(struct devi
 	}
 
 	/* Undo the runtime PM settings in local_pci_probe() */
-	pm_runtime_disable(dev);
-	pm_runtime_set_suspended(dev);
-	pm_runtime_put_noidle(dev);
+	pm_runtime_put_sync(dev);
 
 	/*
 	 * If the device is still on, set the power state as "unknown",
@@ -998,10 +991,14 @@ static int pci_pm_restore(struct device
 static int pci_pm_runtime_suspend(struct device *dev)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
-	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
+	const struct dev_pm_ops *pm;
 	pci_power_t prev = pci_dev->current_state;
 	int error;
 
+	if (!dev->driver)
+		return 0;
+
+	pm = dev->driver->pm;
 	if (!pm || !pm->runtime_suspend)
 		return -ENOSYS;
 
@@ -1035,8 +1032,12 @@ static int pci_pm_runtime_resume(struct
 {
 	int rc;
 	struct pci_dev *pci_dev = to_pci_dev(dev);
-	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
+	const struct dev_pm_ops *pm;
+
+	if (!dev->driver)
+		return 0;
 
+	pm = dev->driver->pm;
 	if (!pm || !pm->runtime_resume)
 		return -ENOSYS;
 
@@ -1054,8 +1055,12 @@ static int pci_pm_runtime_resume(struct
 
 static int pci_pm_runtime_idle(struct device *dev)
 {
-	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
+	const struct dev_pm_ops *pm;
+
+	if (!dev->driver)
+		goto out:
 
+	pm = dev->driver->pm;
 	if (!pm)
 		return -ENOSYS;
 
@@ -1065,8 +1070,8 @@ static int pci_pm_runtime_idle(struct de
 			return ret;
 	}
 
+ out:
 	pm_runtime_suspend(dev);
-
 	return 0;
 }
 
Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -1868,6 +1868,8 @@ void pci_pm_init(struct pci_dev *dev)
 	u16 pmc;
 
 	pm_runtime_forbid(&dev->dev);
+	pm_runtime_set_active(dev);
+	pm_runtime_enable(&dev->dev);
 	device_enable_async_suspend(&dev->dev);
 	dev->wakeup_prepared = false;
 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-07 23:09                         ` Rafael J. Wysocki
@ 2012-11-08  1:15                           ` Huang Ying
  2012-11-08  1:35                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-08  1:15 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Thu, 2012-11-08 at 00:09 +0100, Rafael J. Wysocki wrote:
> On Wednesday, November 07, 2012 11:51:15 PM Rafael J. Wysocki wrote:
> > On Wednesday, November 07, 2012 04:56:49 PM Alan Stern wrote:
> > > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > > 
> > > > > Right.  The reasoning behind my proposal goes like this: When there's
> > > > > no driver, the subsystem can let userspace directly control the
> > > > > device's power level through the power/control attribute.
> > > > 
> > > > Well, we might as well just leave the runtime PM of PCI devices enabled, even
> > > > if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> > > > to ignore devices with no drivers.
> > > > 
> > > > IIRC the reason why we decided to disable runtime PM for PCI device with no
> > > > drivers was that some of them refused to work again after being put by the
> > > > core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> > > > we'd avoid this problem without modifying the core's behavior.
> > > 
> > > It comes down to a question of the parent.  If a driverless PCI device
> > > isn't being used, shouldn't its parent be allowed to go into runtime
> > > suspend?  As things stand now, we do allow it.
> > > 
> > > The problem is that we don't disallow it when the driverless device
> > > _is_ being used.
> > 
> > We can make it depend on what's there in the control file.  Let's say if that's
> > "on" (ie. the devices usage counter is not zero), we won't allow the parent
> > to be suspended.
> > 
> > So, as I said, why don't we keep the runtime PM of PCI devices always enabled,
> > regardless of whether or not there is a driver, and arrange things in such a
> > way that the device is automatically "suspended" if user space writes "auto"
> > to the control file.  IOW, I suppose we can do something like this:
> 
> It probably is better to treat the "no driver" case in a special way, though:
> 
> ---
>  drivers/pci/pci-driver.c |   45 +++++++++++++++++++++++++--------------------
>  drivers/pci/pci.c        |    2 ++
>  2 files changed, 27 insertions(+), 20 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -263,22 +263,17 @@ static long local_pci_probe(void *_ddi)
>  	/* The parent bridge must be in active state when probing */
>  	if (parent)
>  		pm_runtime_get_sync(parent);
> -	/* Unbound PCI devices are always set to disabled and suspended.
> -	 * During probe, the device is set to enabled and active and the
> -	 * usage count is incremented.  If the driver supports runtime PM,
> -	 * it should call pm_runtime_put_noidle() in its probe routine and
> +	/*
> +	 * During probe, the device is set to active and the usage count is
> +	 * incremented.  If the driver supports runtime PM, it should call
> +	 * pm_runtime_put_noidle() in its probe routine and
>  	 * pm_runtime_get_noresume() in its remove routine.
>  	 */
> -	pm_runtime_get_noresume(dev);
> -	pm_runtime_set_active(dev);
> -	pm_runtime_enable(dev);
> -
> +	pm_runtime_get_sync(dev);
>  	rc = ddi->drv->probe(ddi->dev, ddi->id);
> -	if (rc) {
> -		pm_runtime_disable(dev);
> -		pm_runtime_set_suspended(dev);
> -		pm_runtime_put_noidle(dev);
> -	}
> +	if (rc)
> +		pm_runtime_put_sync(dev);
> +
>  	if (parent)
>  		pm_runtime_put(parent);
>  	return rc;
> @@ -369,9 +364,7 @@ static int pci_device_remove(struct devi
>  	}
>  
>  	/* Undo the runtime PM settings in local_pci_probe() */
> -	pm_runtime_disable(dev);
> -	pm_runtime_set_suspended(dev);
> -	pm_runtime_put_noidle(dev);
> +	pm_runtime_put_sync(dev);
>  
>  	/*
>  	 * If the device is still on, set the power state as "unknown",
> @@ -998,10 +991,14 @@ static int pci_pm_restore(struct device
>  static int pci_pm_runtime_suspend(struct device *dev)
>  {
>  	struct pci_dev *pci_dev = to_pci_dev(dev);
> -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> +	const struct dev_pm_ops *pm;
>  	pci_power_t prev = pci_dev->current_state;
>  	int error;
>  
> +	if (!dev->driver)
> +		return 0;
> +
> +	pm = dev->driver->pm;
>  	if (!pm || !pm->runtime_suspend)
>  		return -ENOSYS;
>  
> @@ -1035,8 +1032,12 @@ static int pci_pm_runtime_resume(struct
>  {
>  	int rc;
>  	struct pci_dev *pci_dev = to_pci_dev(dev);
> -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> +	const struct dev_pm_ops *pm;
> +
> +	if (!dev->driver)
> +		return 0;
>  
> +	pm = dev->driver->pm;
>  	if (!pm || !pm->runtime_resume)
>  		return -ENOSYS;
>  
> @@ -1054,8 +1055,12 @@ static int pci_pm_runtime_resume(struct
>  
>  static int pci_pm_runtime_idle(struct device *dev)
>  {
> -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> +	const struct dev_pm_ops *pm;
> +
> +	if (!dev->driver)
> +		goto out:
>  
> +	pm = dev->driver->pm;
>  	if (!pm)
>  		return -ENOSYS;
>  
> @@ -1065,8 +1070,8 @@ static int pci_pm_runtime_idle(struct de
>  			return ret;
>  	}
>  
> + out:
>  	pm_runtime_suspend(dev);
> -
>  	return 0;
>  }
>  
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -1868,6 +1868,8 @@ void pci_pm_init(struct pci_dev *dev)
>  	u16 pmc;
>  
>  	pm_runtime_forbid(&dev->dev);
> +	pm_runtime_set_active(dev);
> +	pm_runtime_enable(&dev->dev);
>  	device_enable_async_suspend(&dev->dev);
>  	dev->wakeup_prepared = false;
>  

I think the patch can fix the issue in a better way.

Do we still need to clarify state about disabled and forbidden?  When a
device is forbidden and the usage_count > 0, is it a good idea to allow
to set device state to SUSPENDED if the device is disabled?

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08  1:15                           ` Huang Ying
@ 2012-11-08  1:35                             ` Rafael J. Wysocki
  2012-11-08  2:04                               ` Huang Ying
  0 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-08  1:35 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Thursday, November 08, 2012 09:15:08 AM Huang Ying wrote:
> On Thu, 2012-11-08 at 00:09 +0100, Rafael J. Wysocki wrote:
> > On Wednesday, November 07, 2012 11:51:15 PM Rafael J. Wysocki wrote:
> > > On Wednesday, November 07, 2012 04:56:49 PM Alan Stern wrote:
> > > > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > > > 
> > > > > > Right.  The reasoning behind my proposal goes like this: When there's
> > > > > > no driver, the subsystem can let userspace directly control the
> > > > > > device's power level through the power/control attribute.
> > > > > 
> > > > > Well, we might as well just leave the runtime PM of PCI devices enabled, even
> > > > > if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> > > > > to ignore devices with no drivers.
> > > > > 
> > > > > IIRC the reason why we decided to disable runtime PM for PCI device with no
> > > > > drivers was that some of them refused to work again after being put by the
> > > > > core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> > > > > we'd avoid this problem without modifying the core's behavior.
> > > > 
> > > > It comes down to a question of the parent.  If a driverless PCI device
> > > > isn't being used, shouldn't its parent be allowed to go into runtime
> > > > suspend?  As things stand now, we do allow it.
> > > > 
> > > > The problem is that we don't disallow it when the driverless device
> > > > _is_ being used.
> > > 
> > > We can make it depend on what's there in the control file.  Let's say if that's
> > > "on" (ie. the devices usage counter is not zero), we won't allow the parent
> > > to be suspended.
> > > 
> > > So, as I said, why don't we keep the runtime PM of PCI devices always enabled,
> > > regardless of whether or not there is a driver, and arrange things in such a
> > > way that the device is automatically "suspended" if user space writes "auto"
> > > to the control file.  IOW, I suppose we can do something like this:
> > 
> > It probably is better to treat the "no driver" case in a special way, though:
> > 
> > ---
> >  drivers/pci/pci-driver.c |   45 +++++++++++++++++++++++++--------------------
> >  drivers/pci/pci.c        |    2 ++
> >  2 files changed, 27 insertions(+), 20 deletions(-)
> > 
> > Index: linux-pm/drivers/pci/pci-driver.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci-driver.c
> > +++ linux-pm/drivers/pci/pci-driver.c
> > @@ -263,22 +263,17 @@ static long local_pci_probe(void *_ddi)
> >  	/* The parent bridge must be in active state when probing */
> >  	if (parent)
> >  		pm_runtime_get_sync(parent);
> > -	/* Unbound PCI devices are always set to disabled and suspended.
> > -	 * During probe, the device is set to enabled and active and the
> > -	 * usage count is incremented.  If the driver supports runtime PM,
> > -	 * it should call pm_runtime_put_noidle() in its probe routine and
> > +	/*
> > +	 * During probe, the device is set to active and the usage count is
> > +	 * incremented.  If the driver supports runtime PM, it should call
> > +	 * pm_runtime_put_noidle() in its probe routine and
> >  	 * pm_runtime_get_noresume() in its remove routine.
> >  	 */
> > -	pm_runtime_get_noresume(dev);
> > -	pm_runtime_set_active(dev);
> > -	pm_runtime_enable(dev);
> > -
> > +	pm_runtime_get_sync(dev);
> >  	rc = ddi->drv->probe(ddi->dev, ddi->id);
> > -	if (rc) {
> > -		pm_runtime_disable(dev);
> > -		pm_runtime_set_suspended(dev);
> > -		pm_runtime_put_noidle(dev);
> > -	}
> > +	if (rc)
> > +		pm_runtime_put_sync(dev);
> > +
> >  	if (parent)
> >  		pm_runtime_put(parent);
> >  	return rc;
> > @@ -369,9 +364,7 @@ static int pci_device_remove(struct devi
> >  	}
> >  
> >  	/* Undo the runtime PM settings in local_pci_probe() */
> > -	pm_runtime_disable(dev);
> > -	pm_runtime_set_suspended(dev);
> > -	pm_runtime_put_noidle(dev);
> > +	pm_runtime_put_sync(dev);
> >  
> >  	/*
> >  	 * If the device is still on, set the power state as "unknown",
> > @@ -998,10 +991,14 @@ static int pci_pm_restore(struct device
> >  static int pci_pm_runtime_suspend(struct device *dev)
> >  {
> >  	struct pci_dev *pci_dev = to_pci_dev(dev);
> > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > +	const struct dev_pm_ops *pm;
> >  	pci_power_t prev = pci_dev->current_state;
> >  	int error;
> >  
> > +	if (!dev->driver)
> > +		return 0;
> > +
> > +	pm = dev->driver->pm;
> >  	if (!pm || !pm->runtime_suspend)
> >  		return -ENOSYS;
> >  
> > @@ -1035,8 +1032,12 @@ static int pci_pm_runtime_resume(struct
> >  {
> >  	int rc;
> >  	struct pci_dev *pci_dev = to_pci_dev(dev);
> > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > +	const struct dev_pm_ops *pm;
> > +
> > +	if (!dev->driver)
> > +		return 0;
> >  
> > +	pm = dev->driver->pm;
> >  	if (!pm || !pm->runtime_resume)
> >  		return -ENOSYS;
> >  
> > @@ -1054,8 +1055,12 @@ static int pci_pm_runtime_resume(struct
> >  
> >  static int pci_pm_runtime_idle(struct device *dev)
> >  {
> > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > +	const struct dev_pm_ops *pm;
> > +
> > +	if (!dev->driver)
> > +		goto out:
> >  
> > +	pm = dev->driver->pm;
> >  	if (!pm)
> >  		return -ENOSYS;
> >  
> > @@ -1065,8 +1070,8 @@ static int pci_pm_runtime_idle(struct de
> >  			return ret;
> >  	}
> >  
> > + out:
> >  	pm_runtime_suspend(dev);
> > -
> >  	return 0;
> >  }
> >  
> > Index: linux-pm/drivers/pci/pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci.c
> > +++ linux-pm/drivers/pci/pci.c
> > @@ -1868,6 +1868,8 @@ void pci_pm_init(struct pci_dev *dev)
> >  	u16 pmc;
> >  
> >  	pm_runtime_forbid(&dev->dev);
> > +	pm_runtime_set_active(dev);
> > +	pm_runtime_enable(&dev->dev);
> >  	device_enable_async_suspend(&dev->dev);
> >  	dev->wakeup_prepared = false;
> >  
> 
> I think the patch can fix the issue in a better way.

I'm not sure what you mean.

> Do we still need to clarify state about disabled and forbidden?  When a
> device is forbidden and the usage_count > 0,

"Forbidden" always means usage_count > 0.

> is it a good idea to allow to set device state to SUSPENDED if the device
> is disabled?

No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
However, in some cases we actually would like to change the status to
SUSPENDED when usage_count becomes equal to 0, because that means we can
suspend (I mean really suspend) the parents of the devices in question
(and we want to notify the parents in those cases).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08  1:35                             ` Rafael J. Wysocki
@ 2012-11-08  2:04                               ` Huang Ying
  2012-11-08  9:56                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-08  2:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Thu, 2012-11-08 at 02:35 +0100, Rafael J. Wysocki wrote:
> On Thursday, November 08, 2012 09:15:08 AM Huang Ying wrote:
> > On Thu, 2012-11-08 at 00:09 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, November 07, 2012 11:51:15 PM Rafael J. Wysocki wrote:
> > > > On Wednesday, November 07, 2012 04:56:49 PM Alan Stern wrote:
> > > > > On Wed, 7 Nov 2012, Rafael J. Wysocki wrote:
> > > > > 
> > > > > > > Right.  The reasoning behind my proposal goes like this: When there's
> > > > > > > no driver, the subsystem can let userspace directly control the
> > > > > > > device's power level through the power/control attribute.
> > > > > > 
> > > > > > Well, we might as well just leave the runtime PM of PCI devices enabled, even
> > > > > > if they have no drivers, but modify the PCI bus type's runtime PM callbacks
> > > > > > to ignore devices with no drivers.
> > > > > > 
> > > > > > IIRC the reason why we decided to disable runtime PM for PCI device with no
> > > > > > drivers was that some of them refused to work again after being put by the
> > > > > > core into D3.  By making the PCI bus type's runtime PM callbacks ignore them
> > > > > > we'd avoid this problem without modifying the core's behavior.
> > > > > 
> > > > > It comes down to a question of the parent.  If a driverless PCI device
> > > > > isn't being used, shouldn't its parent be allowed to go into runtime
> > > > > suspend?  As things stand now, we do allow it.
> > > > > 
> > > > > The problem is that we don't disallow it when the driverless device
> > > > > _is_ being used.
> > > > 
> > > > We can make it depend on what's there in the control file.  Let's say if that's
> > > > "on" (ie. the devices usage counter is not zero), we won't allow the parent
> > > > to be suspended.
> > > > 
> > > > So, as I said, why don't we keep the runtime PM of PCI devices always enabled,
> > > > regardless of whether or not there is a driver, and arrange things in such a
> > > > way that the device is automatically "suspended" if user space writes "auto"
> > > > to the control file.  IOW, I suppose we can do something like this:
> > > 
> > > It probably is better to treat the "no driver" case in a special way, though:
> > > 
> > > ---
> > >  drivers/pci/pci-driver.c |   45 +++++++++++++++++++++++++--------------------
> > >  drivers/pci/pci.c        |    2 ++
> > >  2 files changed, 27 insertions(+), 20 deletions(-)
> > > 
> > > Index: linux-pm/drivers/pci/pci-driver.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/pci/pci-driver.c
> > > +++ linux-pm/drivers/pci/pci-driver.c
> > > @@ -263,22 +263,17 @@ static long local_pci_probe(void *_ddi)
> > >  	/* The parent bridge must be in active state when probing */
> > >  	if (parent)
> > >  		pm_runtime_get_sync(parent);
> > > -	/* Unbound PCI devices are always set to disabled and suspended.
> > > -	 * During probe, the device is set to enabled and active and the
> > > -	 * usage count is incremented.  If the driver supports runtime PM,
> > > -	 * it should call pm_runtime_put_noidle() in its probe routine and
> > > +	/*
> > > +	 * During probe, the device is set to active and the usage count is
> > > +	 * incremented.  If the driver supports runtime PM, it should call
> > > +	 * pm_runtime_put_noidle() in its probe routine and
> > >  	 * pm_runtime_get_noresume() in its remove routine.
> > >  	 */
> > > -	pm_runtime_get_noresume(dev);
> > > -	pm_runtime_set_active(dev);
> > > -	pm_runtime_enable(dev);
> > > -
> > > +	pm_runtime_get_sync(dev);
> > >  	rc = ddi->drv->probe(ddi->dev, ddi->id);
> > > -	if (rc) {
> > > -		pm_runtime_disable(dev);
> > > -		pm_runtime_set_suspended(dev);
> > > -		pm_runtime_put_noidle(dev);
> > > -	}
> > > +	if (rc)
> > > +		pm_runtime_put_sync(dev);
> > > +
> > >  	if (parent)
> > >  		pm_runtime_put(parent);
> > >  	return rc;
> > > @@ -369,9 +364,7 @@ static int pci_device_remove(struct devi
> > >  	}
> > >  
> > >  	/* Undo the runtime PM settings in local_pci_probe() */
> > > -	pm_runtime_disable(dev);
> > > -	pm_runtime_set_suspended(dev);
> > > -	pm_runtime_put_noidle(dev);
> > > +	pm_runtime_put_sync(dev);
> > >  
> > >  	/*
> > >  	 * If the device is still on, set the power state as "unknown",
> > > @@ -998,10 +991,14 @@ static int pci_pm_restore(struct device
> > >  static int pci_pm_runtime_suspend(struct device *dev)
> > >  {
> > >  	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > > +	const struct dev_pm_ops *pm;
> > >  	pci_power_t prev = pci_dev->current_state;
> > >  	int error;
> > >  
> > > +	if (!dev->driver)
> > > +		return 0;
> > > +
> > > +	pm = dev->driver->pm;
> > >  	if (!pm || !pm->runtime_suspend)
> > >  		return -ENOSYS;
> > >  
> > > @@ -1035,8 +1032,12 @@ static int pci_pm_runtime_resume(struct
> > >  {
> > >  	int rc;
> > >  	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > > +	const struct dev_pm_ops *pm;
> > > +
> > > +	if (!dev->driver)
> > > +		return 0;
> > >  
> > > +	pm = dev->driver->pm;
> > >  	if (!pm || !pm->runtime_resume)
> > >  		return -ENOSYS;
> > >  
> > > @@ -1054,8 +1055,12 @@ static int pci_pm_runtime_resume(struct
> > >  
> > >  static int pci_pm_runtime_idle(struct device *dev)
> > >  {
> > > -	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
> > > +	const struct dev_pm_ops *pm;
> > > +
> > > +	if (!dev->driver)
> > > +		goto out:
> > >  
> > > +	pm = dev->driver->pm;
> > >  	if (!pm)
> > >  		return -ENOSYS;
> > >  
> > > @@ -1065,8 +1070,8 @@ static int pci_pm_runtime_idle(struct de
> > >  			return ret;
> > >  	}
> > >  
> > > + out:
> > >  	pm_runtime_suspend(dev);
> > > -
> > >  	return 0;
> > >  }
> > >  
> > > Index: linux-pm/drivers/pci/pci.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/pci/pci.c
> > > +++ linux-pm/drivers/pci/pci.c
> > > @@ -1868,6 +1868,8 @@ void pci_pm_init(struct pci_dev *dev)
> > >  	u16 pmc;
> > >  
> > >  	pm_runtime_forbid(&dev->dev);
> > > +	pm_runtime_set_active(dev);
> > > +	pm_runtime_enable(&dev->dev);
> > >  	device_enable_async_suspend(&dev->dev);
> > >  	dev->wakeup_prepared = false;
> > >  
> > 
> > I think the patch can fix the issue in a better way.
> 
> I'm not sure what you mean.

I mean your patch can fix the driver-less VGA issue.  And it is better
than my original patch.  The following discussion is not about this
specific issue.  Just about PM core logic.

> > Do we still need to clarify state about disabled and forbidden?  When a
> > device is forbidden and the usage_count > 0,
> 
> "Forbidden" always means usage_count > 0.

Yes.

> > is it a good idea to allow to set device state to SUSPENDED if the device
> > is disabled?
> 
> No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> However, in some cases we actually would like to change the status to
> SUSPENDED when usage_count becomes equal to 0, because that means we can
> suspend (I mean really suspend) the parents of the devices in question
> (and we want to notify the parents in those cases).

So do you think Alan Stern's suggestion about forbidden and disabled is
the right way to go?

>	Make pm_runtime_set_suspended() fail if runtime PM is 
> 	forbidden.
> 
> 	Make pm_runtime_forbid() call pm_runtime_set_active()
> 	(and do a runtime resume of the parent) if disable_depth > 0.
>
>	Make the PCI runtime-idle routine call 
> 	pm_runtime_set_suspended() if disable_depth > 0.  Or maybe
> 	do this for all devices, in the runtime PM core.

In this way, we do not really need to call pm_runtime_set_suspended() in
fact.  Because if disabled and usage_count=0, device will be set to
SUSPENDED state automatically.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08  2:04                               ` Huang Ying
@ 2012-11-08  9:56                                 ` Rafael J. Wysocki
  2012-11-08 17:07                                     ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-08  9:56 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Thursday, November 08, 2012 10:04:36 AM Huang Ying wrote:
> On Thu, 2012-11-08 at 02:35 +0100, Rafael J. Wysocki wrote:
> > On Thursday, November 08, 2012 09:15:08 AM Huang Ying wrote:
> > > On Thu, 2012-11-08 at 00:09 +0100, Rafael J. Wysocki wrote:

[...]

> > > I think the patch can fix the issue in a better way.
> > 
> > I'm not sure what you mean.
> 
> I mean your patch can fix the driver-less VGA issue.  And it is better
> than my original patch.  The following discussion is not about this
> specific issue.  Just about PM core logic.

OK

> > > Do we still need to clarify state about disabled and forbidden?  When a
> > > device is forbidden and the usage_count > 0,
> > 
> > "Forbidden" always means usage_count > 0.
> 
> Yes.
> 
> > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > is disabled?
> > 
> > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> > However, in some cases we actually would like to change the status to
> > SUSPENDED when usage_count becomes equal to 0, because that means we can
> > suspend (I mean really suspend) the parents of the devices in question
> > (and we want to notify the parents in those cases).
> 
> So do you think Alan Stern's suggestion about forbidden and disabled is
> the right way to go?

I'm not really sure about that.

My original idea was that the runtime PM status and usage counter would
only matter when runtime PM of a device was enabled.  That leads to
problems, though, when we enable runtime PM of a device whose usage
counter is greater from zero and status is SUSPENDED.  Also when the
device's status is ACTIVE, but its parent's child count is 0.

It's not very easy to fix this at the core level, though, because we
depend on the current behavior in some places.  I'm thinking that
perhaps pm_runtime_enable() should just WARN() if things are obviously
inconsistent (although there still may be problems, for example, if the
parent's child count is 2 when we enable runtime PM for its child, but that
child is the only one it actually has).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08  9:56                                 ` Rafael J. Wysocki
@ 2012-11-08 17:07                                     ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-08 17:07 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:

> > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > is disabled?
> > > 
> > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.

That isn't strictly true, because pm_runtime_get_noresume violates this
rule.  What the PM core actually does is prevent a transition from the
ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
_provided_ runtime PM is enabled.  There's no such restriction when it
is disabled.

BTW, do we need to think about what happens in the case where the
device _does_ have a driver and for some reason the driver has disabled
the device for runtime PM?  I would just as soon ignore the issue.

> > > However, in some cases we actually would like to change the status to
> > > SUSPENDED when usage_count becomes equal to 0, because that means we can
> > > suspend (I mean really suspend) the parents of the devices in question
> > > (and we want to notify the parents in those cases).
> > 
> > So do you think Alan Stern's suggestion about forbidden and disabled is
> > the right way to go?
> 
> I'm not really sure about that.
> 
> My original idea was that the runtime PM status and usage counter would
> only matter when runtime PM of a device was enabled.  That leads to
> problems, though, when we enable runtime PM of a device whose usage
> counter is greater from zero and status is SUSPENDED.

That doesn't seem to be a problem.  It can arise without disabling
runtime PM at all -- just call pm_runtime_get_noresume.

>  Also when the
> device's status is ACTIVE, but its parent's child count is 0.

__pm_runtime_set_status prevents this situation from arising.  When the 
device's status is set to ACTIVE, the parent's child count is 
incremented.  So this isn't a problem either.

> It's not very easy to fix this at the core level, though, because we
> depend on the current behavior in some places.  I'm thinking that
> perhaps pm_runtime_enable() should just WARN() if things are obviously
> inconsistent (although there still may be problems, for example, if the
> parent's child count is 2 when we enable runtime PM for its child, but that
> child is the only one it actually has).

I think we should continue the original strategy of ignoring the status
and usage counter when runtime PM is disabled.  This is definitely the
easiest and most straightforward approach.  Fixing the problem at hand
(VGA controllers) by changing the PCI subsystem seems like the simplest
solution.

Your revised patch does do the job, except for a few problems.  
Namely, while local_pci_probe() and pci_device_remove() are running,
the device _does_ have a driver.  This means that local_pci_probe()
should not call pm_runtime_get_sync(), for example.  Doing so would
invoke the driver's runtime_resume routine before calling the driver's
probe routine!

The USB subsystem solves this problem by carefully keeping track of the 
state of the device-driver binding:

	Originally the device is UNBOUND.

	At the start of the subsystem's probe routine, the state
	changes to BINDING.

	If the probe succeeds then it changes to BOUND; otherwise
	it goes back to UNBOUND.

	At the start of the subsystem's remove routine, the state
	changes to UNBINDING.  At the end it goes to UNBOUND.

When the state is anything other than BOUND, the subsystem's runtime PM 
routines act as though there is no driver.  This works because the 
subsystem makes sure that the device is ACTIVE with a nonzero usage 
count before calling the driver's probe or remove routine, so no 
runtime PM callbacks can occur at these awkward times.

If PCI adopted this strategy then your new patch would work okay.  I 
think -- I haven't checked it thoroughly.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-08 17:07                                     ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-08 17:07 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:

> > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > is disabled?
> > > 
> > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.

That isn't strictly true, because pm_runtime_get_noresume violates this
rule.  What the PM core actually does is prevent a transition from the
ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
_provided_ runtime PM is enabled.  There's no such restriction when it
is disabled.

BTW, do we need to think about what happens in the case where the
device _does_ have a driver and for some reason the driver has disabled
the device for runtime PM?  I would just as soon ignore the issue.

> > > However, in some cases we actually would like to change the status to
> > > SUSPENDED when usage_count becomes equal to 0, because that means we can
> > > suspend (I mean really suspend) the parents of the devices in question
> > > (and we want to notify the parents in those cases).
> > 
> > So do you think Alan Stern's suggestion about forbidden and disabled is
> > the right way to go?
> 
> I'm not really sure about that.
> 
> My original idea was that the runtime PM status and usage counter would
> only matter when runtime PM of a device was enabled.  That leads to
> problems, though, when we enable runtime PM of a device whose usage
> counter is greater from zero and status is SUSPENDED.

That doesn't seem to be a problem.  It can arise without disabling
runtime PM at all -- just call pm_runtime_get_noresume.

>  Also when the
> device's status is ACTIVE, but its parent's child count is 0.

__pm_runtime_set_status prevents this situation from arising.  When the 
device's status is set to ACTIVE, the parent's child count is 
incremented.  So this isn't a problem either.

> It's not very easy to fix this at the core level, though, because we
> depend on the current behavior in some places.  I'm thinking that
> perhaps pm_runtime_enable() should just WARN() if things are obviously
> inconsistent (although there still may be problems, for example, if the
> parent's child count is 2 when we enable runtime PM for its child, but that
> child is the only one it actually has).

I think we should continue the original strategy of ignoring the status
and usage counter when runtime PM is disabled.  This is definitely the
easiest and most straightforward approach.  Fixing the problem at hand
(VGA controllers) by changing the PCI subsystem seems like the simplest
solution.

Your revised patch does do the job, except for a few problems.  
Namely, while local_pci_probe() and pci_device_remove() are running,
the device _does_ have a driver.  This means that local_pci_probe()
should not call pm_runtime_get_sync(), for example.  Doing so would
invoke the driver's runtime_resume routine before calling the driver's
probe routine!

The USB subsystem solves this problem by carefully keeping track of the 
state of the device-driver binding:

	Originally the device is UNBOUND.

	At the start of the subsystem's probe routine, the state
	changes to BINDING.

	If the probe succeeds then it changes to BOUND; otherwise
	it goes back to UNBOUND.

	At the start of the subsystem's remove routine, the state
	changes to UNBINDING.  At the end it goes to UNBOUND.

When the state is anything other than BOUND, the subsystem's runtime PM 
routines act as though there is no driver.  This works because the 
subsystem makes sure that the device is ACTIVE with a nonzero usage 
count before calling the driver's probe or remove routine, so no 
runtime PM callbacks can occur at these awkward times.

If PCI adopted this strategy then your new patch would work okay.  I 
think -- I haven't checked it thoroughly.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08 17:07                                     ` Alan Stern
  (?)
@ 2012-11-09  2:36                                     ` Huang Ying
  2012-11-09 16:41                                         ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-09  2:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Thu, 2012-11-08 at 12:07 -0500, Alan Stern wrote:
> On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> 
> > > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > > is disabled?
> > > > 
> > > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> 
> That isn't strictly true, because pm_runtime_get_noresume violates this
> rule.  What the PM core actually does is prevent a transition from the
> ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
> _provided_ runtime PM is enabled.  There's no such restriction when it
> is disabled.

Usage count may be not a issue for the end user.  But "on" in "control"
sysfs file + SUSPENDED can be confusing for the end user.  Maybe we need
to check dev->power.runtime_auto in pm_runtime_set_suspended().

> BTW, do we need to think about what happens in the case where the
> device _does_ have a driver and for some reason the driver has disabled
> the device for runtime PM?  I would just as soon ignore the issue.
> 
> > > > However, in some cases we actually would like to change the status to
> > > > SUSPENDED when usage_count becomes equal to 0, because that means we can
> > > > suspend (I mean really suspend) the parents of the devices in question
> > > > (and we want to notify the parents in those cases).
> > > 
> > > So do you think Alan Stern's suggestion about forbidden and disabled is
> > > the right way to go?
> > 
> > I'm not really sure about that.
> > 
> > My original idea was that the runtime PM status and usage counter would
> > only matter when runtime PM of a device was enabled.  That leads to
> > problems, though, when we enable runtime PM of a device whose usage
> > counter is greater from zero and status is SUSPENDED.
> 
> That doesn't seem to be a problem.  It can arise without disabling
> runtime PM at all -- just call pm_runtime_get_noresume.

I think pm_runtime_get_noresume can not fix the issue.
pm_runtiem_set_active() should be invoked before pm_runtime_enable() if
necessary.  That is, the invoker should be responsible for the
consistence between usage_count and SUSPENDED/ACTIVE status.  And the
API may be a little low level and error-prone to the invoker (mainly bus
code).

Best Regards,
Huang Ying

> >  Also when the
> > device's status is ACTIVE, but its parent's child count is 0.
> 
> __pm_runtime_set_status prevents this situation from arising.  When the 
> device's status is set to ACTIVE, the parent's child count is 
> incremented.  So this isn't a problem either.
> 
> > It's not very easy to fix this at the core level, though, because we
> > depend on the current behavior in some places.  I'm thinking that
> > perhaps pm_runtime_enable() should just WARN() if things are obviously
> > inconsistent (although there still may be problems, for example, if the
> > parent's child count is 2 when we enable runtime PM for its child, but that
> > child is the only one it actually has).
> 
> I think we should continue the original strategy of ignoring the status
> and usage counter when runtime PM is disabled.  This is definitely the
> easiest and most straightforward approach.  Fixing the problem at hand
> (VGA controllers) by changing the PCI subsystem seems like the simplest
> solution.
> 
> Your revised patch does do the job, except for a few problems.  
> Namely, while local_pci_probe() and pci_device_remove() are running,
> the device _does_ have a driver.  This means that local_pci_probe()
> should not call pm_runtime_get_sync(), for example.  Doing so would
> invoke the driver's runtime_resume routine before calling the driver's
> probe routine!
> 
> The USB subsystem solves this problem by carefully keeping track of the 
> state of the device-driver binding:
> 
> 	Originally the device is UNBOUND.
> 
> 	At the start of the subsystem's probe routine, the state
> 	changes to BINDING.
> 
> 	If the probe succeeds then it changes to BOUND; otherwise
> 	it goes back to UNBOUND.
> 
> 	At the start of the subsystem's remove routine, the state
> 	changes to UNBINDING.  At the end it goes to UNBOUND.
> 
> When the state is anything other than BOUND, the subsystem's runtime PM 
> routines act as though there is no driver.  This works because the 
> subsystem makes sure that the device is ACTIVE with a nonzero usage 
> count before calling the driver's probe or remove routine, so no 
> runtime PM callbacks can occur at these awkward times.
> 
> If PCI adopted this strategy then your new patch would work okay.  I 
> think -- I haven't checked it thoroughly.
> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-09  2:36                                     ` Huang Ying
@ 2012-11-09 16:41                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-09 16:41 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Fri, 9 Nov 2012, Huang Ying wrote:

> On Thu, 2012-11-08 at 12:07 -0500, Alan Stern wrote:
> > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > > > is disabled?
> > > > > 
> > > > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> > 
> > That isn't strictly true, because pm_runtime_get_noresume violates this
> > rule.  What the PM core actually does is prevent a transition from the
> > ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
> > _provided_ runtime PM is enabled.  There's no such restriction when it
> > is disabled.
> 
> Usage count may be not a issue for the end user.  But "on" in "control"
> sysfs file + SUSPENDED can be confusing for the end user.  Maybe we need
> to check dev->power.runtime_auto in pm_runtime_set_suspended().

You are confusing the issue by raising two separate (though related)
questions.

The first question: How should the PCI subsystem prevent the parents of 
driverless VGA devices from being runtime suspended while userspace is 
accessing them?

The second question: Should the PM core allow devices that are disabled
for runtime PM to be in the SUSPENDED state when
dev->power.runtime_auto is clear?

Assuming we don't want to allow this, there's a third question: Should
pm_runtime_allow call pm_runtime_set_suspended if the device is
disabled?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-09 16:41                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-09 16:41 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Fri, 9 Nov 2012, Huang Ying wrote:

> On Thu, 2012-11-08 at 12:07 -0500, Alan Stern wrote:
> > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > > > is disabled?
> > > > > 
> > > > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> > 
> > That isn't strictly true, because pm_runtime_get_noresume violates this
> > rule.  What the PM core actually does is prevent a transition from the
> > ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
> > _provided_ runtime PM is enabled.  There's no such restriction when it
> > is disabled.
> 
> Usage count may be not a issue for the end user.  But "on" in "control"
> sysfs file + SUSPENDED can be confusing for the end user.  Maybe we need
> to check dev->power.runtime_auto in pm_runtime_set_suspended().

You are confusing the issue by raising two separate (though related)
questions.

The first question: How should the PCI subsystem prevent the parents of 
driverless VGA devices from being runtime suspended while userspace is 
accessing them?

The second question: Should the PM core allow devices that are disabled
for runtime PM to be in the SUSPENDED state when
dev->power.runtime_auto is clear?

Assuming we don't want to allow this, there's a third question: Should
pm_runtime_allow call pm_runtime_set_suspended if the device is
disabled?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-09 16:41                                         ` Alan Stern
  (?)
@ 2012-11-12  0:37                                         ` Huang Ying
  2012-11-12  2:36                                             ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-12  0:37 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Fri, 2012-11-09 at 11:41 -0500, Alan Stern wrote:
> On Fri, 9 Nov 2012, Huang Ying wrote:
> 
> > On Thu, 2012-11-08 at 12:07 -0500, Alan Stern wrote:
> > > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> > > 
> > > > > > > is it a good idea to allow to set device state to SUSPENDED if the device
> > > > > > > is disabled?
> > > > > > 
> > > > > > No, it is not.  The status should always be ACTIVE as long as usage_count > 0.
> > > 
> > > That isn't strictly true, because pm_runtime_get_noresume violates this
> > > rule.  What the PM core actually does is prevent a transition from the
> > > ACTIVE state to the SUSPENDING/SUSPENDED state if usage_count > 0,
> > > _provided_ runtime PM is enabled.  There's no such restriction when it
> > > is disabled.
> > 
> > Usage count may be not a issue for the end user.  But "on" in "control"
> > sysfs file + SUSPENDED can be confusing for the end user.  Maybe we need
> > to check dev->power.runtime_auto in pm_runtime_set_suspended().
> 
> You are confusing the issue by raising two separate (though related)
> questions.

Thanks for clarify.

> The first question: How should the PCI subsystem prevent the parents of 
> driverless VGA devices from being runtime suspended while userspace is 
> accessing them?

I think Rafael's patch is good for that.

> The second question: Should the PM core allow devices that are disabled
> for runtime PM to be in the SUSPENDED state when
> dev->power.runtime_auto is clear?

I think that should not be allowed.

> Assuming we don't want to allow this, there's a third question: Should
> pm_runtime_allow call pm_runtime_set_suspended if the device is
> disabled?

Is it absolute necessary to call pm_runtime_set_suspended?  If the
device is disabled, the transition to SUSPENDED state will not be
triggered even if the device is ACTIVE.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-12  0:37                                         ` Huang Ying
@ 2012-11-12  2:36                                             ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-12  2:36 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 12 Nov 2012, Huang Ying wrote:

> > The first question: How should the PCI subsystem prevent the parents of 
> > driverless VGA devices from being runtime suspended while userspace is 
> > accessing them?
> 
> I think Rafael's patch is good for that.

But his patch isn't needed if we make these other changes.

> > The second question: Should the PM core allow devices that are disabled
> > for runtime PM to be in the SUSPENDED state when
> > dev->power.runtime_auto is clear?
> 
> I think that should not be allowed.

Disallowing it is okay with me too.  But it will require several 
changes to the code, more than what your patch did.

> > Assuming we don't want to allow this, there's a third question: Should
> > pm_runtime_allow call pm_runtime_set_suspended if the device is
> > disabled?
> 
> Is it absolute necessary to call pm_runtime_set_suspended?  If the
> device is disabled, the transition to SUSPENDED state will not be
> triggered even if the device is ACTIVE.

It's not absolutely necessary to do this, but we ought to because it 
will allow the device's parent to be suspended.  If we leave the device 
in the ACTIVE state then the parent can't be suspended, even when the 
device is disabled.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-12  2:36                                             ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-12  2:36 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 12 Nov 2012, Huang Ying wrote:

> > The first question: How should the PCI subsystem prevent the parents of 
> > driverless VGA devices from being runtime suspended while userspace is 
> > accessing them?
> 
> I think Rafael's patch is good for that.

But his patch isn't needed if we make these other changes.

> > The second question: Should the PM core allow devices that are disabled
> > for runtime PM to be in the SUSPENDED state when
> > dev->power.runtime_auto is clear?
> 
> I think that should not be allowed.

Disallowing it is okay with me too.  But it will require several 
changes to the code, more than what your patch did.

> > Assuming we don't want to allow this, there's a third question: Should
> > pm_runtime_allow call pm_runtime_set_suspended if the device is
> > disabled?
> 
> Is it absolute necessary to call pm_runtime_set_suspended?  If the
> device is disabled, the transition to SUSPENDED state will not be
> triggered even if the device is ACTIVE.

It's not absolutely necessary to do this, but we ought to because it 
will allow the device's parent to be suspended.  If we leave the device 
in the ACTIVE state then the parent can't be suspended, even when the 
device is disabled.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-12  2:36                                             ` Alan Stern
  (?)
@ 2012-11-12  5:55                                             ` Huang Ying
  2012-11-12 16:32                                                 ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-12  5:55 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Sun, 2012-11-11 at 21:36 -0500, Alan Stern wrote:
> On Mon, 12 Nov 2012, Huang Ying wrote:
> 
> > > The first question: How should the PCI subsystem prevent the parents of 
> > > driverless VGA devices from being runtime suspended while userspace is 
> > > accessing them?
> > 
> > I think Rafael's patch is good for that.
> 
> But his patch isn't needed if we make these other changes.

Yes.

> > > The second question: Should the PM core allow devices that are disabled
> > > for runtime PM to be in the SUSPENDED state when
> > > dev->power.runtime_auto is clear?
> > 
> > I think that should not be allowed.
> 
> Disallowing it is okay with me too.  But it will require several 
> changes to the code, more than what your patch did.

Yes.  I think so too.

> > > Assuming we don't want to allow this, there's a third question: Should
> > > pm_runtime_allow call pm_runtime_set_suspended if the device is
> > > disabled?
> > 
> > Is it absolute necessary to call pm_runtime_set_suspended?  If the
> > device is disabled, the transition to SUSPENDED state will not be
> > triggered even if the device is ACTIVE.
> 
> It's not absolutely necessary to do this, but we ought to because it 
> will allow the device's parent to be suspended.  If we leave the device 
> in the ACTIVE state then the parent can't be suspended, even when the 
> device is disabled.

I think this is the hard part of the issue.  Now "disabled" and
SUSPENDED state is managed by hand.  IMHO, if we changed
pm_runtime_allow() as you said, we need to change the rule too.  Maybe
something as follow:

- remove pm_runtime_set_suspended/pm_runtime_set_active
- in pm_runtime_disable/pm_runtime_allow, put device into SUSPENDED
state if runtime PM is not forbidden.
- in pm_runtime_forbid/pm_runtime_enable, put device into ACTIVE state.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-12  5:55                                             ` Huang Ying
@ 2012-11-12 16:32                                                 ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-12 16:32 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 12 Nov 2012, Huang Ying wrote:

> > > Is it absolute necessary to call pm_runtime_set_suspended?  If the
> > > device is disabled, the transition to SUSPENDED state will not be
> > > triggered even if the device is ACTIVE.
> > 
> > It's not absolutely necessary to do this, but we ought to because it 
> > will allow the device's parent to be suspended.  If we leave the device 
> > in the ACTIVE state then the parent can't be suspended, even when the 
> > device is disabled.
> 
> I think this is the hard part of the issue.  Now "disabled" and
> SUSPENDED state is managed by hand.  IMHO, if we changed
> pm_runtime_allow() as you said, we need to change the rule too.  Maybe
> something as follow:
> 
> - remove pm_runtime_set_suspended/pm_runtime_set_active

We can't remove them entirely.  They are needed for situations where 
the device's physical state is different from what the PM core thinks; 
they tell the PM core what the actual state is.

> - in pm_runtime_disable/pm_runtime_allow, put device into SUSPENDED
> state if runtime PM is not forbidden.

That doesn't make sense.  Runtime PM is never forbidden after 
pm_runtime_allow is called; that's its purpose.

> - in pm_runtime_forbid/pm_runtime_enable, put device into ACTIVE state.

Why should pm_runtime_enable put the device into the ACTIVE state?

No, I think a better approach is simply to say that the device will
never be allowed to be in the SUSPENDED state if runtime PM is
forbidden.  We already enforce this when the device is enabled for 
runtime PM, so we would have to start enforcing it when the device is 
disabled.

This means:

	pm_runtime_set_suspended should fail if dev->power.runtime_auto
	is clear.

	pm_runtime_forbid should call pm_runtime_set_active if
	dev->power.disable_depth > 0.  (This would run into a problem
	if the parent is suspended and disabled.  Maybe 
	pm_runtime_forbid should fail when this happens.)

Finally, we probably should make a third change even though it isn't
strictly necessary:

	pm_runtime_allow should call pm_runtime_set_suspended if
	dev->power.disable_depth > 0.

Rafael, what do you think?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-12 16:32                                                 ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-12 16:32 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 12 Nov 2012, Huang Ying wrote:

> > > Is it absolute necessary to call pm_runtime_set_suspended?  If the
> > > device is disabled, the transition to SUSPENDED state will not be
> > > triggered even if the device is ACTIVE.
> > 
> > It's not absolutely necessary to do this, but we ought to because it 
> > will allow the device's parent to be suspended.  If we leave the device 
> > in the ACTIVE state then the parent can't be suspended, even when the 
> > device is disabled.
> 
> I think this is the hard part of the issue.  Now "disabled" and
> SUSPENDED state is managed by hand.  IMHO, if we changed
> pm_runtime_allow() as you said, we need to change the rule too.  Maybe
> something as follow:
> 
> - remove pm_runtime_set_suspended/pm_runtime_set_active

We can't remove them entirely.  They are needed for situations where 
the device's physical state is different from what the PM core thinks; 
they tell the PM core what the actual state is.

> - in pm_runtime_disable/pm_runtime_allow, put device into SUSPENDED
> state if runtime PM is not forbidden.

That doesn't make sense.  Runtime PM is never forbidden after 
pm_runtime_allow is called; that's its purpose.

> - in pm_runtime_forbid/pm_runtime_enable, put device into ACTIVE state.

Why should pm_runtime_enable put the device into the ACTIVE state?

No, I think a better approach is simply to say that the device will
never be allowed to be in the SUSPENDED state if runtime PM is
forbidden.  We already enforce this when the device is enabled for 
runtime PM, so we would have to start enforcing it when the device is 
disabled.

This means:

	pm_runtime_set_suspended should fail if dev->power.runtime_auto
	is clear.

	pm_runtime_forbid should call pm_runtime_set_active if
	dev->power.disable_depth > 0.  (This would run into a problem
	if the parent is suspended and disabled.  Maybe 
	pm_runtime_forbid should fail when this happens.)

Finally, we probably should make a third change even though it isn't
strictly necessary:

	pm_runtime_allow should call pm_runtime_set_suspended if
	dev->power.disable_depth > 0.

Rafael, what do you think?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-12 16:32                                                 ` Alan Stern
  (?)
@ 2012-11-13  1:19                                                 ` Huang Ying
  2012-11-13  2:32                                                     ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-13  1:19 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 2012-11-12 at 11:32 -0500, Alan Stern wrote:
> On Mon, 12 Nov 2012, Huang Ying wrote:
> 
> > > > Is it absolute necessary to call pm_runtime_set_suspended?  If the
> > > > device is disabled, the transition to SUSPENDED state will not be
> > > > triggered even if the device is ACTIVE.
> > > 
> > > It's not absolutely necessary to do this, but we ought to because it 
> > > will allow the device's parent to be suspended.  If we leave the device 
> > > in the ACTIVE state then the parent can't be suspended, even when the 
> > > device is disabled.
> > 
> > I think this is the hard part of the issue.  Now "disabled" and
> > SUSPENDED state is managed by hand.  IMHO, if we changed
> > pm_runtime_allow() as you said, we need to change the rule too.  Maybe
> > something as follow:
> > 
> > - remove pm_runtime_set_suspended/pm_runtime_set_active
> 
> We can't remove them entirely.  They are needed for situations where 
> the device's physical state is different from what the PM core thinks; 
> they tell the PM core what the actual state is.
> 
> > - in pm_runtime_disable/pm_runtime_allow, put device into SUSPENDED
> > state if runtime PM is not forbidden.
> 
> That doesn't make sense.  Runtime PM is never forbidden after 
> pm_runtime_allow is called; that's its purpose.

Sorry, my original idea is:

	pm_runtime_disable will put device into SUSPENDED state if
	dev->power.runtime_auto is clear.  pm_runtime_allow will put
	device into SUSPENDED state if dev->power.disable_depth > 0.

So in general, my original idea is to manage device runtime power state
automatically instead of manually, especially when device is in disabled
state.

	disabled + forbidden	-> ACTIVE
	disabled + !forbidden	-> SUSPENDED
	enabled + forbidden	-> ACTIVE
	enabled + !forbidden	-> auto

Why we can not do that?

> > - in pm_runtime_forbid/pm_runtime_enable, put device into ACTIVE state.
> 
> Why should pm_runtime_enable put the device into the ACTIVE state?
> 
> No, I think a better approach is simply to say that the device will
> never be allowed to be in the SUSPENDED state if runtime PM is
> forbidden.  We already enforce this when the device is enabled for 
> runtime PM, so we would have to start enforcing it when the device is 
> disabled.
> 
> This means:
> 
> 	pm_runtime_set_suspended should fail if dev->power.runtime_auto
> 	is clear.

I think we can WARN_ON() here.  Because the caller should responsible
for state consistence if they decide to manage runtime power state
manually.

> 	pm_runtime_forbid should call pm_runtime_set_active if
> 	dev->power.disable_depth > 0.  (This would run into a problem
> 	if the parent is suspended and disabled.  Maybe 
> 	pm_runtime_forbid should fail when this happens.)

pm_runtime_forbid() may be called via echo "on" > .../power/control.  I
think it is hard to refuse the request from user space to forbid runtime
PM.  Device can always work with full power.

> Finally, we probably should make a third change even though it isn't
> strictly necessary:
> 
> 	pm_runtime_allow should call pm_runtime_set_suspended if
> 	dev->power.disable_depth > 0.

I think this is something similar to manage device power state
automatically if disabled.

Best Regards,
Huang Ying

> Rafael, what do you think?
> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-13  1:19                                                 ` Huang Ying
@ 2012-11-13  2:32                                                     ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-13  2:32 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 13 Nov 2012, Huang Ying wrote:

> Sorry, my original idea is:
> 
> 	pm_runtime_disable will put device into SUSPENDED state if
> 	dev->power.runtime_auto is clear.  pm_runtime_allow will put
> 	device into SUSPENDED state if dev->power.disable_depth > 0.

That's close to what I suggested.

> So in general, my original idea is to manage device runtime power state
> automatically instead of manually, especially when device is in disabled
> state.
> 
> 	disabled + forbidden	-> ACTIVE
> 	disabled + !forbidden	-> SUSPENDED

This is not quite right.  Consider a device that is in runtime suspend 
when a system sleep starts.  When the system sleep ends, the device 
will be resumed but the PM core will still think its state is 
SUSPENDED.  The subsystem has to tell the PM core that the device is 
now ACTIVE.  Currently, subsystems do this by calling 
pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
your scheme this wouldn't work; the pm_runtime_set_active call would 
fail because the device was !forbidden.

> 	enabled + forbidden	-> ACTIVE
> 	enabled + !forbidden	-> auto
> 
> Why we can not do that?

See above.  What we can do instead is:

	disabled + forbidden	-> ACTIVE
	disabled + !forbidden	-> anything

which is basically what I proposed.

> > This means:
> > 
> > 	pm_runtime_set_suspended should fail if dev->power.runtime_auto
> > 	is clear.
> 
> I think we can WARN_ON() here.  Because the caller should responsible
> for state consistence if they decide to manage runtime power state
> manually.

No.  Drivers should not have to worry about whether runtime PM is 
forbidden.  Worrying about that is the PM core's job.

> > 	pm_runtime_forbid should call pm_runtime_set_active if
> > 	dev->power.disable_depth > 0.  (This would run into a problem
> > 	if the parent is suspended and disabled.  Maybe 
> > 	pm_runtime_forbid should fail when this happens.)
> 
> pm_runtime_forbid() may be called via echo "on" > .../power/control.  I
> think it is hard to refuse the request from user space to forbid runtime
> PM.  Device can always work with full power.

It can't if the parent is in SUSPEND.  If necessary, the user can write 
"on" to the parent's power/control attribute first.

> > Finally, we probably should make a third change even though it isn't
> > strictly necessary:
> > 
> > 	pm_runtime_allow should call pm_runtime_set_suspended if
> > 	dev->power.disable_depth > 0.
> 
> I think this is something similar to manage device power state
> automatically if disabled.

Yes, it is similar but not exactly the same as your proposal.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-13  2:32                                                     ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-13  2:32 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 13 Nov 2012, Huang Ying wrote:

> Sorry, my original idea is:
> 
> 	pm_runtime_disable will put device into SUSPENDED state if
> 	dev->power.runtime_auto is clear.  pm_runtime_allow will put
> 	device into SUSPENDED state if dev->power.disable_depth > 0.

That's close to what I suggested.

> So in general, my original idea is to manage device runtime power state
> automatically instead of manually, especially when device is in disabled
> state.
> 
> 	disabled + forbidden	-> ACTIVE
> 	disabled + !forbidden	-> SUSPENDED

This is not quite right.  Consider a device that is in runtime suspend 
when a system sleep starts.  When the system sleep ends, the device 
will be resumed but the PM core will still think its state is 
SUSPENDED.  The subsystem has to tell the PM core that the device is 
now ACTIVE.  Currently, subsystems do this by calling 
pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
your scheme this wouldn't work; the pm_runtime_set_active call would 
fail because the device was !forbidden.

> 	enabled + forbidden	-> ACTIVE
> 	enabled + !forbidden	-> auto
> 
> Why we can not do that?

See above.  What we can do instead is:

	disabled + forbidden	-> ACTIVE
	disabled + !forbidden	-> anything

which is basically what I proposed.

> > This means:
> > 
> > 	pm_runtime_set_suspended should fail if dev->power.runtime_auto
> > 	is clear.
> 
> I think we can WARN_ON() here.  Because the caller should responsible
> for state consistence if they decide to manage runtime power state
> manually.

No.  Drivers should not have to worry about whether runtime PM is 
forbidden.  Worrying about that is the PM core's job.

> > 	pm_runtime_forbid should call pm_runtime_set_active if
> > 	dev->power.disable_depth > 0.  (This would run into a problem
> > 	if the parent is suspended and disabled.  Maybe 
> > 	pm_runtime_forbid should fail when this happens.)
> 
> pm_runtime_forbid() may be called via echo "on" > .../power/control.  I
> think it is hard to refuse the request from user space to forbid runtime
> PM.  Device can always work with full power.

It can't if the parent is in SUSPEND.  If necessary, the user can write 
"on" to the parent's power/control attribute first.

> > Finally, we probably should make a third change even though it isn't
> > strictly necessary:
> > 
> > 	pm_runtime_allow should call pm_runtime_set_suspended if
> > 	dev->power.disable_depth > 0.
> 
> I think this is something similar to manage device power state
> automatically if disabled.

Yes, it is similar but not exactly the same as your proposal.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-13  2:32                                                     ` Alan Stern
  (?)
@ 2012-11-13  5:12                                                     ` Huang Ying
  2012-11-13 16:10                                                         ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-13  5:12 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Mon, 2012-11-12 at 21:32 -0500, Alan Stern wrote:
> On Tue, 13 Nov 2012, Huang Ying wrote:
> 
> > Sorry, my original idea is:
> > 
> > 	pm_runtime_disable will put device into SUSPENDED state if
> > 	dev->power.runtime_auto is clear.  pm_runtime_allow will put
> > 	device into SUSPENDED state if dev->power.disable_depth > 0.
> 
> That's close to what I suggested.
> 
> > So in general, my original idea is to manage device runtime power state
> > automatically instead of manually, especially when device is in disabled
> > state.
> > 
> > 	disabled + forbidden	-> ACTIVE
> > 	disabled + !forbidden	-> SUSPENDED
> 
> This is not quite right.  Consider a device that is in runtime suspend 
> when a system sleep starts.  When the system sleep ends, the device 
> will be resumed but the PM core will still think its state is 
> SUSPENDED.  The subsystem has to tell the PM core that the device is 
> now ACTIVE.  Currently, subsystems do this by calling 
> pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> your scheme this wouldn't work; the pm_runtime_set_active call would 
> fail because the device was !forbidden.

Thanks for your information.  For this specific situation, is it
possible to call pm_runtime_resume() or pm_request_resume() for the
device?

> > 	enabled + forbidden	-> ACTIVE
> > 	enabled + !forbidden	-> auto
> > 
> > Why we can not do that?
> 
> See above.  What we can do instead is:
> 
> 	disabled + forbidden	-> ACTIVE
> 	disabled + !forbidden	-> anything
> 
> which is basically what I proposed.
>
> > > This means:
> > > 
> > > 	pm_runtime_set_suspended should fail if dev->power.runtime_auto
> > > 	is clear.
> > 
> > I think we can WARN_ON() here.  Because the caller should responsible
> > for state consistence if they decide to manage runtime power state
> > manually.
> 
> No.  Drivers should not have to worry about whether runtime PM is 
> forbidden.  Worrying about that is the PM core's job.

En...  It appears that what caller can do is just do not call
pm_runtime_set_suspended() if forbidden.  So your method should be
better.

> > > 	pm_runtime_forbid should call pm_runtime_set_active if
> > > 	dev->power.disable_depth > 0.  (This would run into a problem
> > > 	if the parent is suspended and disabled.  Maybe 
> > > 	pm_runtime_forbid should fail when this happens.)
> > 
> > pm_runtime_forbid() may be called via echo "on" > .../power/control.  I
> > think it is hard to refuse the request from user space to forbid runtime
> > PM.  Device can always work with full power.
> 
> It can't if the parent is in SUSPEND.  If necessary, the user can write 
> "on" to the parent's power/control attribute first.

Is it possible to call pm_runtime_set_active() for the parent if the
parent is disabled and SUSPENDED.

> > > Finally, we probably should make a third change even though it isn't
> > > strictly necessary:
> > > 
> > > 	pm_runtime_allow should call pm_runtime_set_suspended if
> > > 	dev->power.disable_depth > 0.
> > 
> > I think this is something similar to manage device power state
> > automatically if disabled.
> 
> Yes, it is similar but not exactly the same as your proposal.

It appears that there is race condition between this and the
pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
you mentioned ealier.

thread 1			thread 2
pm_runtime_disable
pm_runtime_set_active
				pm_runtime_allow
				  pm_runtime_set_suspended
pm_runtime_enable

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-13  5:12                                                     ` Huang Ying
@ 2012-11-13 16:10                                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-13 16:10 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 13 Nov 2012, Huang Ying wrote:

> > This is not quite right.  Consider a device that is in runtime suspend 
> > when a system sleep starts.  When the system sleep ends, the device 
> > will be resumed but the PM core will still think its state is 
> > SUSPENDED.  The subsystem has to tell the PM core that the device is 
> > now ACTIVE.  Currently, subsystems do this by calling 
> > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> > your scheme this wouldn't work; the pm_runtime_set_active call would 
> > fail because the device was !forbidden.
> 
> Thanks for your information.  For this specific situation, is it
> possible to call pm_runtime_resume() or pm_request_resume() for the
> device?

No, because the device already is at full power.  The subsystem just
needs to tell the PM core that it is.

> > > PM.  Device can always work with full power.
> > 
> > It can't if the parent is in SUSPEND.  If necessary, the user can write 
> > "on" to the parent's power/control attribute first.
> 
> Is it possible to call pm_runtime_set_active() for the parent if the
> parent is disabled and SUSPENDED.

Doing that is possible, but it might not work.  The parent might
actually be at low power; calling pm_runtime_set_active wouldn't change
the physical power level.  Basically, it's not safe to assume anything
about devices that are disabled for runtime PM.

> It appears that there is race condition between this and the
> pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
> you mentioned ealier.
> 
> thread 1			thread 2
> pm_runtime_disable
> pm_runtime_set_active
> 				pm_runtime_allow
> 				  pm_runtime_set_suspended
> pm_runtime_enable

This can't happen in the situation I described earlier because during
system sleep transitions, no other user threads are allowed to run.  
All of them except the one actually carrying out the transition are
frozen.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-13 16:10                                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-13 16:10 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 13 Nov 2012, Huang Ying wrote:

> > This is not quite right.  Consider a device that is in runtime suspend 
> > when a system sleep starts.  When the system sleep ends, the device 
> > will be resumed but the PM core will still think its state is 
> > SUSPENDED.  The subsystem has to tell the PM core that the device is 
> > now ACTIVE.  Currently, subsystems do this by calling 
> > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> > your scheme this wouldn't work; the pm_runtime_set_active call would 
> > fail because the device was !forbidden.
> 
> Thanks for your information.  For this specific situation, is it
> possible to call pm_runtime_resume() or pm_request_resume() for the
> device?

No, because the device already is at full power.  The subsystem just
needs to tell the PM core that it is.

> > > PM.  Device can always work with full power.
> > 
> > It can't if the parent is in SUSPEND.  If necessary, the user can write 
> > "on" to the parent's power/control attribute first.
> 
> Is it possible to call pm_runtime_set_active() for the parent if the
> parent is disabled and SUSPENDED.

Doing that is possible, but it might not work.  The parent might
actually be at low power; calling pm_runtime_set_active wouldn't change
the physical power level.  Basically, it's not safe to assume anything
about devices that are disabled for runtime PM.

> It appears that there is race condition between this and the
> pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
> you mentioned ealier.
> 
> thread 1			thread 2
> pm_runtime_disable
> pm_runtime_set_active
> 				pm_runtime_allow
> 				  pm_runtime_set_suspended
> pm_runtime_enable

This can't happen in the situation I described earlier because during
system sleep transitions, no other user threads are allowed to run.  
All of them except the one actually carrying out the transition are
frozen.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-12 16:32                                                 ` Alan Stern
  (?)
  (?)
@ 2012-11-13 23:43                                                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-13 23:43 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Monday, November 12, 2012 11:32:26 AM Alan Stern wrote:
> On Mon, 12 Nov 2012, Huang Ying wrote:
> 
> > > > Is it absolute necessary to call pm_runtime_set_suspended?  If the
> > > > device is disabled, the transition to SUSPENDED state will not be
> > > > triggered even if the device is ACTIVE.
> > > 
> > > It's not absolutely necessary to do this, but we ought to because it 
> > > will allow the device's parent to be suspended.  If we leave the device 
> > > in the ACTIVE state then the parent can't be suspended, even when the 
> > > device is disabled.
> > 
> > I think this is the hard part of the issue.  Now "disabled" and
> > SUSPENDED state is managed by hand.  IMHO, if we changed
> > pm_runtime_allow() as you said, we need to change the rule too.  Maybe
> > something as follow:
> > 
> > - remove pm_runtime_set_suspended/pm_runtime_set_active
> 
> We can't remove them entirely.  They are needed for situations where 
> the device's physical state is different from what the PM core thinks; 
> they tell the PM core what the actual state is.
> 
> > - in pm_runtime_disable/pm_runtime_allow, put device into SUSPENDED
> > state if runtime PM is not forbidden.
> 
> That doesn't make sense.  Runtime PM is never forbidden after 
> pm_runtime_allow is called; that's its purpose.
> 
> > - in pm_runtime_forbid/pm_runtime_enable, put device into ACTIVE state.
> 
> Why should pm_runtime_enable put the device into the ACTIVE state?
> 
> No, I think a better approach is simply to say that the device will
> never be allowed to be in the SUSPENDED state if runtime PM is
> forbidden.  We already enforce this when the device is enabled for 
> runtime PM, so we would have to start enforcing it when the device is 
> disabled.

Sorry for the delay, I needed to take care of some ACPI changes urgently.

(Without reading the rest of the thread yet) I think that would be
a reasonable approach.

> This means:
> 
> 	pm_runtime_set_suspended should fail if dev->power.runtime_auto
> 	is clear.
> 
> 	pm_runtime_forbid should call pm_runtime_set_active if
> 	dev->power.disable_depth > 0.  (This would run into a problem
> 	if the parent is suspended and disabled.  Maybe 
> 	pm_runtime_forbid should fail when this happens.)
> 
> Finally, we probably should make a third change even though it isn't
> strictly necessary:
> 
> 	pm_runtime_allow should call pm_runtime_set_suspended if
> 	dev->power.disable_depth > 0.
> 
> Rafael, what do you think?

As I said, sounds reasonable.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-13 16:10                                                         ` Alan Stern
  (?)
@ 2012-11-14  1:08                                                         ` Huang Ying
  2012-11-14  9:52                                                           ` Rafael J. Wysocki
  -1 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-14  1:08 UTC (permalink / raw)
  To: Alan Stern; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Tue, 2012-11-13 at 11:10 -0500, Alan Stern wrote:
> On Tue, 13 Nov 2012, Huang Ying wrote:
> 
> > > This is not quite right.  Consider a device that is in runtime suspend 
> > > when a system sleep starts.  When the system sleep ends, the device 
> > > will be resumed but the PM core will still think its state is 
> > > SUSPENDED.  The subsystem has to tell the PM core that the device is 
> > > now ACTIVE.  Currently, subsystems do this by calling 
> > > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> > > your scheme this wouldn't work; the pm_runtime_set_active call would 
> > > fail because the device was !forbidden.
> > 
> > Thanks for your information.  For this specific situation, is it
> > possible to call pm_runtime_resume() or pm_request_resume() for the
> > device?
> 
> No, because the device already is at full power.  The subsystem just
> needs to tell the PM core that it is.
> 
> > > > PM.  Device can always work with full power.
> > > 
> > > It can't if the parent is in SUSPEND.  If necessary, the user can write 
> > > "on" to the parent's power/control attribute first.
> > 
> > Is it possible to call pm_runtime_set_active() for the parent if the
> > parent is disabled and SUSPENDED.
> 
> Doing that is possible, but it might not work.  The parent might
> actually be at low power; calling pm_runtime_set_active wouldn't change
> the physical power level.  Basically, it's not safe to assume anything
> about devices that are disabled for runtime PM.
> 
> > It appears that there is race condition between this and the
> > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
> > you mentioned ealier.
> > 
> > thread 1			thread 2
> > pm_runtime_disable
> > pm_runtime_set_active
> > 				pm_runtime_allow
> > 				  pm_runtime_set_suspended
> > pm_runtime_enable
> 
> This can't happen in the situation I described earlier because during
> system sleep transitions, no other user threads are allowed to run.  
> All of them except the one actually carrying out the transition are
> frozen.

Thanks for your kind explanation.

After talking with you, my feeling is that the disabled state is obscure
and error-prone.  So I suggest not to use it if possible.  Maybe we can

 - make changes suggested by Alan to make disabled state better.
 - use Rafael's solution to solve this specific issue, and avoid the
usage of disabled state here.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14  1:08                                                         ` Huang Ying
@ 2012-11-14  9:52                                                           ` Rafael J. Wysocki
  2012-11-14 13:35                                                             ` Huang Ying
  0 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-14  9:52 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Wednesday, November 14, 2012 09:08:28 AM Huang Ying wrote:
> On Tue, 2012-11-13 at 11:10 -0500, Alan Stern wrote:
> > On Tue, 13 Nov 2012, Huang Ying wrote:
> > 
> > > > This is not quite right.  Consider a device that is in runtime suspend 
> > > > when a system sleep starts.  When the system sleep ends, the device 
> > > > will be resumed but the PM core will still think its state is 
> > > > SUSPENDED.  The subsystem has to tell the PM core that the device is 
> > > > now ACTIVE.  Currently, subsystems do this by calling 
> > > > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> > > > your scheme this wouldn't work; the pm_runtime_set_active call would 
> > > > fail because the device was !forbidden.
> > > 
> > > Thanks for your information.  For this specific situation, is it
> > > possible to call pm_runtime_resume() or pm_request_resume() for the
> > > device?
> > 
> > No, because the device already is at full power.  The subsystem just
> > needs to tell the PM core that it is.
> > 
> > > > > PM.  Device can always work with full power.
> > > > 
> > > > It can't if the parent is in SUSPEND.  If necessary, the user can write 
> > > > "on" to the parent's power/control attribute first.
> > > 
> > > Is it possible to call pm_runtime_set_active() for the parent if the
> > > parent is disabled and SUSPENDED.
> > 
> > Doing that is possible, but it might not work.  The parent might
> > actually be at low power; calling pm_runtime_set_active wouldn't change
> > the physical power level.  Basically, it's not safe to assume anything
> > about devices that are disabled for runtime PM.
> > 
> > > It appears that there is race condition between this and the
> > > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
> > > you mentioned ealier.
> > > 
> > > thread 1			thread 2
> > > pm_runtime_disable
> > > pm_runtime_set_active
> > > 				pm_runtime_allow
> > > 				  pm_runtime_set_suspended
> > > pm_runtime_enable
> > 
> > This can't happen in the situation I described earlier because during
> > system sleep transitions, no other user threads are allowed to run.  
> > All of them except the one actually carrying out the transition are
> > frozen.
> 
> Thanks for your kind explanation.
> 
> After talking with you, my feeling is that the disabled state is obscure
> and error-prone.  So I suggest not to use it if possible.  Maybe we can
> 
>  - make changes suggested by Alan to make disabled state better.

What changes specifically do you mean to be precise?

>  - use Rafael's solution to solve this specific issue, and avoid the
> usage of disabled state here.

Well, I think that the PCI subsystem should just enable runtime PM for
all devices upfront and keep it enabled going forward.

My patch is incomplete, however, because it doesn't deal with probe/remove
correctly at this point (which Alan pointed out earlier in the thread).

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-08 17:07                                     ` Alan Stern
  (?)
  (?)
@ 2012-11-14 10:05                                     ` Rafael J. Wysocki
  2012-11-14 16:42                                         ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-14 10:05 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Thursday, November 08, 2012 12:07:54 PM Alan Stern wrote:
> On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:

[...]

I'd like to revisit this for a while if you don't mind.

> Your revised patch does do the job, except for a few problems.  
> Namely, while local_pci_probe() and pci_device_remove() are running,
> the device _does_ have a driver.

Right.

> This means that local_pci_probe() should not call pm_runtime_get_sync(),
> for example.  Doing so would invoke the driver's runtime_resume routine
> before calling the driver's probe routine!
> 
> The USB subsystem solves this problem by carefully keeping track of the 
> state of the device-driver binding:
> 
> 	Originally the device is UNBOUND.
> 
> 	At the start of the subsystem's probe routine, the state
> 	changes to BINDING.
> 
> 	If the probe succeeds then it changes to BOUND; otherwise
> 	it goes back to UNBOUND.
> 
> 	At the start of the subsystem's remove routine, the state
> 	changes to UNBINDING.  At the end it goes to UNBOUND.
> 
> When the state is anything other than BOUND, the subsystem's runtime PM 
> routines act as though there is no driver.

Well, that wouldn't help PCI, because some drivers want to use the
pm_runtime_* stuff in their .probe() routines and actually expect it to
work. :-)

Perhaps we can introduce something like

pm_runtime_get[_put]_skip_callbacks()

that would treat the device as though it had the power.no_callbacks flag
set and use that around the driver's .probe() in the PCI core?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14  9:52                                                           ` Rafael J. Wysocki
@ 2012-11-14 13:35                                                             ` Huang Ying
  2012-11-14 16:06                                                                 ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-14 13:35 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Wed, 2012-11-14 at 10:52 +0100, Rafael J. Wysocki wrote:
> On Wednesday, November 14, 2012 09:08:28 AM Huang Ying wrote:
> > On Tue, 2012-11-13 at 11:10 -0500, Alan Stern wrote:
> > > On Tue, 13 Nov 2012, Huang Ying wrote:
> > > 
> > > > > This is not quite right.  Consider a device that is in runtime suspend 
> > > > > when a system sleep starts.  When the system sleep ends, the device 
> > > > > will be resumed but the PM core will still think its state is 
> > > > > SUSPENDED.  The subsystem has to tell the PM core that the device is 
> > > > > now ACTIVE.  Currently, subsystems do this by calling 
> > > > > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable.  Under 
> > > > > your scheme this wouldn't work; the pm_runtime_set_active call would 
> > > > > fail because the device was !forbidden.
> > > > 
> > > > Thanks for your information.  For this specific situation, is it
> > > > possible to call pm_runtime_resume() or pm_request_resume() for the
> > > > device?
> > > 
> > > No, because the device already is at full power.  The subsystem just
> > > needs to tell the PM core that it is.
> > > 
> > > > > > PM.  Device can always work with full power.
> > > > > 
> > > > > It can't if the parent is in SUSPEND.  If necessary, the user can write 
> > > > > "on" to the parent's power/control attribute first.
> > > > 
> > > > Is it possible to call pm_runtime_set_active() for the parent if the
> > > > parent is disabled and SUSPENDED.
> > > 
> > > Doing that is possible, but it might not work.  The parent might
> > > actually be at low power; calling pm_runtime_set_active wouldn't change
> > > the physical power level.  Basically, it's not safe to assume anything
> > > about devices that are disabled for runtime PM.
> > > 
> > > > It appears that there is race condition between this and the
> > > > pm_runtime_disable, pm_runtime_set_active, pm_runtime_enable sequence
> > > > you mentioned ealier.
> > > > 
> > > > thread 1			thread 2
> > > > pm_runtime_disable
> > > > pm_runtime_set_active
> > > > 				pm_runtime_allow
> > > > 				  pm_runtime_set_suspended
> > > > pm_runtime_enable
> > > 
> > > This can't happen in the situation I described earlier because during
> > > system sleep transitions, no other user threads are allowed to run.  
> > > All of them except the one actually carrying out the transition are
> > > frozen.
> > 
> > Thanks for your kind explanation.
> > 
> > After talking with you, my feeling is that the disabled state is obscure
> > and error-prone.  So I suggest not to use it if possible.  Maybe we can
> > 
> >  - make changes suggested by Alan to make disabled state better.
> 
> What changes specifically do you mean to be precise?

I mean the following changes from Alan's email.

        pm_runtime_set_suspended should fail if dev->power.runtime_auto
        is clear.

        pm_runtime_forbid should call pm_runtime_set_active if
        dev->power.disable_depth > 0.  (This would run into a problem
        if the parent is suspended and disabled.  Maybe 
        pm_runtime_forbid should fail when this happens.)

For the second one, is it possible that the device is really in low
power state when pm_runtime_forbid is called?  That situation is hard to
deal with too.

> >  - use Rafael's solution to solve this specific issue, and avoid the
> > usage of disabled state here.
> 
> Well, I think that the PCI subsystem should just enable runtime PM for
> all devices upfront and keep it enabled going forward.
> 
> My patch is incomplete, however, because it doesn't deal with probe/remove
> correctly at this point (which Alan pointed out earlier in the thread).

Yes.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 13:35                                                             ` Huang Ying
@ 2012-11-14 16:06                                                                 ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 16:06 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Huang Ying wrote:

> > What changes specifically do you mean to be precise?
> 
> I mean the following changes from Alan's email.
> 
>         pm_runtime_set_suspended should fail if dev->power.runtime_auto
>         is clear.
> 
>         pm_runtime_forbid should call pm_runtime_set_active if
>         dev->power.disable_depth > 0.  (This would run into a problem
>         if the parent is suspended and disabled.  Maybe 
>         pm_runtime_forbid should fail when this happens.)
> 
> For the second one, is it possible that the device is really in low
> power state when pm_runtime_forbid is called?  That situation is hard to
> deal with too.

Yes, it is possible.  I don't see what we can do about it.  By
disabling the device, the driver has said that it doesn't want to 
handle any runtime PM callbacks.  Without the driver's help, there 
isn't any good way to bring the device back to full power.

On the other hand, the PM core doesn't know the device's actual power 
state.  All it knows is the value of dev->power.runtime_status.  So it 
doesn't have any way to detect when this problem occurs.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-14 16:06                                                                 ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 16:06 UTC (permalink / raw)
  To: Huang Ying; +Cc: Rafael J. Wysocki, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Huang Ying wrote:

> > What changes specifically do you mean to be precise?
> 
> I mean the following changes from Alan's email.
> 
>         pm_runtime_set_suspended should fail if dev->power.runtime_auto
>         is clear.
> 
>         pm_runtime_forbid should call pm_runtime_set_active if
>         dev->power.disable_depth > 0.  (This would run into a problem
>         if the parent is suspended and disabled.  Maybe 
>         pm_runtime_forbid should fail when this happens.)
> 
> For the second one, is it possible that the device is really in low
> power state when pm_runtime_forbid is called?  That situation is hard to
> deal with too.

Yes, it is possible.  I don't see what we can do about it.  By
disabling the device, the driver has said that it doesn't want to 
handle any runtime PM callbacks.  Without the driver's help, there 
isn't any good way to bring the device back to full power.

On the other hand, the PM core doesn't know the device's actual power 
state.  All it knows is the value of dev->power.runtime_status.  So it 
doesn't have any way to detect when this problem occurs.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 10:05                                     ` Rafael J. Wysocki
@ 2012-11-14 16:42                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 16:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:

> On Thursday, November 08, 2012 12:07:54 PM Alan Stern wrote:
> > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> 
> [...]
> 
> I'd like to revisit this for a while if you don't mind.

Not at all.

> > Your revised patch does do the job, except for a few problems.  
> > Namely, while local_pci_probe() and pci_device_remove() are running,
> > the device _does_ have a driver.
> 
> Right.
> 
> > This means that local_pci_probe() should not call pm_runtime_get_sync(),
> > for example.  Doing so would invoke the driver's runtime_resume routine
> > before calling the driver's probe routine!
> > 
> > The USB subsystem solves this problem by carefully keeping track of the 
> > state of the device-driver binding:
> > 
> > 	Originally the device is UNBOUND.
> > 
> > 	At the start of the subsystem's probe routine, the state
> > 	changes to BINDING.
> > 
> > 	If the probe succeeds then it changes to BOUND; otherwise
> > 	it goes back to UNBOUND.
> > 
> > 	At the start of the subsystem's remove routine, the state
> > 	changes to UNBINDING.  At the end it goes to UNBOUND.
> > 
> > When the state is anything other than BOUND, the subsystem's runtime PM 
> > routines act as though there is no driver.
> 
> Well, that wouldn't help PCI, because some drivers want to use the
> pm_runtime_* stuff in their .probe() routines and actually expect it to
> work. :-)

PCI could do something like this:

	local_pci_probe() calls pm_runtime_get_sync() twice before
	it changes the binding state to BINDING.  It then calls 
	pm_runtime_put_sync() after the state is BOUND.

	pci_device_remove() calls pm_runtime_get_sync() before it
	changes the binding state to UNBINDING.  It then calls
	pm_runtime_put_sync() twice after the state is UNBOUND.

(Obviously some of those calls could be _get_noresume() or
_put_noidle().)

This has the side effect that when a driver unbinds, it can't leave the 
device in a special low-power state.  The device will always end up in 
the generic low-power state supported by the PCI core.

> Perhaps we can introduce something like
> 
> pm_runtime_get[_put]_skip_callbacks()
> 
> that would treat the device as though it had the power.no_callbacks flag
> set and use that around the driver's .probe() in the PCI core?

That would prevent the PM core from invoking the PCI subsystem's own 
callback, not just the driver's callback.  So I don't think that's what 
you want.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-14 16:42                                         ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 16:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:

> On Thursday, November 08, 2012 12:07:54 PM Alan Stern wrote:
> > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> 
> [...]
> 
> I'd like to revisit this for a while if you don't mind.

Not at all.

> > Your revised patch does do the job, except for a few problems.  
> > Namely, while local_pci_probe() and pci_device_remove() are running,
> > the device _does_ have a driver.
> 
> Right.
> 
> > This means that local_pci_probe() should not call pm_runtime_get_sync(),
> > for example.  Doing so would invoke the driver's runtime_resume routine
> > before calling the driver's probe routine!
> > 
> > The USB subsystem solves this problem by carefully keeping track of the 
> > state of the device-driver binding:
> > 
> > 	Originally the device is UNBOUND.
> > 
> > 	At the start of the subsystem's probe routine, the state
> > 	changes to BINDING.
> > 
> > 	If the probe succeeds then it changes to BOUND; otherwise
> > 	it goes back to UNBOUND.
> > 
> > 	At the start of the subsystem's remove routine, the state
> > 	changes to UNBINDING.  At the end it goes to UNBOUND.
> > 
> > When the state is anything other than BOUND, the subsystem's runtime PM 
> > routines act as though there is no driver.
> 
> Well, that wouldn't help PCI, because some drivers want to use the
> pm_runtime_* stuff in their .probe() routines and actually expect it to
> work. :-)

PCI could do something like this:

	local_pci_probe() calls pm_runtime_get_sync() twice before
	it changes the binding state to BINDING.  It then calls 
	pm_runtime_put_sync() after the state is BOUND.

	pci_device_remove() calls pm_runtime_get_sync() before it
	changes the binding state to UNBINDING.  It then calls
	pm_runtime_put_sync() twice after the state is UNBOUND.

(Obviously some of those calls could be _get_noresume() or
_put_noidle().)

This has the side effect that when a driver unbinds, it can't leave the 
device in a special low-power state.  The device will always end up in 
the generic low-power state supported by the PCI core.

> Perhaps we can introduce something like
> 
> pm_runtime_get[_put]_skip_callbacks()
> 
> that would treat the device as though it had the power.no_callbacks flag
> set and use that around the driver's .probe() in the PCI core?

That would prevent the PM core from invoking the PCI subsystem's own 
callback, not just the driver's callback.  So I don't think that's what 
you want.

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 16:42                                         ` Alan Stern
  (?)
@ 2012-11-14 19:42                                         ` Rafael J. Wysocki
  2012-11-14 21:45                                             ` Alan Stern
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-14 19:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 14, 2012 11:42:33 AM Alan Stern wrote:
> On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> 
> > On Thursday, November 08, 2012 12:07:54 PM Alan Stern wrote:
> > > On Thu, 8 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > [...]
> > 
> > I'd like to revisit this for a while if you don't mind.
> 
> Not at all.
> 
> > > Your revised patch does do the job, except for a few problems.  
> > > Namely, while local_pci_probe() and pci_device_remove() are running,
> > > the device _does_ have a driver.
> > 
> > Right.
> > 
> > > This means that local_pci_probe() should not call pm_runtime_get_sync(),
> > > for example.  Doing so would invoke the driver's runtime_resume routine
> > > before calling the driver's probe routine!
> > > 
> > > The USB subsystem solves this problem by carefully keeping track of the 
> > > state of the device-driver binding:
> > > 
> > > 	Originally the device is UNBOUND.
> > > 
> > > 	At the start of the subsystem's probe routine, the state
> > > 	changes to BINDING.
> > > 
> > > 	If the probe succeeds then it changes to BOUND; otherwise
> > > 	it goes back to UNBOUND.
> > > 
> > > 	At the start of the subsystem's remove routine, the state
> > > 	changes to UNBINDING.  At the end it goes to UNBOUND.
> > > 
> > > When the state is anything other than BOUND, the subsystem's runtime PM 
> > > routines act as though there is no driver.
> > 
> > Well, that wouldn't help PCI, because some drivers want to use the
> > pm_runtime_* stuff in their .probe() routines and actually expect it to
> > work. :-)
> 
> PCI could do something like this:
> 
> 	local_pci_probe() calls pm_runtime_get_sync() twice before
> 	it changes the binding state to BINDING.  It then calls 
> 	pm_runtime_put_sync() after the state is BOUND.
> 
> 	pci_device_remove() calls pm_runtime_get_sync() before it
> 	changes the binding state to UNBINDING.  It then calls
> 	pm_runtime_put_sync() twice after the state is UNBOUND.
> 
> (Obviously some of those calls could be _get_noresume() or
> _put_noidle().)
> 
> This has the side effect that when a driver unbinds, it can't leave the 
> device in a special low-power state.  The device will always end up in 
> the generic low-power state supported by the PCI core.

Well, I'm not sure I'd like that.

Let's just go back even one step more and think what we'd like to have in
general terms and then how to implement it. :-)

Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
addition to what it does currently).  The runtime PM status of each device is
RPM_SUSPENDED at this point.  Then:

(1) We want to keep the current semantics during probe, i.e. the device should
    (a) be RPM_ACTIVE and (b) have usage_count == (user space usage_count + 1)
    right before ddi->drv->probe() is executed.

(2) We don't want the driver's PM callbacks to be run before ddi->drv->probe().
    There's a question if we want the bus type's PM callbacks to be run at
    that point, but they are not run currently and IMO we shouldn't change
    that.

(4) If ddi->drv->probe() fails, we want the device's status to change to
    RPM_SUSPENDED and it's usage_count to be equal to the user space part,
    so that the conditions are the same as before when probing is repeated.

(5) During ddi->drv->probe(), if the driver decrements the device's usage_count,
    which it is supposed to do if it supports runtime PM, then runtime PM
    should work for the device normally going forward (unless the .probe()
    eventually fails, but then the driver is supposed to do the cleanup).

(6) In pci_device_remove() we want the status to change to RPM_SUSPENDED and
    the device's usage_count to be equal to the user space part after
    drv->remove() has run.

(7) We want neither the driver's nor the PCI bus type's PM callbacks
    to be run after drv->remove() has returned (that's what happens now).

> > Perhaps we can introduce something like
> > 
> > pm_runtime_get[_put]_skip_callbacks()
> > 
> > that would treat the device as though it had the power.no_callbacks flag
> > set and use that around the driver's .probe() in the PCI core?
> 
> That would prevent the PM core from invoking the PCI subsystem's own 
> callback, not just the driver's callback.  So I don't think that's what 
> you want.

Actually, looking at the above, I think that's pretty much what I want. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 19:42                                         ` Rafael J. Wysocki
@ 2012-11-14 21:45                                             ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 21:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:

> > This has the side effect that when a driver unbinds, it can't leave the 
> > device in a special low-power state.  The device will always end up in 
> > the generic low-power state supported by the PCI core.
> 
> Well, I'm not sure I'd like that.
> 
> Let's just go back even one step more and think what we'd like to have in
> general terms and then how to implement it. :-)
> 
> Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> addition to what it does currently).  The runtime PM status of each device is
> RPM_SUSPENDED at this point.  Then:

Wait a moment.  When the device is detected and initialized, it is in
D0, right?  Currently we don't care much because the device starts out
disabled for runtime PM.  But now you are going to enable it.  While
the device is enabled, its runtime status should match the physical
power level.

This means the initialization routine would have to call
pm_runtime_set_active() before pm_runtime_enable().  If you then wanted
to change the status to RPM_SUSPENDED, you would actually have to put
the device into D3 by calling pm_runtime_suspend() (or maybe
pm_runtime_schedule_suspend() to give drivers some time to get loaded 
and bind).

> (1) We want to keep the current semantics during probe, i.e. the device should
>     (a) be RPM_ACTIVE and (b) have usage_count == (user space usage_count + 1)
>     right before ddi->drv->probe() is executed.

In theory the usage_count could be higher and then adjusted back after
the probe is finished, if that would make anything easier.

> (2) We don't want the driver's PM callbacks to be run before ddi->drv->probe().
>     There's a question if we want the bus type's PM callbacks to be run at
>     that point, but they are not run currently and IMO we shouldn't change
>     that.

The device is supposed to be in D0 when it is probed.  Since we are
assuming that initialization is now going to leave it in D3, there's no
choice -- you _have_ to invoke pci_pm_runtime_resume(), which would
invoke the driver's callback, which we don't want.

Therefore you need to figure out a way to tell pci_pm_runtime_resume() 
(and presumably pci_pm_runtime_suspend() as well) when not to invoke 
the driver's callback.  Add a flag to the pci_device structure, maybe.

> (4) If ddi->drv->probe() fails, we want the device's status to change to
>     RPM_SUSPENDED and it's usage_count to be equal to the user space part,
>     so that the conditions are the same as before when probing is repeated.
> 
> (5) During ddi->drv->probe(), if the driver decrements the device's usage_count,
>     which it is supposed to do if it supports runtime PM, then runtime PM
>     should work for the device normally going forward (unless the .probe()
>     eventually fails, but then the driver is supposed to do the cleanup).

It would be okay if the normal runtime PM doesn't kick in until after 
the probe routine returns.  For example, if the PCI core made an extra 
call to pm_runtime_get_noresume() before ddi->drv->probe() and a 
matching call to pm_runtime_put_sync() afterward.

> (6) In pci_device_remove() we want the status to change to RPM_SUSPENDED and
>     the device's usage_count to be equal to the user space part after
>     drv->remove() has run.

Basically, pci_device_remove() should undo the actions taken by 
local_pci_probe().

> (7) We want neither the driver's nor the PCI bus type's PM callbacks
>     to be run after drv->remove() has returned (that's what happens now).

What if the driver doesn't support runtime PM?  Then you have
contradictory requirements: The device is in D0 before drv->remove()  
is called, the driver's remove routine won't do any runtime PM, and
you don't want any PM callbacks after drv->remove() returns.  So how
can the device get put back into D3?

Are you suggesting that the unbound device should remain in D0, even
though runtime PM is enabled and the status is SUSPENDED?  I don't
think that would be a good idea.

> > > Perhaps we can introduce something like
> > > 
> > > pm_runtime_get[_put]_skip_callbacks()
> > > 
> > > that would treat the device as though it had the power.no_callbacks flag
> > > set and use that around the driver's .probe() in the PCI core?
> > 
> > That would prevent the PM core from invoking the PCI subsystem's own 
> > callback, not just the driver's callback.  So I don't think that's what 
> > you want.
> 
> Actually, looking at the above, I think that's pretty much what I want. :-)

I don't agree.  In my opinion it would be better to invoke the PCI
bus-type callbacks and tell them somehow to skip calling the driver's
callbacks before drv->probe() or after drv->remove().

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-14 21:45                                             ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-14 21:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:

> > This has the side effect that when a driver unbinds, it can't leave the 
> > device in a special low-power state.  The device will always end up in 
> > the generic low-power state supported by the PCI core.
> 
> Well, I'm not sure I'd like that.
> 
> Let's just go back even one step more and think what we'd like to have in
> general terms and then how to implement it. :-)
> 
> Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> addition to what it does currently).  The runtime PM status of each device is
> RPM_SUSPENDED at this point.  Then:

Wait a moment.  When the device is detected and initialized, it is in
D0, right?  Currently we don't care much because the device starts out
disabled for runtime PM.  But now you are going to enable it.  While
the device is enabled, its runtime status should match the physical
power level.

This means the initialization routine would have to call
pm_runtime_set_active() before pm_runtime_enable().  If you then wanted
to change the status to RPM_SUSPENDED, you would actually have to put
the device into D3 by calling pm_runtime_suspend() (or maybe
pm_runtime_schedule_suspend() to give drivers some time to get loaded 
and bind).

> (1) We want to keep the current semantics during probe, i.e. the device should
>     (a) be RPM_ACTIVE and (b) have usage_count == (user space usage_count + 1)
>     right before ddi->drv->probe() is executed.

In theory the usage_count could be higher and then adjusted back after
the probe is finished, if that would make anything easier.

> (2) We don't want the driver's PM callbacks to be run before ddi->drv->probe().
>     There's a question if we want the bus type's PM callbacks to be run at
>     that point, but they are not run currently and IMO we shouldn't change
>     that.

The device is supposed to be in D0 when it is probed.  Since we are
assuming that initialization is now going to leave it in D3, there's no
choice -- you _have_ to invoke pci_pm_runtime_resume(), which would
invoke the driver's callback, which we don't want.

Therefore you need to figure out a way to tell pci_pm_runtime_resume() 
(and presumably pci_pm_runtime_suspend() as well) when not to invoke 
the driver's callback.  Add a flag to the pci_device structure, maybe.

> (4) If ddi->drv->probe() fails, we want the device's status to change to
>     RPM_SUSPENDED and it's usage_count to be equal to the user space part,
>     so that the conditions are the same as before when probing is repeated.
> 
> (5) During ddi->drv->probe(), if the driver decrements the device's usage_count,
>     which it is supposed to do if it supports runtime PM, then runtime PM
>     should work for the device normally going forward (unless the .probe()
>     eventually fails, but then the driver is supposed to do the cleanup).

It would be okay if the normal runtime PM doesn't kick in until after 
the probe routine returns.  For example, if the PCI core made an extra 
call to pm_runtime_get_noresume() before ddi->drv->probe() and a 
matching call to pm_runtime_put_sync() afterward.

> (6) In pci_device_remove() we want the status to change to RPM_SUSPENDED and
>     the device's usage_count to be equal to the user space part after
>     drv->remove() has run.

Basically, pci_device_remove() should undo the actions taken by 
local_pci_probe().

> (7) We want neither the driver's nor the PCI bus type's PM callbacks
>     to be run after drv->remove() has returned (that's what happens now).

What if the driver doesn't support runtime PM?  Then you have
contradictory requirements: The device is in D0 before drv->remove()  
is called, the driver's remove routine won't do any runtime PM, and
you don't want any PM callbacks after drv->remove() returns.  So how
can the device get put back into D3?

Are you suggesting that the unbound device should remain in D0, even
though runtime PM is enabled and the status is SUSPENDED?  I don't
think that would be a good idea.

> > > Perhaps we can introduce something like
> > > 
> > > pm_runtime_get[_put]_skip_callbacks()
> > > 
> > > that would treat the device as though it had the power.no_callbacks flag
> > > set and use that around the driver's .probe() in the PCI core?
> > 
> > That would prevent the PM core from invoking the PCI subsystem's own 
> > callback, not just the driver's callback.  So I don't think that's what 
> > you want.
> 
> Actually, looking at the above, I think that's pretty much what I want. :-)

I don't agree.  In my opinion it would be better to invoke the PCI
bus-type callbacks and tell them somehow to skip calling the driver's
callbacks before drv->probe() or after drv->remove().

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 21:45                                             ` Alan Stern
  (?)
@ 2012-11-14 23:10                                             ` Rafael J. Wysocki
  2012-11-15  1:03                                               ` Huang Ying
  -1 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-14 23:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huang Ying, linux-kernel, linux-pm

On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> 
> > > This has the side effect that when a driver unbinds, it can't leave the 
> > > device in a special low-power state.  The device will always end up in 
> > > the generic low-power state supported by the PCI core.
> > 
> > Well, I'm not sure I'd like that.
> > 
> > Let's just go back even one step more and think what we'd like to have in
> > general terms and then how to implement it. :-)
> > 
> > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > addition to what it does currently).  The runtime PM status of each device is
> > RPM_SUSPENDED at this point.  Then:
> 
> Wait a moment.  When the device is detected and initialized, it is in
> D0, right?  Currently we don't care much because the device starts out
> disabled for runtime PM.  But now you are going to enable it.  While
> the device is enabled, its runtime status should match the physical
> power level.

OK

> This means the initialization routine would have to call
> pm_runtime_set_active() before pm_runtime_enable().  If you then wanted
> to change the status to RPM_SUSPENDED, you would actually have to put
> the device into D3 by calling pm_runtime_suspend() (or maybe
> pm_runtime_schedule_suspend() to give drivers some time to get loaded 
> and bind).

No, I don't want that.  It may be RPM_ACTIVE all the time as long as the
device doesn't have a driver.  Which probably would even make things
simpler. :-)

> > (1) We want to keep the current semantics during probe, i.e. the device should
> >     (a) be RPM_ACTIVE and (b) have usage_count == (user space usage_count + 1)
> >     right before ddi->drv->probe() is executed.
> 
> In theory the usage_count could be higher and then adjusted back after
> the probe is finished, if that would make anything easier.

No, it wouldn't, because of (5).  Suppose that the driver wants to suspend
the device directly from .probe() and the user space doesn't mind.  We can't
prevent that from being doable.

> > (2) We don't want the driver's PM callbacks to be run before ddi->drv->probe().
> >     There's a question if we want the bus type's PM callbacks to be run at
> >     that point, but they are not run currently and IMO we shouldn't change
> >     that.
> 
> The device is supposed to be in D0 when it is probed.  Since we are
> assuming that initialization is now going to leave it in D3, there's no
> choice -- you _have_ to invoke pci_pm_runtime_resume(), which would
> invoke the driver's callback, which we don't want.

Let's say the device will stay in D0 after the initialization and then
we'll require that it be in D0 if .probe() fails or after .remove().

The only thing we'll need to do before .probe() in that case is to
bump up the usage counter and then to bump it down if .probe() fails
(and after .remove()).

The only problem we have in that case are buggy drivers that leave
devices in, say, D3cold after a failing .probe().  That doesn't
seem to be avoidable, though.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-14 23:10                                             ` Rafael J. Wysocki
@ 2012-11-15  1:03                                               ` Huang Ying
  2012-11-15  9:51                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-15  1:03 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > 
> > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > device in a special low-power state.  The device will always end up in 
> > > > the generic low-power state supported by the PCI core.
> > > 
> > > Well, I'm not sure I'd like that.
> > > 
> > > Let's just go back even one step more and think what we'd like to have in
> > > general terms and then how to implement it. :-)
> > > 
> > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > addition to what it does currently).  The runtime PM status of each device is
> > > RPM_SUSPENDED at this point.  Then:
> > 
> > Wait a moment.  When the device is detected and initialized, it is in
> > D0, right?  Currently we don't care much because the device starts out
> > disabled for runtime PM.  But now you are going to enable it.  While
> > the device is enabled, its runtime status should match the physical
> > power level.
> 
> OK

If my memory were correct, RPM_SUSPENDED just means device stop working,
but need not be put into low-power state.  So for RPM_ACTIVE, PCI
devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
power state.

Best Regards,
Huang Ying


> > This means the initialization routine would have to call
> > pm_runtime_set_active() before pm_runtime_enable().  If you then wanted
> > to change the status to RPM_SUSPENDED, you would actually have to put
> > the device into D3 by calling pm_runtime_suspend() (or maybe
> > pm_runtime_schedule_suspend() to give drivers some time to get loaded 
> > and bind).
> 
> No, I don't want that.  It may be RPM_ACTIVE all the time as long as the
> device doesn't have a driver.  Which probably would even make things
> simpler. :-)
> 
> > > (1) We want to keep the current semantics during probe, i.e. the device should
> > >     (a) be RPM_ACTIVE and (b) have usage_count == (user space usage_count + 1)
> > >     right before ddi->drv->probe() is executed.
> > 
> > In theory the usage_count could be higher and then adjusted back after
> > the probe is finished, if that would make anything easier.
> 
> No, it wouldn't, because of (5).  Suppose that the driver wants to suspend
> the device directly from .probe() and the user space doesn't mind.  We can't
> prevent that from being doable.
> 
> > > (2) We don't want the driver's PM callbacks to be run before ddi->drv->probe().
> > >     There's a question if we want the bus type's PM callbacks to be run at
> > >     that point, but they are not run currently and IMO we shouldn't change
> > >     that.
> > 
> > The device is supposed to be in D0 when it is probed.  Since we are
> > assuming that initialization is now going to leave it in D3, there's no
> > choice -- you _have_ to invoke pci_pm_runtime_resume(), which would
> > invoke the driver's callback, which we don't want.
> 
> Let's say the device will stay in D0 after the initialization and then
> we'll require that it be in D0 if .probe() fails or after .remove().
> 
> The only thing we'll need to do before .probe() in that case is to
> bump up the usage counter and then to bump it down if .probe() fails
> (and after .remove()).
> 
> The only problem we have in that case are buggy drivers that leave
> devices in, say, D3cold after a failing .probe().  That doesn't
> seem to be avoidable, though.
> 
> Thanks,
> Rafael
> 
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-15  1:03                                               ` Huang Ying
@ 2012-11-15  9:51                                                 ` Rafael J. Wysocki
  2012-11-15 10:09                                                   ` Rafael J. Wysocki
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-15  9:51 UTC (permalink / raw)
  To: Huang Ying, Alan Stern; +Cc: linux-kernel, linux-pm

On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > 
> > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > device in a special low-power state.  The device will always end up in 
> > > > > the generic low-power state supported by the PCI core.
> > > > 
> > > > Well, I'm not sure I'd like that.
> > > > 
> > > > Let's just go back even one step more and think what we'd like to have in
> > > > general terms and then how to implement it. :-)
> > > > 
> > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > addition to what it does currently).  The runtime PM status of each device is
> > > > RPM_SUSPENDED at this point.  Then:
> > > 
> > > Wait a moment.  When the device is detected and initialized, it is in
> > > D0, right?  Currently we don't care much because the device starts out
> > > disabled for runtime PM.  But now you are going to enable it.  While
> > > the device is enabled, its runtime status should match the physical
> > > power level.
> > 
> > OK
> 
> If my memory were correct, RPM_SUSPENDED just means device stop working,
> but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> power state.

Yes, that's correct and I was wrong when I thought we could require the
status to be RPM_ACTIVE all the time when there's no driver, because that
would prevent parents from being suspended.  And we want them to be able to
suspend for driverless children, _unless_ user space has its attribute set
to "on" (i.e. the default).

So it looks like what we want to do is:

(1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
    before, so that it is in agreement with the pm_runtime_forbid() we do in
    there.

(2) If user space switches its attribute to "off" later, but before any
    drivers are probed, we want the status to switch to RPM_SUSPENDED
    _without_ actually changing the devices power state.  For that,
    I think, we can make the PCI bus type's runtime PM callback ignore
    devices without drivers (i.e. return 0 for them).

(3) When local_pci_probe() starts, after we've resumed the parent,
    the device will be in D0 (it may be D0-uninitialized, though).
    If the user space's attribute is "on" at this point, the parent's
    resume doesn't change anything.  If it is "auto", the parent's
    resume may actually transition the device, although its status
    will still be RPM_SUSPENDED.  For consistency _and_ compatibility
    with the current code, the driver's .probe() routine needs to see
    the device RPM_ACTIVE and usage_count incremented, but we don't
    want to run its PM callbacks _before_ .probe() runs.  For that
    to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
    treating the device as though it had the power.no_callbacks flag set,
    right before calling ddi->drv->probe().

    If the device has been RPM_ACTIVE at that point (i.e. user space has
    had its attribute set to "on") it will just bump up usage_count (which
    is what we want).  If the device has been RPM_SUSPENDED, it will
    bump up usage_count _and_ change the status to RPM_ACTIVE without
    executing any callbacks (the device is in D0 anyway, right?), which
    is what we want too.

(4) If ddi->drv->probe() succeeds, we don't want to change anything, so
    as not to confuse the driver, which is now in control of the device.

(5) If ddi->drv->probe() fails, we need to restore the situation from
    before calling local_pci_probe(), but we want the pm_runtime_put(parent)
    at the end of it to actually suspend the parent if user space has
    its attribute (for the child!) set to "auto".

    Assume that the driver is not buggy and the failing ddi->drv->probe()
    leaves the device in the same configuration as it's received it in.
    Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
    For the parent's suspend to work, we need to transition it to
    RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
    run at this point.  Moreover, we don't want the PCI bus type's
    callbacks to run at this point, because dev->driver is still set.
    So again, doing something like pm_runtime_put_skip_callbacks(),
    treating the device as though it had power.no_callbacks set, seems
    to be appropriate.
   
    Namely, if the user space's attribute is "on", it will just drop
    usage_count by 1, which is what we want in that case.  If the user
    space's attribute is "auto", on the other hand, it will drop
    usage_count by 1 and change the status to RPM_SUSPENDED without
    running callbacks, which again is what we want.
    
(6) In drv->remove() the driver is supposed to bump up usage_count by 1,
    so as to restore the situation from before its .probe() routine
    was called.  It also should leave the device as RPM_ACTIVE, because
    that's what it's got in .probe().  Then, after drv->remove exits,
    (and also if drv was NULL to start with), we want to drop usage_count
    by 1.  Moreover, if the user space's attribute is "on", we don't
    want anything more to happen, _but_ if that's "auto", we want to
    suspend the parent.

    Note that dev->driver is still not NULL at this point (although
    pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
    It looks like, then, what we want to do here is
    pm_runtime_put_skip_callbacks() again, because if the user space's
    attribute is "on", it will just drop usage_count by 1, which is what
    we want, and if that's "auto", it will additionally change the status
    to RPM_SUSPENDED (without executing callbacks, which we want) _and_
    it will queue up the parent's suspend (which, again, is what we want).

Did I miss anything?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-15  9:51                                                 ` Rafael J. Wysocki
@ 2012-11-15 10:09                                                   ` Rafael J. Wysocki
  2012-11-15 15:27                                                       ` Alan Stern
  2012-11-16  0:36                                                   ` Huang Ying
  2012-11-16  3:11                                                   ` Huang Ying
  2 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-15 10:09 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Thursday, November 15, 2012 10:51:42 AM Rafael J. Wysocki wrote:
> On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> > On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > > 
> > > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > > device in a special low-power state.  The device will always end up in 
> > > > > > the generic low-power state supported by the PCI core.
> > > > > 
> > > > > Well, I'm not sure I'd like that.
> > > > > 
> > > > > Let's just go back even one step more and think what we'd like to have in
> > > > > general terms and then how to implement it. :-)
> > > > > 
> > > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > > addition to what it does currently).  The runtime PM status of each device is
> > > > > RPM_SUSPENDED at this point.  Then:
> > > > 
> > > > Wait a moment.  When the device is detected and initialized, it is in
> > > > D0, right?  Currently we don't care much because the device starts out
> > > > disabled for runtime PM.  But now you are going to enable it.  While
> > > > the device is enabled, its runtime status should match the physical
> > > > power level.
> > > 
> > > OK
> > 
> > If my memory were correct, RPM_SUSPENDED just means device stop working,
> > but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> > devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> > power state.
> 
> Yes, that's correct and I was wrong when I thought we could require the
> status to be RPM_ACTIVE all the time when there's no driver, because that
> would prevent parents from being suspended.  And we want them to be able to
> suspend for driverless children, _unless_ user space has its attribute set
> to "on" (i.e. the default).
> 
> So it looks like what we want to do is:
> 
> (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
>     before, so that it is in agreement with the pm_runtime_forbid() we do in
>     there.
> 
> (2) If user space switches its attribute to "off" later, but before any
>     drivers are probed, we want the status to switch to RPM_SUSPENDED
>     _without_ actually changing the devices power state.  For that,
>     I think, we can make the PCI bus type's runtime PM callback ignore
>     devices without drivers (i.e. return 0 for them).
> 
> (3) When local_pci_probe() starts, after we've resumed the parent,
>     the device will be in D0 (it may be D0-uninitialized, though).
>     If the user space's attribute is "on" at this point, the parent's
>     resume doesn't change anything.  If it is "auto", the parent's
>     resume may actually transition the device, although its status
>     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
>     with the current code, the driver's .probe() routine needs to see
>     the device RPM_ACTIVE and usage_count incremented, but we don't
>     want to run its PM callbacks _before_ .probe() runs.  For that
>     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
>     treating the device as though it had the power.no_callbacks flag set,
>     right before calling ddi->drv->probe().
> 
>     If the device has been RPM_ACTIVE at that point (i.e. user space has
>     had its attribute set to "on") it will just bump up usage_count (which
>     is what we want).  If the device has been RPM_SUSPENDED, it will
>     bump up usage_count _and_ change the status to RPM_ACTIVE without
>     executing any callbacks (the device is in D0 anyway, right?), which
>     is what we want too.
> 
> (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
>     as not to confuse the driver, which is now in control of the device.
> 
> (5) If ddi->drv->probe() fails, we need to restore the situation from
>     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
>     at the end of it to actually suspend the parent if user space has
>     its attribute (for the child!) set to "auto".
> 
>     Assume that the driver is not buggy and the failing ddi->drv->probe()
>     leaves the device in the same configuration as it's received it in.
>     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
>     For the parent's suspend to work, we need to transition it to
>     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
>     run at this point.  Moreover, we don't want the PCI bus type's
>     callbacks to run at this point, because dev->driver is still set.
>     So again, doing something like pm_runtime_put_skip_callbacks(),
>     treating the device as though it had power.no_callbacks set, seems
>     to be appropriate.
>    
>     Namely, if the user space's attribute is "on", it will just drop
>     usage_count by 1, which is what we want in that case.  If the user
>     space's attribute is "auto", on the other hand, it will drop
>     usage_count by 1 and change the status to RPM_SUSPENDED without
>     running callbacks, which again is what we want.
>     
> (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
>     so as to restore the situation from before its .probe() routine
>     was called.  It also should leave the device as RPM_ACTIVE, because
>     that's what it's got in .probe().  Then, after drv->remove exits,
>     (and also if drv was NULL to start with), we want to drop usage_count
>     by 1.  Moreover, if the user space's attribute is "on", we don't
>     want anything more to happen, _but_ if that's "auto", we want to
>     suspend the parent.
> 
>     Note that dev->driver is still not NULL at this point (although
>     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
>     It looks like, then, what we want to do here is
>     pm_runtime_put_skip_callbacks() again, because if the user space's
>     attribute is "on", it will just drop usage_count by 1, which is what
>     we want, and if that's "auto", it will additionally change the status
>     to RPM_SUSPENDED (without executing callbacks, which we want) _and_
>     it will queue up the parent's suspend (which, again, is what we want).
> 
> Did I miss anything?

Apparently, I did.  In (6), if drv is NULL to start with, we don't want
to do anything with runtime PM, except for checking if the parent can be
suspended, so we only need to do pm_request_idle(parent) in that case.
And we seem to have a bug in there right now, because we shouldn't
do the "Undo the runtime PM settings in local_pci_probe()" stuff in that
case I think.

That seems to be the only thing I missed, though, hopefully.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-15 10:09                                                   ` Rafael J. Wysocki
@ 2012-11-15 15:27                                                       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-15 15:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Thu, 15 Nov 2012, Rafael J. Wysocki wrote:

> > So it looks like what we want to do is:
> > 
> > (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
> >     before, so that it is in agreement with the pm_runtime_forbid() we do in
> >     there.
> > 
> > (2) If user space switches its attribute to "off" later, but before any
> >     drivers are probed, we want the status to switch to RPM_SUSPENDED
> >     _without_ actually changing the devices power state.  For that,
> >     I think, we can make the PCI bus type's runtime PM callback ignore
> >     devices without drivers (i.e. return 0 for them).

Okay, so driverless PCI devices will always be in D0.  But you do allow 
their parents to go to low power.  Is that going to cause any problems?

> > (3) When local_pci_probe() starts, after we've resumed the parent,
> >     the device will be in D0 (it may be D0-uninitialized, though).
> >     If the user space's attribute is "on" at this point, the parent's
> >     resume doesn't change anything.  If it is "auto", the parent's
> >     resume may actually transition the device, although its status
> >     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
> >     with the current code, the driver's .probe() routine needs to see
> >     the device RPM_ACTIVE and usage_count incremented, but we don't
> >     want to run its PM callbacks _before_ .probe() runs.  For that
> >     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
> >     treating the device as though it had the power.no_callbacks flag set,
> >     right before calling ddi->drv->probe().

Instead of changing the PM core, wouldn't it be simpler to test whether
or not pci_dev->driver is NULL at the start of the PCI runtime methods?  
The same test could be used for (2) above.

All that would be needed would be to move the line that sets
pci_dev->driver from where it is in __pci_device_probe() to
local_pci_probe(), just before ddi->drv->probe() is called and just 
after calling pm_runtime_get_sync().

> >     If the device has been RPM_ACTIVE at that point (i.e. user space has
> >     had its attribute set to "on") it will just bump up usage_count (which
> >     is what we want).  If the device has been RPM_SUSPENDED, it will
> >     bump up usage_count _and_ change the status to RPM_ACTIVE without
> >     executing any callbacks (the device is in D0 anyway, right?), which
> >     is what we want too.
> > 
> > (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
> >     as not to confuse the driver, which is now in control of the device.
> > 
> > (5) If ddi->drv->probe() fails, we need to restore the situation from
> >     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
> >     at the end of it to actually suspend the parent if user space has
> >     its attribute (for the child!) set to "auto".

If the probe fails, set pci_dev->driver back to NULL and then call
pm_runtime_put_sync() or pm_runtime_put().

> >     Assume that the driver is not buggy and the failing ddi->drv->probe()
> >     leaves the device in the same configuration as it's received it in.
> >     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
> >     For the parent's suspend to work, we need to transition it to
> >     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
> >     run at this point.  Moreover, we don't want the PCI bus type's
> >     callbacks to run at this point, because dev->driver is still set.
> >     So again, doing something like pm_runtime_put_skip_callbacks(),
> >     treating the device as though it had power.no_callbacks set, seems
> >     to be appropriate.
> >    
> >     Namely, if the user space's attribute is "on", it will just drop
> >     usage_count by 1, which is what we want in that case.  If the user
> >     space's attribute is "auto", on the other hand, it will drop
> >     usage_count by 1 and change the status to RPM_SUSPENDED without
> >     running callbacks, which again is what we want.
> >     
> > (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
> >     so as to restore the situation from before its .probe() routine
> >     was called.  It also should leave the device as RPM_ACTIVE, because
> >     that's what it's got in .probe().  Then, after drv->remove exits,
> >     (and also if drv was NULL to start with), we want to drop usage_count
> >     by 1.  Moreover, if the user space's attribute is "on", we don't
> >     want anything more to happen, _but_ if that's "auto", we want to
> >     suspend the parent.

Shouldn't pci_device_remove() end up doing essentially the same thing 
as the failure path in local_pci_probe()?

> >     Note that dev->driver is still not NULL at this point (although
> >     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
> >     It looks like, then, what we want to do here is
> >     pm_runtime_put_skip_callbacks() again, because if the user space's
> >     attribute is "on", it will just drop usage_count by 1, which is what
> >     we want, and if that's "auto", it will additionally change the status
> >     to RPM_SUSPENDED (without executing callbacks, which we want) _and_
> >     it will queue up the parent's suspend (which, again, is what we want).
> > 
> > Did I miss anything?
> 
> Apparently, I did.  In (6), if drv is NULL to start with, we don't want
> to do anything with runtime PM, except for checking if the parent can be
> suspended, so we only need to do pm_request_idle(parent) in that case.
> And we seem to have a bug in there right now, because we shouldn't
> do the "Undo the runtime PM settings in local_pci_probe()" stuff in that
> case I think.

How can pci_device_remove() ever get called with either dev->driver or
pci_dev->driver equal to NULL?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
@ 2012-11-15 15:27                                                       ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2012-11-15 15:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Huang Ying, linux-kernel, linux-pm

On Thu, 15 Nov 2012, Rafael J. Wysocki wrote:

> > So it looks like what we want to do is:
> > 
> > (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
> >     before, so that it is in agreement with the pm_runtime_forbid() we do in
> >     there.
> > 
> > (2) If user space switches its attribute to "off" later, but before any
> >     drivers are probed, we want the status to switch to RPM_SUSPENDED
> >     _without_ actually changing the devices power state.  For that,
> >     I think, we can make the PCI bus type's runtime PM callback ignore
> >     devices without drivers (i.e. return 0 for them).

Okay, so driverless PCI devices will always be in D0.  But you do allow 
their parents to go to low power.  Is that going to cause any problems?

> > (3) When local_pci_probe() starts, after we've resumed the parent,
> >     the device will be in D0 (it may be D0-uninitialized, though).
> >     If the user space's attribute is "on" at this point, the parent's
> >     resume doesn't change anything.  If it is "auto", the parent's
> >     resume may actually transition the device, although its status
> >     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
> >     with the current code, the driver's .probe() routine needs to see
> >     the device RPM_ACTIVE and usage_count incremented, but we don't
> >     want to run its PM callbacks _before_ .probe() runs.  For that
> >     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
> >     treating the device as though it had the power.no_callbacks flag set,
> >     right before calling ddi->drv->probe().

Instead of changing the PM core, wouldn't it be simpler to test whether
or not pci_dev->driver is NULL at the start of the PCI runtime methods?  
The same test could be used for (2) above.

All that would be needed would be to move the line that sets
pci_dev->driver from where it is in __pci_device_probe() to
local_pci_probe(), just before ddi->drv->probe() is called and just 
after calling pm_runtime_get_sync().

> >     If the device has been RPM_ACTIVE at that point (i.e. user space has
> >     had its attribute set to "on") it will just bump up usage_count (which
> >     is what we want).  If the device has been RPM_SUSPENDED, it will
> >     bump up usage_count _and_ change the status to RPM_ACTIVE without
> >     executing any callbacks (the device is in D0 anyway, right?), which
> >     is what we want too.
> > 
> > (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
> >     as not to confuse the driver, which is now in control of the device.
> > 
> > (5) If ddi->drv->probe() fails, we need to restore the situation from
> >     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
> >     at the end of it to actually suspend the parent if user space has
> >     its attribute (for the child!) set to "auto".

If the probe fails, set pci_dev->driver back to NULL and then call
pm_runtime_put_sync() or pm_runtime_put().

> >     Assume that the driver is not buggy and the failing ddi->drv->probe()
> >     leaves the device in the same configuration as it's received it in.
> >     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
> >     For the parent's suspend to work, we need to transition it to
> >     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
> >     run at this point.  Moreover, we don't want the PCI bus type's
> >     callbacks to run at this point, because dev->driver is still set.
> >     So again, doing something like pm_runtime_put_skip_callbacks(),
> >     treating the device as though it had power.no_callbacks set, seems
> >     to be appropriate.
> >    
> >     Namely, if the user space's attribute is "on", it will just drop
> >     usage_count by 1, which is what we want in that case.  If the user
> >     space's attribute is "auto", on the other hand, it will drop
> >     usage_count by 1 and change the status to RPM_SUSPENDED without
> >     running callbacks, which again is what we want.
> >     
> > (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
> >     so as to restore the situation from before its .probe() routine
> >     was called.  It also should leave the device as RPM_ACTIVE, because
> >     that's what it's got in .probe().  Then, after drv->remove exits,
> >     (and also if drv was NULL to start with), we want to drop usage_count
> >     by 1.  Moreover, if the user space's attribute is "on", we don't
> >     want anything more to happen, _but_ if that's "auto", we want to
> >     suspend the parent.

Shouldn't pci_device_remove() end up doing essentially the same thing 
as the failure path in local_pci_probe()?

> >     Note that dev->driver is still not NULL at this point (although
> >     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
> >     It looks like, then, what we want to do here is
> >     pm_runtime_put_skip_callbacks() again, because if the user space's
> >     attribute is "on", it will just drop usage_count by 1, which is what
> >     we want, and if that's "auto", it will additionally change the status
> >     to RPM_SUSPENDED (without executing callbacks, which we want) _and_
> >     it will queue up the parent's suspend (which, again, is what we want).
> > 
> > Did I miss anything?
> 
> Apparently, I did.  In (6), if drv is NULL to start with, we don't want
> to do anything with runtime PM, except for checking if the parent can be
> suspended, so we only need to do pm_request_idle(parent) in that case.
> And we seem to have a bug in there right now, because we shouldn't
> do the "Undo the runtime PM settings in local_pci_probe()" stuff in that
> case I think.

How can pci_device_remove() ever get called with either dev->driver or
pci_dev->driver equal to NULL?

Alan Stern


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-15  9:51                                                 ` Rafael J. Wysocki
  2012-11-15 10:09                                                   ` Rafael J. Wysocki
@ 2012-11-16  0:36                                                   ` Huang Ying
  2012-11-16  0:44                                                     ` Rafael J. Wysocki
  2012-11-16  3:11                                                   ` Huang Ying
  2 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-16  0:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> > On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > > 
> > > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > > device in a special low-power state.  The device will always end up in 
> > > > > > the generic low-power state supported by the PCI core.
> > > > > 
> > > > > Well, I'm not sure I'd like that.
> > > > > 
> > > > > Let's just go back even one step more and think what we'd like to have in
> > > > > general terms and then how to implement it. :-)
> > > > > 
> > > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > > addition to what it does currently).  The runtime PM status of each device is
> > > > > RPM_SUSPENDED at this point.  Then:
> > > > 
> > > > Wait a moment.  When the device is detected and initialized, it is in
> > > > D0, right?  Currently we don't care much because the device starts out
> > > > disabled for runtime PM.  But now you are going to enable it.  While
> > > > the device is enabled, its runtime status should match the physical
> > > > power level.
> > > 
> > > OK
> > 
> > If my memory were correct, RPM_SUSPENDED just means device stop working,
> > but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> > devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> > power state.
> 
> Yes, that's correct and I was wrong when I thought we could require the
> status to be RPM_ACTIVE all the time when there's no driver, because that
> would prevent parents from being suspended.  And we want them to be able to
> suspend for driverless children, _unless_ user space has its attribute set
> to "on" (i.e. the default).
> 
> So it looks like what we want to do is:
> 
> (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
>     before, so that it is in agreement with the pm_runtime_forbid() we do in
>     there.
> 
> (2) If user space switches its attribute to "off" later, but before any
>     drivers are probed, we want the status to switch to RPM_SUSPENDED
>     _without_ actually changing the devices power state.  For that,
>     I think, we can make the PCI bus type's runtime PM callback ignore
>     devices without drivers (i.e. return 0 for them).
> 
> (3) When local_pci_probe() starts, after we've resumed the parent,
>     the device will be in D0 (it may be D0-uninitialized, though).
>     If the user space's attribute is "on" at this point, the parent's
>     resume doesn't change anything.  If it is "auto", the parent's
>     resume may actually transition the device, although its status
>     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
>     with the current code, the driver's .probe() routine needs to see
>     the device RPM_ACTIVE and usage_count incremented, but we don't
>     want to run its PM callbacks _before_ .probe() runs.  For that
>     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
>     treating the device as though it had the power.no_callbacks flag set,
>     right before calling ddi->drv->probe().
> 
>     If the device has been RPM_ACTIVE at that point (i.e. user space has
>     had its attribute set to "on") it will just bump up usage_count (which
>     is what we want).  If the device has been RPM_SUSPENDED, it will
>     bump up usage_count _and_ change the status to RPM_ACTIVE without
>     executing any callbacks (the device is in D0 anyway, right?), which
>     is what we want too.
> 
> (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
>     as not to confuse the driver, which is now in control of the device.
> 
> (5) If ddi->drv->probe() fails, we need to restore the situation from
>     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
>     at the end of it to actually suspend the parent if user space has
>     its attribute (for the child!) set to "auto".
> 
>     Assume that the driver is not buggy and the failing ddi->drv->probe()
>     leaves the device in the same configuration as it's received it in.
>     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
>     For the parent's suspend to work, we need to transition it to
>     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
>     run at this point.  Moreover, we don't want the PCI bus type's
>     callbacks to run at this point, because dev->driver is still set.
>     So again, doing something like pm_runtime_put_skip_callbacks(),
>     treating the device as though it had power.no_callbacks set, seems
>     to be appropriate.
>    
>     Namely, if the user space's attribute is "on", it will just drop
>     usage_count by 1, which is what we want in that case.  If the user
>     space's attribute is "auto", on the other hand, it will drop
>     usage_count by 1 and change the status to RPM_SUSPENDED without
>     running callbacks, which again is what we want.
>     
> (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
>     so as to restore the situation from before its .probe() routine
>     was called.  It also should leave the device as RPM_ACTIVE, because
>     that's what it's got in .probe().  Then, after drv->remove exits,
>     (and also if drv was NULL to start with), we want to drop usage_count
>     by 1.  Moreover, if the user space's attribute is "on", we don't
>     want anything more to happen, _but_ if that's "auto", we want to
>     suspend the parent.
> 
>     Note that dev->driver is still not NULL at this point (although
>     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
>     It looks like, then, what we want to do here is
>     pm_runtime_put_skip_callbacks() again, because if the user space's
>     attribute is "on", it will just drop usage_count by 1, which is what
>     we want, 

For this situation, if user "echo auto > .../power/control" for the
device, the runtime PM callbacks of device will be called.  I think that
is not intended.  So I think it is better to use some kind of flag or
state for that.

Best Regards,
Huang Ying

>     and if that's "auto", it will additionally change the status
>     to RPM_SUSPENDED (without executing callbacks, which we want) _and_
>     it will queue up the parent's suspend (which, again, is what we want).



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  0:36                                                   ` Huang Ying
@ 2012-11-16  0:44                                                     ` Rafael J. Wysocki
  2012-11-16  0:48                                                       ` Huang Ying
  2012-11-16  0:55                                                       ` Rafael J. Wysocki
  0 siblings, 2 replies; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-16  0:44 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> > On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> > > On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > > > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > > > 
> > > > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > > > device in a special low-power state.  The device will always end up in 
> > > > > > > the generic low-power state supported by the PCI core.
> > > > > > 
> > > > > > Well, I'm not sure I'd like that.
> > > > > > 
> > > > > > Let's just go back even one step more and think what we'd like to have in
> > > > > > general terms and then how to implement it. :-)
> > > > > > 
> > > > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > > > addition to what it does currently).  The runtime PM status of each device is
> > > > > > RPM_SUSPENDED at this point.  Then:
> > > > > 
> > > > > Wait a moment.  When the device is detected and initialized, it is in
> > > > > D0, right?  Currently we don't care much because the device starts out
> > > > > disabled for runtime PM.  But now you are going to enable it.  While
> > > > > the device is enabled, its runtime status should match the physical
> > > > > power level.
> > > > 
> > > > OK
> > > 
> > > If my memory were correct, RPM_SUSPENDED just means device stop working,
> > > but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> > > devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> > > power state.
> > 
> > Yes, that's correct and I was wrong when I thought we could require the
> > status to be RPM_ACTIVE all the time when there's no driver, because that
> > would prevent parents from being suspended.  And we want them to be able to
> > suspend for driverless children, _unless_ user space has its attribute set
> > to "on" (i.e. the default).
> > 
> > So it looks like what we want to do is:
> > 
> > (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
> >     before, so that it is in agreement with the pm_runtime_forbid() we do in
> >     there.
> > 
> > (2) If user space switches its attribute to "off" later, but before any
> >     drivers are probed, we want the status to switch to RPM_SUSPENDED
> >     _without_ actually changing the devices power state.  For that,
> >     I think, we can make the PCI bus type's runtime PM callback ignore
> >     devices without drivers (i.e. return 0 for them).
> > 
> > (3) When local_pci_probe() starts, after we've resumed the parent,
> >     the device will be in D0 (it may be D0-uninitialized, though).
> >     If the user space's attribute is "on" at this point, the parent's
> >     resume doesn't change anything.  If it is "auto", the parent's
> >     resume may actually transition the device, although its status
> >     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
> >     with the current code, the driver's .probe() routine needs to see
> >     the device RPM_ACTIVE and usage_count incremented, but we don't
> >     want to run its PM callbacks _before_ .probe() runs.  For that
> >     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
> >     treating the device as though it had the power.no_callbacks flag set,
> >     right before calling ddi->drv->probe().
> > 
> >     If the device has been RPM_ACTIVE at that point (i.e. user space has
> >     had its attribute set to "on") it will just bump up usage_count (which
> >     is what we want).  If the device has been RPM_SUSPENDED, it will
> >     bump up usage_count _and_ change the status to RPM_ACTIVE without
> >     executing any callbacks (the device is in D0 anyway, right?), which
> >     is what we want too.
> > 
> > (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
> >     as not to confuse the driver, which is now in control of the device.
> > 
> > (5) If ddi->drv->probe() fails, we need to restore the situation from
> >     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
> >     at the end of it to actually suspend the parent if user space has
> >     its attribute (for the child!) set to "auto".
> > 
> >     Assume that the driver is not buggy and the failing ddi->drv->probe()
> >     leaves the device in the same configuration as it's received it in.
> >     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
> >     For the parent's suspend to work, we need to transition it to
> >     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
> >     run at this point.  Moreover, we don't want the PCI bus type's
> >     callbacks to run at this point, because dev->driver is still set.
> >     So again, doing something like pm_runtime_put_skip_callbacks(),
> >     treating the device as though it had power.no_callbacks set, seems
> >     to be appropriate.
> >    
> >     Namely, if the user space's attribute is "on", it will just drop
> >     usage_count by 1, which is what we want in that case.  If the user
> >     space's attribute is "auto", on the other hand, it will drop
> >     usage_count by 1 and change the status to RPM_SUSPENDED without
> >     running callbacks, which again is what we want.
> >     
> > (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
> >     so as to restore the situation from before its .probe() routine
> >     was called.  It also should leave the device as RPM_ACTIVE, because
> >     that's what it's got in .probe().  Then, after drv->remove exits,
> >     (and also if drv was NULL to start with), we want to drop usage_count
> >     by 1.  Moreover, if the user space's attribute is "on", we don't
> >     want anything more to happen, _but_ if that's "auto", we want to
> >     suspend the parent.
> > 
> >     Note that dev->driver is still not NULL at this point (although
> >     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
> >     It looks like, then, what we want to do here is
> >     pm_runtime_put_skip_callbacks() again, because if the user space's
> >     attribute is "on", it will just drop usage_count by 1, which is what
> >     we want, 
> 
> For this situation, if user "echo auto > .../power/control" for the
> device, the runtime PM callbacks of device will be called.  I think that
> is not intended.  So I think it is better to use some kind of flag or
> state for that.

I'm not sure what situation exactly you have in mind.  Care to give an
exact scenario?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  0:44                                                     ` Rafael J. Wysocki
@ 2012-11-16  0:48                                                       ` Huang Ying
  2012-11-16  0:55                                                       ` Rafael J. Wysocki
  1 sibling, 0 replies; 68+ messages in thread
From: Huang Ying @ 2012-11-16  0:48 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Fri, 2012-11-16 at 01:44 +0100, Rafael J. Wysocki wrote:
> On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> > > On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> > > > On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > > > > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > > > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > > > > 
> > > > > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > > > > device in a special low-power state.  The device will always end up in 
> > > > > > > > the generic low-power state supported by the PCI core.
> > > > > > > 
> > > > > > > Well, I'm not sure I'd like that.
> > > > > > > 
> > > > > > > Let's just go back even one step more and think what we'd like to have in
> > > > > > > general terms and then how to implement it. :-)
> > > > > > > 
> > > > > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > > > > addition to what it does currently).  The runtime PM status of each device is
> > > > > > > RPM_SUSPENDED at this point.  Then:
> > > > > > 
> > > > > > Wait a moment.  When the device is detected and initialized, it is in
> > > > > > D0, right?  Currently we don't care much because the device starts out
> > > > > > disabled for runtime PM.  But now you are going to enable it.  While
> > > > > > the device is enabled, its runtime status should match the physical
> > > > > > power level.
> > > > > 
> > > > > OK
> > > > 
> > > > If my memory were correct, RPM_SUSPENDED just means device stop working,
> > > > but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> > > > devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> > > > power state.
> > > 
> > > Yes, that's correct and I was wrong when I thought we could require the
> > > status to be RPM_ACTIVE all the time when there's no driver, because that
> > > would prevent parents from being suspended.  And we want them to be able to
> > > suspend for driverless children, _unless_ user space has its attribute set
> > > to "on" (i.e. the default).
> > > 
> > > So it looks like what we want to do is:
> > > 
> > > (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
> > >     before, so that it is in agreement with the pm_runtime_forbid() we do in
> > >     there.
> > > 
> > > (2) If user space switches its attribute to "off" later, but before any
> > >     drivers are probed, we want the status to switch to RPM_SUSPENDED
> > >     _without_ actually changing the devices power state.  For that,
> > >     I think, we can make the PCI bus type's runtime PM callback ignore
> > >     devices without drivers (i.e. return 0 for them).
> > > 
> > > (3) When local_pci_probe() starts, after we've resumed the parent,
> > >     the device will be in D0 (it may be D0-uninitialized, though).
> > >     If the user space's attribute is "on" at this point, the parent's
> > >     resume doesn't change anything.  If it is "auto", the parent's
> > >     resume may actually transition the device, although its status
> > >     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
> > >     with the current code, the driver's .probe() routine needs to see
> > >     the device RPM_ACTIVE and usage_count incremented, but we don't
> > >     want to run its PM callbacks _before_ .probe() runs.  For that
> > >     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
> > >     treating the device as though it had the power.no_callbacks flag set,
> > >     right before calling ddi->drv->probe().
> > > 
> > >     If the device has been RPM_ACTIVE at that point (i.e. user space has
> > >     had its attribute set to "on") it will just bump up usage_count (which
> > >     is what we want).  If the device has been RPM_SUSPENDED, it will
> > >     bump up usage_count _and_ change the status to RPM_ACTIVE without
> > >     executing any callbacks (the device is in D0 anyway, right?), which
> > >     is what we want too.
> > > 
> > > (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
> > >     as not to confuse the driver, which is now in control of the device.
> > > 
> > > (5) If ddi->drv->probe() fails, we need to restore the situation from
> > >     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
> > >     at the end of it to actually suspend the parent if user space has
> > >     its attribute (for the child!) set to "auto".
> > > 
> > >     Assume that the driver is not buggy and the failing ddi->drv->probe()
> > >     leaves the device in the same configuration as it's received it in.
> > >     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
> > >     For the parent's suspend to work, we need to transition it to
> > >     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
> > >     run at this point.  Moreover, we don't want the PCI bus type's
> > >     callbacks to run at this point, because dev->driver is still set.
> > >     So again, doing something like pm_runtime_put_skip_callbacks(),
> > >     treating the device as though it had power.no_callbacks set, seems
> > >     to be appropriate.
> > >    
> > >     Namely, if the user space's attribute is "on", it will just drop
> > >     usage_count by 1, which is what we want in that case.  If the user
> > >     space's attribute is "auto", on the other hand, it will drop
> > >     usage_count by 1 and change the status to RPM_SUSPENDED without
> > >     running callbacks, which again is what we want.
> > >     
> > > (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
> > >     so as to restore the situation from before its .probe() routine
> > >     was called.  It also should leave the device as RPM_ACTIVE, because
> > >     that's what it's got in .probe().  Then, after drv->remove exits,
> > >     (and also if drv was NULL to start with), we want to drop usage_count
> > >     by 1.  Moreover, if the user space's attribute is "on", we don't
> > >     want anything more to happen, _but_ if that's "auto", we want to
> > >     suspend the parent.
> > > 
> > >     Note that dev->driver is still not NULL at this point (although
> > >     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
> > >     It looks like, then, what we want to do here is
> > >     pm_runtime_put_skip_callbacks() again, because if the user space's
> > >     attribute is "on", it will just drop usage_count by 1, which is what
> > >     we want, 
> > 
> > For this situation, if user "echo auto > .../power/control" for the
> > device, the runtime PM callbacks of device will be called.  I think that
> > is not intended.  So I think it is better to use some kind of flag or
> > state for that.
> 
> I'm not sure what situation exactly you have in mind.  Care to give an
> exact scenario?

"control" is "on"

pcie_device_remove()
  pm_runtime_get_sync()
  drv->remove() /* usage count++ */
  pm_runtime_put()
  pm_runtime_put_skip_callbacks()
					pm_runtime_allow
					  rpm_idle	/* callback */

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  0:55                                                       ` Rafael J. Wysocki
@ 2012-11-16  0:54                                                         ` Huang Ying
  2012-11-16  1:29                                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-16  0:54 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Fri, 2012-11-16 at 01:55 +0100, Rafael J. Wysocki wrote:
> On Friday, November 16, 2012 01:44:00 AM Rafael J. Wysocki wrote:
> > On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> 
> [...]
> 
> > > 
> > > For this situation, if user "echo auto > .../power/control" for the
> > > device, the runtime PM callbacks of device will be called.  I think that
> > > is not intended.  So I think it is better to use some kind of flag or
> > > state for that.
> > 
> > I'm not sure what situation exactly you have in mind.  Care to give an
> > exact scenario?
> 
> Ah, I see.  When we've just called drv->remove(), there is a window in
> which user space may cause the driver's runtime PM callbacks to be
> executed by changing its attribute to "auto".
> 
> So perhaps we should check pci_dev->driver rather than pci_dev->dev.driver
> in the runtime PM callbacks?  With a few more changes that should allow us
> to close that race.

Yes.  And I think, with pci_dev->driver (after some changes suggested by
Alan), we need not to use pm_runtime_get/put_skip_callbacks().

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  0:44                                                     ` Rafael J. Wysocki
  2012-11-16  0:48                                                       ` Huang Ying
@ 2012-11-16  0:55                                                       ` Rafael J. Wysocki
  2012-11-16  0:54                                                         ` Huang Ying
  1 sibling, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-16  0:55 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Friday, November 16, 2012 01:44:00 AM Rafael J. Wysocki wrote:
> On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:

[...]

> > 
> > For this situation, if user "echo auto > .../power/control" for the
> > device, the runtime PM callbacks of device will be called.  I think that
> > is not intended.  So I think it is better to use some kind of flag or
> > state for that.
> 
> I'm not sure what situation exactly you have in mind.  Care to give an
> exact scenario?

Ah, I see.  When we've just called drv->remove(), there is a window in
which user space may cause the driver's runtime PM callbacks to be
executed by changing its attribute to "auto".

So perhaps we should check pci_dev->driver rather than pci_dev->dev.driver
in the runtime PM callbacks?  With a few more changes that should allow us
to close that race.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  1:29                                                           ` Rafael J. Wysocki
@ 2012-11-16  1:27                                                             ` Huang Ying
  2012-11-16 10:10                                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 68+ messages in thread
From: Huang Ying @ 2012-11-16  1:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Fri, 2012-11-16 at 02:29 +0100, Rafael J. Wysocki wrote:
> On Friday, November 16, 2012 08:54:56 AM Huang Ying wrote:
> > On Fri, 2012-11-16 at 01:55 +0100, Rafael J. Wysocki wrote:
> > > On Friday, November 16, 2012 01:44:00 AM Rafael J. Wysocki wrote:
> > > > On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > > > > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> > > 
> > > [...]
> > > 
> > > > > 
> > > > > For this situation, if user "echo auto > .../power/control" for the
> > > > > device, the runtime PM callbacks of device will be called.  I think that
> > > > > is not intended.  So I think it is better to use some kind of flag or
> > > > > state for that.
> > > > 
> > > > I'm not sure what situation exactly you have in mind.  Care to give an
> > > > exact scenario?
> > > 
> > > Ah, I see.  When we've just called drv->remove(), there is a window in
> > > which user space may cause the driver's runtime PM callbacks to be
> > > executed by changing its attribute to "auto".
> > > 
> > > So perhaps we should check pci_dev->driver rather than pci_dev->dev.driver
> > > in the runtime PM callbacks?  With a few more changes that should allow us
> > > to close that race.
> > 
> > Yes.  And I think, with pci_dev->driver (after some changes suggested by
> > Alan), we need not to use pm_runtime_get/put_skip_callbacks().
> 
> Good.  Can you please prepare a patch, then? :-)

Sure.

Best Regards,
Huang Ying




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  0:54                                                         ` Huang Ying
@ 2012-11-16  1:29                                                           ` Rafael J. Wysocki
  2012-11-16  1:27                                                             ` Huang Ying
  0 siblings, 1 reply; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-16  1:29 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Friday, November 16, 2012 08:54:56 AM Huang Ying wrote:
> On Fri, 2012-11-16 at 01:55 +0100, Rafael J. Wysocki wrote:
> > On Friday, November 16, 2012 01:44:00 AM Rafael J. Wysocki wrote:
> > > On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > > > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> > 
> > [...]
> > 
> > > > 
> > > > For this situation, if user "echo auto > .../power/control" for the
> > > > device, the runtime PM callbacks of device will be called.  I think that
> > > > is not intended.  So I think it is better to use some kind of flag or
> > > > state for that.
> > > 
> > > I'm not sure what situation exactly you have in mind.  Care to give an
> > > exact scenario?
> > 
> > Ah, I see.  When we've just called drv->remove(), there is a window in
> > which user space may cause the driver's runtime PM callbacks to be
> > executed by changing its attribute to "auto".
> > 
> > So perhaps we should check pci_dev->driver rather than pci_dev->dev.driver
> > in the runtime PM callbacks?  With a few more changes that should allow us
> > to close that race.
> 
> Yes.  And I think, with pci_dev->driver (after some changes suggested by
> Alan), we need not to use pm_runtime_get/put_skip_callbacks().

Good.  Can you please prepare a patch, then? :-)

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-15  9:51                                                 ` Rafael J. Wysocki
  2012-11-15 10:09                                                   ` Rafael J. Wysocki
  2012-11-16  0:36                                                   ` Huang Ying
@ 2012-11-16  3:11                                                   ` Huang Ying
  2 siblings, 0 replies; 68+ messages in thread
From: Huang Ying @ 2012-11-16  3:11 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> On Thursday, November 15, 2012 09:03:44 AM Huang Ying wrote:
> > On Thu, 2012-11-15 at 00:10 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, November 14, 2012 04:45:01 PM Alan Stern wrote:
> > > > On Wed, 14 Nov 2012, Rafael J. Wysocki wrote:
> > > > 
> > > > > > This has the side effect that when a driver unbinds, it can't leave the 
> > > > > > device in a special low-power state.  The device will always end up in 
> > > > > > the generic low-power state supported by the PCI core.
> > > > > 
> > > > > Well, I'm not sure I'd like that.
> > > > > 
> > > > > Let's just go back even one step more and think what we'd like to have in
> > > > > general terms and then how to implement it. :-)
> > > > > 
> > > > > Suppose that pci_pm_init() calls pm_runtime_enable() for all devices (in
> > > > > addition to what it does currently).  The runtime PM status of each device is
> > > > > RPM_SUSPENDED at this point.  Then:
> > > > 
> > > > Wait a moment.  When the device is detected and initialized, it is in
> > > > D0, right?  Currently we don't care much because the device starts out
> > > > disabled for runtime PM.  But now you are going to enable it.  While
> > > > the device is enabled, its runtime status should match the physical
> > > > power level.
> > > 
> > > OK
> > 
> > If my memory were correct, RPM_SUSPENDED just means device stop working,
> > but need not be put into low-power state.  So for RPM_ACTIVE, PCI
> > devices should be in D0, but for RPM_SUSPENDED, PCI devices can in any
> > power state.
> 
> Yes, that's correct and I was wrong when I thought we could require the
> status to be RPM_ACTIVE all the time when there's no driver, because that
> would prevent parents from being suspended.  And we want them to be able to
> suspend for driverless children, _unless_ user space has its attribute set
> to "on" (i.e. the default).
> 
> So it looks like what we want to do is:
> 
> (1) Enable runtime PM in pci_pm_init() and set the status to RPM_ACTIVE right
>     before, so that it is in agreement with the pm_runtime_forbid() we do in
>     there.
> 
> (2) If user space switches its attribute to "off" later, but before any
>     drivers are probed, we want the status to switch to RPM_SUSPENDED
>     _without_ actually changing the devices power state.  For that,
>     I think, we can make the PCI bus type's runtime PM callback ignore
>     devices without drivers (i.e. return 0 for them).
> 
> (3) When local_pci_probe() starts, after we've resumed the parent,
>     the device will be in D0 (it may be D0-uninitialized, though).

But the pci_dev->current_state may be PCI_UNKNOWN, although the real
state should be D0, because of commit:
2449e06a5696b7af1c8a369b04c97f3b139cf3bb.

Best Regards,
Huang Ying

>     If the user space's attribute is "on" at this point, the parent's
>     resume doesn't change anything.  If it is "auto", the parent's
>     resume may actually transition the device, although its status
>     will still be RPM_SUSPENDED.  For consistency _and_ compatibility
>     with the current code, the driver's .probe() routine needs to see
>     the device RPM_ACTIVE and usage_count incremented, but we don't
>     want to run its PM callbacks _before_ .probe() runs.  For that
>     to work, I think, we can do something like pm_runtime_get_skip_callbacks(),
>     treating the device as though it had the power.no_callbacks flag set,
>     right before calling ddi->drv->probe().
> 
>     If the device has been RPM_ACTIVE at that point (i.e. user space has
>     had its attribute set to "on") it will just bump up usage_count (which
>     is what we want).  If the device has been RPM_SUSPENDED, it will
>     bump up usage_count _and_ change the status to RPM_ACTIVE without
>     executing any callbacks (the device is in D0 anyway, right?), which
>     is what we want too.
> 
> (4) If ddi->drv->probe() succeeds, we don't want to change anything, so
>     as not to confuse the driver, which is now in control of the device.
> 
> (5) If ddi->drv->probe() fails, we need to restore the situation from
>     before calling local_pci_probe(), but we want the pm_runtime_put(parent)
>     at the end of it to actually suspend the parent if user space has
>     its attribute (for the child!) set to "auto".
> 
>     Assume that the driver is not buggy and the failing ddi->drv->probe()
>     leaves the device in the same configuration as it's received it in.
>     Then, the device is RPM_ACTIVE and in D0 (which may be uninitialized).
>     For the parent's suspend to work, we need to transition it to
>     RPM_SUSPENDED, but again we don't want the driver's PM callbacks to
>     run at this point.  Moreover, we don't want the PCI bus type's
>     callbacks to run at this point, because dev->driver is still set.
>     So again, doing something like pm_runtime_put_skip_callbacks(),
>     treating the device as though it had power.no_callbacks set, seems
>     to be appropriate.
>    
>     Namely, if the user space's attribute is "on", it will just drop
>     usage_count by 1, which is what we want in that case.  If the user
>     space's attribute is "auto", on the other hand, it will drop
>     usage_count by 1 and change the status to RPM_SUSPENDED without
>     running callbacks, which again is what we want.
>     
> (6) In drv->remove() the driver is supposed to bump up usage_count by 1,
>     so as to restore the situation from before its .probe() routine
>     was called.  It also should leave the device as RPM_ACTIVE, because
>     that's what it's got in .probe().  Then, after drv->remove exits,
>     (and also if drv was NULL to start with), we want to drop usage_count
>     by 1.  Moreover, if the user space's attribute is "on", we don't
>     want anything more to happen, _but_ if that's "auto", we want to
>     suspend the parent.
> 
>     Note that dev->driver is still not NULL at this point (although
>     pci_dev->driver is!) so again we can't run the PCI bus type's callbacks.
>     It looks like, then, what we want to do here is
>     pm_runtime_put_skip_callbacks() again, because if the user space's
>     attribute is "on", it will just drop usage_count by 1, which is what
>     we want, and if that's "auto", it will additionally change the status
>     to RPM_SUSPENDED (without executing callbacks, which we want) _and_
>     it will queue up the parent's suspend (which, again, is what we want).
> 
> Did I miss anything?
> 
> Rafael
> 
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [BUGFIX] PM: Fix active child counting when disabled and forbidden
  2012-11-16  1:27                                                             ` Huang Ying
@ 2012-11-16 10:10                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 68+ messages in thread
From: Rafael J. Wysocki @ 2012-11-16 10:10 UTC (permalink / raw)
  To: Huang Ying; +Cc: Alan Stern, linux-kernel, linux-pm

On Friday, November 16, 2012 09:27:05 AM Huang Ying wrote:
> On Fri, 2012-11-16 at 02:29 +0100, Rafael J. Wysocki wrote:
> > On Friday, November 16, 2012 08:54:56 AM Huang Ying wrote:
> > > On Fri, 2012-11-16 at 01:55 +0100, Rafael J. Wysocki wrote:
> > > > On Friday, November 16, 2012 01:44:00 AM Rafael J. Wysocki wrote:
> > > > > On Friday, November 16, 2012 08:36:14 AM Huang Ying wrote:
> > > > > > On Thu, 2012-11-15 at 10:51 +0100, Rafael J. Wysocki wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > > 
> > > > > > For this situation, if user "echo auto > .../power/control" for the
> > > > > > device, the runtime PM callbacks of device will be called.  I think that
> > > > > > is not intended.  So I think it is better to use some kind of flag or
> > > > > > state for that.
> > > > > 
> > > > > I'm not sure what situation exactly you have in mind.  Care to give an
> > > > > exact scenario?
> > > > 
> > > > Ah, I see.  When we've just called drv->remove(), there is a window in
> > > > which user space may cause the driver's runtime PM callbacks to be
> > > > executed by changing its attribute to "auto".
> > > > 
> > > > So perhaps we should check pci_dev->driver rather than pci_dev->dev.driver
> > > > in the runtime PM callbacks?  With a few more changes that should allow us
> > > > to close that race.
> > > 
> > > Yes.  And I think, with pci_dev->driver (after some changes suggested by
> > > Alan), we need not to use pm_runtime_get/put_skip_callbacks().
> > 
> > Good.  Can you please prepare a patch, then? :-)
> 
> Sure.

Cool, thanks!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2012-11-16 10:06 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-05  1:17 [BUGFIX] PM: Fix active child counting when disabled and forbidden Huang Ying
2012-11-05  1:56 ` Alan Stern
2012-11-05  1:56   ` Alan Stern
2012-11-06  0:43   ` Huang Ying
2012-11-06 15:17     ` Alan Stern
2012-11-06 15:17       ` Alan Stern
2012-11-07  0:26       ` Huang Ying
2012-11-07 15:49         ` Alan Stern
2012-11-07 15:49           ` Alan Stern
2012-11-07 16:09           ` Rafael J. Wysocki
2012-11-07 17:17             ` Alan Stern
2012-11-07 17:17               ` Alan Stern
2012-11-07 20:21               ` Rafael J. Wysocki
2012-11-07 20:47                 ` Alan Stern
2012-11-07 20:47                   ` Alan Stern
2012-11-07 21:44                   ` Rafael J. Wysocki
2012-11-07 21:56                     ` Alan Stern
2012-11-07 21:56                       ` Alan Stern
2012-11-07 22:51                       ` Rafael J. Wysocki
2012-11-07 23:09                         ` Rafael J. Wysocki
2012-11-08  1:15                           ` Huang Ying
2012-11-08  1:35                             ` Rafael J. Wysocki
2012-11-08  2:04                               ` Huang Ying
2012-11-08  9:56                                 ` Rafael J. Wysocki
2012-11-08 17:07                                   ` Alan Stern
2012-11-08 17:07                                     ` Alan Stern
2012-11-09  2:36                                     ` Huang Ying
2012-11-09 16:41                                       ` Alan Stern
2012-11-09 16:41                                         ` Alan Stern
2012-11-12  0:37                                         ` Huang Ying
2012-11-12  2:36                                           ` Alan Stern
2012-11-12  2:36                                             ` Alan Stern
2012-11-12  5:55                                             ` Huang Ying
2012-11-12 16:32                                               ` Alan Stern
2012-11-12 16:32                                                 ` Alan Stern
2012-11-13  1:19                                                 ` Huang Ying
2012-11-13  2:32                                                   ` Alan Stern
2012-11-13  2:32                                                     ` Alan Stern
2012-11-13  5:12                                                     ` Huang Ying
2012-11-13 16:10                                                       ` Alan Stern
2012-11-13 16:10                                                         ` Alan Stern
2012-11-14  1:08                                                         ` Huang Ying
2012-11-14  9:52                                                           ` Rafael J. Wysocki
2012-11-14 13:35                                                             ` Huang Ying
2012-11-14 16:06                                                               ` Alan Stern
2012-11-14 16:06                                                                 ` Alan Stern
2012-11-13 23:43                                                 ` Rafael J. Wysocki
2012-11-14 10:05                                     ` Rafael J. Wysocki
2012-11-14 16:42                                       ` Alan Stern
2012-11-14 16:42                                         ` Alan Stern
2012-11-14 19:42                                         ` Rafael J. Wysocki
2012-11-14 21:45                                           ` Alan Stern
2012-11-14 21:45                                             ` Alan Stern
2012-11-14 23:10                                             ` Rafael J. Wysocki
2012-11-15  1:03                                               ` Huang Ying
2012-11-15  9:51                                                 ` Rafael J. Wysocki
2012-11-15 10:09                                                   ` Rafael J. Wysocki
2012-11-15 15:27                                                     ` Alan Stern
2012-11-15 15:27                                                       ` Alan Stern
2012-11-16  0:36                                                   ` Huang Ying
2012-11-16  0:44                                                     ` Rafael J. Wysocki
2012-11-16  0:48                                                       ` Huang Ying
2012-11-16  0:55                                                       ` Rafael J. Wysocki
2012-11-16  0:54                                                         ` Huang Ying
2012-11-16  1:29                                                           ` Rafael J. Wysocki
2012-11-16  1:27                                                             ` Huang Ying
2012-11-16 10:10                                                               ` Rafael J. Wysocki
2012-11-16  3:11                                                   ` Huang Ying

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.