All of lore.kernel.org
 help / color / mirror / Atom feed
* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 22:48 ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 22:48 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J. Wysocki
  Cc: Kevin Hilman, linux-pm, linux-omap, linux-arm-kernel

Hi,

Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind") broke PM on at least
omap3. It seems we now need to additionally also call
pm_runtime_dont_use_autosuspend() to get things working again?

The following fixes idling on omap3, but I'm wondering if we
should do something in pm_runtime_reinit() instead?

Regards,

Tony

8< ---------------------
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -2232,6 +2232,7 @@ err_irq:
 		dma_release_channel(host->tx_chan);
 	if (host->rx_chan)
 		dma_release_channel(host->rx_chan);
+	pm_runtime_dont_use_autosuspend(host->dev);
 	pm_runtime_put_sync(host->dev);
 	pm_runtime_disable(host->dev);
 	if (host->dbclk)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 22:48 ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 22:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind") broke PM on at least
omap3. It seems we now need to additionally also call
pm_runtime_dont_use_autosuspend() to get things working again?

The following fixes idling on omap3, but I'm wondering if we
should do something in pm_runtime_reinit() instead?

Regards,

Tony

8< ---------------------
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -2232,6 +2232,7 @@ err_irq:
 		dma_release_channel(host->tx_chan);
 	if (host->rx_chan)
 		dma_release_channel(host->rx_chan);
+	pm_runtime_dont_use_autosuspend(host->dev);
 	pm_runtime_put_sync(host->dev);
 	pm_runtime_disable(host->dev);
 	if (host->dbclk)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 22:48 ` Tony Lindgren
@ 2016-01-26 22:50   ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 22:50 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J. Wysocki
  Cc: Kevin Hilman, linux-pm, linux-omap, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160126 14:49]:
> Hi,
> 
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
> 
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Oh one more thing, this happens as we get -EPROBE_DEFER with
the regulators initially.

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>  		dma_release_channel(host->tx_chan);
>  	if (host->rx_chan)
>  		dma_release_channel(host->rx_chan);
> +	pm_runtime_dont_use_autosuspend(host->dev);
>  	pm_runtime_put_sync(host->dev);
>  	pm_runtime_disable(host->dev);
>  	if (host->dbclk)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 22:50   ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 22:50 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160126 14:49]:
> Hi,
> 
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
> 
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Oh one more thing, this happens as we get -EPROBE_DEFER with
the regulators initially.

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>  		dma_release_channel(host->tx_chan);
>  	if (host->rx_chan)
>  		dma_release_channel(host->rx_chan);
> +	pm_runtime_dont_use_autosuspend(host->dev);
>  	pm_runtime_put_sync(host->dev);
>  	pm_runtime_disable(host->dev);
>  	if (host->dbclk)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 22:48 ` Tony Lindgren
@ 2016-01-26 23:14   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-26 23:14 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
>
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Well, does it actually help if you add
pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>                 dma_release_channel(host->tx_chan);
>         if (host->rx_chan)
>                 dma_release_channel(host->rx_chan);
> +       pm_runtime_dont_use_autosuspend(host->dev);
>         pm_runtime_put_sync(host->dev);
>         pm_runtime_disable(host->dev);
>         if (host->dbclk)
> --


Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 23:14   ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-26 23:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind") broke PM on at least
> omap3. It seems we now need to additionally also call
> pm_runtime_dont_use_autosuspend() to get things working again?
>
> The following fixes idling on omap3, but I'm wondering if we
> should do something in pm_runtime_reinit() instead?

Well, does it actually help if you add
pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

> 8< ---------------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>                 dma_release_channel(host->tx_chan);
>         if (host->rx_chan)
>                 dma_release_channel(host->rx_chan);
> +       pm_runtime_dont_use_autosuspend(host->dev);
>         pm_runtime_put_sync(host->dev);
>         pm_runtime_disable(host->dev);
>         if (host->dbclk)
> --


Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 23:14   ` Rafael J. Wysocki
@ 2016-01-26 23:22     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 23:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > Hi,
> >
> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > PM states at probe error and driver unbind") broke PM on at least
> > omap3. It seems we now need to additionally also call
> > pm_runtime_dont_use_autosuspend() to get things working again?
> >
> > The following fixes idling on omap3, but I'm wondering if we
> > should do something in pm_runtime_reinit() instead?
> 
> Well, does it actually help if you add
> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

No adding it to pm_runtime_reinit() does not help.

Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
gives any clues.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 23:22     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > Hi,
> >
> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > PM states at probe error and driver unbind") broke PM on at least
> > omap3. It seems we now need to additionally also call
> > pm_runtime_dont_use_autosuspend() to get things working again?
> >
> > The following fixes idling on omap3, but I'm wondering if we
> > should do something in pm_runtime_reinit() instead?
> 
> Well, does it actually help if you add
> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?

No adding it to pm_runtime_reinit() does not help.

Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
gives any clues.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 23:22     ` Tony Lindgren
@ 2016-01-26 23:37       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-26 23:37 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > Hi,
>> >
>> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > PM states at probe error and driver unbind") broke PM on at least
>> > omap3. It seems we now need to additionally also call
>> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >
>> > The following fixes idling on omap3, but I'm wondering if we
>> > should do something in pm_runtime_reinit() instead?
>>
>> Well, does it actually help if you add
>> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>
> No adding it to pm_runtime_reinit() does not help.

Yes, I realized that it wouldn't help only after sending the previous
message, sorry about that.

The reason why it helps in the driver code seems to be that
autosuspend_delay happens to be negative, so update_autosuspend()
decrements the usage counter that would have stayed incremented
otherwise.  Or at least that's the only way it can help I see ATM.

> Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> gives any clues.

It looks like pm_runtime_reinit() should clear the usage counter too.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 23:37       ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-26 23:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > Hi,
>> >
>> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > PM states at probe error and driver unbind") broke PM on at least
>> > omap3. It seems we now need to additionally also call
>> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >
>> > The following fixes idling on omap3, but I'm wondering if we
>> > should do something in pm_runtime_reinit() instead?
>>
>> Well, does it actually help if you add
>> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>
> No adding it to pm_runtime_reinit() does not help.

Yes, I realized that it wouldn't help only after sending the previous
message, sorry about that.

The reason why it helps in the driver code seems to be that
autosuspend_delay happens to be negative, so update_autosuspend()
decrements the usage counter that would have stayed incremented
otherwise.  Or at least that's the only way it can help I see ATM.

> Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> gives any clues.

It looks like pm_runtime_reinit() should clear the usage counter too.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 23:37       ` Rafael J. Wysocki
@ 2016-01-26 23:52         ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 23:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > Hi,
> >> >
> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > PM states at probe error and driver unbind") broke PM on at least
> >> > omap3. It seems we now need to additionally also call
> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >
> >> > The following fixes idling on omap3, but I'm wondering if we
> >> > should do something in pm_runtime_reinit() instead?
> >>
> >> Well, does it actually help if you add
> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >
> > No adding it to pm_runtime_reinit() does not help.
> 
> Yes, I realized that it wouldn't help only after sending the previous
> message, sorry about that.
>
> The reason why it helps in the driver code seems to be that
> autosuspend_delay happens to be negative, so update_autosuspend()
> decrements the usage counter that would have stayed incremented
> otherwise.  Or at least that's the only way it can help I see ATM.

Oh OK.

> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > gives any clues.
> 
> It looks like pm_runtime_reinit() should clear the usage counter too.

Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
safe to pretty much reset everything?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-26 23:52         ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-26 23:52 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > Hi,
> >> >
> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > PM states at probe error and driver unbind") broke PM on at least
> >> > omap3. It seems we now need to additionally also call
> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >
> >> > The following fixes idling on omap3, but I'm wondering if we
> >> > should do something in pm_runtime_reinit() instead?
> >>
> >> Well, does it actually help if you add
> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >
> > No adding it to pm_runtime_reinit() does not help.
> 
> Yes, I realized that it wouldn't help only after sending the previous
> message, sorry about that.
>
> The reason why it helps in the driver code seems to be that
> autosuspend_delay happens to be negative, so update_autosuspend()
> decrements the usage counter that would have stayed incremented
> otherwise.  Or at least that's the only way it can help I see ATM.

Oh OK.

> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > gives any clues.
> 
> It looks like pm_runtime_reinit() should clear the usage counter too.

Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
safe to pretty much reset everything?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 23:52         ` Tony Lindgren
@ 2016-01-27  7:54           ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-27  7:54 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > >> > Hi,
> > >> >
> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > >> > PM states at probe error and driver unbind") broke PM on at least
> > >> > omap3. It seems we now need to additionally also call
> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> > >> >
> > >> > The following fixes idling on omap3, but I'm wondering if we
> > >> > should do something in pm_runtime_reinit() instead?
> > >>
> > >> Well, does it actually help if you add
> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> > >
> > > No adding it to pm_runtime_reinit() does not help.
> > 
> > Yes, I realized that it wouldn't help only after sending the previous
> > message, sorry about that.
> >
> > The reason why it helps in the driver code seems to be that
> > autosuspend_delay happens to be negative, so update_autosuspend()
> > decrements the usage counter that would have stayed incremented
> > otherwise.  Or at least that's the only way it can help I see ATM.
> 
> Oh OK.
> 
> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > > gives any clues.
> > 
> > It looks like pm_runtime_reinit() should clear the usage counter too.
> 
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?

Not only safe, but also a good idea apparently. :-)

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-27  7:54           ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-27  7:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> > >> > Hi,
> > >> >
> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> > >> > PM states at probe error and driver unbind") broke PM on at least
> > >> > omap3. It seems we now need to additionally also call
> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> > >> >
> > >> > The following fixes idling on omap3, but I'm wondering if we
> > >> > should do something in pm_runtime_reinit() instead?
> > >>
> > >> Well, does it actually help if you add
> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> > >
> > > No adding it to pm_runtime_reinit() does not help.
> > 
> > Yes, I realized that it wouldn't help only after sending the previous
> > message, sorry about that.
> >
> > The reason why it helps in the driver code seems to be that
> > autosuspend_delay happens to be negative, so update_autosuspend()
> > decrements the usage counter that would have stayed incremented
> > otherwise.  Or at least that's the only way it can help I see ATM.
> 
> Oh OK.
> 
> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> > > gives any clues.
> > 
> > It looks like pm_runtime_reinit() should clear the usage counter too.
> 
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?

Not only safe, but also a good idea apparently. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-27  7:54           ` Rafael J. Wysocki
@ 2016-01-27  8:17             ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-01-27  8:17 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
>> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > >> > Hi,
>> > >> >
>> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > >> > PM states at probe error and driver unbind") broke PM on at least
>> > >> > omap3. It seems we now need to additionally also call
>> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> > >> >
>> > >> > The following fixes idling on omap3, but I'm wondering if we
>> > >> > should do something in pm_runtime_reinit() instead?
>> > >>
>> > >> Well, does it actually help if you add
>> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> > >
>> > > No adding it to pm_runtime_reinit() does not help.
>> >
>> > Yes, I realized that it wouldn't help only after sending the previous
>> > message, sorry about that.
>> >
>> > The reason why it helps in the driver code seems to be that
>> > autosuspend_delay happens to be negative, so update_autosuspend()
>> > decrements the usage counter that would have stayed incremented
>> > otherwise.  Or at least that's the only way it can help I see ATM.
>>
>> Oh OK.
>>
>> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > > gives any clues.
>> >
>> > It looks like pm_runtime_reinit() should clear the usage counter too.
>>
>> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
>> safe to pretty much reset everything?
>
> Not only safe, but also a good idea apparently. :-)

I happy to send a patch, extending pm_runtime_reinit() with some more
data to be reset.

Or, you or Tony intend to send a patch?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-27  8:17             ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-01-27  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
>> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > >> > Hi,
>> > >> >
>> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> > >> > PM states at probe error and driver unbind") broke PM on at least
>> > >> > omap3. It seems we now need to additionally also call
>> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> > >> >
>> > >> > The following fixes idling on omap3, but I'm wondering if we
>> > >> > should do something in pm_runtime_reinit() instead?
>> > >>
>> > >> Well, does it actually help if you add
>> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> > >
>> > > No adding it to pm_runtime_reinit() does not help.
>> >
>> > Yes, I realized that it wouldn't help only after sending the previous
>> > message, sorry about that.
>> >
>> > The reason why it helps in the driver code seems to be that
>> > autosuspend_delay happens to be negative, so update_autosuspend()
>> > decrements the usage counter that would have stayed incremented
>> > otherwise.  Or at least that's the only way it can help I see ATM.
>>
>> Oh OK.
>>
>> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > > gives any clues.
>> >
>> > It looks like pm_runtime_reinit() should clear the usage counter too.
>>
>> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
>> safe to pretty much reset everything?
>
> Not only safe, but also a good idea apparently. :-)

I happy to send a patch, extending pm_runtime_reinit() with some more
data to be reset.

Or, you or Tony intend to send a patch?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-27  8:17             ` Ulf Hansson
@ 2016-01-27 15:19               ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-27 15:19 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-pm, Kevin Hilman, Rafael J. Wysocki, Rafael J. Wysocki,
	Rafael J. Wysocki, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160127 00:18]:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.
> 
> Or, you or Tony intend to send a patch?

Please do, I'll be glad to test it!

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-27 15:19               ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-27 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160127 00:18]:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.
> 
> Or, you or Tony intend to send a patch?

Please do, I'll be glad to test it!

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-27  8:17             ` Ulf Hansson
@ 2016-01-27 22:51               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-27 22:51 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wednesday, January 27, 2016 09:17:45 AM Ulf Hansson wrote:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > >> > Hi,
> >> > >> >
> >> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > >> > PM states at probe error and driver unbind") broke PM on at least
> >> > >> > omap3. It seems we now need to additionally also call
> >> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> > >> >
> >> > >> > The following fixes idling on omap3, but I'm wondering if we
> >> > >> > should do something in pm_runtime_reinit() instead?
> >> > >>
> >> > >> Well, does it actually help if you add
> >> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >> > >
> >> > > No adding it to pm_runtime_reinit() does not help.
> >> >
> >> > Yes, I realized that it wouldn't help only after sending the previous
> >> > message, sorry about that.
> >> >
> >> > The reason why it helps in the driver code seems to be that
> >> > autosuspend_delay happens to be negative, so update_autosuspend()
> >> > decrements the usage counter that would have stayed incremented
> >> > otherwise.  Or at least that's the only way it can help I see ATM.
> >>
> >> Oh OK.
> >>
> >> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > > gives any clues.
> >> >
> >> > It looks like pm_runtime_reinit() should clear the usage counter too.
> >>
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.

Please do.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-27 22:51               ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-01-27 22:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday, January 27, 2016 09:17:45 AM Ulf Hansson wrote:
> On 27 January 2016 at 08:54, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, January 26, 2016 03:52:23 PM Tony Lindgren wrote:
> >> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> > On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> > >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > >> > Hi,
> >> > >> >
> >> > >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> > >> > PM states at probe error and driver unbind") broke PM on at least
> >> > >> > omap3. It seems we now need to additionally also call
> >> > >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> > >> >
> >> > >> > The following fixes idling on omap3, but I'm wondering if we
> >> > >> > should do something in pm_runtime_reinit() instead?
> >> > >>
> >> > >> Well, does it actually help if you add
> >> > >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
> >> > >
> >> > > No adding it to pm_runtime_reinit() does not help.
> >> >
> >> > Yes, I realized that it wouldn't help only after sending the previous
> >> > message, sorry about that.
> >> >
> >> > The reason why it helps in the driver code seems to be that
> >> > autosuspend_delay happens to be negative, so update_autosuspend()
> >> > decrements the usage counter that would have stayed incremented
> >> > otherwise.  Or at least that's the only way it can help I see ATM.
> >>
> >> Oh OK.
> >>
> >> > > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > > gives any clues.
> >> >
> >> > It looks like pm_runtime_reinit() should clear the usage counter too.
> >>
> >> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> >> safe to pretty much reset everything?
> >
> > Not only safe, but also a good idea apparently. :-)
> 
> I happy to send a patch, extending pm_runtime_reinit() with some more
> data to be reset.

Please do.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-26 23:52         ` Tony Lindgren
@ 2016-01-28 14:29           ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-01-28 14:29 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > Hi,
>> >> >
>> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> > PM states at probe error and driver unbind") broke PM on at least

I need to understand the issue here in a bit more detail, could you
please try to fill out some of my gaps!?

In *what way* did it break PM?

Did the driver not probe successfully the second try? If so, what happened.

>> >> > omap3. It seems we now need to additionally also call
>> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >
>> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> > should do something in pm_runtime_reinit() instead?

I understand this as the second (or third, forth, whatever) probing
attempt actually succeeds, right!?

Is the issue you are seeing, that the device never becomes runtime
suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind")?

>> >>
>> >> Well, does it actually help if you add
>> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> >
>> > No adding it to pm_runtime_reinit() does not help.
>>
>> Yes, I realized that it wouldn't help only after sending the previous
>> message, sorry about that.
>>
>> The reason why it helps in the driver code seems to be that
>> autosuspend_delay happens to be negative, so update_autosuspend()
>> decrements the usage counter that would have stayed incremented
>> otherwise.  Or at least that's the only way it can help I see ATM.
>
> Oh OK.
>
>> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > gives any clues.

That's a good clue. Although, there could be several reasons to why.

Rafael has pointed out one valid potential case, but I want to be sure
that's really what happening here.

*If* the problem is that the device doesn't becomes runtime suspended,
that will anyway be prevented as long as the autosuspend_delay has
been set to a negative value. That's why I wonder whether this really
is the case here.

For omap3, I assume there's a PM domain (the so called hwmod) being
attached to the omap_hsmmc device at device registration point!?
In that path depending on a specific hwmod flag, the device will be
given the an initial *active* runtime PM status, via invoking
pm_runtime_set_active().

*If* that's the case, it affects the probing sequence, as the
pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
to be invoked in the first probe attempt.

Moreover, according to the data I received in this regression report
so far, it seems like probing the second time has *always* been done
with the device in runtime PM active state. Could that be the reason
to why it "happens" to work?

>>
>> It looks like pm_runtime_reinit() should clear the usage counter too.
>
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?
>
> Regards,
>
> Tony

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-28 14:29           ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-01-28 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > Hi,
>> >> >
>> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> > PM states at probe error and driver unbind") broke PM on at least

I need to understand the issue here in a bit more detail, could you
please try to fill out some of my gaps!?

In *what way* did it break PM?

Did the driver not probe successfully the second try? If so, what happened.

>> >> > omap3. It seems we now need to additionally also call
>> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >
>> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> > should do something in pm_runtime_reinit() instead?

I understand this as the second (or third, forth, whatever) probing
attempt actually succeeds, right!?

Is the issue you are seeing, that the device never becomes runtime
suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind")?

>> >>
>> >> Well, does it actually help if you add
>> >> pm_runtime_dont_use_autosuspend(dev) to pm_runtime_reinit()?
>> >
>> > No adding it to pm_runtime_reinit() does not help.
>>
>> Yes, I realized that it wouldn't help only after sending the previous
>> message, sorry about that.
>>
>> The reason why it helps in the driver code seems to be that
>> autosuspend_delay happens to be negative, so update_autosuspend()
>> decrements the usage counter that would have stayed incremented
>> otherwise.  Or at least that's the only way it can help I see ATM.
>
> Oh OK.
>
>> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> > gives any clues.

That's a good clue. Although, there could be several reasons to why.

Rafael has pointed out one valid potential case, but I want to be sure
that's really what happening here.

*If* the problem is that the device doesn't becomes runtime suspended,
that will anyway be prevented as long as the autosuspend_delay has
been set to a negative value. That's why I wonder whether this really
is the case here.

For omap3, I assume there's a PM domain (the so called hwmod) being
attached to the omap_hsmmc device at device registration point!?
In that path depending on a specific hwmod flag, the device will be
given the an initial *active* runtime PM status, via invoking
pm_runtime_set_active().

*If* that's the case, it affects the probing sequence, as the
pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
to be invoked in the first probe attempt.

Moreover, according to the data I received in this regression report
so far, it seems like probing the second time has *always* been done
with the device in runtime PM active state. Could that be the reason
to why it "happens" to work?

>>
>> It looks like pm_runtime_reinit() should clear the usage counter too.
>
> Yeah if we do this when !pm_runtime_enabled(dev) it seems it's
> safe to pretty much reset everything?
>
> Regards,
>
> Tony

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-28 14:29           ` Ulf Hansson
@ 2016-01-28 16:58             ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-28 16:58 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> >> > PM states at probe error and driver unbind") broke PM on at least
> 
> I need to understand the issue here in a bit more detail, could you
> please try to fill out some of my gaps!?
> 
> In *what way* did it break PM?

The MMC hardware will not get idled properly any longer blocking any
deeper idle states.

> Did the driver not probe successfully the second try? If so, what happened.

It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
But the PM runtime usecounts are wrong.

> >> >> > omap3. It seems we now need to additionally also call
> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >> >
> >> >> > The following fixes idling on omap3, but I'm wondering if we
> >> >> > should do something in pm_runtime_reinit() instead?
> 
> I understand this as the second (or third, forth, whatever) probing
> attempt actually succeeds, right!?

Right.

> Is the issue you are seeing, that the device never becomes runtime
> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind")?

Correct.

> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > gives any clues.
> 
> That's a good clue. Although, there could be several reasons to why.
> 
> Rafael has pointed out one valid potential case, but I want to be sure
> that's really what happening here.
> 
> *If* the problem is that the device doesn't becomes runtime suspended,
> that will anyway be prevented as long as the autosuspend_delay has
> been set to a negative value. That's why I wonder whether this really
> is the case here.

That seems to be the case here. In device driver probe, commenting out
pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
also makes things work again.

So one of the failing cases seems to be:

1. Device driver probe sets pm_runtime_set_autosuspend_delay()

2. Device driver probe initially fails with -EPROBE_DEFER

3. PM runtime usecounts get messed up

> For omap3, I assume there's a PM domain (the so called hwmod) being
> attached to the omap_hsmmc device at device registration point!?

Correct. It uses the notifiers like bus code does in general.

FYI, most SoCs don't have hardware based autoidle signaling between
the interconnect and the interconnect targets. So the hwmod code is
still needed until we have converted it into a proper interconnect +
module target drivers.

> In that path depending on a specific hwmod flag, the device will be
> given the an initial *active* runtime PM status, via invoking
> pm_runtime_set_active().
>
> *If* that's the case, it affects the probing sequence, as the
> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
> to be invoked in the first probe attempt.

It has worked since pm runtime. And it works with MMC as as a loadable
module just fine when no -EPROBE_DEFER happens.

> Moreover, according to the data I received in this regression report
> so far, it seems like probing the second time has *always* been done
> with the device in runtime PM active state. Could that be the reason
> to why it "happens" to work?

Not correct. I think that speculation is not related to the $subject
regression at all.

BTW, do you have some hardware to test with that has PM runtime
implemnted and actually working with mainline kernel?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-01-28 16:58             ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-01-28 16:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> >> >> > PM states at probe error and driver unbind") broke PM on at least
> 
> I need to understand the issue here in a bit more detail, could you
> please try to fill out some of my gaps!?
> 
> In *what way* did it break PM?

The MMC hardware will not get idled properly any longer blocking any
deeper idle states.

> Did the driver not probe successfully the second try? If so, what happened.

It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
But the PM runtime usecounts are wrong.

> >> >> > omap3. It seems we now need to additionally also call
> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
> >> >> >
> >> >> > The following fixes idling on omap3, but I'm wondering if we
> >> >> > should do something in pm_runtime_reinit() instead?
> 
> I understand this as the second (or third, forth, whatever) probing
> attempt actually succeeds, right!?

Right.

> Is the issue you are seeing, that the device never becomes runtime
> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind")?

Correct.

> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
> >> > gives any clues.
> 
> That's a good clue. Although, there could be several reasons to why.
> 
> Rafael has pointed out one valid potential case, but I want to be sure
> that's really what happening here.
> 
> *If* the problem is that the device doesn't becomes runtime suspended,
> that will anyway be prevented as long as the autosuspend_delay has
> been set to a negative value. That's why I wonder whether this really
> is the case here.

That seems to be the case here. In device driver probe, commenting out
pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
also makes things work again.

So one of the failing cases seems to be:

1. Device driver probe sets pm_runtime_set_autosuspend_delay()

2. Device driver probe initially fails with -EPROBE_DEFER

3. PM runtime usecounts get messed up

> For omap3, I assume there's a PM domain (the so called hwmod) being
> attached to the omap_hsmmc device at device registration point!?

Correct. It uses the notifiers like bus code does in general.

FYI, most SoCs don't have hardware based autoidle signaling between
the interconnect and the interconnect targets. So the hwmod code is
still needed until we have converted it into a proper interconnect +
module target drivers.

> In that path depending on a specific hwmod flag, the device will be
> given the an initial *active* runtime PM status, via invoking
> pm_runtime_set_active().
>
> *If* that's the case, it affects the probing sequence, as the
> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
> to be invoked in the first probe attempt.

It has worked since pm runtime. And it works with MMC as as a loadable
module just fine when no -EPROBE_DEFER happens.

> Moreover, according to the data I received in this regression report
> so far, it seems like probing the second time has *always* been done
> with the device in runtime PM active state. Could that be the reason
> to why it "happens" to work?

Not correct. I think that speculation is not related to the $subject
regression at all.

BTW, do you have some hardware to test with that has PM runtime
implemnted and actually working with mainline kernel?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-01-28 16:58             ` Tony Lindgren
@ 2016-02-01 16:44               ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-01 16:44 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
>> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> >> > PM states at probe error and driver unbind") broke PM on at least
>>
>> I need to understand the issue here in a bit more detail, could you
>> please try to fill out some of my gaps!?
>>
>> In *what way* did it break PM?
>
> The MMC hardware will not get idled properly any longer blocking any
> deeper idle states.
>
>> Did the driver not probe successfully the second try? If so, what happened.
>
> It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> But the PM runtime usecounts are wrong.

Okay. How did you verify this?

I think the easiest way would be to make sure the usage count is
*one*, just before the omap_hsmmc calls mmc_add_host() in its
->probe() function.

If it's *two*, that confirms this theory.

>
>> >> >> > omap3. It seems we now need to additionally also call
>> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >> >
>> >> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> >> > should do something in pm_runtime_reinit() instead?
>>
>> I understand this as the second (or third, forth, whatever) probing
>> attempt actually succeeds, right!?
>
> Right.
>
>> Is the issue you are seeing, that the device never becomes runtime
>> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> PM states at probe error and driver unbind")?
>
> Correct.
>
>> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> >> > gives any clues.
>>
>> That's a good clue. Although, there could be several reasons to why.
>>
>> Rafael has pointed out one valid potential case, but I want to be sure
>> that's really what happening here.
>>
>> *If* the problem is that the device doesn't becomes runtime suspended,
>> that will anyway be prevented as long as the autosuspend_delay has
>> been set to a negative value. That's why I wonder whether this really
>> is the case here.
>
> That seems to be the case here. In device driver probe, commenting out
> pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
> also makes things work again.

Okay. Interesting. :-)

>
> So one of the failing cases seems to be:
>
> 1. Device driver probe sets pm_runtime_set_autosuspend_delay()

Only when "someone" in-before has set the autosuspend delay to a
negative value, this will affect the usage count.

I didn't find any in-kernel user that would set this for the
omap_hsmmc device, so it needs to be done from userspace via sysfs. It
would be really great if you could help to confirm this.

But more importantly, I fail to understand why commit 5de85b9d57ab
("PM / runtime: Re-init runtime PM states at probe error and driver
unbind"), is changing the behaviour in this regards.
Potentially we might have a different runtime PM status than we had
before when probing the second try, but how could that mess up the
usage count...?

>
> 2. Device driver probe initially fails with -EPROBE_DEFER

So that happened before as well.

Although, we now get a different runtime PM status (suspended) the
second probe time.

>
> 3. PM runtime usecounts get messed up

Perhaps, but let's try to validate that as it should be rather easy.

>
>> For omap3, I assume there's a PM domain (the so called hwmod) being
>> attached to the omap_hsmmc device at device registration point!?
>
> Correct. It uses the notifiers like bus code does in general.

Yes, there perfectly okay.

Although, I wondering whether it could be that it's the PM domain
that's preventing the omap_hsmmc device from becoming runtime
suspended.

Perhaps the PM domain returns an error code from its
->runtime_suspend() callback?

>
> FYI, most SoCs don't have hardware based autoidle signaling between
> the interconnect and the interconnect targets. So the hwmod code is
> still needed until we have converted it into a proper interconnect +
> module target drivers.
>
>> In that path depending on a specific hwmod flag, the device will be
>> given the an initial *active* runtime PM status, via invoking
>> pm_runtime_set_active().
>>
>> *If* that's the case, it affects the probing sequence, as the
>> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
>> to be invoked in the first probe attempt.
>
> It has worked since pm runtime. And it works with MMC as as a loadable
> module just fine when no -EPROBE_DEFER happens.

Okay, so we definitely seem to have an issue with the changed runtime
PM status (suspended) the second probe time.

That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind").

>
>> Moreover, according to the data I received in this regression report
>> so far, it seems like probing the second time has *always* been done
>> with the device in runtime PM active state. Could that be the reason
>> to why it "happens" to work?
>
> Not correct. I think that speculation is not related to the $subject
> regression at all.

I think it's important in this context, as it could affect how the PM
domain may treat the device. As I mentioned earlier.

For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind"), the ->runtime_resume()
callback seems now to be called twice in a row, without in-between
having the ->runtime_suspend() callback being invoked. Does really the
PM domain cope with that correctly?

>
> BTW, do you have some hardware to test with that has PM runtime
> implemnted and actually working with mainline kernel?

Oh, yes. Although I don't have an omap3, I wish I had.

Also, I have locally a "runtime PM test driver", which helps me to
test various runtime PM sequences.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 16:44               ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-01 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [160128 06:30]:
>> On 27 January 2016 at 00:52, Tony Lindgren <tony@atomide.com> wrote:
>> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:38]:
>> >> On Wed, Jan 27, 2016 at 12:22 AM, Tony Lindgren <tony@atomide.com> wrote:
>> >> > * Rafael J. Wysocki <rafael@kernel.org> [160126 15:15]:
>> >> >> On Tue, Jan 26, 2016 at 11:48 PM, Tony Lindgren <tony@atomide.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > Looks like commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> >> >> > PM states at probe error and driver unbind") broke PM on at least
>>
>> I need to understand the issue here in a bit more detail, could you
>> please try to fill out some of my gaps!?
>>
>> In *what way* did it break PM?
>
> The MMC hardware will not get idled properly any longer blocking any
> deeper idle states.
>
>> Did the driver not probe successfully the second try? If so, what happened.
>
> It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> But the PM runtime usecounts are wrong.

Okay. How did you verify this?

I think the easiest way would be to make sure the usage count is
*one*, just before the omap_hsmmc calls mmc_add_host() in its
->probe() function.

If it's *two*, that confirms this theory.

>
>> >> >> > omap3. It seems we now need to additionally also call
>> >> >> > pm_runtime_dont_use_autosuspend() to get things working again?
>> >> >> >
>> >> >> > The following fixes idling on omap3, but I'm wondering if we
>> >> >> > should do something in pm_runtime_reinit() instead?
>>
>> I understand this as the second (or third, forth, whatever) probing
>> attempt actually succeeds, right!?
>
> Right.
>
>> Is the issue you are seeing, that the device never becomes runtime
>> suspended due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
>> PM states at probe error and driver unbind")?
>
> Correct.
>
>> >> > Looks like we have RPM_ACTIVE set in pm_runtime_reinit() if that
>> >> > gives any clues.
>>
>> That's a good clue. Although, there could be several reasons to why.
>>
>> Rafael has pointed out one valid potential case, but I want to be sure
>> that's really what happening here.
>>
>> *If* the problem is that the device doesn't becomes runtime suspended,
>> that will anyway be prevented as long as the autosuspend_delay has
>> been set to a negative value. That's why I wonder whether this really
>> is the case here.
>
> That seems to be the case here. In device driver probe, commenting out
> pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY)
> also makes things work again.

Okay. Interesting. :-)

>
> So one of the failing cases seems to be:
>
> 1. Device driver probe sets pm_runtime_set_autosuspend_delay()

Only when "someone" in-before has set the autosuspend delay to a
negative value, this will affect the usage count.

I didn't find any in-kernel user that would set this for the
omap_hsmmc device, so it needs to be done from userspace via sysfs. It
would be really great if you could help to confirm this.

But more importantly, I fail to understand why commit 5de85b9d57ab
("PM / runtime: Re-init runtime PM states at probe error and driver
unbind"), is changing the behaviour in this regards.
Potentially we might have a different runtime PM status than we had
before when probing the second try, but how could that mess up the
usage count...?

>
> 2. Device driver probe initially fails with -EPROBE_DEFER

So that happened before as well.

Although, we now get a different runtime PM status (suspended) the
second probe time.

>
> 3. PM runtime usecounts get messed up

Perhaps, but let's try to validate that as it should be rather easy.

>
>> For omap3, I assume there's a PM domain (the so called hwmod) being
>> attached to the omap_hsmmc device at device registration point!?
>
> Correct. It uses the notifiers like bus code does in general.

Yes, there perfectly okay.

Although, I wondering whether it could be that it's the PM domain
that's preventing the omap_hsmmc device from becoming runtime
suspended.

Perhaps the PM domain returns an error code from its
->runtime_suspend() callback?

>
> FYI, most SoCs don't have hardware based autoidle signaling between
> the interconnect and the interconnect targets. So the hwmod code is
> still needed until we have converted it into a proper interconnect +
> module target drivers.
>
>> In that path depending on a specific hwmod flag, the device will be
>> given the an initial *active* runtime PM status, via invoking
>> pm_runtime_set_active().
>>
>> *If* that's the case, it affects the probing sequence, as the
>> pm_runtime_get_sync() won't trigger the ->runtime_resume() callbacks
>> to be invoked in the first probe attempt.
>
> It has worked since pm runtime. And it works with MMC as as a loadable
> module just fine when no -EPROBE_DEFER happens.

Okay, so we definitely seem to have an issue with the changed runtime
PM status (suspended) the second probe time.

That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind").

>
>> Moreover, according to the data I received in this regression report
>> so far, it seems like probing the second time has *always* been done
>> with the device in runtime PM active state. Could that be the reason
>> to why it "happens" to work?
>
> Not correct. I think that speculation is not related to the $subject
> regression at all.

I think it's important in this context, as it could affect how the PM
domain may treat the device. As I mentioned earlier.

For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
PM states at probe error and driver unbind"), the ->runtime_resume()
callback seems now to be called twice in a row, without in-between
having the ->runtime_suspend() callback being invoked. Does really the
PM domain cope with that correctly?

>
> BTW, do you have some hardware to test with that has PM runtime
> implemnted and actually working with mainline kernel?

Oh, yes. Although I don't have an omap3, I wish I had.

Also, I have locally a "runtime PM test driver", which helps me to
test various runtime PM sequences.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 16:44               ` Ulf Hansson
@ 2016-02-01 18:11                 ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 18:11 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
> On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> >
> > The MMC hardware will not get idled properly any longer blocking any
> > deeper idle states.
> >
> >> Did the driver not probe successfully the second try? If so, what happened.
> >
> > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> > But the PM runtime usecounts are wrong.
> 
> Okay. How did you verify this?

Well that was just based on what I see in the dmesg:

omap_device: omap_device_enable() called from invalid state 1

> I think the easiest way would be to make sure the usage count is
> *one*, just before the omap_hsmmc calls mmc_add_host() in its
> ->probe() function.
> 
> If it's *two*, that confirms this theory.

Here's with use count dumped for one MMC:

omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: omap_device: omap_device_enable() called from invalid state 1
omap_hsmmc 4809c000.mmc: PM runtime use count: 0

So it seems you're right, it's the state not the count.

> > So one of the failing cases seems to be:
> >
> > 1. Device driver probe sets pm_runtime_set_autosuspend_delay()
> 
> Only when "someone" in-before has set the autosuspend delay to a
> negative value, this will affect the usage count.
> 
> I didn't find any in-kernel user that would set this for the
> omap_hsmmc device, so it needs to be done from userspace via sysfs. It
> would be really great if you could help to confirm this.

No nothing is setting that from userspace.

> But more importantly, I fail to understand why commit 5de85b9d57ab
> ("PM / runtime: Re-init runtime PM states at probe error and driver
> unbind"), is changing the behaviour in this regards.
> Potentially we might have a different runtime PM status than we had
> before when probing the second try, but how could that mess up the
> usage count...?

Yes I think you're right here, it's the state, not the count.

> Although, I wondering whether it could be that it's the PM domain
> that's preventing the omap_hsmmc device from becoming runtime
> suspended.
> 
> Perhaps the PM domain returns an error code from its
> ->runtime_suspend() callback?

We certainly see a warning there.

> Okay, so we definitely seem to have an issue with the changed runtime
> PM status (suspended) the second probe time.
> 
> That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind").

Yup.

> For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind"), the ->runtime_resume()
> callback seems now to be called twice in a row, without in-between
> having the ->runtime_suspend() callback being invoked. Does really the
> PM domain cope with that correctly?

That might explain the warning above.

> > BTW, do you have some hardware to test with that has PM runtime
> > implemnted and actually working with mainline kernel?
> 
> Oh, yes. Although I don't have an omap3, I wish I had.

OK good to hear. Anyways, getting an omap3 should be in tens of
whatever units if you need one to test with.

> Also, I have locally a "runtime PM test driver", which helps me to
> test various runtime PM sequences.

Now that would be good to have in the mainline kernel. Of course it
still is a very limited test.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 18:11                 ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 18:11 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
> On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> >
> > The MMC hardware will not get idled properly any longer blocking any
> > deeper idle states.
> >
> >> Did the driver not probe successfully the second try? If so, what happened.
> >
> > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> > But the PM runtime usecounts are wrong.
> 
> Okay. How did you verify this?

Well that was just based on what I see in the dmesg:

omap_device: omap_device_enable() called from invalid state 1

> I think the easiest way would be to make sure the usage count is
> *one*, just before the omap_hsmmc calls mmc_add_host() in its
> ->probe() function.
> 
> If it's *two*, that confirms this theory.

Here's with use count dumped for one MMC:

omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer cd
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: Got CD GPIO
omap_hsmmc 4809c000.mmc: GPIO lookup for consumer wp
omap_hsmmc 4809c000.mmc: using device tree for GPIO lookup
omap_hsmmc 4809c000.mmc: using lookup tables for GPIO lookup
omap_hsmmc 4809c000.mmc: lookup for GPIO wp failed
omap_hsmmc 4809c000.mmc: omap_device: omap_device_enable() called from invalid state 1
omap_hsmmc 4809c000.mmc: PM runtime use count: 0

So it seems you're right, it's the state not the count.

> > So one of the failing cases seems to be:
> >
> > 1. Device driver probe sets pm_runtime_set_autosuspend_delay()
> 
> Only when "someone" in-before has set the autosuspend delay to a
> negative value, this will affect the usage count.
> 
> I didn't find any in-kernel user that would set this for the
> omap_hsmmc device, so it needs to be done from userspace via sysfs. It
> would be really great if you could help to confirm this.

No nothing is setting that from userspace.

> But more importantly, I fail to understand why commit 5de85b9d57ab
> ("PM / runtime: Re-init runtime PM states at probe error and driver
> unbind"), is changing the behaviour in this regards.
> Potentially we might have a different runtime PM status than we had
> before when probing the second try, but how could that mess up the
> usage count...?

Yes I think you're right here, it's the state, not the count.

> Although, I wondering whether it could be that it's the PM domain
> that's preventing the omap_hsmmc device from becoming runtime
> suspended.
> 
> Perhaps the PM domain returns an error code from its
> ->runtime_suspend() callback?

We certainly see a warning there.

> Okay, so we definitely seem to have an issue with the changed runtime
> PM status (suspended) the second probe time.
> 
> That's indeed caused by 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind").

Yup.

> For example due to commit 5de85b9d57ab ("PM / runtime: Re-init runtime
> PM states at probe error and driver unbind"), the ->runtime_resume()
> callback seems now to be called twice in a row, without in-between
> having the ->runtime_suspend() callback being invoked. Does really the
> PM domain cope with that correctly?

That might explain the warning above.

> > BTW, do you have some hardware to test with that has PM runtime
> > implemnted and actually working with mainline kernel?
> 
> Oh, yes. Although I don't have an omap3, I wish I had.

OK good to hear. Anyways, getting an omap3 should be in tens of
whatever units if you need one to test with.

> Also, I have locally a "runtime PM test driver", which helps me to
> test various runtime PM sequences.

Now that would be good to have in the mainline kernel. Of course it
still is a very limited test.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 18:11                 ` Tony Lindgren
@ 2016-02-01 22:06                   ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 22:06 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160201 10:12]:
> * Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
> > On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> > >
> > > The MMC hardware will not get idled properly any longer blocking any
> > > deeper idle states.
> > >
> > >> Did the driver not probe successfully the second try? If so, what happened.
> > >
> > > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> > > But the PM runtime usecounts are wrong.
> > 
> > Okay. How did you verify this?
> 
> Well that was just based on what I see in the dmesg:
> 
> omap_device: omap_device_enable() called from invalid state 1

So we're now missing the idling of hardare after -EPROBE_DEFER..
Does the following patch work for you guys?

Regards,

Tony

8< -----------------------------
From: Tony Lindgren <tony@atomide.com>
Date: Mon, 1 Feb 2016 13:40:46 -0800
Subject: [PATCH] PM / runtime: Fix PM runtime reinit

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") added calls to reinit the PM runtime. This
however broke things for idling the hardware at least if the driver
probing has pm_runtime_set_autosuspend_delay() and -EPROBE_DEFER
happens.

Fix the problem by adding a check for configured autosuspend if
RPM_ACTIVE is set. Then reset the autosuspend, and suspend the
device to make sure the hardware gets idled.

Let's also cut down one level of nestedness and remove a negative
test by returning early if pm_runtime_enabled(dev) as there is
currently nothing for us to do in that case.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at
probe error and driver unbind")
Signed-off-by: Tony Lindgren <tony@atomide.com>

--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			__pm_runtime_use_autosuspend(dev, false);
+			pm_runtime_suspend(dev);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 22:06                   ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160201 10:12]:
> * Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
> > On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
> > >
> > > The MMC hardware will not get idled properly any longer blocking any
> > > deeper idle states.
> > >
> > >> Did the driver not probe successfully the second try? If so, what happened.
> > >
> > > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
> > > But the PM runtime usecounts are wrong.
> > 
> > Okay. How did you verify this?
> 
> Well that was just based on what I see in the dmesg:
> 
> omap_device: omap_device_enable() called from invalid state 1

So we're now missing the idling of hardare after -EPROBE_DEFER..
Does the following patch work for you guys?

Regards,

Tony

8< -----------------------------
From: Tony Lindgren <tony@atomide.com>
Date: Mon, 1 Feb 2016 13:40:46 -0800
Subject: [PATCH] PM / runtime: Fix PM runtime reinit

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") added calls to reinit the PM runtime. This
however broke things for idling the hardware at least if the driver
probing has pm_runtime_set_autosuspend_delay() and -EPROBE_DEFER
happens.

Fix the problem by adding a check for configured autosuspend if
RPM_ACTIVE is set. Then reset the autosuspend, and suspend the
device to make sure the hardware gets idled.

Let's also cut down one level of nestedness and remove a negative
test by returning early if pm_runtime_enabled(dev) as there is
currently nothing for us to do in that case.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at
probe error and driver unbind")
Signed-off-by: Tony Lindgren <tony@atomide.com>

--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			__pm_runtime_use_autosuspend(dev, false);
+			pm_runtime_suspend(dev);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 22:06                   ` Tony Lindgren
@ 2016-02-01 22:17                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 22:17 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Tony Lindgren <tony@atomide.com> [160201 10:12]:
>> * Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
>> > On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
>> > >
>> > > The MMC hardware will not get idled properly any longer blocking any
>> > > deeper idle states.
>> > >
>> > >> Did the driver not probe successfully the second try? If so, what happened.
>> > >
>> > > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
>> > > But the PM runtime usecounts are wrong.
>> >
>> > Okay. How did you verify this?
>>
>> Well that was just based on what I see in the dmesg:
>>
>> omap_device: omap_device_enable() called from invalid state 1
>
> So we're now missing the idling of hardare after -EPROBE_DEFER..
> Does the following patch work for you guys?
>
> Regards,
>
> Tony
>
> 8< -----------------------------
> From: Tony Lindgren <tony@atomide.com>
> Date: Mon, 1 Feb 2016 13:40:46 -0800
> Subject: [PATCH] PM / runtime: Fix PM runtime reinit
>
> Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
> error and driver unbind") added calls to reinit the PM runtime. This
> however broke things for idling the hardware at least if the driver
> probing has pm_runtime_set_autosuspend_delay() and -EPROBE_DEFER
> happens.
>
> Fix the problem by adding a check for configured autosuspend if
> RPM_ACTIVE is set. Then reset the autosuspend, and suspend the
> device to make sure the hardware gets idled.
>
> Let's also cut down one level of nestedness and remove a negative
> test by returning early if pm_runtime_enabled(dev) as there is
> currently nothing for us to do in that case.
>
> Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at
> probe error and driver unbind")
> Signed-off-by: Tony Lindgren <tony@atomide.com>
>
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -       if (!pm_runtime_enabled(dev)) {
> -               if (dev->power.runtime_status == RPM_ACTIVE)
> +       if (pm_runtime_enabled(dev))
> +               return;
> +
> +       if (dev->power.runtime_status == RPM_ACTIVE) {
> +               if (dev->power.use_autosuspend) {
> +                       __pm_runtime_use_autosuspend(dev, false);
> +                       pm_runtime_suspend(dev);

This won't work, because runtime PM is disabled at this point.

What about doing this instead:

               if (dev->power.use_autosuspend)
                       __pm_runtime_use_autosuspend(dev, false);

               pm_runtime_set_suspended(dev);


> +               } else {
>                         pm_runtime_set_suspended(dev);
> -               if (dev->power.irq_safe) {
> -                       spin_lock_irq(&dev->power.lock);
> -                       dev->power.irq_safe = 0;
> -                       spin_unlock_irq(&dev->power.lock);
> -                       if (dev->parent)
> -                               pm_runtime_put(dev->parent);
>                 }
>         }
> +
> +       if (dev->power.irq_safe) {
> +               spin_lock_irq(&dev->power.lock);
> +               dev->power.irq_safe = 0;
> +               spin_unlock_irq(&dev->power.lock);
> +               if (dev->parent)
> +                       pm_runtime_put(dev->parent);
> +       }
>  }
>
>  /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 22:17                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Tony Lindgren <tony@atomide.com> [160201 10:12]:
>> * Ulf Hansson <ulf.hansson@linaro.org> [160201 08:45]:
>> > On 28 January 2016 at 17:58, Tony Lindgren <tony@atomide.com> wrote:
>> > >
>> > > The MMC hardware will not get idled properly any longer blocking any
>> > > deeper idle states.
>> > >
>> > >> Did the driver not probe successfully the second try? If so, what happened.
>> > >
>> > > It probes fine after a -EPROBE_DEFER on the vmmc i2c regulator.
>> > > But the PM runtime usecounts are wrong.
>> >
>> > Okay. How did you verify this?
>>
>> Well that was just based on what I see in the dmesg:
>>
>> omap_device: omap_device_enable() called from invalid state 1
>
> So we're now missing the idling of hardare after -EPROBE_DEFER..
> Does the following patch work for you guys?
>
> Regards,
>
> Tony
>
> 8< -----------------------------
> From: Tony Lindgren <tony@atomide.com>
> Date: Mon, 1 Feb 2016 13:40:46 -0800
> Subject: [PATCH] PM / runtime: Fix PM runtime reinit
>
> Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
> error and driver unbind") added calls to reinit the PM runtime. This
> however broke things for idling the hardware at least if the driver
> probing has pm_runtime_set_autosuspend_delay() and -EPROBE_DEFER
> happens.
>
> Fix the problem by adding a check for configured autosuspend if
> RPM_ACTIVE is set. Then reset the autosuspend, and suspend the
> device to make sure the hardware gets idled.
>
> Let's also cut down one level of nestedness and remove a negative
> test by returning early if pm_runtime_enabled(dev) as there is
> currently nothing for us to do in that case.
>
> Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at
> probe error and driver unbind")
> Signed-off-by: Tony Lindgren <tony@atomide.com>
>
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -       if (!pm_runtime_enabled(dev)) {
> -               if (dev->power.runtime_status == RPM_ACTIVE)
> +       if (pm_runtime_enabled(dev))
> +               return;
> +
> +       if (dev->power.runtime_status == RPM_ACTIVE) {
> +               if (dev->power.use_autosuspend) {
> +                       __pm_runtime_use_autosuspend(dev, false);
> +                       pm_runtime_suspend(dev);

This won't work, because runtime PM is disabled at this point.

What about doing this instead:

               if (dev->power.use_autosuspend)
                       __pm_runtime_use_autosuspend(dev, false);

               pm_runtime_set_suspended(dev);


> +               } else {
>                         pm_runtime_set_suspended(dev);
> -               if (dev->power.irq_safe) {
> -                       spin_lock_irq(&dev->power.lock);
> -                       dev->power.irq_safe = 0;
> -                       spin_unlock_irq(&dev->power.lock);
> -                       if (dev->parent)
> -                               pm_runtime_put(dev->parent);
>                 }
>         }
> +
> +       if (dev->power.irq_safe) {
> +               spin_lock_irq(&dev->power.lock);
> +               dev->power.irq_safe = 0;
> +               spin_unlock_irq(&dev->power.lock);
> +               if (dev->parent)
> +                       pm_runtime_put(dev->parent);
> +       }
>  }
>
>  /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 22:17                     ` Rafael J. Wysocki
@ 2016-02-01 22:29                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 22:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
> >   */
> >  void pm_runtime_reinit(struct device *dev)
> >  {
> > -       if (!pm_runtime_enabled(dev)) {
> > -               if (dev->power.runtime_status == RPM_ACTIVE)
> > +       if (pm_runtime_enabled(dev))
> > +               return;
> > +
> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
> > +               if (dev->power.use_autosuspend) {
> > +                       __pm_runtime_use_autosuspend(dev, false);
> > +                       pm_runtime_suspend(dev);
> 
> This won't work, because runtime PM is disabled at this point.

Hmm right OK. It does work from idling the hardware point
of view though..

> What about doing this instead:
> 
>                if (dev->power.use_autosuspend)
>                        __pm_runtime_use_autosuspend(dev, false);
> 
>                pm_runtime_set_suspended(dev);

..while this does not work. The hardware is never idled
in this case.

What else does __pm_runtime_use_autosuspend() set initially
that changes things here?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 22:29                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 22:29 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
> >   */
> >  void pm_runtime_reinit(struct device *dev)
> >  {
> > -       if (!pm_runtime_enabled(dev)) {
> > -               if (dev->power.runtime_status == RPM_ACTIVE)
> > +       if (pm_runtime_enabled(dev))
> > +               return;
> > +
> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
> > +               if (dev->power.use_autosuspend) {
> > +                       __pm_runtime_use_autosuspend(dev, false);
> > +                       pm_runtime_suspend(dev);
> 
> This won't work, because runtime PM is disabled at this point.

Hmm right OK. It does work from idling the hardware point
of view though..

> What about doing this instead:
> 
>                if (dev->power.use_autosuspend)
>                        __pm_runtime_use_autosuspend(dev, false);
> 
>                pm_runtime_set_suspended(dev);

..while this does not work. The hardware is never idled
in this case.

What else does __pm_runtime_use_autosuspend() set initially
that changes things here?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 22:29                       ` Tony Lindgren
@ 2016-02-01 23:10                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 23:10 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Mon, Feb 1, 2016 at 11:29 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
>> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > --- a/drivers/base/power/runtime.c
>> > +++ b/drivers/base/power/runtime.c
>> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
>> >   */
>> >  void pm_runtime_reinit(struct device *dev)
>> >  {
>> > -       if (!pm_runtime_enabled(dev)) {
>> > -               if (dev->power.runtime_status == RPM_ACTIVE)
>> > +       if (pm_runtime_enabled(dev))
>> > +               return;
>> > +
>> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
>> > +               if (dev->power.use_autosuspend) {
>> > +                       __pm_runtime_use_autosuspend(dev, false);
>> > +                       pm_runtime_suspend(dev);
>>
>> This won't work, because runtime PM is disabled at this point.
>
> Hmm right OK. It does work from idling the hardware point
> of view though..

pm_runtime_suspend() with runtime PM disabled is a NOP.  It will do
nothing and return -EACCES.

>> What about doing this instead:
>>
>>                if (dev->power.use_autosuspend)
>>                        __pm_runtime_use_autosuspend(dev, false);
>>
>>                pm_runtime_set_suspended(dev);
>
> ..while this does not work. The hardware is never idled
> in this case.

I'm not sure what you mean.  pm_runtime_set_suspended() sets the
status to RPM_SUSPENDED for devices with runtime PM disabled.  It has
nothing to do with "idling" in principle.

> What else does __pm_runtime_use_autosuspend() set initially
> that changes things here?

The usage counter, if the delay is negative.

I'll look at this in detail, but not right now, sorry.  I'm working on
something else ATM and I was hoping that Ulf would be able to figure
out what's going on here.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 23:10                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 23:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 1, 2016 at 11:29 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
>> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
>> > --- a/drivers/base/power/runtime.c
>> > +++ b/drivers/base/power/runtime.c
>> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
>> >   */
>> >  void pm_runtime_reinit(struct device *dev)
>> >  {
>> > -       if (!pm_runtime_enabled(dev)) {
>> > -               if (dev->power.runtime_status == RPM_ACTIVE)
>> > +       if (pm_runtime_enabled(dev))
>> > +               return;
>> > +
>> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
>> > +               if (dev->power.use_autosuspend) {
>> > +                       __pm_runtime_use_autosuspend(dev, false);
>> > +                       pm_runtime_suspend(dev);
>>
>> This won't work, because runtime PM is disabled at this point.
>
> Hmm right OK. It does work from idling the hardware point
> of view though..

pm_runtime_suspend() with runtime PM disabled is a NOP.  It will do
nothing and return -EACCES.

>> What about doing this instead:
>>
>>                if (dev->power.use_autosuspend)
>>                        __pm_runtime_use_autosuspend(dev, false);
>>
>>                pm_runtime_set_suspended(dev);
>
> ..while this does not work. The hardware is never idled
> in this case.

I'm not sure what you mean.  pm_runtime_set_suspended() sets the
status to RPM_SUSPENDED for devices with runtime PM disabled.  It has
nothing to do with "idling" in principle.

> What else does __pm_runtime_use_autosuspend() set initially
> that changes things here?

The usage counter, if the delay is negative.

I'll look at this in detail, but not right now, sorry.  I'm working on
something else ATM and I was hoping that Ulf would be able to figure
out what's going on here.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 23:10                         ` Rafael J. Wysocki
@ 2016-02-01 23:28                           ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 23:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160201 15:11]:
> On Mon, Feb 1, 2016 at 11:29 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
> >> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > --- a/drivers/base/power/runtime.c
> >> > +++ b/drivers/base/power/runtime.c
> >> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
> >> >   */
> >> >  void pm_runtime_reinit(struct device *dev)
> >> >  {
> >> > -       if (!pm_runtime_enabled(dev)) {
> >> > -               if (dev->power.runtime_status == RPM_ACTIVE)
> >> > +       if (pm_runtime_enabled(dev))
> >> > +               return;
> >> > +
> >> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
> >> > +               if (dev->power.use_autosuspend) {
> >> > +                       __pm_runtime_use_autosuspend(dev, false);
> >> > +                       pm_runtime_suspend(dev);
> >>
> >> This won't work, because runtime PM is disabled at this point.
> >
> > Hmm right OK. It does work from idling the hardware point
> > of view though..
> 
> pm_runtime_suspend() with runtime PM disabled is a NOP.  It will do
> nothing and return -EACCES.

Hmm it  makes a difference here for sure :)

> >> What about doing this instead:
> >>
> >>                if (dev->power.use_autosuspend)
> >>                        __pm_runtime_use_autosuspend(dev, false);
> >>
> >>                pm_runtime_set_suspended(dev);
> >
> > ..while this does not work. The hardware is never idled
> > in this case.
> 
> I'm not sure what you mean.  pm_runtime_set_suspended() sets the
> status to RPM_SUSPENDED for devices with runtime PM disabled.  It has
> nothing to do with "idling" in principle.

Well looking at the update_autosuspend(), it seems we're now missing
rpm_idle() call that now never happens.

Does the patch below make more sense to you where we call rpm_idle?
That seems to make things behave here also.

> > What else does __pm_runtime_use_autosuspend() set initially
> > that changes things here?
> 
> The usage counter, if the delay is negative.

Yeah I don't see any difference with those.

> I'll look at this in detail, but not right now, sorry.  I'm working on
> something else ATM and I was hoping that Ulf would be able to figure
> out what's going on here.

Yeah we need to understand what's going on here. Having the PM runtime
framework out of sync with the hardare is not good.. If we can't
figure this out we should probably revert the patch until we understand
it.

Regards,

Tony

8< ------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	int (*callback)(struct device *);
+	int err;
+
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			__pm_runtime_use_autosuspend(dev, false);
+			rpm_idle(dev, RPM_AUTO);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 23:28                           ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 23:28 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160201 15:11]:
> On Mon, Feb 1, 2016 at 11:29 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Rafael J. Wysocki <rafael@kernel.org> [160201 14:18]:
> >> On Mon, Feb 1, 2016 at 11:06 PM, Tony Lindgren <tony@atomide.com> wrote:
> >> > --- a/drivers/base/power/runtime.c
> >> > +++ b/drivers/base/power/runtime.c
> >> > @@ -1419,17 +1419,25 @@ void pm_runtime_init(struct device *dev)
> >> >   */
> >> >  void pm_runtime_reinit(struct device *dev)
> >> >  {
> >> > -       if (!pm_runtime_enabled(dev)) {
> >> > -               if (dev->power.runtime_status == RPM_ACTIVE)
> >> > +       if (pm_runtime_enabled(dev))
> >> > +               return;
> >> > +
> >> > +       if (dev->power.runtime_status == RPM_ACTIVE) {
> >> > +               if (dev->power.use_autosuspend) {
> >> > +                       __pm_runtime_use_autosuspend(dev, false);
> >> > +                       pm_runtime_suspend(dev);
> >>
> >> This won't work, because runtime PM is disabled at this point.
> >
> > Hmm right OK. It does work from idling the hardware point
> > of view though..
> 
> pm_runtime_suspend() with runtime PM disabled is a NOP.  It will do
> nothing and return -EACCES.

Hmm it  makes a difference here for sure :)

> >> What about doing this instead:
> >>
> >>                if (dev->power.use_autosuspend)
> >>                        __pm_runtime_use_autosuspend(dev, false);
> >>
> >>                pm_runtime_set_suspended(dev);
> >
> > ..while this does not work. The hardware is never idled
> > in this case.
> 
> I'm not sure what you mean.  pm_runtime_set_suspended() sets the
> status to RPM_SUSPENDED for devices with runtime PM disabled.  It has
> nothing to do with "idling" in principle.

Well looking at the update_autosuspend(), it seems we're now missing
rpm_idle() call that now never happens.

Does the patch below make more sense to you where we call rpm_idle?
That seems to make things behave here also.

> > What else does __pm_runtime_use_autosuspend() set initially
> > that changes things here?
> 
> The usage counter, if the delay is negative.

Yeah I don't see any difference with those.

> I'll look at this in detail, but not right now, sorry.  I'm working on
> something else ATM and I was hoping that Ulf would be able to figure
> out what's going on here.

Yeah we need to understand what's going on here. Having the PM runtime
framework out of sync with the hardare is not good.. If we can't
figure this out we should probably revert the patch until we understand
it.

Regards,

Tony

8< ------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	int (*callback)(struct device *);
+	int err;
+
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			__pm_runtime_use_autosuspend(dev, false);
+			rpm_idle(dev, RPM_AUTO);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 23:28                           ` Tony Lindgren
@ 2016-02-01 23:44                             ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 23:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160201 15:29]:
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -	if (!pm_runtime_enabled(dev)) {
> -		if (dev->power.runtime_status == RPM_ACTIVE)
> +	int (*callback)(struct device *);
> +	int err;

The callback and err are not needed here FYI, forgot to remove
them..

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 23:44                             ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-01 23:44 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160201 15:29]:
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -	if (!pm_runtime_enabled(dev)) {
> -		if (dev->power.runtime_status == RPM_ACTIVE)
> +	int (*callback)(struct device *);
> +	int err;

The callback and err are not needed here FYI, forgot to remove
them..

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 23:28                           ` Tony Lindgren
@ 2016-02-01 23:49                             ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-01 23:49 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Mon, 1 Feb 2016, Tony Lindgren wrote:

> Does the patch below make more sense to you where we call rpm_idle?
> That seems to make things behave here also.
> 
> > > What else does __pm_runtime_use_autosuspend() set initially
> > > that changes things here?
> > 
> > The usage counter, if the delay is negative.
> 
> Yeah I don't see any difference with those.
> 
> > I'll look at this in detail, but not right now, sorry.  I'm working on
> > something else ATM and I was hoping that Ulf would be able to figure
> > out what's going on here.
> 
> Yeah we need to understand what's going on here. Having the PM runtime
> framework out of sync with the hardare is not good.. If we can't
> figure this out we should probably revert the patch until we understand
> it.
> 
> Regards,
> 
> Tony
> 
> 8< ------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -	if (!pm_runtime_enabled(dev)) {
> -		if (dev->power.runtime_status == RPM_ACTIVE)
> +	int (*callback)(struct device *);
> +	int err;
> +
> +	if (pm_runtime_enabled(dev))
> +		return;
> +
> +	if (dev->power.runtime_status == RPM_ACTIVE) {
> +		if (dev->power.use_autosuspend) {
> +			__pm_runtime_use_autosuspend(dev, false);
> +			rpm_idle(dev, RPM_AUTO);

You get here only if runtime PM is disabled, right?  So the rpm_idle 
call won't do anything -- "disabled" means don't make any callbacks.

Tony, exactly what are you trying to do here?  Do you want this to 
invoke a runtime-PM callback in the subsystem, power domain, or driver?  
(Is there even a driver bound to the device when this function runs?)

The function's name suggests that it merely resets the data stored in 
dev->power without actually touching the hardware.  Is that what you 
really want?

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-01 23:49                             ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-01 23:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 1 Feb 2016, Tony Lindgren wrote:

> Does the patch below make more sense to you where we call rpm_idle?
> That seems to make things behave here also.
> 
> > > What else does __pm_runtime_use_autosuspend() set initially
> > > that changes things here?
> > 
> > The usage counter, if the delay is negative.
> 
> Yeah I don't see any difference with those.
> 
> > I'll look at this in detail, but not right now, sorry.  I'm working on
> > something else ATM and I was hoping that Ulf would be able to figure
> > out what's going on here.
> 
> Yeah we need to understand what's going on here. Having the PM runtime
> framework out of sync with the hardare is not good.. If we can't
> figure this out we should probably revert the patch until we understand
> it.
> 
> Regards,
> 
> Tony
> 
> 8< ------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1419,17 +1419,28 @@ void pm_runtime_init(struct device *dev)
>   */
>  void pm_runtime_reinit(struct device *dev)
>  {
> -	if (!pm_runtime_enabled(dev)) {
> -		if (dev->power.runtime_status == RPM_ACTIVE)
> +	int (*callback)(struct device *);
> +	int err;
> +
> +	if (pm_runtime_enabled(dev))
> +		return;
> +
> +	if (dev->power.runtime_status == RPM_ACTIVE) {
> +		if (dev->power.use_autosuspend) {
> +			__pm_runtime_use_autosuspend(dev, false);
> +			rpm_idle(dev, RPM_AUTO);

You get here only if runtime PM is disabled, right?  So the rpm_idle 
call won't do anything -- "disabled" means don't make any callbacks.

Tony, exactly what are you trying to do here?  Do you want this to 
invoke a runtime-PM callback in the subsystem, power domain, or driver?  
(Is there even a driver bound to the device when this function runs?)

The function's name suggests that it merely resets the data stored in 
dev->power without actually touching the hardware.  Is that what you 
really want?

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-01 23:49                             ` Alan Stern
@ 2016-02-02  3:05                               ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02  3:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

Hi,

* Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
> 
> You get here only if runtime PM is disabled, right?  So the rpm_idle 
> call won't do anything -- "disabled" means don't make any callbacks.

Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
just has a side effect that makes a difference here.

> Tony, exactly what are you trying to do here?  Do you want this to 
> invoke a runtime-PM callback in the subsystem, power domain, or driver?  
> (Is there even a driver bound to the device when this function runs?)

I guess I need to add more printks to figure out what's going on here.
But yeah, I'm not seeing the callback happening at the interconnect
level so hardware and PM runtime states won't match on the following
probe after -EPROBE_DEFER.

> The function's name suggests that it merely resets the data stored in 
> dev->power without actually touching the hardware.  Is that what you 
> really want?

I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
where we now set pm_runtime_set_suspended() between failed device
probes and the device is still active in hardware.

The patch below also helps with the problem and leaves out the
rpm_suspend() call from loop so it might give more hints.

The difference here from what Rafael suggested earlier is calling
__pm_runtime_use_autosuspend() and then not calling
pm_runtime_set_suspended().

However, it seems the below patch keeps hardware active in the
autoidle case though, so chances are there is more that needs to
be done here. Anyways, I'll try to debug it more tomorrow.

Regards,

Tony

8< ------------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1425,17 +1425,25 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			pm_runtime_set_autosuspend_delay(dev, 0);
+			__pm_runtime_use_autosuspend(dev, false);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02  3:05                               ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02  3:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

* Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
> 
> You get here only if runtime PM is disabled, right?  So the rpm_idle 
> call won't do anything -- "disabled" means don't make any callbacks.

Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
just has a side effect that makes a difference here.

> Tony, exactly what are you trying to do here?  Do you want this to 
> invoke a runtime-PM callback in the subsystem, power domain, or driver?  
> (Is there even a driver bound to the device when this function runs?)

I guess I need to add more printks to figure out what's going on here.
But yeah, I'm not seeing the callback happening at the interconnect
level so hardware and PM runtime states won't match on the following
probe after -EPROBE_DEFER.

> The function's name suggests that it merely resets the data stored in 
> dev->power without actually touching the hardware.  Is that what you 
> really want?

I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
where we now set pm_runtime_set_suspended() between failed device
probes and the device is still active in hardware.

The patch below also helps with the problem and leaves out the
rpm_suspend() call from loop so it might give more hints.

The difference here from what Rafael suggested earlier is calling
__pm_runtime_use_autosuspend() and then not calling
pm_runtime_set_suspended().

However, it seems the below patch keeps hardware active in the
autoidle case though, so chances are there is more that needs to
be done here. Anyways, I'll try to debug it more tomorrow.

Regards,

Tony

8< ------------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1425,17 +1425,25 @@ void pm_runtime_init(struct device *dev)
  */
 void pm_runtime_reinit(struct device *dev)
 {
-	if (!pm_runtime_enabled(dev)) {
-		if (dev->power.runtime_status == RPM_ACTIVE)
+	if (pm_runtime_enabled(dev))
+		return;
+
+	if (dev->power.runtime_status == RPM_ACTIVE) {
+		if (dev->power.use_autosuspend) {
+			pm_runtime_set_autosuspend_delay(dev, 0);
+			__pm_runtime_use_autosuspend(dev, false);
+		} else {
 			pm_runtime_set_suspended(dev);
-		if (dev->power.irq_safe) {
-			spin_lock_irq(&dev->power.lock);
-			dev->power.irq_safe = 0;
-			spin_unlock_irq(&dev->power.lock);
-			if (dev->parent)
-				pm_runtime_put(dev->parent);
 		}
 	}
+
+	if (dev->power.irq_safe) {
+		spin_lock_irq(&dev->power.lock);
+		dev->power.irq_safe = 0;
+		spin_unlock_irq(&dev->power.lock);
+		if (dev->parent)
+			pm_runtime_put(dev->parent);
+	}
 }
 
 /**

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02  3:05                               ` Tony Lindgren
@ 2016-02-02 10:07                                 ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 10:07 UTC (permalink / raw)
  To: Tony Lindgren, Alan Stern, Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

On 2 February 2016 at 04:05, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
>>
>> You get here only if runtime PM is disabled, right?  So the rpm_idle
>> call won't do anything -- "disabled" means don't make any callbacks.
>
> Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
> just has a side effect that makes a difference here.
>
>> Tony, exactly what are you trying to do here?  Do you want this to
>> invoke a runtime-PM callback in the subsystem, power domain, or driver?
>> (Is there even a driver bound to the device when this function runs?)
>
> I guess I need to add more printks to figure out what's going on here.
> But yeah, I'm not seeing the callback happening at the interconnect
> level so hardware and PM runtime states won't match on the following
> probe after -EPROBE_DEFER.
>
>> The function's name suggests that it merely resets the data stored in
>> dev->power without actually touching the hardware.  Is that what you
>> really want?
>
> I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
> where we now set pm_runtime_set_suspended() between failed device
> probes and the device is still active in hardware.
>
> The patch below also helps with the problem and leaves out the
> rpm_suspend() call from loop so it might give more hints.
>
> The difference here from what Rafael suggested earlier is calling
> __pm_runtime_use_autosuspend() and then not calling
> pm_runtime_set_suspended().
>
> However, it seems the below patch keeps hardware active in the
> autoidle case though, so chances are there is more that needs to
> be done here. Anyways, I'll try to debug it more tomorrow.
>

Your observations is correct. The hardware will be kept active
in-between the probe attempts (and thus also if probing fails).
Although, that's not a regression as that's the behaviour you get from
runtime PM, when drivers are implemented like omap_hsmmc.

Instead of the suggested approaches, I think the regression should be
fixed at the PM domain level (omap hwmod). I have attached a patch
below, please give it try as it's untested.

To solve the other problem (allowing devices to become inactive
in-between at probe failures), I see two options (not treated as
regressions).
1)
Change the behaviour of pm_runtime_put_sync(), to *not* respect the
autosuspend mode.
I think I prefer this option, as it seems like autosuspend should be
respected only via the asynchronous runtime PM APIs.

2)
Change the failing drivers, to before calling pm_runtime_put_sync()
also invoke pm_runtime_dont_use_autosusend(). Or other similar
approach.

Kind regards
Uffe

[...]

From: Ulf Hansson <ulf.hansson@linaro.org>
Date: Tue, 2 Feb 2016 10:05:39 +0100
Subject: [PATCH] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") may re-initialize the runtime PM status of the
device to RPM_SUSPENDED at driver probe failures.

For the omap_hsmmc and likely also other omap drivers, which needs more
than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
causes a regression at the PM domain level (omap hwmod).

The reason is that the drivers don't put back the device into low power
state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
status from RPM_ACTIVE to RPM_SUSPENDED.

The next ->probe() attempt then triggers the ->runtime_resume() callback
to be invoked, which means this happens two times in a row. At the PM
domain level (omap hwmod) this is being treated as an error and thus the
runtime PM status of the device isn't correctly synchronized with the
runtime PM core.

In the end, ->probe() anyway succeeds (as the driver don't checks the
error code from the runtime PM APIs), but results in that the PM domain
always stays powered on. This because of the runtime PM core believes the
device is RPM_SUSPENDED.

Fix this regression by allowing devices to be pre-enabled when the PM
domain's (omap hwmod) ->runtime_resume() callback is requested to enable
the device. In such cases, return zero to synchronize the runtime PM
status with the runtime PM core.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 arch/arm/mach-omap2/omap_device.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_device.c
b/arch/arm/mach-omap2/omap_device.c
index 0437537..851767f 100644
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -717,7 +717,7 @@ int omap_device_register(struct platform_device *pdev)
  * to be accessible and ready to operate.  This generally involves
  * enabling clocks, setting SYSCONFIG registers; and in the future may
  * involve remuxing pins.  Device drivers should call this function
- * indirectly via pm_runtime_get*().  Returns -EINVAL if called when
+ * indirectly via pm_runtime_get*().  Returns zero if called when
  * the omap_device is already enabled, or passes along the return
  * value of _omap_device_enable_hwmods().
  */
@@ -728,12 +728,13 @@ int omap_device_enable(struct platform_device *pdev)

        od = to_omap_device(pdev);

-       if (od->_state == OMAP_DEVICE_STATE_ENABLED) {
-               dev_warn(&pdev->dev,
-                        "omap_device: %s() called from invalid state %d\n",
-                        __func__, od->_state);
-               return -EINVAL;
-       }
+       /*
+        * From the PM domain perspective the device may already be enabled.
+        * In such case, let's return zero to synchronize the state with the
+        * runtime PM core.
+       */
+       if (od->_state == OMAP_DEVICE_STATE_ENABLED)
+               return 0;

        ret = _omap_device_enable_hwmods(od);

-- 
1.9.1

^ permalink raw reply related	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 10:07                                 ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 10:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 February 2016 at 04:05, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
>>
>> You get here only if runtime PM is disabled, right?  So the rpm_idle
>> call won't do anything -- "disabled" means don't make any callbacks.
>
> Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
> just has a side effect that makes a difference here.
>
>> Tony, exactly what are you trying to do here?  Do you want this to
>> invoke a runtime-PM callback in the subsystem, power domain, or driver?
>> (Is there even a driver bound to the device when this function runs?)
>
> I guess I need to add more printks to figure out what's going on here.
> But yeah, I'm not seeing the callback happening at the interconnect
> level so hardware and PM runtime states won't match on the following
> probe after -EPROBE_DEFER.
>
>> The function's name suggests that it merely resets the data stored in
>> dev->power without actually touching the hardware.  Is that what you
>> really want?
>
> I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
> where we now set pm_runtime_set_suspended() between failed device
> probes and the device is still active in hardware.
>
> The patch below also helps with the problem and leaves out the
> rpm_suspend() call from loop so it might give more hints.
>
> The difference here from what Rafael suggested earlier is calling
> __pm_runtime_use_autosuspend() and then not calling
> pm_runtime_set_suspended().
>
> However, it seems the below patch keeps hardware active in the
> autoidle case though, so chances are there is more that needs to
> be done here. Anyways, I'll try to debug it more tomorrow.
>

Your observations is correct. The hardware will be kept active
in-between the probe attempts (and thus also if probing fails).
Although, that's not a regression as that's the behaviour you get from
runtime PM, when drivers are implemented like omap_hsmmc.

Instead of the suggested approaches, I think the regression should be
fixed at the PM domain level (omap hwmod). I have attached a patch
below, please give it try as it's untested.

To solve the other problem (allowing devices to become inactive
in-between at probe failures), I see two options (not treated as
regressions).
1)
Change the behaviour of pm_runtime_put_sync(), to *not* respect the
autosuspend mode.
I think I prefer this option, as it seems like autosuspend should be
respected only via the asynchronous runtime PM APIs.

2)
Change the failing drivers, to before calling pm_runtime_put_sync()
also invoke pm_runtime_dont_use_autosusend(). Or other similar
approach.

Kind regards
Uffe

[...]

From: Ulf Hansson <ulf.hansson@linaro.org>
Date: Tue, 2 Feb 2016 10:05:39 +0100
Subject: [PATCH] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") may re-initialize the runtime PM status of the
device to RPM_SUSPENDED at driver probe failures.

For the omap_hsmmc and likely also other omap drivers, which needs more
than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
causes a regression at the PM domain level (omap hwmod).

The reason is that the drivers don't put back the device into low power
state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
status from RPM_ACTIVE to RPM_SUSPENDED.

The next ->probe() attempt then triggers the ->runtime_resume() callback
to be invoked, which means this happens two times in a row. At the PM
domain level (omap hwmod) this is being treated as an error and thus the
runtime PM status of the device isn't correctly synchronized with the
runtime PM core.

In the end, ->probe() anyway succeeds (as the driver don't checks the
error code from the runtime PM APIs), but results in that the PM domain
always stays powered on. This because of the runtime PM core believes the
device is RPM_SUSPENDED.

Fix this regression by allowing devices to be pre-enabled when the PM
domain's (omap hwmod) ->runtime_resume() callback is requested to enable
the device. In such cases, return zero to synchronize the runtime PM
status with the runtime PM core.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 arch/arm/mach-omap2/omap_device.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_device.c
b/arch/arm/mach-omap2/omap_device.c
index 0437537..851767f 100644
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -717,7 +717,7 @@ int omap_device_register(struct platform_device *pdev)
  * to be accessible and ready to operate.  This generally involves
  * enabling clocks, setting SYSCONFIG registers; and in the future may
  * involve remuxing pins.  Device drivers should call this function
- * indirectly via pm_runtime_get*().  Returns -EINVAL if called when
+ * indirectly via pm_runtime_get*().  Returns zero if called when
  * the omap_device is already enabled, or passes along the return
  * value of _omap_device_enable_hwmods().
  */
@@ -728,12 +728,13 @@ int omap_device_enable(struct platform_device *pdev)

        od = to_omap_device(pdev);

-       if (od->_state == OMAP_DEVICE_STATE_ENABLED) {
-               dev_warn(&pdev->dev,
-                        "omap_device: %s() called from invalid state %d\n",
-                        __func__, od->_state);
-               return -EINVAL;
-       }
+       /*
+        * From the PM domain perspective the device may already be enabled.
+        * In such case, let's return zero to synchronize the state with the
+        * runtime PM core.
+       */
+       if (od->_state == OMAP_DEVICE_STATE_ENABLED)
+               return 0;

        ret = _omap_device_enable_hwmods(od);

-- 
1.9.1

^ permalink raw reply related	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 10:07                                 ` Ulf Hansson
@ 2016-02-02 10:42                                   ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 10:42 UTC (permalink / raw)
  To: Tony Lindgren, Alan Stern, Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

>
> Instead of the suggested approaches, I think the regression should be
> fixed at the PM domain level (omap hwmod). I have attached a patch
> below, please give it try as it's untested.

Realized that version 1 would actually *only* make the PM domain code
to deal with pre-enabled devices. It would still invoke the driver's
->runtime_resume() callbacks (via pm_generic_runtime_resume())  for
these scenarios.

That's not what we want. So I moved the checks to the actual
->runtime_resume() callback in the PM domain, instead of in the
omap_device_enable() function. A version 2 is attached. Please give it
at try.

[...]

From: Ulf Hansson <ulf.hansson@linaro.org>
Date: Tue, 2 Feb 2016 10:05:39 +0100
Subject: [PATCH V2] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") may re-initialize the runtime PM status of the
device to RPM_SUSPENDED at driver probe failures.

For the omap_hsmmc and likely also other omap drivers, which needs more
than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
causes a regression at the PM domain level (omap hwmod).

The reason is that the drivers don't put back the device into low power
state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
status from RPM_ACTIVE to RPM_SUSPENDED.

The next ->probe() attempt then triggers the ->runtime_resume() callback
to be invoked, which means this happens two times in a row. At the PM
domain level (omap hwmod) this is being treated as an error and thus the
runtime PM status of the device isn't correctly synchronized with the
runtime PM core.

In the end, ->probe() anyway succeeds (as the driver don't checks the
error code from the runtime PM APIs), but results in that the PM domain
always stays powered on. This because of the runtime PM core believes the
device is RPM_SUSPENDED.

Fix this regression by allowing devices to be pre-enabled when the PM
domain's (omap hwmod) ->runtime_resume() callback is requested to enable
the device. In such cases, return zero to synchronize the runtime PM
status with the runtime PM core.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v2:
        -Prevent a driver's ->runtime_resume() callbacks from being invoked for
        a pre-enabled device.

---
 arch/arm/mach-omap2/omap_device.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_device.c
b/arch/arm/mach-omap2/omap_device.c
index 0437537..1ad390b 100644
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -599,8 +599,17 @@ static int _od_runtime_suspend(struct device *dev)
 static int _od_runtime_resume(struct device *dev)
 {
        struct platform_device *pdev = to_platform_device(dev);
+       struct omap_device *od = to_omap_device(pdev);
        int ret;

+       /*
+        * From the PM domain perspective the device may already be enabled.
+        * In such case, let's return zero to synchronize the state with the
+        * runtime PM core.
+       */
+       if (od->_state == OMAP_DEVICE_STATE_ENABLED)
+               return 0;
+
        ret = omap_device_enable(pdev);
        if (ret)
                return ret;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 10:42                                   ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

>
> Instead of the suggested approaches, I think the regression should be
> fixed at the PM domain level (omap hwmod). I have attached a patch
> below, please give it try as it's untested.

Realized that version 1 would actually *only* make the PM domain code
to deal with pre-enabled devices. It would still invoke the driver's
->runtime_resume() callbacks (via pm_generic_runtime_resume())  for
these scenarios.

That's not what we want. So I moved the checks to the actual
->runtime_resume() callback in the PM domain, instead of in the
omap_device_enable() function. A version 2 is attached. Please give it
at try.

[...]

From: Ulf Hansson <ulf.hansson@linaro.org>
Date: Tue, 2 Feb 2016 10:05:39 +0100
Subject: [PATCH V2] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled

Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind") may re-initialize the runtime PM status of the
device to RPM_SUSPENDED at driver probe failures.

For the omap_hsmmc and likely also other omap drivers, which needs more
than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
causes a regression at the PM domain level (omap hwmod).

The reason is that the drivers don't put back the device into low power
state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
status from RPM_ACTIVE to RPM_SUSPENDED.

The next ->probe() attempt then triggers the ->runtime_resume() callback
to be invoked, which means this happens two times in a row. At the PM
domain level (omap hwmod) this is being treated as an error and thus the
runtime PM status of the device isn't correctly synchronized with the
runtime PM core.

In the end, ->probe() anyway succeeds (as the driver don't checks the
error code from the runtime PM APIs), but results in that the PM domain
always stays powered on. This because of the runtime PM core believes the
device is RPM_SUSPENDED.

Fix this regression by allowing devices to be pre-enabled when the PM
domain's (omap hwmod) ->runtime_resume() callback is requested to enable
the device. In such cases, return zero to synchronize the runtime PM
status with the runtime PM core.

Fixes: 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
error and driver unbind")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v2:
        -Prevent a driver's ->runtime_resume() callbacks from being invoked for
        a pre-enabled device.

---
 arch/arm/mach-omap2/omap_device.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_device.c
b/arch/arm/mach-omap2/omap_device.c
index 0437537..1ad390b 100644
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -599,8 +599,17 @@ static int _od_runtime_suspend(struct device *dev)
 static int _od_runtime_resume(struct device *dev)
 {
        struct platform_device *pdev = to_platform_device(dev);
+       struct omap_device *od = to_omap_device(pdev);
        int ret;

+       /*
+        * From the PM domain perspective the device may already be enabled.
+        * In such case, let's return zero to synchronize the state with the
+        * runtime PM core.
+       */
+       if (od->_state == OMAP_DEVICE_STATE_ENABLED)
+               return 0;
+
        ret = omap_device_enable(pdev);
        if (ret)
                return ret;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 10:07                                 ` Ulf Hansson
@ 2016-02-02 16:15                                   ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 16:15 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> On 2 February 2016 at 04:05, Tony Lindgren <tony@atomide.com> wrote:
> > Hi,
> >
> > * Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
> >>
> >> You get here only if runtime PM is disabled, right?  So the rpm_idle
> >> call won't do anything -- "disabled" means don't make any callbacks.
> >
> > Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
> > just has a side effect that makes a difference here.
> >
> >> Tony, exactly what are you trying to do here?  Do you want this to
> >> invoke a runtime-PM callback in the subsystem, power domain, or driver?
> >> (Is there even a driver bound to the device when this function runs?)
> >
> > I guess I need to add more printks to figure out what's going on here.
> > But yeah, I'm not seeing the callback happening at the interconnect
> > level so hardware and PM runtime states won't match on the following
> > probe after -EPROBE_DEFER.
> >
> >> The function's name suggests that it merely resets the data stored in
> >> dev->power without actually touching the hardware.  Is that what you
> >> really want?
> >
> > I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
> > where we now set pm_runtime_set_suspended() between failed device
> > probes and the device is still active in hardware.
> >
> > The patch below also helps with the problem and leaves out the
> > rpm_suspend() call from loop so it might give more hints.
> >
> > The difference here from what Rafael suggested earlier is calling
> > __pm_runtime_use_autosuspend() and then not calling
> > pm_runtime_set_suspended().
> >
> > However, it seems the below patch keeps hardware active in the
> > autoidle case though, so chances are there is more that needs to
> > be done here. Anyways, I'll try to debug it more tomorrow.
> >
> 
> Your observations is correct. The hardware will be kept active
> in-between the probe attempts (and thus also if probing fails).
> Although, that's not a regression as that's the behaviour you get from
> runtime PM, when drivers are implemented like omap_hsmmc.

Perhaps this is what we should do.  If probing gets deferred a few 
times, doing runtime suspends and resumes in between will be a waste of 
time.

> Instead of the suggested approaches, I think the regression should be
> fixed at the PM domain level (omap hwmod). I have attached a patch
> below, please give it try as it's untested.
> 
> To solve the other problem (allowing devices to become inactive
> in-between at probe failures), I see two options (not treated as
> regressions).
> 1)
> Change the behaviour of pm_runtime_put_sync(), to *not* respect the
> autosuspend mode.
> I think I prefer this option, as it seems like autosuspend should be
> respected only via the asynchronous runtime PM APIs.

?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
mode.  If you want to respect it, you have to call 
pm_runtime_put_sync_autosuspend() instead.

> 2)
> Change the failing drivers, to before calling pm_runtime_put_sync()
> also invoke pm_runtime_dont_use_autosusend(). Or other similar
> approach.

Given the above, this seems unnecessary.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 16:15                                   ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> On 2 February 2016 at 04:05, Tony Lindgren <tony@atomide.com> wrote:
> > Hi,
> >
> > * Alan Stern <stern@rowland.harvard.edu> [160201 15:50]:
> >>
> >> You get here only if runtime PM is disabled, right?  So the rpm_idle
> >> call won't do anything -- "disabled" means don't make any callbacks.
> >
> > Hmm sorry yes I'm confused again. Yeah it looks like calling rpm_idle
> > just has a side effect that makes a difference here.
> >
> >> Tony, exactly what are you trying to do here?  Do you want this to
> >> invoke a runtime-PM callback in the subsystem, power domain, or driver?
> >> (Is there even a driver bound to the device when this function runs?)
> >
> > I guess I need to add more printks to figure out what's going on here.
> > But yeah, I'm not seeing the callback happening at the interconnect
> > level so hardware and PM runtime states won't match on the following
> > probe after -EPROBE_DEFER.
> >
> >> The function's name suggests that it merely resets the data stored in
> >> dev->power without actually touching the hardware.  Is that what you
> >> really want?
> >
> > I guess you mean pm_runtime_set_suspended() above? I'm seeing a state
> > where we now set pm_runtime_set_suspended() between failed device
> > probes and the device is still active in hardware.
> >
> > The patch below also helps with the problem and leaves out the
> > rpm_suspend() call from loop so it might give more hints.
> >
> > The difference here from what Rafael suggested earlier is calling
> > __pm_runtime_use_autosuspend() and then not calling
> > pm_runtime_set_suspended().
> >
> > However, it seems the below patch keeps hardware active in the
> > autoidle case though, so chances are there is more that needs to
> > be done here. Anyways, I'll try to debug it more tomorrow.
> >
> 
> Your observations is correct. The hardware will be kept active
> in-between the probe attempts (and thus also if probing fails).
> Although, that's not a regression as that's the behaviour you get from
> runtime PM, when drivers are implemented like omap_hsmmc.

Perhaps this is what we should do.  If probing gets deferred a few 
times, doing runtime suspends and resumes in between will be a waste of 
time.

> Instead of the suggested approaches, I think the regression should be
> fixed at the PM domain level (omap hwmod). I have attached a patch
> below, please give it try as it's untested.
> 
> To solve the other problem (allowing devices to become inactive
> in-between at probe failures), I see two options (not treated as
> regressions).
> 1)
> Change the behaviour of pm_runtime_put_sync(), to *not* respect the
> autosuspend mode.
> I think I prefer this option, as it seems like autosuspend should be
> respected only via the asynchronous runtime PM APIs.

?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
mode.  If you want to respect it, you have to call 
pm_runtime_put_sync_autosuspend() instead.

> 2)
> Change the failing drivers, to before calling pm_runtime_put_sync()
> also invoke pm_runtime_dont_use_autosusend(). Or other similar
> approach.

Given the above, this seems unnecessary.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 10:42                                   ` Ulf Hansson
@ 2016-02-02 16:23                                     ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 16:23 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> That's not what we want. So I moved the checks to the actual
> ->runtime_resume() callback in the PM domain, instead of in the
> omap_device_enable() function. A version 2 is attached. Please give it
> at try.
> 
> [...]
> 
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Date: Tue, 2 Feb 2016 10:05:39 +0100
> Subject: [PATCH V2] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled
> 
> Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
> error and driver unbind") may re-initialize the runtime PM status of the
> device to RPM_SUSPENDED at driver probe failures.
> 
> For the omap_hsmmc and likely also other omap drivers, which needs more
> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
> causes a regression at the PM domain level (omap hwmod).
> 
> The reason is that the drivers don't put back the device into low power
> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
> status from RPM_ACTIVE to RPM_SUSPENDED.

But isn't that okay?  When the system starts up, the PM core doesn't 
know the actual state of any device.  It just marks each device as 
RPM_SUSPENDED and disabled as it is registered.  It's the job of 
subsystems and drivers to make sure the stored state agrees with the 
actual physical state.

So why shouldn't pm_runtime_reinit() leave the structure the same as it 
would be if the device had just been registered?

> The next ->probe() attempt then triggers the ->runtime_resume() callback
> to be invoked, which means this happens two times in a row. At the PM
> domain level (omap hwmod) this is being treated as an error and thus the
> runtime PM status of the device isn't correctly synchronized with the
> runtime PM core.

Why doesn't the same problem arise the first time the device is probed?  
Or to put it another, why should a re-probe be any different from the 
initial probe?

> In the end, ->probe() anyway succeeds (as the driver don't checks the
> error code from the runtime PM APIs), but results in that the PM domain
> always stays powered on. This because of the runtime PM core believes the
> device is RPM_SUSPENDED.
> 
> Fix this regression by allowing devices to be pre-enabled when the PM
> domain's (omap hwmod) ->runtime_resume() callback is requested to enable
> the device. In such cases, return zero to synchronize the runtime PM
> status with the runtime PM core.

Is this really the right way to fix the problem?

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 16:23                                     ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> That's not what we want. So I moved the checks to the actual
> ->runtime_resume() callback in the PM domain, instead of in the
> omap_device_enable() function. A version 2 is attached. Please give it
> at try.
> 
> [...]
> 
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Date: Tue, 2 Feb 2016 10:05:39 +0100
> Subject: [PATCH V2] ARM: OMAP2+: omap-device: Allow devices to be pre-enabled
> 
> Commit 5de85b9d57ab ("PM / runtime: Re-init runtime PM states at probe
> error and driver unbind") may re-initialize the runtime PM status of the
> device to RPM_SUSPENDED at driver probe failures.
> 
> For the omap_hsmmc and likely also other omap drivers, which needs more
> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
> causes a regression at the PM domain level (omap hwmod).
> 
> The reason is that the drivers don't put back the device into low power
> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
> status from RPM_ACTIVE to RPM_SUSPENDED.

But isn't that okay?  When the system starts up, the PM core doesn't 
know the actual state of any device.  It just marks each device as 
RPM_SUSPENDED and disabled as it is registered.  It's the job of 
subsystems and drivers to make sure the stored state agrees with the 
actual physical state.

So why shouldn't pm_runtime_reinit() leave the structure the same as it 
would be if the device had just been registered?

> The next ->probe() attempt then triggers the ->runtime_resume() callback
> to be invoked, which means this happens two times in a row. At the PM
> domain level (omap hwmod) this is being treated as an error and thus the
> runtime PM status of the device isn't correctly synchronized with the
> runtime PM core.

Why doesn't the same problem arise the first time the device is probed?  
Or to put it another, why should a re-probe be any different from the 
initial probe?

> In the end, ->probe() anyway succeeds (as the driver don't checks the
> error code from the runtime PM APIs), but results in that the PM domain
> always stays powered on. This because of the runtime PM core believes the
> device is RPM_SUSPENDED.
> 
> Fix this regression by allowing devices to be pre-enabled when the PM
> domain's (omap hwmod) ->runtime_resume() callback is requested to enable
> the device. In such cases, return zero to synchronize the runtime PM
> status with the runtime PM core.

Is this really the right way to fix the problem?

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 10:42                                   ` Ulf Hansson
@ 2016-02-02 16:35                                     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 16:35 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [160202 02:43]:
> 
> For the omap_hsmmc and likely also other omap drivers, which needs more
> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
> causes a regression at the PM domain level (omap hwmod).
> 
> The reason is that the drivers don't put back the device into low power
> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
> status from RPM_ACTIVE to RPM_SUSPENDED.

Yup, that's the bug here. It seems that we never call the runtime_suspend
callback at the end of a first failed device driver probe if the driver
has set pm_runtime_use_autosuspend. Only rpm_idle runtime_idle callback
gets called. So the device stays on.

This does not happen if pm_runtime_dont_use_autosuspend() is added to
the end of the device driver probe before pm_runtime_put_sync().

> The next ->probe() attempt then triggers the ->runtime_resume() callback
> to be invoked, which means this happens two times in a row. At the PM
> domain level (omap hwmod) this is being treated as an error and thus the
> runtime PM status of the device isn't correctly synchronized with the
> runtime PM core.

That's a valid error though, let's not remove it. The reason why we
call runtime_resume() twice is because runtime_suspend callback never
gets called like I explain above.

> In the end, ->probe() anyway succeeds (as the driver don't checks the
> error code from the runtime PM APIs), but results in that the PM domain
> always stays powered on. This because of the runtime PM core believes the
> device is RPM_SUSPENDED.

FYI, the following allows runtime_suspend callback to get called at the
end of a failed driver probe so the hardware state matches the PM runtime
state. Need to debug more.

Regards,

Tony

8< ------------
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -2232,6 +2232,7 @@ err_irq:
 		dma_release_channel(host->tx_chan);
 	if (host->rx_chan)
 		dma_release_channel(host->rx_chan);
+	pm_runtime_dont_use_autosuspend(host->dev);
 	pm_runtime_put_sync(host->dev);
 	pm_runtime_disable(host->dev);
 	if (host->dbclk)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 16:35                                     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 16:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

* Ulf Hansson <ulf.hansson@linaro.org> [160202 02:43]:
> 
> For the omap_hsmmc and likely also other omap drivers, which needs more
> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
> causes a regression at the PM domain level (omap hwmod).
> 
> The reason is that the drivers don't put back the device into low power
> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
> status from RPM_ACTIVE to RPM_SUSPENDED.

Yup, that's the bug here. It seems that we never call the runtime_suspend
callback at the end of a first failed device driver probe if the driver
has set pm_runtime_use_autosuspend. Only rpm_idle runtime_idle callback
gets called. So the device stays on.

This does not happen if pm_runtime_dont_use_autosuspend() is added to
the end of the device driver probe before pm_runtime_put_sync().

> The next ->probe() attempt then triggers the ->runtime_resume() callback
> to be invoked, which means this happens two times in a row. At the PM
> domain level (omap hwmod) this is being treated as an error and thus the
> runtime PM status of the device isn't correctly synchronized with the
> runtime PM core.

That's a valid error though, let's not remove it. The reason why we
call runtime_resume() twice is because runtime_suspend callback never
gets called like I explain above.

> In the end, ->probe() anyway succeeds (as the driver don't checks the
> error code from the runtime PM APIs), but results in that the PM domain
> always stays powered on. This because of the runtime PM core believes the
> device is RPM_SUSPENDED.

FYI, the following allows runtime_suspend callback to get called at the
end of a failed driver probe so the hardware state matches the PM runtime
state. Need to debug more.

Regards,

Tony

8< ------------
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -2232,6 +2232,7 @@ err_irq:
 		dma_release_channel(host->tx_chan);
 	if (host->rx_chan)
 		dma_release_channel(host->rx_chan);
+	pm_runtime_dont_use_autosuspend(host->dev);
 	pm_runtime_put_sync(host->dev);
 	pm_runtime_disable(host->dev);
 	if (host->dbclk)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 16:15                                   ` Alan Stern
@ 2016-02-02 16:49                                     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 16:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> On Tue, 2 Feb 2016, Ulf Hansson wrote:
> > 
> > Your observations is correct. The hardware will be kept active
> > in-between the probe attempts (and thus also if probing fails).
> > Although, that's not a regression as that's the behaviour you get from
> > runtime PM, when drivers are implemented like omap_hsmmc.
> 
> Perhaps this is what we should do.  If probing gets deferred a few 
> times, doing runtime suspends and resumes in between will be a waste of 
> time.

That sounds like an optimization though. We'd still have to disable
the deviec somehow after deferred probe gives up.

> ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> mode.  If you want to respect it, you have to call 
> pm_runtime_put_sync_autosuspend() instead.

I think you found the real bug there. So the right fix is to
call pm_runtime_put_sync_autosuspend() at the end of failed
probe in omap_hsmmc. Let me give that a try here.

Can we add some warning to pm_runtime_put_sync() about that?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 16:49                                     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> On Tue, 2 Feb 2016, Ulf Hansson wrote:
> > 
> > Your observations is correct. The hardware will be kept active
> > in-between the probe attempts (and thus also if probing fails).
> > Although, that's not a regression as that's the behaviour you get from
> > runtime PM, when drivers are implemented like omap_hsmmc.
> 
> Perhaps this is what we should do.  If probing gets deferred a few 
> times, doing runtime suspends and resumes in between will be a waste of 
> time.

That sounds like an optimization though. We'd still have to disable
the deviec somehow after deferred probe gives up.

> ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> mode.  If you want to respect it, you have to call 
> pm_runtime_put_sync_autosuspend() instead.

I think you found the real bug there. So the right fix is to
call pm_runtime_put_sync_autosuspend() at the end of failed
probe in omap_hsmmc. Let me give that a try here.

Can we add some warning to pm_runtime_put_sync() about that?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 16:49                                     ` Tony Lindgren
@ 2016-02-02 18:05                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160202 08:50]:
> * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> 
> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > mode.  If you want to respect it, you have to call 
> > pm_runtime_put_sync_autosuspend() instead.
> 
> I think you found the real bug there. So the right fix is to
> call pm_runtime_put_sync_autosuspend() at the end of failed
> probe in omap_hsmmc. Let me give that a try here.

Nope that's not it but getting closer.

The following seems to make things behave for me. Now the
question is.. Does it have some undesired side effects?

> Can we add some warning to pm_runtime_put_sync() about that?

Probably no need for that, I misunderstood the meaning of
pm_runtime_put_sync_autosuspend().

Regards,

Tony

8< --------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -435,7 +435,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
 		goto out;
 
 	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
-	if ((rpmflags & RPM_AUTO)
+	if (((rpmflags & (RPM_ASYNC | RPM_AUTO)) == ((RPM_ASYNC | RPM_AUTO)))
 	    && dev->power.runtime_status != RPM_SUSPENDING) {
 		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
 

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 18:05                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:05 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160202 08:50]:
> * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> 
> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > mode.  If you want to respect it, you have to call 
> > pm_runtime_put_sync_autosuspend() instead.
> 
> I think you found the real bug there. So the right fix is to
> call pm_runtime_put_sync_autosuspend() at the end of failed
> probe in omap_hsmmc. Let me give that a try here.

Nope that's not it but getting closer.

The following seems to make things behave for me. Now the
question is.. Does it have some undesired side effects?

> Can we add some warning to pm_runtime_put_sync() about that?

Probably no need for that, I misunderstood the meaning of
pm_runtime_put_sync_autosuspend().

Regards,

Tony

8< --------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -435,7 +435,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
 		goto out;
 
 	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
-	if ((rpmflags & RPM_AUTO)
+	if (((rpmflags & (RPM_ASYNC | RPM_AUTO)) == ((RPM_ASYNC | RPM_AUTO)))
 	    && dev->power.runtime_status != RPM_SUSPENDING) {
 		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
 

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 18:05                                       ` Tony Lindgren
@ 2016-02-02 18:43                                         ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 18:43 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> * Tony Lindgren <tony@atomide.com> [160202 08:50]:
> > * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> > 
> > > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > > mode.  If you want to respect it, you have to call 
> > > pm_runtime_put_sync_autosuspend() instead.
> > 
> > I think you found the real bug there. So the right fix is to
> > call pm_runtime_put_sync_autosuspend() at the end of failed
> > probe in omap_hsmmc. Let me give that a try here.
> 
> Nope that's not it but getting closer.
> 
> The following seems to make things behave for me. Now the
> question is.. Does it have some undesired side effects?

Yes, it does.

I'm still not clear on what you want to accomplish.  It sounds like you 
want to perform a runtime suspend following the last probe (if the 
probe fails), and in between probes you don't really care (although it 
would be preferable to avoid suspending).

Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
it does, then perhaps you can get what you want by having the probe 
routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
return an error -- particularly -EDEFER.

> > Can we add some warning to pm_runtime_put_sync() about that?
> 
> Probably no need for that, I misunderstood the meaning of
> pm_runtime_put_sync_autosuspend().
> 
> Regards,
> 
> Tony
> 
> 8< --------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -435,7 +435,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>  		goto out;
>  
>  	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
> -	if ((rpmflags & RPM_AUTO)
> +	if (((rpmflags & (RPM_ASYNC | RPM_AUTO)) == ((RPM_ASYNC | RPM_AUTO)))
>  	    && dev->power.runtime_status != RPM_SUSPENDING) {
>  		unsigned long expires = pm_runtime_autosuspend_expiration(dev);

This would prevent pm_runtime_autosuspend() from working correctly.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 18:43                                         ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> * Tony Lindgren <tony@atomide.com> [160202 08:50]:
> > * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> > 
> > > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > > mode.  If you want to respect it, you have to call 
> > > pm_runtime_put_sync_autosuspend() instead.
> > 
> > I think you found the real bug there. So the right fix is to
> > call pm_runtime_put_sync_autosuspend() at the end of failed
> > probe in omap_hsmmc. Let me give that a try here.
> 
> Nope that's not it but getting closer.
> 
> The following seems to make things behave for me. Now the
> question is.. Does it have some undesired side effects?

Yes, it does.

I'm still not clear on what you want to accomplish.  It sounds like you 
want to perform a runtime suspend following the last probe (if the 
probe fails), and in between probes you don't really care (although it 
would be preferable to avoid suspending).

Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
it does, then perhaps you can get what you want by having the probe 
routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
return an error -- particularly -EDEFER.

> > Can we add some warning to pm_runtime_put_sync() about that?
> 
> Probably no need for that, I misunderstood the meaning of
> pm_runtime_put_sync_autosuspend().
> 
> Regards,
> 
> Tony
> 
> 8< --------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -435,7 +435,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>  		goto out;
>  
>  	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
> -	if ((rpmflags & RPM_AUTO)
> +	if (((rpmflags & (RPM_ASYNC | RPM_AUTO)) == ((RPM_ASYNC | RPM_AUTO)))
>  	    && dev->power.runtime_status != RPM_SUSPENDING) {
>  		unsigned long expires = pm_runtime_autosuspend_expiration(dev);

This would prevent pm_runtime_autosuspend() from working correctly.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 18:05                                       ` Tony Lindgren
@ 2016-02-02 18:47                                         ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:47 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160202 10:07]:
> 
> The following seems to make things behave for me. Now the
> question is.. Does it have some undesired side effects?

The side effect is that it breaks PM runtime for devices
during runtime..

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 18:47                                         ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160202 10:07]:
> 
> The following seems to make things behave for me. Now the
> question is.. Does it have some undesired side effects?

The side effect is that it breaks PM runtime for devices
during runtime..

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 18:43                                         ` Alan Stern
@ 2016-02-02 18:54                                           ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:54 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 10:44]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > * Tony Lindgren <tony@atomide.com> [160202 08:50]:
> > > * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> > > 
> > > > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > > > mode.  If you want to respect it, you have to call 
> > > > pm_runtime_put_sync_autosuspend() instead.
> > > 
> > > I think you found the real bug there. So the right fix is to
> > > call pm_runtime_put_sync_autosuspend() at the end of failed
> > > probe in omap_hsmmc. Let me give that a try here.
> > 
> > Nope that's not it but getting closer.
> > 
> > The following seems to make things behave for me. Now the
> > question is.. Does it have some undesired side effects?
> 
> Yes, it does.

Yeah noticed..

> I'm still not clear on what you want to accomplish.  It sounds like you 
> want to perform a runtime suspend following the last probe (if the 
> probe fails), and in between probes you don't really care (although it 
> would be preferable to avoid suspending).

I'd like to have pm_runtime_put_sync() disable the hardware after
the initial failed probe. Currently that does not happen unless
pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().

> Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> it does, then perhaps you can get what you want by having the probe 
> routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> return an error -- particularly -EDEFER.

Yes so far that's the only fix that seems to work like I posted
earlier. But is that the right fix though?

If we wanted to have some generic fix, it seems we would have to pass
a new flag in pm_runtime_put_sync() to ignore any autosuspend
configuration. But I don't know if that's what we want to or should
do though?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 18:54                                           ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 18:54 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 10:44]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > * Tony Lindgren <tony@atomide.com> [160202 08:50]:
> > > * Alan Stern <stern@rowland.harvard.edu> [160202 08:17]:
> > > 
> > > > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend 
> > > > mode.  If you want to respect it, you have to call 
> > > > pm_runtime_put_sync_autosuspend() instead.
> > > 
> > > I think you found the real bug there. So the right fix is to
> > > call pm_runtime_put_sync_autosuspend() at the end of failed
> > > probe in omap_hsmmc. Let me give that a try here.
> > 
> > Nope that's not it but getting closer.
> > 
> > The following seems to make things behave for me. Now the
> > question is.. Does it have some undesired side effects?
> 
> Yes, it does.

Yeah noticed..

> I'm still not clear on what you want to accomplish.  It sounds like you 
> want to perform a runtime suspend following the last probe (if the 
> probe fails), and in between probes you don't really care (although it 
> would be preferable to avoid suspending).

I'd like to have pm_runtime_put_sync() disable the hardware after
the initial failed probe. Currently that does not happen unless
pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().

> Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> it does, then perhaps you can get what you want by having the probe 
> routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> return an error -- particularly -EDEFER.

Yes so far that's the only fix that seems to work like I posted
earlier. But is that the right fix though?

If we wanted to have some generic fix, it seems we would have to pass
a new flag in pm_runtime_put_sync() to ignore any autosuspend
configuration. But I don't know if that's what we want to or should
do though?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 18:54                                           ` Tony Lindgren
@ 2016-02-02 19:16                                             ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 19:16 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> > I'm still not clear on what you want to accomplish.  It sounds like you 
> > want to perform a runtime suspend following the last probe (if the 
> > probe fails), and in between probes you don't really care (although it 
> > would be preferable to avoid suspending).
> 
> I'd like to have pm_runtime_put_sync() disable the hardware after
> the initial failed probe. Currently that does not happen unless
> pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().

pm_runtime_put_sync() doesn't do anything to the hardware if the usage
count was > 1, because after the decrement it's still nonzero.  Where
is the particular call of pm_runtime_put_sync() that you're interested
in, and what is the usage count when it runs?  It's not at all unusual
for the usage count to be > 1 during a probe.

Also, what is autosuspend_delay set to for your device?  And is 
runtime_auto set?

> > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > it does, then perhaps you can get what you want by having the probe 
> > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > return an error -- particularly -EDEFER.
> 
> Yes so far that's the only fix that seems to work like I posted
> earlier. But is that the right fix though?

No, not really.  Ideally you would leave autosuspend turned on.  The 
delay would be long enough to that after -EDEFER, another probe would 
start before the delay expired.  But shortly after the last probe 
attempt, the delay would expire and the device would then be put in low 
power.

> If we wanted to have some generic fix, it seems we would have to pass
> a new flag in pm_runtime_put_sync() to ignore any autosuspend
> configuration. But I don't know if that's what we want to or should
> do though?

I don't think so.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 19:16                                             ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 19:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> > I'm still not clear on what you want to accomplish.  It sounds like you 
> > want to perform a runtime suspend following the last probe (if the 
> > probe fails), and in between probes you don't really care (although it 
> > would be preferable to avoid suspending).
> 
> I'd like to have pm_runtime_put_sync() disable the hardware after
> the initial failed probe. Currently that does not happen unless
> pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().

pm_runtime_put_sync() doesn't do anything to the hardware if the usage
count was > 1, because after the decrement it's still nonzero.  Where
is the particular call of pm_runtime_put_sync() that you're interested
in, and what is the usage count when it runs?  It's not at all unusual
for the usage count to be > 1 during a probe.

Also, what is autosuspend_delay set to for your device?  And is 
runtime_auto set?

> > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > it does, then perhaps you can get what you want by having the probe 
> > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > return an error -- particularly -EDEFER.
> 
> Yes so far that's the only fix that seems to work like I posted
> earlier. But is that the right fix though?

No, not really.  Ideally you would leave autosuspend turned on.  The 
delay would be long enough to that after -EDEFER, another probe would 
start before the delay expired.  But shortly after the last probe 
attempt, the delay would expire and the device would then be put in low 
power.

> If we wanted to have some generic fix, it seems we would have to pass
> a new flag in pm_runtime_put_sync() to ignore any autosuspend
> configuration. But I don't know if that's what we want to or should
> do though?

I don't think so.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 16:15                                   ` Alan Stern
@ 2016-02-02 20:24                                     ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 20:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

[...]

>>
>> Your observations is correct. The hardware will be kept active
>> in-between the probe attempts (and thus also if probing fails).
>> Although, that's not a regression as that's the behaviour you get from
>> runtime PM, when drivers are implemented like omap_hsmmc.
>
> Perhaps this is what we should do.  If probing gets deferred a few
> times, doing runtime suspends and resumes in between will be a waste of
> time.

Then you will have to distinguish between -EPROBE_DEFER and other
errors, as I think leaving the device fully powered from a permanent
failed probe isn't very good.

Anyway, for sure it's doable by the driver, but let's try to focus on
the regression here for now.

>
>> Instead of the suggested approaches, I think the regression should be
>> fixed at the PM domain level (omap hwmod). I have attached a patch
>> below, please give it try as it's untested.
>>
>> To solve the other problem (allowing devices to become inactive
>> in-between at probe failures), I see two options (not treated as
>> regressions).
>> 1)
>> Change the behaviour of pm_runtime_put_sync(), to *not* respect the
>> autosuspend mode.
>> I think I prefer this option, as it seems like autosuspend should be
>> respected only via the asynchronous runtime PM APIs.
>
> ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> mode.  If you want to respect it, you have to call
> pm_runtime_put_sync_autosuspend() instead.

Then there's a bug in the runtime PM core.

>From Tony's regression report and from mine own local runtime PM test
driver, I can see that the device doesn't get RPM_SUSPENDED (the
->runtime_suspend() callback isn't called), even when the usage count
is zero - when pm_runtime_put_sync() is called.

To find the sequence of runtime PM commands, go ahead an have look in
the omap_hsmmc driver. The problem occurs when the driver bails out in
probe, when it receives -EPROBE_DEFER when fetching regulators.

I have some more data to share on this problem from my runtime PM test
driver. I will try my best to share it tomorrow.

>
>> 2)
>> Change the failing drivers, to before calling pm_runtime_put_sync()
>> also invoke pm_runtime_dont_use_autosusend(). Or other similar
>> approach.
>
> Given the above, this seems unnecessary.

Okay, so you are saying that the pm_runtime_put_sync() should idle the
device even if autosuspend is in use. That seems reasonable, I will
look into this problem.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 20:24                                     ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 20:24 UTC (permalink / raw)
  To: linux-arm-kernel

[...]

>>
>> Your observations is correct. The hardware will be kept active
>> in-between the probe attempts (and thus also if probing fails).
>> Although, that's not a regression as that's the behaviour you get from
>> runtime PM, when drivers are implemented like omap_hsmmc.
>
> Perhaps this is what we should do.  If probing gets deferred a few
> times, doing runtime suspends and resumes in between will be a waste of
> time.

Then you will have to distinguish between -EPROBE_DEFER and other
errors, as I think leaving the device fully powered from a permanent
failed probe isn't very good.

Anyway, for sure it's doable by the driver, but let's try to focus on
the regression here for now.

>
>> Instead of the suggested approaches, I think the regression should be
>> fixed at the PM domain level (omap hwmod). I have attached a patch
>> below, please give it try as it's untested.
>>
>> To solve the other problem (allowing devices to become inactive
>> in-between at probe failures), I see two options (not treated as
>> regressions).
>> 1)
>> Change the behaviour of pm_runtime_put_sync(), to *not* respect the
>> autosuspend mode.
>> I think I prefer this option, as it seems like autosuspend should be
>> respected only via the asynchronous runtime PM APIs.
>
> ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> mode.  If you want to respect it, you have to call
> pm_runtime_put_sync_autosuspend() instead.

Then there's a bug in the runtime PM core.

>From Tony's regression report and from mine own local runtime PM test
driver, I can see that the device doesn't get RPM_SUSPENDED (the
->runtime_suspend() callback isn't called), even when the usage count
is zero - when pm_runtime_put_sync() is called.

To find the sequence of runtime PM commands, go ahead an have look in
the omap_hsmmc driver. The problem occurs when the driver bails out in
probe, when it receives -EPROBE_DEFER when fetching regulators.

I have some more data to share on this problem from my runtime PM test
driver. I will try my best to share it tomorrow.

>
>> 2)
>> Change the failing drivers, to before calling pm_runtime_put_sync()
>> also invoke pm_runtime_dont_use_autosusend(). Or other similar
>> approach.
>
> Given the above, this seems unnecessary.

Okay, so you are saying that the pm_runtime_put_sync() should idle the
device even if autosuspend is in use. That seems reasonable, I will
look into this problem.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 16:35                                     ` Tony Lindgren
@ 2016-02-02 20:47                                       ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 20:47 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 02:43]:
>>
>> For the omap_hsmmc and likely also other omap drivers, which needs more
>> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
>> causes a regression at the PM domain level (omap hwmod).
>>
>> The reason is that the drivers don't put back the device into low power
>> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
>> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
>> status from RPM_ACTIVE to RPM_SUSPENDED.
>
> Yup, that's the bug here. It seems that we never call the runtime_suspend
> callback at the end of a first failed device driver probe if the driver
> has set pm_runtime_use_autosuspend. Only rpm_idle runtime_idle callback
> gets called. So the device stays on.
>
> This does not happen if pm_runtime_dont_use_autosuspend() is added to
> the end of the device driver probe before pm_runtime_put_sync().

Thanks! It then confirms the second option I proposed.

>
>> The next ->probe() attempt then triggers the ->runtime_resume() callback
>> to be invoked, which means this happens two times in a row. At the PM
>> domain level (omap hwmod) this is being treated as an error and thus the
>> runtime PM status of the device isn't correctly synchronized with the
>> runtime PM core.
>
> That's a valid error though, let's not remove it. The reason why we
> call runtime_resume() twice is because runtime_suspend callback never
> gets called like I explain above.

This isn't an error, it's just a hickup in the synchronization of the
runtime PM status.

Very similar what happens at the first probe, when the driver core has
initialized the runtime PM status to RPM_SUSPENDED at the device
registration.

To me, it's the responsible of the PM domain to *help* with the
synchronization, not prevent it as it currently does.

>
>> In the end, ->probe() anyway succeeds (as the driver don't checks the
>> error code from the runtime PM APIs), but results in that the PM domain
>> always stays powered on. This because of the runtime PM core believes the
>> device is RPM_SUSPENDED.
>
> FYI, the following allows runtime_suspend callback to get called at the
> end of a failed driver probe so the hardware state matches the PM runtime
> state. Need to debug more.
>
> Regards,
>
> Tony
>
> 8< ------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>                 dma_release_channel(host->tx_chan);
>         if (host->rx_chan)
>                 dma_release_channel(host->rx_chan);
> +       pm_runtime_dont_use_autosuspend(host->dev);

It's good know this works, although do you intend to fix this sequence
for all omap drivers/devices that's part of the hwmod PM domain?

I haven't checked the number of drivers this would affect, but I
imagine there could be quite many with similar behaviour and thus may
suffer from the same issue.

Of course the regression is only noticed for those returning
-EPROBE_DEFER, which might not be that many, but it seems fragile to
rely on this when going forward. All related drivers then needs to be
fixed.

>         pm_runtime_put_sync(host->dev);
>         pm_runtime_disable(host->dev);
>         if (host->dbclk)

Could you please test my version 2 of the patch I attached earlier. I
still believe it's the best way to solve the regression, if it works
of course. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 20:47                                       ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-02 20:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
> Hi,
>
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 02:43]:
>>
>> For the omap_hsmmc and likely also other omap drivers, which needs more
>> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit
>> causes a regression at the PM domain level (omap hwmod).
>>
>> The reason is that the drivers don't put back the device into low power
>> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to
>> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM
>> status from RPM_ACTIVE to RPM_SUSPENDED.
>
> Yup, that's the bug here. It seems that we never call the runtime_suspend
> callback at the end of a first failed device driver probe if the driver
> has set pm_runtime_use_autosuspend. Only rpm_idle runtime_idle callback
> gets called. So the device stays on.
>
> This does not happen if pm_runtime_dont_use_autosuspend() is added to
> the end of the device driver probe before pm_runtime_put_sync().

Thanks! It then confirms the second option I proposed.

>
>> The next ->probe() attempt then triggers the ->runtime_resume() callback
>> to be invoked, which means this happens two times in a row. At the PM
>> domain level (omap hwmod) this is being treated as an error and thus the
>> runtime PM status of the device isn't correctly synchronized with the
>> runtime PM core.
>
> That's a valid error though, let's not remove it. The reason why we
> call runtime_resume() twice is because runtime_suspend callback never
> gets called like I explain above.

This isn't an error, it's just a hickup in the synchronization of the
runtime PM status.

Very similar what happens at the first probe, when the driver core has
initialized the runtime PM status to RPM_SUSPENDED at the device
registration.

To me, it's the responsible of the PM domain to *help* with the
synchronization, not prevent it as it currently does.

>
>> In the end, ->probe() anyway succeeds (as the driver don't checks the
>> error code from the runtime PM APIs), but results in that the PM domain
>> always stays powered on. This because of the runtime PM core believes the
>> device is RPM_SUSPENDED.
>
> FYI, the following allows runtime_suspend callback to get called at the
> end of a failed driver probe so the hardware state matches the PM runtime
> state. Need to debug more.
>
> Regards,
>
> Tony
>
> 8< ------------
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -2232,6 +2232,7 @@ err_irq:
>                 dma_release_channel(host->tx_chan);
>         if (host->rx_chan)
>                 dma_release_channel(host->rx_chan);
> +       pm_runtime_dont_use_autosuspend(host->dev);

It's good know this works, although do you intend to fix this sequence
for all omap drivers/devices that's part of the hwmod PM domain?

I haven't checked the number of drivers this would affect, but I
imagine there could be quite many with similar behaviour and thus may
suffer from the same issue.

Of course the regression is only noticed for those returning
-EPROBE_DEFER, which might not be that many, but it seems fragile to
rely on this when going forward. All related drivers then needs to be
fixed.

>         pm_runtime_put_sync(host->dev);
>         pm_runtime_disable(host->dev);
>         if (host->dbclk)

Could you please test my version 2 of the patch I attached earlier. I
still believe it's the best way to solve the regression, if it works
of course. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 19:16                                             ` Alan Stern
@ 2016-02-02 21:03                                               ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 21:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 11:17]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > I'd like to have pm_runtime_put_sync() disable the hardware after
> > the initial failed probe. Currently that does not happen unless
> > pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().
> 
> pm_runtime_put_sync() doesn't do anything to the hardware if the usage
> count was > 1, because after the decrement it's still nonzero.  Where
> is the particular call of pm_runtime_put_sync() that you're interested
> in, and what is the usage count when it runs?  It's not at all unusual
> for the usage count to be > 1 during a probe.

The usage count is 0 at that point, it seems the be the RPM_AUTO
causing the issues that we set at the end of rpm_idle().

> Also, what is autosuspend_delay set to for your device?  And is 
> runtime_auto set?

It's 100 at that point, see the commented snippet below from
omap_hsmmc_probe():

	pm_runtime_enable(host->dev);
	pm_runtime_get_sync(host->dev);
	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
	pm_runtime_use_autosuspend(host->dev);
	...
	/* gets -EPROBE_DEFER */
err_irq:
	...
	pm_runtime_put_sync(host->dev);
        pm_runtime_disable(host->dev);
	/* NOTE: suspend callback never gets called unless
	 * pm_runtime_dont_use_autosuspend() is called
	 * before pm_runtime_put_sync() above.
	 */
	 ...

> > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > it does, then perhaps you can get what you want by having the probe 
> > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > return an error -- particularly -EDEFER.
> > 
> > Yes so far that's the only fix that seems to work like I posted
> > earlier. But is that the right fix though?
> 
> No, not really.  Ideally you would leave autosuspend turned on.  The 
> delay would be long enough to that after -EDEFER, another probe would 
> start before the delay expired.  But shortly after the last probe 
> attempt, the delay would expire and the device would then be put in low 
> power.

But then what about the new reinit function? To me it seems that
we should not attempt to maintain a state from the earlier failed
probe. Or are you thinking we just skip the reinit if autosuspend
is set?

> > If we wanted to have some generic fix, it seems we would have to pass
> > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > configuration. But I don't know if that's what we want to or should
> > do though?
> 
> I don't think so.

So should we just establish a policy that pm_runtime_use_autosuspend()
needs to be paired with pm_runtime_dont_use_autosuspend() for
pm_runtime_put_sync() to work?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 21:03                                               ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 21:03 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 11:17]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > I'd like to have pm_runtime_put_sync() disable the hardware after
> > the initial failed probe. Currently that does not happen unless
> > pm_runtime_dont_use_autosuspend() is called before pm_runtime_put_sync().
> 
> pm_runtime_put_sync() doesn't do anything to the hardware if the usage
> count was > 1, because after the decrement it's still nonzero.  Where
> is the particular call of pm_runtime_put_sync() that you're interested
> in, and what is the usage count when it runs?  It's not at all unusual
> for the usage count to be > 1 during a probe.

The usage count is 0 at that point, it seems the be the RPM_AUTO
causing the issues that we set at the end of rpm_idle().

> Also, what is autosuspend_delay set to for your device?  And is 
> runtime_auto set?

It's 100 at that point, see the commented snippet below from
omap_hsmmc_probe():

	pm_runtime_enable(host->dev);
	pm_runtime_get_sync(host->dev);
	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
	pm_runtime_use_autosuspend(host->dev);
	...
	/* gets -EPROBE_DEFER */
err_irq:
	...
	pm_runtime_put_sync(host->dev);
        pm_runtime_disable(host->dev);
	/* NOTE: suspend callback never gets called unless
	 * pm_runtime_dont_use_autosuspend() is called
	 * before pm_runtime_put_sync() above.
	 */
	 ...

> > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > it does, then perhaps you can get what you want by having the probe 
> > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > return an error -- particularly -EDEFER.
> > 
> > Yes so far that's the only fix that seems to work like I posted
> > earlier. But is that the right fix though?
> 
> No, not really.  Ideally you would leave autosuspend turned on.  The 
> delay would be long enough to that after -EDEFER, another probe would 
> start before the delay expired.  But shortly after the last probe 
> attempt, the delay would expire and the device would then be put in low 
> power.

But then what about the new reinit function? To me it seems that
we should not attempt to maintain a state from the earlier failed
probe. Or are you thinking we just skip the reinit if autosuspend
is set?

> > If we wanted to have some generic fix, it seems we would have to pass
> > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > configuration. But I don't know if that's what we want to or should
> > do though?
> 
> I don't think so.

So should we just establish a policy that pm_runtime_use_autosuspend()
needs to be paired with pm_runtime_dont_use_autosuspend() for
pm_runtime_put_sync() to work?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 20:24                                     ` Ulf Hansson
@ 2016-02-02 21:24                                       ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 21:24 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Tony Lindgren, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> > mode.  If you want to respect it, you have to call
> > pm_runtime_put_sync_autosuspend() instead.
> 
> Then there's a bug in the runtime PM core.
> 
> From Tony's regression report and from mine own local runtime PM test
> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> ->runtime_suspend() callback isn't called), even when the usage count
> is zero - when pm_runtime_put_sync() is called.

Ah, yes -- I see what's going on.  pm_runtime_put_sync() calls 
__pm_runtime_idle(), rather than __pm_runtime_suspend().  And the idle 
routine always forces RPM_AUTO on.

So what I said was wrong.  pm_runtime_put_sync() respects the driver's 
settings for autosuspend, whereas pm_runtime_put_sync_suspend() and 
pm_runtime_put_sync_autosuspend() override the settings.

> To find the sequence of runtime PM commands, go ahead an have look in
> the omap_hsmmc driver. The problem occurs when the driver bails out in
> probe, when it receives -EPROBE_DEFER when fetching regulators.
> 
> I have some more data to share on this problem from my runtime PM test
> driver. I will try my best to share it tomorrow.

> >> 2)
> >> Change the failing drivers, to before calling pm_runtime_put_sync()
> >> also invoke pm_runtime_dont_use_autosusend(). Or other similar
> >> approach.
> >
> > Given the above, this seems unnecessary.
> 
> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> device even if autosuspend is in use. That seems reasonable, I will
> look into this problem.

Heh -- you are the person who did it.  :-)  See commit d66e6db28df3 
("PM / Runtime: Respect autosuspend when idle triggers suspend").

I guess the intention was that if the driver wants to specify whether 
autosuspend should be respected, it should implement an rpm_idle 
callback routine.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 21:24                                       ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 21:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Ulf Hansson wrote:

> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> > mode.  If you want to respect it, you have to call
> > pm_runtime_put_sync_autosuspend() instead.
> 
> Then there's a bug in the runtime PM core.
> 
> From Tony's regression report and from mine own local runtime PM test
> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> ->runtime_suspend() callback isn't called), even when the usage count
> is zero - when pm_runtime_put_sync() is called.

Ah, yes -- I see what's going on.  pm_runtime_put_sync() calls 
__pm_runtime_idle(), rather than __pm_runtime_suspend().  And the idle 
routine always forces RPM_AUTO on.

So what I said was wrong.  pm_runtime_put_sync() respects the driver's 
settings for autosuspend, whereas pm_runtime_put_sync_suspend() and 
pm_runtime_put_sync_autosuspend() override the settings.

> To find the sequence of runtime PM commands, go ahead an have look in
> the omap_hsmmc driver. The problem occurs when the driver bails out in
> probe, when it receives -EPROBE_DEFER when fetching regulators.
> 
> I have some more data to share on this problem from my runtime PM test
> driver. I will try my best to share it tomorrow.

> >> 2)
> >> Change the failing drivers, to before calling pm_runtime_put_sync()
> >> also invoke pm_runtime_dont_use_autosusend(). Or other similar
> >> approach.
> >
> > Given the above, this seems unnecessary.
> 
> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> device even if autosuspend is in use. That seems reasonable, I will
> look into this problem.

Heh -- you are the person who did it.  :-)  See commit d66e6db28df3 
("PM / Runtime: Respect autosuspend when idle triggers suspend").

I guess the intention was that if the driver wants to specify whether 
autosuspend should be respected, it should implement an rpm_idle 
callback routine.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 20:24                                     ` Ulf Hansson
@ 2016-02-02 21:39                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 21:39 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
> >
> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> > mode.  If you want to respect it, you have to call
> > pm_runtime_put_sync_autosuspend() instead.
> 
> Then there's a bug in the runtime PM core.
> 
> From Tony's regression report and from mine own local runtime PM test
> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> ->runtime_suspend() callback isn't called), even when the usage count
> is zero - when pm_runtime_put_sync() is called.
...
> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> device even if autosuspend is in use. That seems reasonable, I will
> look into this problem.

The patch below fixes pm_runtime_put_sync() to not respect the
autosuspend mode to match what Alan is saying above. Seems to also
fixes the $subject issue for me. And seems to behave for PM runtime
for other devices during runtime too based on light testing here.

Regards,

Tony

8< ---------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
 
  out:
 	trace_rpm_return_int(dev, _THIS_IP_, retval);
-	return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
+	if (!(rpmflags & RPM_IGNORE_AUTO))
+		rpmflags |= RPM_AUTO;
+	return retval ? retval : rpm_suspend(dev, rpmflags);
 }
 
 /**
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -23,6 +23,8 @@
 					    usage_count */
 #define RPM_AUTO		0x08	/* Use autosuspend_delay */
 
+#define RPM_IGNORE_AUTO		0x10	/* Ignore autosuspend */
+
 #ifdef CONFIG_PM
 extern struct workqueue_struct *pm_wq;
 
@@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
 
 static inline int pm_runtime_put_sync(struct device *dev)
 {
-	return __pm_runtime_idle(dev, RPM_GET_PUT);
+	return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
 }
 
 static inline int pm_runtime_put_sync_suspend(struct device *dev)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 21:39                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 21:39 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
> >
> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> > mode.  If you want to respect it, you have to call
> > pm_runtime_put_sync_autosuspend() instead.
> 
> Then there's a bug in the runtime PM core.
> 
> From Tony's regression report and from mine own local runtime PM test
> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> ->runtime_suspend() callback isn't called), even when the usage count
> is zero - when pm_runtime_put_sync() is called.
...
> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> device even if autosuspend is in use. That seems reasonable, I will
> look into this problem.

The patch below fixes pm_runtime_put_sync() to not respect the
autosuspend mode to match what Alan is saying above. Seems to also
fixes the $subject issue for me. And seems to behave for PM runtime
for other devices during runtime too based on light testing here.

Regards,

Tony

8< ---------------
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
 
  out:
 	trace_rpm_return_int(dev, _THIS_IP_, retval);
-	return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
+	if (!(rpmflags & RPM_IGNORE_AUTO))
+		rpmflags |= RPM_AUTO;
+	return retval ? retval : rpm_suspend(dev, rpmflags);
 }
 
 /**
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -23,6 +23,8 @@
 					    usage_count */
 #define RPM_AUTO		0x08	/* Use autosuspend_delay */
 
+#define RPM_IGNORE_AUTO		0x10	/* Ignore autosuspend */
+
 #ifdef CONFIG_PM
 extern struct workqueue_struct *pm_wq;
 
@@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
 
 static inline int pm_runtime_put_sync(struct device *dev)
 {
-	return __pm_runtime_idle(dev, RPM_GET_PUT);
+	return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
 }
 
 static inline int pm_runtime_put_sync_suspend(struct device *dev)

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 21:03                                               ` Tony Lindgren
@ 2016-02-02 21:45                                                 ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 21:45 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> > Also, what is autosuspend_delay set to for your device?  And is 
> > runtime_auto set?
> 
> It's 100 at that point, see the commented snippet below from
> omap_hsmmc_probe():
> 
> 	pm_runtime_enable(host->dev);
> 	pm_runtime_get_sync(host->dev);
> 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> 	pm_runtime_use_autosuspend(host->dev);
> 	...
> 	/* gets -EPROBE_DEFER */
> err_irq:
> 	...
> 	pm_runtime_put_sync(host->dev);

You could try changing this to pm_runtime_put_sync_suspend().  But
putting pm_runtime_dont_use_autosuspend() before the put_sync seems
like a perfectly reasonable thing to do, especially if you feel you
should reverse all the changes you made at the start.

>         pm_runtime_disable(host->dev);
> 	/* NOTE: suspend callback never gets called unless
> 	 * pm_runtime_dont_use_autosuspend() is called
> 	 * before pm_runtime_put_sync() above.
> 	 */
> 	 ...
> 
> > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > > it does, then perhaps you can get what you want by having the probe 
> > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > > return an error -- particularly -EDEFER.
> > > 
> > > Yes so far that's the only fix that seems to work like I posted
> > > earlier. But is that the right fix though?
> > 
> > No, not really.  Ideally you would leave autosuspend turned on.  The 
> > delay would be long enough to that after -EDEFER, another probe would 
> > start before the delay expired.  But shortly after the last probe 
> > attempt, the delay would expire and the device would then be put in low 
> > power.
> 
> But then what about the new reinit function? To me it seems that
> we should not attempt to maintain a state from the earlier failed
> probe. Or are you thinking we just skip the reinit if autosuspend
> is set?

The reinit function gets called too late to do what you want -- namely, 
put the hardware in a low-power state.

That _is_ what you want, isn't it?  The alternative is to leave
dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual 
state.

Given that the reinit function is supposed to restore the initial 
settings, it probably ought to call pm_runtime_dont_use_autosuspend().  
But that won't be enough to fix your problem.

> > > If we wanted to have some generic fix, it seems we would have to pass
> > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > > configuration. But I don't know if that's what we want to or should
> > > do though?
> > 
> > I don't think so.
> 
> So should we just establish a policy that pm_runtime_use_autosuspend()
> needs to be paired with pm_runtime_dont_use_autosuspend() for
> pm_runtime_put_sync() to work?

Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
work -- that is, it does what it's supposed to do.  The difficulty is
that what it's supposed to do doesn't match what you think.

pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
invokes the driver's runtime_idle callback if there is one, and the
callback routine can start a suspend or an autosuspend.  If there is no
callback, it will use whatever autosuspend setting the driver has set
up.  If you want to override the autosuspend setting, use
pm_runtime_put_sync_suspend() instead.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 21:45                                                 ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-02 21:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> > Also, what is autosuspend_delay set to for your device?  And is 
> > runtime_auto set?
> 
> It's 100 at that point, see the commented snippet below from
> omap_hsmmc_probe():
> 
> 	pm_runtime_enable(host->dev);
> 	pm_runtime_get_sync(host->dev);
> 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> 	pm_runtime_use_autosuspend(host->dev);
> 	...
> 	/* gets -EPROBE_DEFER */
> err_irq:
> 	...
> 	pm_runtime_put_sync(host->dev);

You could try changing this to pm_runtime_put_sync_suspend().  But
putting pm_runtime_dont_use_autosuspend() before the put_sync seems
like a perfectly reasonable thing to do, especially if you feel you
should reverse all the changes you made at the start.

>         pm_runtime_disable(host->dev);
> 	/* NOTE: suspend callback never gets called unless
> 	 * pm_runtime_dont_use_autosuspend() is called
> 	 * before pm_runtime_put_sync() above.
> 	 */
> 	 ...
> 
> > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > > it does, then perhaps you can get what you want by having the probe 
> > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > > return an error -- particularly -EDEFER.
> > > 
> > > Yes so far that's the only fix that seems to work like I posted
> > > earlier. But is that the right fix though?
> > 
> > No, not really.  Ideally you would leave autosuspend turned on.  The 
> > delay would be long enough to that after -EDEFER, another probe would 
> > start before the delay expired.  But shortly after the last probe 
> > attempt, the delay would expire and the device would then be put in low 
> > power.
> 
> But then what about the new reinit function? To me it seems that
> we should not attempt to maintain a state from the earlier failed
> probe. Or are you thinking we just skip the reinit if autosuspend
> is set?

The reinit function gets called too late to do what you want -- namely, 
put the hardware in a low-power state.

That _is_ what you want, isn't it?  The alternative is to leave
dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual 
state.

Given that the reinit function is supposed to restore the initial 
settings, it probably ought to call pm_runtime_dont_use_autosuspend().  
But that won't be enough to fix your problem.

> > > If we wanted to have some generic fix, it seems we would have to pass
> > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > > configuration. But I don't know if that's what we want to or should
> > > do though?
> > 
> > I don't think so.
> 
> So should we just establish a policy that pm_runtime_use_autosuspend()
> needs to be paired with pm_runtime_dont_use_autosuspend() for
> pm_runtime_put_sync() to work?

Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
work -- that is, it does what it's supposed to do.  The difficulty is
that what it's supposed to do doesn't match what you think.

pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
invokes the driver's runtime_idle callback if there is one, and the
callback routine can start a suspend or an autosuspend.  If there is no
callback, it will use whatever autosuspend setting the driver has set
up.  If you want to override the autosuspend setting, use
pm_runtime_put_sync_suspend() instead.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 20:47                                       ` Ulf Hansson
@ 2016-02-02 23:41                                         ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 23:41 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160202 12:48]:
> On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
> > That's a valid error though, let's not remove it. The reason why we
> > call runtime_resume() twice is because runtime_suspend callback never
> > gets called like I explain above.
> 
> This isn't an error, it's just a hickup in the synchronization of the
> runtime PM status.

I'd rather not get the hardware state out of sync with PM runtime..

> Very similar what happens at the first probe, when the driver core has
> initialized the runtime PM status to RPM_SUSPENDED at the device
> registration.

Well we actually pretty much have devices in that state to start
with.

> To me, it's the responsible of the PM domain to *help* with the
> synchronization, not prevent it as it currently does.

The problem is that the hardware state gets out of sync with
PM runtime. And that's going to be a pain to debug later on.

> > --- a/drivers/mmc/host/omap_hsmmc.c
> > +++ b/drivers/mmc/host/omap_hsmmc.c
> > @@ -2232,6 +2232,7 @@ err_irq:
> >                 dma_release_channel(host->tx_chan);
> >         if (host->rx_chan)
> >                 dma_release_channel(host->rx_chan);
> > +       pm_runtime_dont_use_autosuspend(host->dev);
> 
> It's good know this works, although do you intend to fix this sequence
> for all omap drivers/devices that's part of the hwmod PM domain?
> 
> I haven't checked the number of drivers this would affect, but I
> imagine there could be quite many with similar behaviour and thus may
> suffer from the same issue.

Yeah not sure what the right fix is. But I'd rather patch the
few drivers using autosuspend if we come to the conclusion
that there is no bug in PM runtime.

> Could you please test my version 2 of the patch I attached earlier. I
> still believe it's the best way to solve the regression, if it works
> of course. :-)

And I don't like it because of the reasons above :) But yeah
gave it a quick try and that too works as expected.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 23:41                                         ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 23:41 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160202 12:48]:
> On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
> > That's a valid error though, let's not remove it. The reason why we
> > call runtime_resume() twice is because runtime_suspend callback never
> > gets called like I explain above.
> 
> This isn't an error, it's just a hickup in the synchronization of the
> runtime PM status.

I'd rather not get the hardware state out of sync with PM runtime..

> Very similar what happens at the first probe, when the driver core has
> initialized the runtime PM status to RPM_SUSPENDED at the device
> registration.

Well we actually pretty much have devices in that state to start
with.

> To me, it's the responsible of the PM domain to *help* with the
> synchronization, not prevent it as it currently does.

The problem is that the hardware state gets out of sync with
PM runtime. And that's going to be a pain to debug later on.

> > --- a/drivers/mmc/host/omap_hsmmc.c
> > +++ b/drivers/mmc/host/omap_hsmmc.c
> > @@ -2232,6 +2232,7 @@ err_irq:
> >                 dma_release_channel(host->tx_chan);
> >         if (host->rx_chan)
> >                 dma_release_channel(host->rx_chan);
> > +       pm_runtime_dont_use_autosuspend(host->dev);
> 
> It's good know this works, although do you intend to fix this sequence
> for all omap drivers/devices that's part of the hwmod PM domain?
> 
> I haven't checked the number of drivers this would affect, but I
> imagine there could be quite many with similar behaviour and thus may
> suffer from the same issue.

Yeah not sure what the right fix is. But I'd rather patch the
few drivers using autosuspend if we come to the conclusion
that there is no bug in PM runtime.

> Could you please test my version 2 of the patch I attached earlier. I
> still believe it's the best way to solve the regression, if it works
> of course. :-)

And I don't like it because of the reasons above :) But yeah
gave it a quick try and that too works as expected.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 21:45                                                 ` Alan Stern
@ 2016-02-02 23:46                                                   ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 23:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > > Also, what is autosuspend_delay set to for your device?  And is 
> > > runtime_auto set?
> > 
> > It's 100 at that point, see the commented snippet below from
> > omap_hsmmc_probe():
> > 
> > 	pm_runtime_enable(host->dev);
> > 	pm_runtime_get_sync(host->dev);
> > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > 	pm_runtime_use_autosuspend(host->dev);
> > 	...
> > 	/* gets -EPROBE_DEFER */
> > err_irq:
> > 	...
> > 	pm_runtime_put_sync(host->dev);
> 
> You could try changing this to pm_runtime_put_sync_suspend().  But
> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> like a perfectly reasonable thing to do, especially if you feel you
> should reverse all the changes you made at the start.

They both seem to fix the problem.

> >         pm_runtime_disable(host->dev);
> > 	/* NOTE: suspend callback never gets called unless
> > 	 * pm_runtime_dont_use_autosuspend() is called
> > 	 * before pm_runtime_put_sync() above.
> > 	 */
> > 	 ...
> > 
> > > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > > > it does, then perhaps you can get what you want by having the probe 
> > > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > > > return an error -- particularly -EDEFER.
> > > > 
> > > > Yes so far that's the only fix that seems to work like I posted
> > > > earlier. But is that the right fix though?
> > > 
> > > No, not really.  Ideally you would leave autosuspend turned on.  The 
> > > delay would be long enough to that after -EDEFER, another probe would 
> > > start before the delay expired.  But shortly after the last probe 
> > > attempt, the delay would expire and the device would then be put in low 
> > > power.
> > 
> > But then what about the new reinit function? To me it seems that
> > we should not attempt to maintain a state from the earlier failed
> > probe. Or are you thinking we just skip the reinit if autosuspend
> > is set?
> 
> The reinit function gets called too late to do what you want -- namely, 
> put the hardware in a low-power state.

Right, the problem is the lack of suspend on the first probe. But
for having autosuspend timeout long enough for the next probe
would mean that we can't reset the PM runtime state in between.

> That _is_ what you want, isn't it?  The alternative is to leave
> dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual 
> state.

As far as I can tell things work just fine if the failed probe
suspend the device at the end of the failed probe.

> Given that the reinit function is supposed to restore the initial 
> settings, it probably ought to call pm_runtime_dont_use_autosuspend().  
> But that won't be enough to fix your problem.
> 
> > > > If we wanted to have some generic fix, it seems we would have to pass
> > > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > > > configuration. But I don't know if that's what we want to or should
> > > > do though?
> > > 
> > > I don't think so.
> > 
> > So should we just establish a policy that pm_runtime_use_autosuspend()
> > needs to be paired with pm_runtime_dont_use_autosuspend() for
> > pm_runtime_put_sync() to work?
> 
> Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
> work -- that is, it does what it's supposed to do.  The difficulty is
> that what it's supposed to do doesn't match what you think.
> 
> pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> invokes the driver's runtime_idle callback if there is one, and the
> callback routine can start a suspend or an autosuspend.  If there is no
> callback, it will use whatever autosuspend setting the driver has set
> up.  If you want to override the autosuspend setting, use
> pm_runtime_put_sync_suspend() instead.

Yes.. That works too. I guess the thing to consider is if we should
make pm_runtime_put_sync() always sync along the lines of a patch
I posted earlier today. That could avoid quite a bit of confusion
as already seen in this thread :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-02 23:46                                                   ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-02 23:46 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > > Also, what is autosuspend_delay set to for your device?  And is 
> > > runtime_auto set?
> > 
> > It's 100 at that point, see the commented snippet below from
> > omap_hsmmc_probe():
> > 
> > 	pm_runtime_enable(host->dev);
> > 	pm_runtime_get_sync(host->dev);
> > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > 	pm_runtime_use_autosuspend(host->dev);
> > 	...
> > 	/* gets -EPROBE_DEFER */
> > err_irq:
> > 	...
> > 	pm_runtime_put_sync(host->dev);
> 
> You could try changing this to pm_runtime_put_sync_suspend().  But
> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> like a perfectly reasonable thing to do, especially if you feel you
> should reverse all the changes you made at the start.

They both seem to fix the problem.

> >         pm_runtime_disable(host->dev);
> > 	/* NOTE: suspend callback never gets called unless
> > 	 * pm_runtime_dont_use_autosuspend() is called
> > 	 * before pm_runtime_put_sync() above.
> > 	 */
> > 	 ...
> > 
> > > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If 
> > > > > it does, then perhaps you can get what you want by having the probe 
> > > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to 
> > > > > return an error -- particularly -EDEFER.
> > > > 
> > > > Yes so far that's the only fix that seems to work like I posted
> > > > earlier. But is that the right fix though?
> > > 
> > > No, not really.  Ideally you would leave autosuspend turned on.  The 
> > > delay would be long enough to that after -EDEFER, another probe would 
> > > start before the delay expired.  But shortly after the last probe 
> > > attempt, the delay would expire and the device would then be put in low 
> > > power.
> > 
> > But then what about the new reinit function? To me it seems that
> > we should not attempt to maintain a state from the earlier failed
> > probe. Or are you thinking we just skip the reinit if autosuspend
> > is set?
> 
> The reinit function gets called too late to do what you want -- namely, 
> put the hardware in a low-power state.

Right, the problem is the lack of suspend on the first probe. But
for having autosuspend timeout long enough for the next probe
would mean that we can't reset the PM runtime state in between.

> That _is_ what you want, isn't it?  The alternative is to leave
> dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual 
> state.

As far as I can tell things work just fine if the failed probe
suspend the device at the end of the failed probe.

> Given that the reinit function is supposed to restore the initial 
> settings, it probably ought to call pm_runtime_dont_use_autosuspend().  
> But that won't be enough to fix your problem.
> 
> > > > If we wanted to have some generic fix, it seems we would have to pass
> > > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
> > > > configuration. But I don't know if that's what we want to or should
> > > > do though?
> > > 
> > > I don't think so.
> > 
> > So should we just establish a policy that pm_runtime_use_autosuspend()
> > needs to be paired with pm_runtime_dont_use_autosuspend() for
> > pm_runtime_put_sync() to work?
> 
> Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
> work -- that is, it does what it's supposed to do.  The difficulty is
> that what it's supposed to do doesn't match what you think.
> 
> pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> invokes the driver's runtime_idle callback if there is one, and the
> callback routine can start a suspend or an autosuspend.  If there is no
> callback, it will use whatever autosuspend setting the driver has set
> up.  If you want to override the autosuspend setting, use
> pm_runtime_put_sync_suspend() instead.

Yes.. That works too. I guess the thing to consider is if we should
make pm_runtime_put_sync() always sync along the lines of a patch
I posted earlier today. That could avoid quite a bit of confusion
as already seen in this thread :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 23:41                                         ` Tony Lindgren
@ 2016-02-03 10:23                                           ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 10:23 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 00:41, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:48]:
>> On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
>> > That's a valid error though, let's not remove it. The reason why we
>> > call runtime_resume() twice is because runtime_suspend callback never
>> > gets called like I explain above.
>>
>> This isn't an error, it's just a hickup in the synchronization of the
>> runtime PM status.
>
> I'd rather not get the hardware state out of sync with PM runtime..
>
>> Very similar what happens at the first probe, when the driver core has
>> initialized the runtime PM status to RPM_SUSPENDED at the device
>> registration.
>
> Well we actually pretty much have devices in that state to start
> with.

That's not true, not even for omap.

There are definitely devices which HW state is reflected by RPM_ACTIVE
at boot. It's the responsible of the subsystem/driver (including PM
domains) to make sure the runtime PM status is updated accordingly, to
reflect the real HW state.

For omap hwmod, at device registration in omap_device_build_from_dt()
it may actually invoke pm_runtime_set_active() if the device is
enabled. To my understanding, that's done to synchronize the real HW
state with the runtime PM status, right?

>
>> To me, it's the responsible of the PM domain to *help* with the
>> synchronization, not prevent it as it currently does.
>
> The problem is that the hardware state gets out of sync with
> PM runtime. And that's going to be a pain to debug later on.

I don't see the problem, but of course you know omap for better than I do.

So if you are concerned about this, perhaps adding a dev_dbg print
when the omap hwmod's ->runtime_suspend() callback returns zero could
be a way forward?

>
>> > --- a/drivers/mmc/host/omap_hsmmc.c
>> > +++ b/drivers/mmc/host/omap_hsmmc.c
>> > @@ -2232,6 +2232,7 @@ err_irq:
>> >                 dma_release_channel(host->tx_chan);
>> >         if (host->rx_chan)
>> >                 dma_release_channel(host->rx_chan);
>> > +       pm_runtime_dont_use_autosuspend(host->dev);
>>
>> It's good know this works, although do you intend to fix this sequence
>> for all omap drivers/devices that's part of the hwmod PM domain?
>>
>> I haven't checked the number of drivers this would affect, but I
>> imagine there could be quite many with similar behaviour and thus may
>> suffer from the same issue.
>
> Yeah not sure what the right fix is. But I'd rather patch the
> few drivers using autosuspend if we come to the conclusion
> that there is no bug in PM runtime.
>
>> Could you please test my version 2 of the patch I attached earlier. I
>> still believe it's the best way to solve the regression, if it works
>> of course. :-)
>
> And I don't like it because of the reasons above :) But yeah
> gave it a quick try and that too works as expected.

Okay, thanks for testing!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 10:23                                           ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 10:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 00:41, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:48]:
>> On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote:
>> > That's a valid error though, let's not remove it. The reason why we
>> > call runtime_resume() twice is because runtime_suspend callback never
>> > gets called like I explain above.
>>
>> This isn't an error, it's just a hickup in the synchronization of the
>> runtime PM status.
>
> I'd rather not get the hardware state out of sync with PM runtime..
>
>> Very similar what happens at the first probe, when the driver core has
>> initialized the runtime PM status to RPM_SUSPENDED at the device
>> registration.
>
> Well we actually pretty much have devices in that state to start
> with.

That's not true, not even for omap.

There are definitely devices which HW state is reflected by RPM_ACTIVE
at boot. It's the responsible of the subsystem/driver (including PM
domains) to make sure the runtime PM status is updated accordingly, to
reflect the real HW state.

For omap hwmod, at device registration in omap_device_build_from_dt()
it may actually invoke pm_runtime_set_active() if the device is
enabled. To my understanding, that's done to synchronize the real HW
state with the runtime PM status, right?

>
>> To me, it's the responsible of the PM domain to *help* with the
>> synchronization, not prevent it as it currently does.
>
> The problem is that the hardware state gets out of sync with
> PM runtime. And that's going to be a pain to debug later on.

I don't see the problem, but of course you know omap for better than I do.

So if you are concerned about this, perhaps adding a dev_dbg print
when the omap hwmod's ->runtime_suspend() callback returns zero could
be a way forward?

>
>> > --- a/drivers/mmc/host/omap_hsmmc.c
>> > +++ b/drivers/mmc/host/omap_hsmmc.c
>> > @@ -2232,6 +2232,7 @@ err_irq:
>> >                 dma_release_channel(host->tx_chan);
>> >         if (host->rx_chan)
>> >                 dma_release_channel(host->rx_chan);
>> > +       pm_runtime_dont_use_autosuspend(host->dev);
>>
>> It's good know this works, although do you intend to fix this sequence
>> for all omap drivers/devices that's part of the hwmod PM domain?
>>
>> I haven't checked the number of drivers this would affect, but I
>> imagine there could be quite many with similar behaviour and thus may
>> suffer from the same issue.
>
> Yeah not sure what the right fix is. But I'd rather patch the
> few drivers using autosuspend if we come to the conclusion
> that there is no bug in PM runtime.
>
>> Could you please test my version 2 of the patch I attached earlier. I
>> still believe it's the best way to solve the regression, if it works
>> of course. :-)
>
> And I don't like it because of the reasons above :) But yeah
> gave it a quick try and that too works as expected.

Okay, thanks for testing!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 10:23                                           ` Ulf Hansson
@ 2016-02-03 10:25                                             ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 10:25 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

>
> I don't see the problem, but of course you know omap for better than I do.
>
> So if you are concerned about this, perhaps adding a dev_dbg print
> when the omap hwmod's ->runtime_suspend() callback returns zero could

/s/runtime_suspend/runtime_resume

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 10:25                                             ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 10:25 UTC (permalink / raw)
  To: linux-arm-kernel

>
> I don't see the problem, but of course you know omap for better than I do.
>
> So if you are concerned about this, perhaps adding a dev_dbg print
> when the omap hwmod's ->runtime_suspend() callback returns zero could

/s/runtime_suspend/runtime_resume

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 10:25                                             ` Ulf Hansson
@ 2016-02-03 12:18                                               ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 12:18 UTC (permalink / raw)
  To: Ulf Hansson, Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Wed, Feb 3, 2016 at 11:25 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>
>> I don't see the problem, but of course you know omap for better than I do.
>>
>> So if you are concerned about this, perhaps adding a dev_dbg print
>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>
> /s/runtime_suspend/runtime_resume

OK

Let me summarize my understanding of this thread so far.

It looks like the omap3 code initializes hardware in ->probe() and
then it may return -EPROBE_DEFER due to some unmet dependencies.  In
that case the hardware is not reset to the previous state and the
runtime PM framework is left in the state that corresponds to the
current hardware state.  Before we had pm_runtime_reinit(), everything
worked as expected on the second ->probe() call, because things were
in sync then.

The introduction of pm_runtime_reinit() changed the situation and now
effectively the hardware is required to be reset to the initial state
if -EPROBE_DEFER is to be returned too.

Is that correct?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 12:18                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 3, 2016 at 11:25 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>
>> I don't see the problem, but of course you know omap for better than I do.
>>
>> So if you are concerned about this, perhaps adding a dev_dbg print
>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>
> /s/runtime_suspend/runtime_resume

OK

Let me summarize my understanding of this thread so far.

It looks like the omap3 code initializes hardware in ->probe() and
then it may return -EPROBE_DEFER due to some unmet dependencies.  In
that case the hardware is not reset to the previous state and the
runtime PM framework is left in the state that corresponds to the
current hardware state.  Before we had pm_runtime_reinit(), everything
worked as expected on the second ->probe() call, because things were
in sync then.

The introduction of pm_runtime_reinit() changed the situation and now
effectively the hardware is required to be reset to the initial state
if -EPROBE_DEFER is to be returned too.

Is that correct?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 21:39                                       ` Tony Lindgren
@ 2016-02-03 13:03                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 13:03 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Tue, Feb 2, 2016 at 10:39 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
>> >
>> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
>> > mode.  If you want to respect it, you have to call
>> > pm_runtime_put_sync_autosuspend() instead.
>>
>> Then there's a bug in the runtime PM core.
>>
>> From Tony's regression report and from mine own local runtime PM test
>> driver, I can see that the device doesn't get RPM_SUSPENDED (the
>> ->runtime_suspend() callback isn't called), even when the usage count
>> is zero - when pm_runtime_put_sync() is called.
> ...
>> Okay, so you are saying that the pm_runtime_put_sync() should idle the
>> device even if autosuspend is in use. That seems reasonable, I will
>> look into this problem.
>
> The patch below fixes pm_runtime_put_sync() to not respect the
> autosuspend mode to match what Alan is saying above. Seems to also
> fixes the $subject issue for me. And seems to behave for PM runtime
> for other devices during runtime too based on light testing here.
>
> Regards,
>
> Tony
>
> 8< ---------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
>
>   out:
>         trace_rpm_return_int(dev, _THIS_IP_, retval);
> -       return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
> +       if (!(rpmflags & RPM_IGNORE_AUTO))
> +               rpmflags |= RPM_AUTO;
> +       return retval ? retval : rpm_suspend(dev, rpmflags);
>  }
>
>  /**
> --- a/include/linux/pm_runtime.h
> +++ b/include/linux/pm_runtime.h
> @@ -23,6 +23,8 @@
>                                             usage_count */
>  #define RPM_AUTO               0x08    /* Use autosuspend_delay */
>
> +#define RPM_IGNORE_AUTO                0x10    /* Ignore autosuspend */
> +
>  #ifdef CONFIG_PM
>  extern struct workqueue_struct *pm_wq;
>
> @@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
>
>  static inline int pm_runtime_put_sync(struct device *dev)
>  {
> -       return __pm_runtime_idle(dev, RPM_GET_PUT);
> +       return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
>  }
>
>  static inline int pm_runtime_put_sync_suspend(struct device *dev)

This changes a well-documented behavior that someone may be relying on.

Not the safest thing to do I have to say.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 13:03                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 2, 2016 at 10:39 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
>> >
>> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
>> > mode.  If you want to respect it, you have to call
>> > pm_runtime_put_sync_autosuspend() instead.
>>
>> Then there's a bug in the runtime PM core.
>>
>> From Tony's regression report and from mine own local runtime PM test
>> driver, I can see that the device doesn't get RPM_SUSPENDED (the
>> ->runtime_suspend() callback isn't called), even when the usage count
>> is zero - when pm_runtime_put_sync() is called.
> ...
>> Okay, so you are saying that the pm_runtime_put_sync() should idle the
>> device even if autosuspend is in use. That seems reasonable, I will
>> look into this problem.
>
> The patch below fixes pm_runtime_put_sync() to not respect the
> autosuspend mode to match what Alan is saying above. Seems to also
> fixes the $subject issue for me. And seems to behave for PM runtime
> for other devices during runtime too based on light testing here.
>
> Regards,
>
> Tony
>
> 8< ---------------
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
>
>   out:
>         trace_rpm_return_int(dev, _THIS_IP_, retval);
> -       return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
> +       if (!(rpmflags & RPM_IGNORE_AUTO))
> +               rpmflags |= RPM_AUTO;
> +       return retval ? retval : rpm_suspend(dev, rpmflags);
>  }
>
>  /**
> --- a/include/linux/pm_runtime.h
> +++ b/include/linux/pm_runtime.h
> @@ -23,6 +23,8 @@
>                                             usage_count */
>  #define RPM_AUTO               0x08    /* Use autosuspend_delay */
>
> +#define RPM_IGNORE_AUTO                0x10    /* Ignore autosuspend */
> +
>  #ifdef CONFIG_PM
>  extern struct workqueue_struct *pm_wq;
>
> @@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
>
>  static inline int pm_runtime_put_sync(struct device *dev)
>  {
> -       return __pm_runtime_idle(dev, RPM_GET_PUT);
> +       return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
>  }
>
>  static inline int pm_runtime_put_sync_suspend(struct device *dev)

This changes a well-documented behavior that someone may be relying on.

Not the safest thing to do I have to say.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 23:46                                                   ` Tony Lindgren
@ 2016-02-03 13:06                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 13:06 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>
>> > > Also, what is autosuspend_delay set to for your device?  And is
>> > > runtime_auto set?
>> >
>> > It's 100 at that point, see the commented snippet below from
>> > omap_hsmmc_probe():
>> >
>> >     pm_runtime_enable(host->dev);
>> >     pm_runtime_get_sync(host->dev);
>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >     pm_runtime_use_autosuspend(host->dev);
>> >     ...
>> >     /* gets -EPROBE_DEFER */
>> > err_irq:
>> >     ...
>> >     pm_runtime_put_sync(host->dev);
>>
>> You could try changing this to pm_runtime_put_sync_suspend().  But
>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> like a perfectly reasonable thing to do, especially if you feel you
>> should reverse all the changes you made at the start.
>
> They both seem to fix the problem.
>
>> >         pm_runtime_disable(host->dev);
>> >     /* NOTE: suspend callback never gets called unless
>> >      * pm_runtime_dont_use_autosuspend() is called
>> >      * before pm_runtime_put_sync() above.
>> >      */
>> >      ...
>> >
>> > > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If
>> > > > > it does, then perhaps you can get what you want by having the probe
>> > > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to
>> > > > > return an error -- particularly -EDEFER.
>> > > >
>> > > > Yes so far that's the only fix that seems to work like I posted
>> > > > earlier. But is that the right fix though?
>> > >
>> > > No, not really.  Ideally you would leave autosuspend turned on.  The
>> > > delay would be long enough to that after -EDEFER, another probe would
>> > > start before the delay expired.  But shortly after the last probe
>> > > attempt, the delay would expire and the device would then be put in low
>> > > power.
>> >
>> > But then what about the new reinit function? To me it seems that
>> > we should not attempt to maintain a state from the earlier failed
>> > probe. Or are you thinking we just skip the reinit if autosuspend
>> > is set?
>>
>> The reinit function gets called too late to do what you want -- namely,
>> put the hardware in a low-power state.
>
> Right, the problem is the lack of suspend on the first probe. But
> for having autosuspend timeout long enough for the next probe
> would mean that we can't reset the PM runtime state in between.
>
>> That _is_ what you want, isn't it?  The alternative is to leave
>> dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual
>> state.
>
> As far as I can tell things work just fine if the failed probe
> suspend the device at the end of the failed probe.
>
>> Given that the reinit function is supposed to restore the initial
>> settings, it probably ought to call pm_runtime_dont_use_autosuspend().
>> But that won't be enough to fix your problem.
>>
>> > > > If we wanted to have some generic fix, it seems we would have to pass
>> > > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
>> > > > configuration. But I don't know if that's what we want to or should
>> > > > do though?
>> > >
>> > > I don't think so.
>> >
>> > So should we just establish a policy that pm_runtime_use_autosuspend()
>> > needs to be paired with pm_runtime_dont_use_autosuspend() for
>> > pm_runtime_put_sync() to work?
>>
>> Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
>> work -- that is, it does what it's supposed to do.  The difficulty is
>> that what it's supposed to do doesn't match what you think.
>>
>> pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
>> invokes the driver's runtime_idle callback if there is one, and the
>> callback routine can start a suspend or an autosuspend.  If there is no
>> callback, it will use whatever autosuspend setting the driver has set
>> up.  If you want to override the autosuspend setting, use
>> pm_runtime_put_sync_suspend() instead.
>
> Yes.. That works too. I guess the thing to consider is if we should
> make pm_runtime_put_sync() always sync along the lines of a patch
> I posted earlier today. That could avoid quite a bit of confusion
> as already seen in this thread :)

No, we shouldn't.

As I said, the current behavior is actually well documented (in
kerneldoc comments and elsewhere).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 13:06                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>
>> > > Also, what is autosuspend_delay set to for your device?  And is
>> > > runtime_auto set?
>> >
>> > It's 100 at that point, see the commented snippet below from
>> > omap_hsmmc_probe():
>> >
>> >     pm_runtime_enable(host->dev);
>> >     pm_runtime_get_sync(host->dev);
>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >     pm_runtime_use_autosuspend(host->dev);
>> >     ...
>> >     /* gets -EPROBE_DEFER */
>> > err_irq:
>> >     ...
>> >     pm_runtime_put_sync(host->dev);
>>
>> You could try changing this to pm_runtime_put_sync_suspend().  But
>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> like a perfectly reasonable thing to do, especially if you feel you
>> should reverse all the changes you made at the start.
>
> They both seem to fix the problem.
>
>> >         pm_runtime_disable(host->dev);
>> >     /* NOTE: suspend callback never gets called unless
>> >      * pm_runtime_dont_use_autosuspend() is called
>> >      * before pm_runtime_put_sync() above.
>> >      */
>> >      ...
>> >
>> > > > > Does pm_runtime_use_autosuspend() get called by the probe routine?  If
>> > > > > it does, then perhaps you can get what you want by having the probe
>> > > > > routine call pm_runtime_dont_use_autosuspend() whenever it's about to
>> > > > > return an error -- particularly -EDEFER.
>> > > >
>> > > > Yes so far that's the only fix that seems to work like I posted
>> > > > earlier. But is that the right fix though?
>> > >
>> > > No, not really.  Ideally you would leave autosuspend turned on.  The
>> > > delay would be long enough to that after -EDEFER, another probe would
>> > > start before the delay expired.  But shortly after the last probe
>> > > attempt, the delay would expire and the device would then be put in low
>> > > power.
>> >
>> > But then what about the new reinit function? To me it seems that
>> > we should not attempt to maintain a state from the earlier failed
>> > probe. Or are you thinking we just skip the reinit if autosuspend
>> > is set?
>>
>> The reinit function gets called too late to do what you want -- namely,
>> put the hardware in a low-power state.
>
> Right, the problem is the lack of suspend on the first probe. But
> for having autosuspend timeout long enough for the next probe
> would mean that we can't reset the PM runtime state in between.
>
>> That _is_ what you want, isn't it?  The alternative is to leave
>> dev->power.rpm_status set to RPM_ACTIVE, to match the hardware's actual
>> state.
>
> As far as I can tell things work just fine if the failed probe
> suspend the device at the end of the failed probe.
>
>> Given that the reinit function is supposed to restore the initial
>> settings, it probably ought to call pm_runtime_dont_use_autosuspend().
>> But that won't be enough to fix your problem.
>>
>> > > > If we wanted to have some generic fix, it seems we would have to pass
>> > > > a new flag in pm_runtime_put_sync() to ignore any autosuspend
>> > > > configuration. But I don't know if that's what we want to or should
>> > > > do though?
>> > >
>> > > I don't think so.
>> >
>> > So should we just establish a policy that pm_runtime_use_autosuspend()
>> > needs to be paired with pm_runtime_dont_use_autosuspend() for
>> > pm_runtime_put_sync() to work?
>>
>> Your understanding is slightly wrong.  pm_runtime_put_sync() _does_
>> work -- that is, it does what it's supposed to do.  The difficulty is
>> that what it's supposed to do doesn't match what you think.
>>
>> pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
>> invokes the driver's runtime_idle callback if there is one, and the
>> callback routine can start a suspend or an autosuspend.  If there is no
>> callback, it will use whatever autosuspend setting the driver has set
>> up.  If you want to override the autosuspend setting, use
>> pm_runtime_put_sync_suspend() instead.
>
> Yes.. That works too. I guess the thing to consider is if we should
> make pm_runtime_put_sync() always sync along the lines of a patch
> I posted earlier today. That could avoid quite a bit of confusion
> as already seen in this thread :)

No, we shouldn't.

As I said, the current behavior is actually well documented (in
kerneldoc comments and elsewhere).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 12:18                                               ` Rafael J. Wysocki
@ 2016-02-03 14:58                                                 ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 14:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tony Lindgren, Alan Stern, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 13:18, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Feb 3, 2016 at 11:25 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>
>>> I don't see the problem, but of course you know omap for better than I do.
>>>
>>> So if you are concerned about this, perhaps adding a dev_dbg print
>>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>>
>> /s/runtime_suspend/runtime_resume
>
> OK
>
> Let me summarize my understanding of this thread so far.
>
> It looks like the omap3 code initializes hardware in ->probe() and
> then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> that case the hardware is not reset to the previous state and the
> runtime PM framework is left in the state that corresponds to the
> current hardware state.  Before we had pm_runtime_reinit(), everything
> worked as expected on the second ->probe() call, because things were
> in sync then.

Correct!

Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
worked fine because the HW state and the runtime PM status was in
sync. The device was powered and the runtime PM status was RPM_ACTIVE.

>
> The introduction of pm_runtime_reinit() changed the situation and now
> effectively the hardware is required to be reset to the initial state
> if -EPROBE_DEFER is to be returned too.

Not correct. The hardware doesn't need a reset as it stays powered
after a failed probe.

Instead, only the runtime PM status needs to be synchronized with the
HW state the next probe attempt.

In other words, when the PM domain's ->runtime_resume() callbacks gets
called due to a pm_runtime_get_sync() in the second probe attempt, it
may find that the HW is already powered and can return zero instead of
resetting it.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 14:58                                                 ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 14:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 13:18, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Feb 3, 2016 at 11:25 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>
>>> I don't see the problem, but of course you know omap for better than I do.
>>>
>>> So if you are concerned about this, perhaps adding a dev_dbg print
>>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>>
>> /s/runtime_suspend/runtime_resume
>
> OK
>
> Let me summarize my understanding of this thread so far.
>
> It looks like the omap3 code initializes hardware in ->probe() and
> then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> that case the hardware is not reset to the previous state and the
> runtime PM framework is left in the state that corresponds to the
> current hardware state.  Before we had pm_runtime_reinit(), everything
> worked as expected on the second ->probe() call, because things were
> in sync then.

Correct!

Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
worked fine because the HW state and the runtime PM status was in
sync. The device was powered and the runtime PM status was RPM_ACTIVE.

>
> The introduction of pm_runtime_reinit() changed the situation and now
> effectively the hardware is required to be reset to the initial state
> if -EPROBE_DEFER is to be returned too.

Not correct. The hardware doesn't need a reset as it stays powered
after a failed probe.

Instead, only the runtime PM status needs to be synchronized with the
HW state the next probe attempt.

In other words, when the PM domain's ->runtime_resume() callbacks gets
called due to a pm_runtime_get_sync() in the second probe attempt, it
may find that the HW is already powered and can return zero instead of
resetting it.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 14:58                                                 ` Ulf Hansson
@ 2016-02-03 15:45                                                   ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-03 15:45 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Tony Lindgren, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wed, 3 Feb 2016, Ulf Hansson wrote:

> > Let me summarize my understanding of this thread so far.
> >
> > It looks like the omap3 code initializes hardware in ->probe() and
> > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> > that case the hardware is not reset to the previous state and the
> > runtime PM framework is left in the state that corresponds to the
> > current hardware state.  Before we had pm_runtime_reinit(), everything
> > worked as expected on the second ->probe() call, because things were
> > in sync then.
> 
> Correct!
> 
> Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
> worked fine because the HW state and the runtime PM status was in
> sync. The device was powered and the runtime PM status was RPM_ACTIVE.
> 
> >
> > The introduction of pm_runtime_reinit() changed the situation and now
> > effectively the hardware is required to be reset to the initial state
> > if -EPROBE_DEFER is to be returned too.
> 
> Not correct. The hardware doesn't need a reset as it stays powered
> after a failed probe.
> 
> Instead, only the runtime PM status needs to be synchronized with the
> HW state the next probe attempt.

In other words, the probe routine assumes the actual state is the same
as the PM status.  This may have been true before pm_runtime_reinit()
came along, but it's not true now.

> In other words, when the PM domain's ->runtime_resume() callbacks gets
> called due to a pm_runtime_get_sync() in the second probe attempt, it
> may find that the HW is already powered and can return zero instead of
> resetting it.

What's wrong with going ahead and resetting the hardware anyway?

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 15:45                                                   ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-03 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 3 Feb 2016, Ulf Hansson wrote:

> > Let me summarize my understanding of this thread so far.
> >
> > It looks like the omap3 code initializes hardware in ->probe() and
> > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> > that case the hardware is not reset to the previous state and the
> > runtime PM framework is left in the state that corresponds to the
> > current hardware state.  Before we had pm_runtime_reinit(), everything
> > worked as expected on the second ->probe() call, because things were
> > in sync then.
> 
> Correct!
> 
> Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
> worked fine because the HW state and the runtime PM status was in
> sync. The device was powered and the runtime PM status was RPM_ACTIVE.
> 
> >
> > The introduction of pm_runtime_reinit() changed the situation and now
> > effectively the hardware is required to be reset to the initial state
> > if -EPROBE_DEFER is to be returned too.
> 
> Not correct. The hardware doesn't need a reset as it stays powered
> after a failed probe.
> 
> Instead, only the runtime PM status needs to be synchronized with the
> HW state the next probe attempt.

In other words, the probe routine assumes the actual state is the same
as the PM status.  This may have been true before pm_runtime_reinit()
came along, but it's not true now.

> In other words, when the PM domain's ->runtime_resume() callbacks gets
> called due to a pm_runtime_get_sync() in the second probe attempt, it
> may find that the HW is already powered and can return zero instead of
> resetting it.

What's wrong with going ahead and resetting the hardware anyway?

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 23:46                                                   ` Tony Lindgren
@ 2016-02-03 15:48                                                     ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-03 15:48 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> > On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > 
> > > > Also, what is autosuspend_delay set to for your device?  And is 
> > > > runtime_auto set?
> > > 
> > > It's 100 at that point, see the commented snippet below from
> > > omap_hsmmc_probe():
> > > 
> > > 	pm_runtime_enable(host->dev);
> > > 	pm_runtime_get_sync(host->dev);
> > > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > > 	pm_runtime_use_autosuspend(host->dev);
> > > 	...
> > > 	/* gets -EPROBE_DEFER */
> > > err_irq:
> > > 	...
> > > 	pm_runtime_put_sync(host->dev);
> > 
> > You could try changing this to pm_runtime_put_sync_suspend().  But
> > putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> > like a perfectly reasonable thing to do, especially if you feel you
> > should reverse all the changes you made at the start.
> 
> They both seem to fix the problem.

So you could use either one.  In my opinion, the 
pm_runtime_dont_use_autosuspend() solution is a little cleaner.

> > The reinit function gets called too late to do what you want -- namely, 
> > put the hardware in a low-power state.
> 
> Right, the problem is the lack of suspend on the first probe. But
> for having autosuspend timeout long enough for the next probe
> would mean that we can't reset the PM runtime state in between.

That's one way to look at it.  But in principle you don't need to
suspend the device after an unsuccessful probe.  You can just leave it
at high power.  If this causes problems for a second probe, it's the 
second probe's own fault for assuming the actual state matches the PM 
status.

> > pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> > invokes the driver's runtime_idle callback if there is one, and the
> > callback routine can start a suspend or an autosuspend.  If there is no
> > callback, it will use whatever autosuspend setting the driver has set
> > up.  If you want to override the autosuspend setting, use
> > pm_runtime_put_sync_suspend() instead.
> 
> Yes.. That works too. I guess the thing to consider is if we should
> make pm_runtime_put_sync() always sync along the lines of a patch
> I posted earlier today. That could avoid quite a bit of confusion
> as already seen in this thread :)

As Rafael pointed out, pm_runtime_put_sync() has well-documented 
behavior.  It shouldn't be changed.  I don't see how changing the 
behavior would reduce anybody's confusion.  At least, anybody who reads 
the documentation carefully.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 15:48                                                     ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-03 15:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2 Feb 2016, Tony Lindgren wrote:

> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> > On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > 
> > > > Also, what is autosuspend_delay set to for your device?  And is 
> > > > runtime_auto set?
> > > 
> > > It's 100 at that point, see the commented snippet below from
> > > omap_hsmmc_probe():
> > > 
> > > 	pm_runtime_enable(host->dev);
> > > 	pm_runtime_get_sync(host->dev);
> > > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > > 	pm_runtime_use_autosuspend(host->dev);
> > > 	...
> > > 	/* gets -EPROBE_DEFER */
> > > err_irq:
> > > 	...
> > > 	pm_runtime_put_sync(host->dev);
> > 
> > You could try changing this to pm_runtime_put_sync_suspend().  But
> > putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> > like a perfectly reasonable thing to do, especially if you feel you
> > should reverse all the changes you made at the start.
> 
> They both seem to fix the problem.

So you could use either one.  In my opinion, the 
pm_runtime_dont_use_autosuspend() solution is a little cleaner.

> > The reinit function gets called too late to do what you want -- namely, 
> > put the hardware in a low-power state.
> 
> Right, the problem is the lack of suspend on the first probe. But
> for having autosuspend timeout long enough for the next probe
> would mean that we can't reset the PM runtime state in between.

That's one way to look at it.  But in principle you don't need to
suspend the device after an unsuccessful probe.  You can just leave it
at high power.  If this causes problems for a second probe, it's the 
second probe's own fault for assuming the actual state matches the PM 
status.

> > pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> > invokes the driver's runtime_idle callback if there is one, and the
> > callback routine can start a suspend or an autosuspend.  If there is no
> > callback, it will use whatever autosuspend setting the driver has set
> > up.  If you want to override the autosuspend setting, use
> > pm_runtime_put_sync_suspend() instead.
> 
> Yes.. That works too. I guess the thing to consider is if we should
> make pm_runtime_put_sync() always sync along the lines of a patch
> I posted earlier today. That could avoid quite a bit of confusion
> as already seen in this thread :)

As Rafael pointed out, pm_runtime_put_sync() has well-documented 
behavior.  It shouldn't be changed.  I don't see how changing the 
behavior would reduce anybody's confusion.  At least, anybody who reads 
the documentation carefully.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 15:45                                                   ` Alan Stern
@ 2016-02-03 16:09                                                     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
> On Wed, 3 Feb 2016, Ulf Hansson wrote:
> 
> > > Let me summarize my understanding of this thread so far.
> > >
> > > It looks like the omap3 code initializes hardware in ->probe() and
> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> > > that case the hardware is not reset to the previous state and the
> > > runtime PM framework is left in the state that corresponds to the
> > > current hardware state.  Before we had pm_runtime_reinit(), everything
> > > worked as expected on the second ->probe() call, because things were
> > > in sync then.
> > 
> > Correct!

Well not quite correct. After failed probe PM runtime is set to reset
state while hardware is still enabled.

> > Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
> > worked fine because the HW state and the runtime PM status was in
> > sync. The device was powered and the runtime PM status was RPM_ACTIVE.
> > 
> > >
> > > The introduction of pm_runtime_reinit() changed the situation and now
> > > effectively the hardware is required to be reset to the initial state
> > > if -EPROBE_DEFER is to be returned too.
> > 
> > Not correct. The hardware doesn't need a reset as it stays powered
> > after a failed probe.

It is really best to disable the hardware after a failed probe
like we do with pm_runtime_put_sync_(suspend)() and pm_runtime_disable().

This is because there may never be another probe attempt and we want
to have unclaimed devices shut off (or idled) for PM.

> > Instead, only the runtime PM status needs to be synchronized with the
> > HW state the next probe attempt.
>
> In other words, the probe routine assumes the actual state is the same
> as the PM status.  This may have been true before pm_runtime_reinit()
> came along, but it's not true now.
> 
> > In other words, when the PM domain's ->runtime_resume() callbacks gets
> > called due to a pm_runtime_get_sync() in the second probe attempt, it
> > may find that the HW is already powered and can return zero instead of
> > resetting it.

Certainly that should at least produce a warning in the hardware
specific implementation. It's a state mismatch between PM runtime
and the hardware specific implementation.

And as discussed, the problem does not exist as long as we understand
that we need to use pm_runtime_put_sync_suspend() if the driver has
set pm_runtime_use_autoidle(). Or else use the combination of
pm_runtime_dont_use_autoidle() and pm_runtime_put_sync().

> What's wrong with going ahead and resetting the hardware anyway?

Nothing, unless a state needs to be maintained for things like GPIO
pins power memory :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:09                                                     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:09 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
> On Wed, 3 Feb 2016, Ulf Hansson wrote:
> 
> > > Let me summarize my understanding of this thread so far.
> > >
> > > It looks like the omap3 code initializes hardware in ->probe() and
> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
> > > that case the hardware is not reset to the previous state and the
> > > runtime PM framework is left in the state that corresponds to the
> > > current hardware state.  Before we had pm_runtime_reinit(), everything
> > > worked as expected on the second ->probe() call, because things were
> > > in sync then.
> > 
> > Correct!

Well not quite correct. After failed probe PM runtime is set to reset
state while hardware is still enabled.

> > Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
> > worked fine because the HW state and the runtime PM status was in
> > sync. The device was powered and the runtime PM status was RPM_ACTIVE.
> > 
> > >
> > > The introduction of pm_runtime_reinit() changed the situation and now
> > > effectively the hardware is required to be reset to the initial state
> > > if -EPROBE_DEFER is to be returned too.
> > 
> > Not correct. The hardware doesn't need a reset as it stays powered
> > after a failed probe.

It is really best to disable the hardware after a failed probe
like we do with pm_runtime_put_sync_(suspend)() and pm_runtime_disable().

This is because there may never be another probe attempt and we want
to have unclaimed devices shut off (or idled) for PM.

> > Instead, only the runtime PM status needs to be synchronized with the
> > HW state the next probe attempt.
>
> In other words, the probe routine assumes the actual state is the same
> as the PM status.  This may have been true before pm_runtime_reinit()
> came along, but it's not true now.
> 
> > In other words, when the PM domain's ->runtime_resume() callbacks gets
> > called due to a pm_runtime_get_sync() in the second probe attempt, it
> > may find that the HW is already powered and can return zero instead of
> > resetting it.

Certainly that should at least produce a warning in the hardware
specific implementation. It's a state mismatch between PM runtime
and the hardware specific implementation.

And as discussed, the problem does not exist as long as we understand
that we need to use pm_runtime_put_sync_suspend() if the driver has
set pm_runtime_use_autoidle(). Or else use the combination of
pm_runtime_dont_use_autoidle() and pm_runtime_put_sync().

> What's wrong with going ahead and resetting the hardware anyway?

Nothing, unless a state needs to be maintained for things like GPIO
pins power memory :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 16:09                                                     ` Tony Lindgren
@ 2016-02-03 16:24                                                       ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 16:24 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: linux-pm, Kevin Hilman, Rafael J. Wysocki, Alan Stern,
	Rafael J. Wysocki, Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 17:09, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
>> On Wed, 3 Feb 2016, Ulf Hansson wrote:
>>
>> > > Let me summarize my understanding of this thread so far.
>> > >
>> > > It looks like the omap3 code initializes hardware in ->probe() and
>> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
>> > > that case the hardware is not reset to the previous state and the
>> > > runtime PM framework is left in the state that corresponds to the
>> > > current hardware state.  Before we had pm_runtime_reinit(), everything
>> > > worked as expected on the second ->probe() call, because things were
>> > > in sync then.
>> >
>> > Correct!
>
> Well not quite correct. After failed probe PM runtime is set to reset
> state while hardware is still enabled.

Yes, but that's *after* pm_runtime_reinit() was added.

I think Rafael was thinking about how it worked *before*.

>
>> > Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
>> > worked fine because the HW state and the runtime PM status was in
>> > sync. The device was powered and the runtime PM status was RPM_ACTIVE.
>> >
>> > >
>> > > The introduction of pm_runtime_reinit() changed the situation and now
>> > > effectively the hardware is required to be reset to the initial state
>> > > if -EPROBE_DEFER is to be returned too.
>> >
>> > Not correct. The hardware doesn't need a reset as it stays powered
>> > after a failed probe.
>
> It is really best to disable the hardware after a failed probe
> like we do with pm_runtime_put_sync_(suspend)() and pm_runtime_disable().
>
> This is because there may never be another probe attempt and we want
> to have unclaimed devices shut off (or idled) for PM.

I totally agree!

>
>> > Instead, only the runtime PM status needs to be synchronized with the
>> > HW state the next probe attempt.
>>
>> In other words, the probe routine assumes the actual state is the same
>> as the PM status.  This may have been true before pm_runtime_reinit()
>> came along, but it's not true now.
>>
>> > In other words, when the PM domain's ->runtime_resume() callbacks gets
>> > called due to a pm_runtime_get_sync() in the second probe attempt, it
>> > may find that the HW is already powered and can return zero instead of
>> > resetting it.
>
> Certainly that should at least produce a warning in the hardware
> specific implementation. It's a state mismatch between PM runtime
> and the hardware specific implementation.
>
> And as discussed, the problem does not exist as long as we understand
> that we need to use pm_runtime_put_sync_suspend() if the driver has
> set pm_runtime_use_autoidle(). Or else use the combination of
> pm_runtime_dont_use_autoidle() and pm_runtime_put_sync().

Okay, so I understand that you decided to not pick up the omap hwmod
patch I posted.

If you want some further help in fixing the omap drivers, please just
tell me I am at your service. :-)

Also, we must not forget to also update their runtime PM calls in
their ->remove() callbacks, as those seems to suffer from the same
problem as in the -EPROBE_DEFER case.

>
>> What's wrong with going ahead and resetting the hardware anyway?
>
> Nothing, unless a state needs to be maintained for things like GPIO
> pins power memory :)
>
> Regards,
>
> Tony

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:24                                                       ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 16:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 17:09, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
>> On Wed, 3 Feb 2016, Ulf Hansson wrote:
>>
>> > > Let me summarize my understanding of this thread so far.
>> > >
>> > > It looks like the omap3 code initializes hardware in ->probe() and
>> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
>> > > that case the hardware is not reset to the previous state and the
>> > > runtime PM framework is left in the state that corresponds to the
>> > > current hardware state.  Before we had pm_runtime_reinit(), everything
>> > > worked as expected on the second ->probe() call, because things were
>> > > in sync then.
>> >
>> > Correct!
>
> Well not quite correct. After failed probe PM runtime is set to reset
> state while hardware is still enabled.

Yes, but that's *after* pm_runtime_reinit() was added.

I think Rafael was thinking about how it worked *before*.

>
>> > Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER)
>> > worked fine because the HW state and the runtime PM status was in
>> > sync. The device was powered and the runtime PM status was RPM_ACTIVE.
>> >
>> > >
>> > > The introduction of pm_runtime_reinit() changed the situation and now
>> > > effectively the hardware is required to be reset to the initial state
>> > > if -EPROBE_DEFER is to be returned too.
>> >
>> > Not correct. The hardware doesn't need a reset as it stays powered
>> > after a failed probe.
>
> It is really best to disable the hardware after a failed probe
> like we do with pm_runtime_put_sync_(suspend)() and pm_runtime_disable().
>
> This is because there may never be another probe attempt and we want
> to have unclaimed devices shut off (or idled) for PM.

I totally agree!

>
>> > Instead, only the runtime PM status needs to be synchronized with the
>> > HW state the next probe attempt.
>>
>> In other words, the probe routine assumes the actual state is the same
>> as the PM status.  This may have been true before pm_runtime_reinit()
>> came along, but it's not true now.
>>
>> > In other words, when the PM domain's ->runtime_resume() callbacks gets
>> > called due to a pm_runtime_get_sync() in the second probe attempt, it
>> > may find that the HW is already powered and can return zero instead of
>> > resetting it.
>
> Certainly that should at least produce a warning in the hardware
> specific implementation. It's a state mismatch between PM runtime
> and the hardware specific implementation.
>
> And as discussed, the problem does not exist as long as we understand
> that we need to use pm_runtime_put_sync_suspend() if the driver has
> set pm_runtime_use_autoidle(). Or else use the combination of
> pm_runtime_dont_use_autoidle() and pm_runtime_put_sync().

Okay, so I understand that you decided to not pick up the omap hwmod
patch I posted.

If you want some further help in fixing the omap drivers, please just
tell me I am at your service. :-)

Also, we must not forget to also update their runtime PM calls in
their ->remove() callbacks, as those seems to suffer from the same
problem as in the -EPROBE_DEFER case.

>
>> What's wrong with going ahead and resetting the hardware anyway?
>
> Nothing, unless a state needs to be maintained for things like GPIO
> pins power memory :)
>
> Regards,
>
> Tony

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 10:23                                           ` Ulf Hansson
@ 2016-02-03 16:27                                             ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:27 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 02:24]:
> On 3 February 2016 at 00:41, Tony Lindgren <tony@atomide.com> wrote:
>
> > Well we actually pretty much have devices in that state to start
> > with.
> 
> That's not true, not even for omap.
> 
> There are definitely devices which HW state is reflected by RPM_ACTIVE
> at boot. It's the responsible of the subsystem/driver (including PM
> domains) to make sure the runtime PM status is updated accordingly, to
> reflect the real HW state.
> 
> For omap hwmod, at device registration in omap_device_build_from_dt()
> it may actually invoke pm_runtime_set_active() if the device is
> enabled. To my understanding, that's done to synchronize the real HW
> state with the runtime PM status, right?

Sure we do have cases where the bootloader state needs to be
preserved for some devices. Like GPIO pin state to power memory on
some devices! :)

> >> To me, it's the responsible of the PM domain to *help* with the
> >> synchronization, not prevent it as it currently does.
> >
> > The problem is that the hardware state gets out of sync with
> > PM runtime. And that's going to be a pain to debug later on.
> 
> I don't see the problem, but of course you know omap for better than I do.

Well there's also the long term maintenance aspect at least I
need to consider.

> So if you are concerned about this, perhaps adding a dev_dbg print
> when the omap hwmod's ->runtime_suspend() callback returns zero could
> be a way forward?

If we downgrade it to a debug statement or a warning, we'll soon end
up with even more driver specific warnings than we already have.
And I don't want to be chasing people around to fix their drivers
for eavery new driver that gets submitted.

Also, without this error I would not even originally have noticed we
have a problem :) So I suggest the following:

1. I'll do a series of patches to fix up the handful of omap
   specific drivers with pm_runtime_use_autosuspend() that depend
   on omap_device

2. I'll also do a patch to improve the omap_device error message
   so new drivers are easy to fix. Something like:

  "%() called from invalid state %d, use pm_runtime_put_sync_suspend()?"

Does that sounds OK to you?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:27                                             ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 02:24]:
> On 3 February 2016 at 00:41, Tony Lindgren <tony@atomide.com> wrote:
>
> > Well we actually pretty much have devices in that state to start
> > with.
> 
> That's not true, not even for omap.
> 
> There are definitely devices which HW state is reflected by RPM_ACTIVE
> at boot. It's the responsible of the subsystem/driver (including PM
> domains) to make sure the runtime PM status is updated accordingly, to
> reflect the real HW state.
> 
> For omap hwmod, at device registration in omap_device_build_from_dt()
> it may actually invoke pm_runtime_set_active() if the device is
> enabled. To my understanding, that's done to synchronize the real HW
> state with the runtime PM status, right?

Sure we do have cases where the bootloader state needs to be
preserved for some devices. Like GPIO pin state to power memory on
some devices! :)

> >> To me, it's the responsible of the PM domain to *help* with the
> >> synchronization, not prevent it as it currently does.
> >
> > The problem is that the hardware state gets out of sync with
> > PM runtime. And that's going to be a pain to debug later on.
> 
> I don't see the problem, but of course you know omap for better than I do.

Well there's also the long term maintenance aspect at least I
need to consider.

> So if you are concerned about this, perhaps adding a dev_dbg print
> when the omap hwmod's ->runtime_suspend() callback returns zero could
> be a way forward?

If we downgrade it to a debug statement or a warning, we'll soon end
up with even more driver specific warnings than we already have.
And I don't want to be chasing people around to fix their drivers
for eavery new driver that gets submitted.

Also, without this error I would not even originally have noticed we
have a problem :) So I suggest the following:

1. I'll do a series of patches to fix up the handful of omap
   specific drivers with pm_runtime_use_autosuspend() that depend
   on omap_device

2. I'll also do a patch to improve the omap_device error message
   so new drivers are easy to fix. Something like:

  "%() called from invalid state %d, use pm_runtime_put_sync_suspend()?"

Does that sounds OK to you?

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 13:06                                                     ` Rafael J. Wysocki
@ 2016-02-03 16:36                                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 05:07]:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> >
> > Yes.. That works too. I guess the thing to consider is if we should
> > make pm_runtime_put_sync() always sync along the lines of a patch
> > I posted earlier today. That could avoid quite a bit of confusion
> > as already seen in this thread :)
> 
> No, we shouldn't.
> 
> As I said, the current behavior is actually well documented (in
> kerneldoc comments and elsewhere).

OK. I'll do a series of fixes to the drivers using omap_device.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:36                                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:36 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 05:07]:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> >
> > Yes.. That works too. I guess the thing to consider is if we should
> > make pm_runtime_put_sync() always sync along the lines of a patch
> > I posted earlier today. That could avoid quite a bit of confusion
> > as already seen in this thread :)
> 
> No, we shouldn't.
> 
> As I said, the current behavior is actually well documented (in
> kerneldoc comments and elsewhere).

OK. I'll do a series of fixes to the drivers using omap_device.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 15:48                                                     ` Alan Stern
@ 2016-02-03 16:37                                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160203 07:49]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> > > On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > > 
> > > > > Also, what is autosuspend_delay set to for your device?  And is 
> > > > > runtime_auto set?
> > > > 
> > > > It's 100 at that point, see the commented snippet below from
> > > > omap_hsmmc_probe():
> > > > 
> > > > 	pm_runtime_enable(host->dev);
> > > > 	pm_runtime_get_sync(host->dev);
> > > > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > > > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > > > 	pm_runtime_use_autosuspend(host->dev);
> > > > 	...
> > > > 	/* gets -EPROBE_DEFER */
> > > > err_irq:
> > > > 	...
> > > > 	pm_runtime_put_sync(host->dev);
> > > 
> > > You could try changing this to pm_runtime_put_sync_suspend().  But
> > > putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> > > like a perfectly reasonable thing to do, especially if you feel you
> > > should reverse all the changes you made at the start.
> > 
> > They both seem to fix the problem.
> 
> So you could use either one.  In my opinion, the 
> pm_runtime_dont_use_autosuspend() solution is a little cleaner.
> 
> > > The reinit function gets called too late to do what you want -- namely, 
> > > put the hardware in a low-power state.
> > 
> > Right, the problem is the lack of suspend on the first probe. But
> > for having autosuspend timeout long enough for the next probe
> > would mean that we can't reset the PM runtime state in between.
> 
> That's one way to look at it.  But in principle you don't need to
> suspend the device after an unsuccessful probe.  You can just leave it
> at high power.  If this causes problems for a second probe, it's the 
> second probe's own fault for assuming the actual state matches the PM 
> status.

Yes. And I can improve the warning for omap_device so the authors
for new drivers can easily fix the issue.

> > > pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> > > invokes the driver's runtime_idle callback if there is one, and the
> > > callback routine can start a suspend or an autosuspend.  If there is no
> > > callback, it will use whatever autosuspend setting the driver has set
> > > up.  If you want to override the autosuspend setting, use
> > > pm_runtime_put_sync_suspend() instead.
> > 
> > Yes.. That works too. I guess the thing to consider is if we should
> > make pm_runtime_put_sync() always sync along the lines of a patch
> > I posted earlier today. That could avoid quite a bit of confusion
> > as already seen in this thread :)
> 
> As Rafael pointed out, pm_runtime_put_sync() has well-documented 
> behavior.  It shouldn't be changed.  I don't see how changing the 
> behavior would reduce anybody's confusion.  At least, anybody who reads 
> the documentation carefully.

Right :)

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:37                                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160203 07:49]:
> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> 
> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> > > On Tue, 2 Feb 2016, Tony Lindgren wrote:
> > > 
> > > > > Also, what is autosuspend_delay set to for your device?  And is 
> > > > > runtime_auto set?
> > > > 
> > > > It's 100 at that point, see the commented snippet below from
> > > > omap_hsmmc_probe():
> > > > 
> > > > 	pm_runtime_enable(host->dev);
> > > > 	pm_runtime_get_sync(host->dev);
> > > > 	pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> > > > 	/* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> > > > 	pm_runtime_use_autosuspend(host->dev);
> > > > 	...
> > > > 	/* gets -EPROBE_DEFER */
> > > > err_irq:
> > > > 	...
> > > > 	pm_runtime_put_sync(host->dev);
> > > 
> > > You could try changing this to pm_runtime_put_sync_suspend().  But
> > > putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> > > like a perfectly reasonable thing to do, especially if you feel you
> > > should reverse all the changes you made at the start.
> > 
> > They both seem to fix the problem.
> 
> So you could use either one.  In my opinion, the 
> pm_runtime_dont_use_autosuspend() solution is a little cleaner.
> 
> > > The reinit function gets called too late to do what you want -- namely, 
> > > put the hardware in a low-power state.
> > 
> > Right, the problem is the lack of suspend on the first probe. But
> > for having autosuspend timeout long enough for the next probe
> > would mean that we can't reset the PM runtime state in between.
> 
> That's one way to look at it.  But in principle you don't need to
> suspend the device after an unsuccessful probe.  You can just leave it
> at high power.  If this causes problems for a second probe, it's the 
> second probe's own fault for assuming the actual state matches the PM 
> status.

Yes. And I can improve the warning for omap_device so the authors
for new drivers can easily fix the issue.

> > > pm_runtime_put_sync() is supposed to follow the driver's wishes.  It
> > > invokes the driver's runtime_idle callback if there is one, and the
> > > callback routine can start a suspend or an autosuspend.  If there is no
> > > callback, it will use whatever autosuspend setting the driver has set
> > > up.  If you want to override the autosuspend setting, use
> > > pm_runtime_put_sync_suspend() instead.
> > 
> > Yes.. That works too. I guess the thing to consider is if we should
> > make pm_runtime_put_sync() always sync along the lines of a patch
> > I posted earlier today. That could avoid quite a bit of confusion
> > as already seen in this thread :)
> 
> As Rafael pointed out, pm_runtime_put_sync() has well-documented 
> behavior.  It shouldn't be changed.  I don't see how changing the 
> behavior would reduce anybody's confusion.  At least, anybody who reads 
> the documentation carefully.

Right :)

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 13:03                                         ` Rafael J. Wysocki
@ 2016-02-03 16:49                                           ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ulf Hansson, Alan Stern, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 05:04]:
> On Tue, Feb 2, 2016 at 10:39 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
> >> >
> >> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> >> > mode.  If you want to respect it, you have to call
> >> > pm_runtime_put_sync_autosuspend() instead.
> >>
> >> Then there's a bug in the runtime PM core.
> >>
> >> From Tony's regression report and from mine own local runtime PM test
> >> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> >> ->runtime_suspend() callback isn't called), even when the usage count
> >> is zero - when pm_runtime_put_sync() is called.
> > ...
> >> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> >> device even if autosuspend is in use. That seems reasonable, I will
> >> look into this problem.
> >
> > The patch below fixes pm_runtime_put_sync() to not respect the
> > autosuspend mode to match what Alan is saying above. Seems to also
> > fixes the $subject issue for me. And seems to behave for PM runtime
> > for other devices during runtime too based on light testing here.
> >
> > Regards,
> >
> > Tony
> >
> > 8< ---------------
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
> >
> >   out:
> >         trace_rpm_return_int(dev, _THIS_IP_, retval);
> > -       return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
> > +       if (!(rpmflags & RPM_IGNORE_AUTO))
> > +               rpmflags |= RPM_AUTO;
> > +       return retval ? retval : rpm_suspend(dev, rpmflags);
> >  }
> >
> >  /**
> > --- a/include/linux/pm_runtime.h
> > +++ b/include/linux/pm_runtime.h
> > @@ -23,6 +23,8 @@
> >                                             usage_count */
> >  #define RPM_AUTO               0x08    /* Use autosuspend_delay */
> >
> > +#define RPM_IGNORE_AUTO                0x10    /* Ignore autosuspend */
> > +
> >  #ifdef CONFIG_PM
> >  extern struct workqueue_struct *pm_wq;
> >
> > @@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
> >
> >  static inline int pm_runtime_put_sync(struct device *dev)
> >  {
> > -       return __pm_runtime_idle(dev, RPM_GET_PUT);
> > +       return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
> >  }
> >
> >  static inline int pm_runtime_put_sync_suspend(struct device *dev)
> 
> This changes a well-documented behavior that someone may be relying on.
> 
> Not the safest thing to do I have to say.

Yup fine with me.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 16:49                                           ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 05:04]:
> On Tue, Feb 2, 2016 at 10:39 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160202 12:25]:
> >> >
> >> > ?  pm_runtime_put_sync() _already_ does not respect the autosuspend
> >> > mode.  If you want to respect it, you have to call
> >> > pm_runtime_put_sync_autosuspend() instead.
> >>
> >> Then there's a bug in the runtime PM core.
> >>
> >> From Tony's regression report and from mine own local runtime PM test
> >> driver, I can see that the device doesn't get RPM_SUSPENDED (the
> >> ->runtime_suspend() callback isn't called), even when the usage count
> >> is zero - when pm_runtime_put_sync() is called.
> > ...
> >> Okay, so you are saying that the pm_runtime_put_sync() should idle the
> >> device even if autosuspend is in use. That seems reasonable, I will
> >> look into this problem.
> >
> > The patch below fixes pm_runtime_put_sync() to not respect the
> > autosuspend mode to match what Alan is saying above. Seems to also
> > fixes the $subject issue for me. And seems to behave for PM runtime
> > for other devices during runtime too based on light testing here.
> >
> > Regards,
> >
> > Tony
> >
> > 8< ---------------
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -353,7 +353,9 @@ static int rpm_idle(struct device *dev, int rpmflags)
> >
> >   out:
> >         trace_rpm_return_int(dev, _THIS_IP_, retval);
> > -       return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
> > +       if (!(rpmflags & RPM_IGNORE_AUTO))
> > +               rpmflags |= RPM_AUTO;
> > +       return retval ? retval : rpm_suspend(dev, rpmflags);
> >  }
> >
> >  /**
> > --- a/include/linux/pm_runtime.h
> > +++ b/include/linux/pm_runtime.h
> > @@ -23,6 +23,8 @@
> >                                             usage_count */
> >  #define RPM_AUTO               0x08    /* Use autosuspend_delay */
> >
> > +#define RPM_IGNORE_AUTO                0x10    /* Ignore autosuspend */
> > +
> >  #ifdef CONFIG_PM
> >  extern struct workqueue_struct *pm_wq;
> >
> > @@ -241,7 +243,7 @@ static inline int pm_runtime_put_autosuspend(struct device *dev)
> >
> >  static inline int pm_runtime_put_sync(struct device *dev)
> >  {
> > -       return __pm_runtime_idle(dev, RPM_GET_PUT);
> > +       return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_IGNORE_AUTO);
> >  }
> >
> >  static inline int pm_runtime_put_sync_suspend(struct device *dev)
> 
> This changes a well-documented behavior that someone may be relying on.
> 
> Not the safest thing to do I have to say.

Yup fine with me.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 16:24                                                       ` Ulf Hansson
@ 2016-02-03 17:01                                                         ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 17:01 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 08:25]:
> Okay, so I understand that you decided to not pick up the omap hwmod
> patch I posted.

I just suggested a solution of patching both drivers and omap_device
to make it easy for people to fix new drivers, so assuming that sounds
OK to you..

> If you want some further help in fixing the omap drivers, please just
> tell me I am at your service. :-)

Heh me too :) It's probably worth checking the non omap_device drivers
thouhg.

I can check and fix up the omap_device using drivers. So roughly it's
these ones:

$ git grep pm_runtime_use_autosuspend drivers | cut -d: -f1 | \
	sort | uniq | grep omap
drivers/i2c/busses/i2c-omap.c
drivers/mmc/host/omap_hsmmc.c
drivers/spi/spi-omap2-mcspi.c
drivers/tty/serial/8250/8250_omap.c
drivers/tty/serial/omap-serial.c

> Also, we must not forget to also update their runtime PM calls in
> their ->remove() callbacks, as those seems to suffer from the same
> problem as in the -EPROBE_DEFER case.

Yes and any sections within the driver code where clocks really
should be disabled we must use pm_runtime_put_sync_suspend() if
pm_runtime_set_autoidle() is set.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 17:01                                                         ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 08:25]:
> Okay, so I understand that you decided to not pick up the omap hwmod
> patch I posted.

I just suggested a solution of patching both drivers and omap_device
to make it easy for people to fix new drivers, so assuming that sounds
OK to you..

> If you want some further help in fixing the omap drivers, please just
> tell me I am at your service. :-)

Heh me too :) It's probably worth checking the non omap_device drivers
thouhg.

I can check and fix up the omap_device using drivers. So roughly it's
these ones:

$ git grep pm_runtime_use_autosuspend drivers | cut -d: -f1 | \
	sort | uniq | grep omap
drivers/i2c/busses/i2c-omap.c
drivers/mmc/host/omap_hsmmc.c
drivers/spi/spi-omap2-mcspi.c
drivers/tty/serial/8250/8250_omap.c
drivers/tty/serial/omap-serial.c

> Also, we must not forget to also update their runtime PM calls in
> their ->remove() callbacks, as those seems to suffer from the same
> problem as in the -EPROBE_DEFER case.

Yes and any sections within the driver code where clocks really
should be disabled we must use pm_runtime_put_sync_suspend() if
pm_runtime_set_autoidle() is set.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 16:24                                                       ` Ulf Hansson
@ 2016-02-03 17:16                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:16 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Tony Lindgren, Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wed, Feb 3, 2016 at 5:24 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 3 February 2016 at 17:09, Tony Lindgren <tony@atomide.com> wrote:
>> * Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
>>> On Wed, 3 Feb 2016, Ulf Hansson wrote:
>>>
>>> > > Let me summarize my understanding of this thread so far.
>>> > >
>>> > > It looks like the omap3 code initializes hardware in ->probe() and
>>> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
>>> > > that case the hardware is not reset to the previous state and the
>>> > > runtime PM framework is left in the state that corresponds to the
>>> > > current hardware state.  Before we had pm_runtime_reinit(), everything
>>> > > worked as expected on the second ->probe() call, because things were
>>> > > in sync then.
>>> >
>>> > Correct!
>>
>> Well not quite correct. After failed probe PM runtime is set to reset
>> state while hardware is still enabled.
>
> Yes, but that's *after* pm_runtime_reinit() was added.
>
> I think Rafael was thinking about how it worked *before*.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 17:16                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 3, 2016 at 5:24 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 3 February 2016 at 17:09, Tony Lindgren <tony@atomide.com> wrote:
>> * Alan Stern <stern@rowland.harvard.edu> [160203 07:46]:
>>> On Wed, 3 Feb 2016, Ulf Hansson wrote:
>>>
>>> > > Let me summarize my understanding of this thread so far.
>>> > >
>>> > > It looks like the omap3 code initializes hardware in ->probe() and
>>> > > then it may return -EPROBE_DEFER due to some unmet dependencies.  In
>>> > > that case the hardware is not reset to the previous state and the
>>> > > runtime PM framework is left in the state that corresponds to the
>>> > > current hardware state.  Before we had pm_runtime_reinit(), everything
>>> > > worked as expected on the second ->probe() call, because things were
>>> > > in sync then.
>>> >
>>> > Correct!
>>
>> Well not quite correct. After failed probe PM runtime is set to reset
>> state while hardware is still enabled.
>
> Yes, but that's *after* pm_runtime_reinit() was added.
>
> I think Rafael was thinking about how it worked *before*.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-02 23:46                                                   ` Tony Lindgren
@ 2016-02-03 17:18                                                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:18 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>
>> > > Also, what is autosuspend_delay set to for your device?  And is
>> > > runtime_auto set?
>> >
>> > It's 100 at that point, see the commented snippet below from
>> > omap_hsmmc_probe():
>> >
>> >     pm_runtime_enable(host->dev);
>> >     pm_runtime_get_sync(host->dev);
>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >     pm_runtime_use_autosuspend(host->dev);
>> >     ...
>> >     /* gets -EPROBE_DEFER */
>> > err_irq:
>> >     ...
>> >     pm_runtime_put_sync(host->dev);
>>
>> You could try changing this to pm_runtime_put_sync_suspend().  But
>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> like a perfectly reasonable thing to do, especially if you feel you
>> should reverse all the changes you made at the start.

FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().

After all, the driver doesn't want to use autosuspend going forward,
so stating that explicitly looks like the right thing to do.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 17:18                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>
>> > > Also, what is autosuspend_delay set to for your device?  And is
>> > > runtime_auto set?
>> >
>> > It's 100 at that point, see the commented snippet below from
>> > omap_hsmmc_probe():
>> >
>> >     pm_runtime_enable(host->dev);
>> >     pm_runtime_get_sync(host->dev);
>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >     pm_runtime_use_autosuspend(host->dev);
>> >     ...
>> >     /* gets -EPROBE_DEFER */
>> > err_irq:
>> >     ...
>> >     pm_runtime_put_sync(host->dev);
>>
>> You could try changing this to pm_runtime_put_sync_suspend().  But
>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> like a perfectly reasonable thing to do, especially if you feel you
>> should reverse all the changes you made at the start.

FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().

After all, the driver doesn't want to use autosuspend going forward,
so stating that explicitly looks like the right thing to do.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 17:18                                                     ` Rafael J. Wysocki
@ 2016-02-03 17:22                                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 17:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Ulf Hansson, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 09:19]:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> >> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> >>
> >> > > Also, what is autosuspend_delay set to for your device?  And is
> >> > > runtime_auto set?
> >> >
> >> > It's 100 at that point, see the commented snippet below from
> >> > omap_hsmmc_probe():
> >> >
> >> >     pm_runtime_enable(host->dev);
> >> >     pm_runtime_get_sync(host->dev);
> >> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> >> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> >> >     pm_runtime_use_autosuspend(host->dev);
> >> >     ...
> >> >     /* gets -EPROBE_DEFER */
> >> > err_irq:
> >> >     ...
> >> >     pm_runtime_put_sync(host->dev);
> >>
> >> You could try changing this to pm_runtime_put_sync_suspend().  But
> >> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> >> like a perfectly reasonable thing to do, especially if you feel you
> >> should reverse all the changes you made at the start.
> 
> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
> 
> After all, the driver doesn't want to use autosuspend going forward,
> so stating that explicitly looks like the right thing to do.

Yeah agreed. FYI, this is what I typed up here into a commit message:

1. For sections of code that needs the device disabled, use
   pm_runtime_put_sync_suspend() if pm_runtime_set_autosuspend() has
   been set.

2. For driver exit code, use pm_runtime_dont_use_autosuspend() before
   pm_runtime_put_sync() if pm_runtime_use_autosuspend() has been
   set.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 17:22                                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

* Rafael J. Wysocki <rafael@kernel.org> [160203 09:19]:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
> >> On Tue, 2 Feb 2016, Tony Lindgren wrote:
> >>
> >> > > Also, what is autosuspend_delay set to for your device?  And is
> >> > > runtime_auto set?
> >> >
> >> > It's 100 at that point, see the commented snippet below from
> >> > omap_hsmmc_probe():
> >> >
> >> >     pm_runtime_enable(host->dev);
> >> >     pm_runtime_get_sync(host->dev);
> >> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
> >> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
> >> >     pm_runtime_use_autosuspend(host->dev);
> >> >     ...
> >> >     /* gets -EPROBE_DEFER */
> >> > err_irq:
> >> >     ...
> >> >     pm_runtime_put_sync(host->dev);
> >>
> >> You could try changing this to pm_runtime_put_sync_suspend().  But
> >> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
> >> like a perfectly reasonable thing to do, especially if you feel you
> >> should reverse all the changes you made at the start.
> 
> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
> 
> After all, the driver doesn't want to use autosuspend going forward,
> so stating that explicitly looks like the right thing to do.

Yeah agreed. FYI, this is what I typed up here into a commit message:

1. For sections of code that needs the device disabled, use
   pm_runtime_put_sync_suspend() if pm_runtime_set_autosuspend() has
   been set.

2. For driver exit code, use pm_runtime_dont_use_autosuspend() before
   pm_runtime_put_sync() if pm_runtime_use_autosuspend() has been
   set.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 17:22                                                       ` Tony Lindgren
@ 2016-02-03 17:27                                                         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:27 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Rafael J. Wysocki, Alan Stern, Ulf Hansson, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Wed, Feb 3, 2016 at 6:22 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160203 09:19]:
>> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> >> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>> >>
>> >> > > Also, what is autosuspend_delay set to for your device?  And is
>> >> > > runtime_auto set?
>> >> >
>> >> > It's 100 at that point, see the commented snippet below from
>> >> > omap_hsmmc_probe():
>> >> >
>> >> >     pm_runtime_enable(host->dev);
>> >> >     pm_runtime_get_sync(host->dev);
>> >> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >> >     pm_runtime_use_autosuspend(host->dev);
>> >> >     ...
>> >> >     /* gets -EPROBE_DEFER */
>> >> > err_irq:
>> >> >     ...
>> >> >     pm_runtime_put_sync(host->dev);
>> >>
>> >> You could try changing this to pm_runtime_put_sync_suspend().  But
>> >> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> >> like a perfectly reasonable thing to do, especially if you feel you
>> >> should reverse all the changes you made at the start.
>>
>> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
>>
>> After all, the driver doesn't want to use autosuspend going forward,
>> so stating that explicitly looks like the right thing to do.
>
> Yeah agreed. FYI, this is what I typed up here into a commit message:
>
> 1. For sections of code that needs the device disabled, use
>    pm_runtime_put_sync_suspend() if pm_runtime_set_autosuspend() has
>    been set.
>
> 2. For driver exit code, use pm_runtime_dont_use_autosuspend() before
>    pm_runtime_put_sync() if pm_runtime_use_autosuspend() has been
>    set.

Sounds reasonable to me.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 17:27                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 148+ messages in thread
From: Rafael J. Wysocki @ 2016-02-03 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 3, 2016 at 6:22 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160203 09:19]:
>> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>> >> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>> >>
>> >> > > Also, what is autosuspend_delay set to for your device?  And is
>> >> > > runtime_auto set?
>> >> >
>> >> > It's 100 at that point, see the commented snippet below from
>> >> > omap_hsmmc_probe():
>> >> >
>> >> >     pm_runtime_enable(host->dev);
>> >> >     pm_runtime_get_sync(host->dev);
>> >> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>> >> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>> >> >     pm_runtime_use_autosuspend(host->dev);
>> >> >     ...
>> >> >     /* gets -EPROBE_DEFER */
>> >> > err_irq:
>> >> >     ...
>> >> >     pm_runtime_put_sync(host->dev);
>> >>
>> >> You could try changing this to pm_runtime_put_sync_suspend().  But
>> >> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>> >> like a perfectly reasonable thing to do, especially if you feel you
>> >> should reverse all the changes you made at the start.
>>
>> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
>>
>> After all, the driver doesn't want to use autosuspend going forward,
>> so stating that explicitly looks like the right thing to do.
>
> Yeah agreed. FYI, this is what I typed up here into a commit message:
>
> 1. For sections of code that needs the device disabled, use
>    pm_runtime_put_sync_suspend() if pm_runtime_set_autosuspend() has
>    been set.
>
> 2. For driver exit code, use pm_runtime_dont_use_autosuspend() before
>    pm_runtime_put_sync() if pm_runtime_use_autosuspend() has been
>    set.

Sounds reasonable to me.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 16:27                                             ` Tony Lindgren
@ 2016-02-03 18:02                                               ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 18:02 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 17:27, Tony Lindgren <tony@atomide.com> wrote:
>
>> >> To me, it's the responsible of the PM domain to *help* with the
>> >> synchronization, not prevent it as it currently does.
>> >
>> > The problem is that the hardware state gets out of sync with
>> > PM runtime. And that's going to be a pain to debug later on.
>>
>> I don't see the problem, but of course you know omap for better than I do.
>
> Well there's also the long term maintenance aspect at least I
> need to consider.
>
>> So if you are concerned about this, perhaps adding a dev_dbg print
>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>> be a way forward?
>
> If we downgrade it to a debug statement or a warning, we'll soon end
> up with even more driver specific warnings than we already have.
> And I don't want to be chasing people around to fix their drivers
> for eavery new driver that gets submitted.
>
> Also, without this error I would not even originally have noticed we
> have a problem :) So I suggest the following:

Well, I am actually not removing that existing warning in the
omap_device_enable(), as it's being called from other places as well.

It's just the runtime PM path that's being changed.

>
> 1. I'll do a series of patches to fix up the handful of omap
>    specific drivers with pm_runtime_use_autosuspend() that depend
>    on omap_device
>
> 2. I'll also do a patch to improve the omap_device error message
>    so new drivers are easy to fix. Something like:
>
>   "%() called from invalid state %d, use pm_runtime_put_sync_suspend()?"
>
> Does that sounds OK to you?

Sure.

>
> Regards,
>
> Tony

One more thing though. I just realized that you have yet another issue
to consider going for the approach fixing *only* drivers.

Let me summarize it here:

If userspace has prevented runtime PM (pm_runtime_forbid()) when a
driver becomes unbound, the driver will not be able to suspend the
device by using any of the pm_runtime_suspend() APIs, as the usage
count is isn't zero.

As pm_runtime_reinit() is invoked as part of the driver unbind
sequence, the runtime PM status goes out of sync. A following driver
rebind will then trigger the warning when the PM domain's
->runtime_resume() callback gets invoked. Again, forever preventing
the device from being runtime suspended.

How do you intend to solve this case?
I guess there are two options, pick up the patch I posted for omap
hwmod or make use of pm_runtime_force_suspend() in the driver.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 18:02                                               ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 17:27, Tony Lindgren <tony@atomide.com> wrote:
>
>> >> To me, it's the responsible of the PM domain to *help* with the
>> >> synchronization, not prevent it as it currently does.
>> >
>> > The problem is that the hardware state gets out of sync with
>> > PM runtime. And that's going to be a pain to debug later on.
>>
>> I don't see the problem, but of course you know omap for better than I do.
>
> Well there's also the long term maintenance aspect at least I
> need to consider.
>
>> So if you are concerned about this, perhaps adding a dev_dbg print
>> when the omap hwmod's ->runtime_suspend() callback returns zero could
>> be a way forward?
>
> If we downgrade it to a debug statement or a warning, we'll soon end
> up with even more driver specific warnings than we already have.
> And I don't want to be chasing people around to fix their drivers
> for eavery new driver that gets submitted.
>
> Also, without this error I would not even originally have noticed we
> have a problem :) So I suggest the following:

Well, I am actually not removing that existing warning in the
omap_device_enable(), as it's being called from other places as well.

It's just the runtime PM path that's being changed.

>
> 1. I'll do a series of patches to fix up the handful of omap
>    specific drivers with pm_runtime_use_autosuspend() that depend
>    on omap_device
>
> 2. I'll also do a patch to improve the omap_device error message
>    so new drivers are easy to fix. Something like:
>
>   "%() called from invalid state %d, use pm_runtime_put_sync_suspend()?"
>
> Does that sounds OK to you?

Sure.

>
> Regards,
>
> Tony

One more thing though. I just realized that you have yet another issue
to consider going for the approach fixing *only* drivers.

Let me summarize it here:

If userspace has prevented runtime PM (pm_runtime_forbid()) when a
driver becomes unbound, the driver will not be able to suspend the
device by using any of the pm_runtime_suspend() APIs, as the usage
count is isn't zero.

As pm_runtime_reinit() is invoked as part of the driver unbind
sequence, the runtime PM status goes out of sync. A following driver
rebind will then trigger the warning when the PM domain's
->runtime_resume() callback gets invoked. Again, forever preventing
the device from being runtime suspended.

How do you intend to solve this case?
I guess there are two options, pick up the patch I posted for omap
hwmod or make use of pm_runtime_force_suspend() in the driver.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 18:02                                               ` Ulf Hansson
@ 2016-02-03 18:28                                                 ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 18:28 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> 
> One more thing though. I just realized that you have yet another issue
> to consider going for the approach fixing *only* drivers.
> 
> Let me summarize it here:
> 
> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> driver becomes unbound, the driver will not be able to suspend the
> device by using any of the pm_runtime_suspend() APIs, as the usage
> count is isn't zero.
> 
> As pm_runtime_reinit() is invoked as part of the driver unbind
> sequence, the runtime PM status goes out of sync. A following driver
> rebind will then trigger the warning when the PM domain's
> ->runtime_resume() callback gets invoked. Again, forever preventing
> the device from being runtime suspended.

Hmm yeah that's a good point.

> How do you intend to solve this case?
> I guess there are two options, pick up the patch I posted for omap
> hwmod or make use of pm_runtime_force_suspend() in the driver.

My gut feeling right now is we should just have
BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
automatically as it's unused after the driver has unloaded :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 18:28                                                 ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 18:28 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> 
> One more thing though. I just realized that you have yet another issue
> to consider going for the approach fixing *only* drivers.
> 
> Let me summarize it here:
> 
> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> driver becomes unbound, the driver will not be able to suspend the
> device by using any of the pm_runtime_suspend() APIs, as the usage
> count is isn't zero.
> 
> As pm_runtime_reinit() is invoked as part of the driver unbind
> sequence, the runtime PM status goes out of sync. A following driver
> rebind will then trigger the warning when the PM domain's
> ->runtime_resume() callback gets invoked. Again, forever preventing
> the device from being runtime suspended.

Hmm yeah that's a good point.

> How do you intend to solve this case?
> I guess there are two options, pick up the patch I posted for omap
> hwmod or make use of pm_runtime_force_suspend() in the driver.

My gut feeling right now is we should just have
BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
automatically as it's unused after the driver has unloaded :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 18:28                                                 ` Tony Lindgren
@ 2016-02-03 18:37                                                   ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 18:37 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
>>
>> One more thing though. I just realized that you have yet another issue
>> to consider going for the approach fixing *only* drivers.
>>
>> Let me summarize it here:
>>
>> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
>> driver becomes unbound, the driver will not be able to suspend the
>> device by using any of the pm_runtime_suspend() APIs, as the usage
>> count is isn't zero.
>>
>> As pm_runtime_reinit() is invoked as part of the driver unbind
>> sequence, the runtime PM status goes out of sync. A following driver
>> rebind will then trigger the warning when the PM domain's
>> ->runtime_resume() callback gets invoked. Again, forever preventing
>> the device from being runtime suspended.
>
> Hmm yeah that's a good point.
>
>> How do you intend to solve this case?
>> I guess there are two options, pick up the patch I posted for omap
>> hwmod or make use of pm_runtime_force_suspend() in the driver.
>
> My gut feeling right now is we should just have
> BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> automatically as it's unused after the driver has unloaded :)

BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
invoked from driver core.
So if the driver requires to do a pm_runtime_get_sync() during
->remove() callback, this won't work.

BUS_NOTIFY_UNBOUND_DRIVER may work though.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 18:37                                                   ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-03 18:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
>>
>> One more thing though. I just realized that you have yet another issue
>> to consider going for the approach fixing *only* drivers.
>>
>> Let me summarize it here:
>>
>> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
>> driver becomes unbound, the driver will not be able to suspend the
>> device by using any of the pm_runtime_suspend() APIs, as the usage
>> count is isn't zero.
>>
>> As pm_runtime_reinit() is invoked as part of the driver unbind
>> sequence, the runtime PM status goes out of sync. A following driver
>> rebind will then trigger the warning when the PM domain's
>> ->runtime_resume() callback gets invoked. Again, forever preventing
>> the device from being runtime suspended.
>
> Hmm yeah that's a good point.
>
>> How do you intend to solve this case?
>> I guess there are two options, pick up the patch I posted for omap
>> hwmod or make use of pm_runtime_force_suspend() in the driver.
>
> My gut feeling right now is we should just have
> BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> automatically as it's unused after the driver has unloaded :)

BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
invoked from driver core.
So if the driver requires to do a pm_runtime_get_sync() during
->remove() callback, this won't work.

BUS_NOTIFY_UNBOUND_DRIVER may work though.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 18:37                                                   ` Ulf Hansson
@ 2016-02-03 18:45                                                     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 18:45 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 10:38]:
> On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> >>
> >> One more thing though. I just realized that you have yet another issue
> >> to consider going for the approach fixing *only* drivers.
> >>
> >> Let me summarize it here:
> >>
> >> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> >> driver becomes unbound, the driver will not be able to suspend the
> >> device by using any of the pm_runtime_suspend() APIs, as the usage
> >> count is isn't zero.
> >>
> >> As pm_runtime_reinit() is invoked as part of the driver unbind
> >> sequence, the runtime PM status goes out of sync. A following driver
> >> rebind will then trigger the warning when the PM domain's
> >> ->runtime_resume() callback gets invoked. Again, forever preventing
> >> the device from being runtime suspended.
> >
> > Hmm yeah that's a good point.
> >
> >> How do you intend to solve this case?
> >> I guess there are two options, pick up the patch I posted for omap
> >> hwmod or make use of pm_runtime_force_suspend() in the driver.
> >
> > My gut feeling right now is we should just have
> > BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> > automatically as it's unused after the driver has unloaded :)
> 
> BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
> invoked from driver core.
> So if the driver requires to do a pm_runtime_get_sync() during
> ->remove() callback, this won't work.
> 
> BUS_NOTIFY_UNBOUND_DRIVER may work though.

Right sorry that's what I meant. Naturally we can't do it before
remove :)

I'll take a look.

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 18:45                                                     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 18:45 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160203 10:38]:
> On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> >>
> >> One more thing though. I just realized that you have yet another issue
> >> to consider going for the approach fixing *only* drivers.
> >>
> >> Let me summarize it here:
> >>
> >> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> >> driver becomes unbound, the driver will not be able to suspend the
> >> device by using any of the pm_runtime_suspend() APIs, as the usage
> >> count is isn't zero.
> >>
> >> As pm_runtime_reinit() is invoked as part of the driver unbind
> >> sequence, the runtime PM status goes out of sync. A following driver
> >> rebind will then trigger the warning when the PM domain's
> >> ->runtime_resume() callback gets invoked. Again, forever preventing
> >> the device from being runtime suspended.
> >
> > Hmm yeah that's a good point.
> >
> >> How do you intend to solve this case?
> >> I guess there are two options, pick up the patch I posted for omap
> >> hwmod or make use of pm_runtime_force_suspend() in the driver.
> >
> > My gut feeling right now is we should just have
> > BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> > automatically as it's unused after the driver has unloaded :)
> 
> BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
> invoked from driver core.
> So if the driver requires to do a pm_runtime_get_sync() during
> ->remove() callback, this won't work.
> 
> BUS_NOTIFY_UNBOUND_DRIVER may work though.

Right sorry that's what I meant. Naturally we can't do it before
remove :)

I'll take a look.

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 18:45                                                     ` Tony Lindgren
@ 2016-02-03 21:51                                                       ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 21:51 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160203 10:46]:
> * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:38]:
> > On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> > > * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> > >>
> > >> One more thing though. I just realized that you have yet another issue
> > >> to consider going for the approach fixing *only* drivers.
> > >>
> > >> Let me summarize it here:
> > >>
> > >> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> > >> driver becomes unbound, the driver will not be able to suspend the
> > >> device by using any of the pm_runtime_suspend() APIs, as the usage
> > >> count is isn't zero.
> > >>
> > >> As pm_runtime_reinit() is invoked as part of the driver unbind
> > >> sequence, the runtime PM status goes out of sync. A following driver
> > >> rebind will then trigger the warning when the PM domain's
> > >> ->runtime_resume() callback gets invoked. Again, forever preventing
> > >> the device from being runtime suspended.
> > >
> > > Hmm yeah that's a good point.
> > >
> > >> How do you intend to solve this case?
> > >> I guess there are two options, pick up the patch I posted for omap
> > >> hwmod or make use of pm_runtime_force_suspend() in the driver.
> > >
> > > My gut feeling right now is we should just have
> > > BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> > > automatically as it's unused after the driver has unloaded :)
> > 
> > BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
> > invoked from driver core.
> > So if the driver requires to do a pm_runtime_get_sync() during
> > ->remove() callback, this won't work.
> > 
> > BUS_NOTIFY_UNBOUND_DRIVER may work though.
> 
> Right sorry that's what I meant. Naturally we can't do it before
> remove :)
> 
> I'll take a look.

This patch below seems to fix this issue. I'll do some more
testing here and send the driver fixes and this with proper
commit messages. Thanks for letting me know about that one!

Regards,

Tony

8< -------------------
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -191,12 +191,22 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
 {
 	struct platform_device *pdev = to_platform_device(dev);
 	struct omap_device *od;
+	int err;
 
 	switch (event) {
 	case BUS_NOTIFY_DEL_DEVICE:
 		if (pdev->archdata.od)
 			omap_device_delete(pdev->archdata.od);
 		break;
+	case BUS_NOTIFY_UNBOUND_DRIVER:
+		od = to_omap_device(pdev);
+		if (od && (od->_state == OMAP_DEVICE_STATE_ENABLED)) {
+			dev_info(&pdev->dev, "enabled after unload, idling\n");
+			err = omap_device_idle(pdev);
+			if (err)
+				dev_err(&pdev->dev, "failed to idle\n");
+		}
+		break;
 	case BUS_NOTIFY_ADD_DEVICE:
 		if (pdev->dev.of_node)
 			omap_device_build_from_dt(pdev);

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-03 21:51                                                       ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-03 21:51 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [160203 10:46]:
> * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:38]:
> > On 3 February 2016 at 19:28, Tony Lindgren <tony@atomide.com> wrote:
> > > * Ulf Hansson <ulf.hansson@linaro.org> [160203 10:03]:
> > >>
> > >> One more thing though. I just realized that you have yet another issue
> > >> to consider going for the approach fixing *only* drivers.
> > >>
> > >> Let me summarize it here:
> > >>
> > >> If userspace has prevented runtime PM (pm_runtime_forbid()) when a
> > >> driver becomes unbound, the driver will not be able to suspend the
> > >> device by using any of the pm_runtime_suspend() APIs, as the usage
> > >> count is isn't zero.
> > >>
> > >> As pm_runtime_reinit() is invoked as part of the driver unbind
> > >> sequence, the runtime PM status goes out of sync. A following driver
> > >> rebind will then trigger the warning when the PM domain's
> > >> ->runtime_resume() callback gets invoked. Again, forever preventing
> > >> the device from being runtime suspended.
> > >
> > > Hmm yeah that's a good point.
> > >
> > >> How do you intend to solve this case?
> > >> I guess there are two options, pick up the patch I posted for omap
> > >> hwmod or make use of pm_runtime_force_suspend() in the driver.
> > >
> > > My gut feeling right now is we should just have
> > > BUS_NOTIFY_UNBIND_DRIVER shut down the device on the interconnect
> > > automatically as it's unused after the driver has unloaded :)
> > 
> > BUS_NOTIFY_UNBIND_DRIVER is sent prior the ->remove() callbacks is
> > invoked from driver core.
> > So if the driver requires to do a pm_runtime_get_sync() during
> > ->remove() callback, this won't work.
> > 
> > BUS_NOTIFY_UNBOUND_DRIVER may work though.
> 
> Right sorry that's what I meant. Naturally we can't do it before
> remove :)
> 
> I'll take a look.

This patch below seems to fix this issue. I'll do some more
testing here and send the driver fixes and this with proper
commit messages. Thanks for letting me know about that one!

Regards,

Tony

8< -------------------
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -191,12 +191,22 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
 {
 	struct platform_device *pdev = to_platform_device(dev);
 	struct omap_device *od;
+	int err;
 
 	switch (event) {
 	case BUS_NOTIFY_DEL_DEVICE:
 		if (pdev->archdata.od)
 			omap_device_delete(pdev->archdata.od);
 		break;
+	case BUS_NOTIFY_UNBOUND_DRIVER:
+		od = to_omap_device(pdev);
+		if (od && (od->_state == OMAP_DEVICE_STATE_ENABLED)) {
+			dev_info(&pdev->dev, "enabled after unload, idling\n");
+			err = omap_device_idle(pdev);
+			if (err)
+				dev_err(&pdev->dev, "failed to idle\n");
+		}
+		break;
 	case BUS_NOTIFY_ADD_DEVICE:
 		if (pdev->dev.of_node)
 			omap_device_build_from_dt(pdev);

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-03 17:18                                                     ` Rafael J. Wysocki
@ 2016-02-04 10:20                                                       ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 10:20 UTC (permalink / raw)
  To: Rafael J. Wysocki, Tony Lindgren, Alan Stern
  Cc: Rafael J. Wysocki, Kevin Hilman, linux-pm,
	Linux OMAP Mailing List, linux-arm-kernel

On 3 February 2016 at 18:18, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
>> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>>
>>> > > Also, what is autosuspend_delay set to for your device?  And is
>>> > > runtime_auto set?
>>> >
>>> > It's 100 at that point, see the commented snippet below from
>>> > omap_hsmmc_probe():
>>> >
>>> >     pm_runtime_enable(host->dev);
>>> >     pm_runtime_get_sync(host->dev);
>>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>>> >     pm_runtime_use_autosuspend(host->dev);
>>> >     ...
>>> >     /* gets -EPROBE_DEFER */
>>> > err_irq:
>>> >     ...
>>> >     pm_runtime_put_sync(host->dev);
>>>
>>> You could try changing this to pm_runtime_put_sync_suspend().  But
>>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>>> like a perfectly reasonable thing to do, especially if you feel you
>>> should reverse all the changes you made at the start.
>
> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
>
> After all, the driver doesn't want to use autosuspend going forward,
> so stating that explicitly looks like the right thing to do.

Just wanted to add yet some new findings while I was looking into this
regression.

So, the reason why pm_runtime_put_sync() doesn't idle the device in
these cases, is because autosuspend is respected and for some reason
the autosuspend timer hasn't expired.
I was wondering *why* the timer isn't considered as expired, because
the driver don't call pm_runtime_mark_last_busy() in the failing probe
case...

Then I realized the following commit was merged a few releases ago,
which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
for us.

commit 56f487c78015936097474fd89b2ccb229d500d0f
Author: Tony Lindgren <tony@atomide.com>
Date:   Wed May 13 16:36:32 2015 -0700
PM / Runtime: Update last_busy in rpm_resume

So, this commit actually causes the devices to stay active after a
failed probe attempt.

I think it's a good idea to revert this change, but what to you think?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 10:20                                                       ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 10:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 3 February 2016 at 18:18, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Feb 3, 2016 at 12:46 AM, Tony Lindgren <tony@atomide.com> wrote:
>> * Alan Stern <stern@rowland.harvard.edu> [160202 13:46]:
>>> On Tue, 2 Feb 2016, Tony Lindgren wrote:
>>>
>>> > > Also, what is autosuspend_delay set to for your device?  And is
>>> > > runtime_auto set?
>>> >
>>> > It's 100 at that point, see the commented snippet below from
>>> > omap_hsmmc_probe():
>>> >
>>> >     pm_runtime_enable(host->dev);
>>> >     pm_runtime_get_sync(host->dev);
>>> >     pm_runtime_set_autosuspend_delay(host->dev, MMC_AUTOSUSPEND_DELAY);
>>> >     /* NOTE: pm_runtime_dont_use_autosuspend(host->dev) needed here? */
>>> >     pm_runtime_use_autosuspend(host->dev);
>>> >     ...
>>> >     /* gets -EPROBE_DEFER */
>>> > err_irq:
>>> >     ...
>>> >     pm_runtime_put_sync(host->dev);
>>>
>>> You could try changing this to pm_runtime_put_sync_suspend().  But
>>> putting pm_runtime_dont_use_autosuspend() before the put_sync seems
>>> like a perfectly reasonable thing to do, especially if you feel you
>>> should reverse all the changes you made at the start.
>
> FWIW, I'd call pm_runtime_dont_use_autosuspend() before put_sync().
>
> After all, the driver doesn't want to use autosuspend going forward,
> so stating that explicitly looks like the right thing to do.

Just wanted to add yet some new findings while I was looking into this
regression.

So, the reason why pm_runtime_put_sync() doesn't idle the device in
these cases, is because autosuspend is respected and for some reason
the autosuspend timer hasn't expired.
I was wondering *why* the timer isn't considered as expired, because
the driver don't call pm_runtime_mark_last_busy() in the failing probe
case...

Then I realized the following commit was merged a few releases ago,
which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
for us.

commit 56f487c78015936097474fd89b2ccb229d500d0f
Author: Tony Lindgren <tony@atomide.com>
Date:   Wed May 13 16:36:32 2015 -0700
PM / Runtime: Update last_busy in rpm_resume

So, this commit actually causes the devices to stay active after a
failed probe attempt.

I think it's a good idea to revert this change, but what to you think?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 10:20                                                       ` Ulf Hansson
@ 2016-02-04 16:04                                                         ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-04 16:04 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Tony Lindgren, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Thu, 4 Feb 2016, Ulf Hansson wrote:

> Just wanted to add yet some new findings while I was looking into this
> regression.
> 
> So, the reason why pm_runtime_put_sync() doesn't idle the device in
> these cases, is because autosuspend is respected and for some reason
> the autosuspend timer hasn't expired.

No doubt the autosuspend delay is longer than the time it takes for a 
probe to fail.

> I was wondering *why* the timer isn't considered as expired, because
> the driver don't call pm_runtime_mark_last_busy() in the failing probe
> case...
> 
> Then I realized the following commit was merged a few releases ago,
> which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
> for us.
> 
> commit 56f487c78015936097474fd89b2ccb229d500d0f
> Author: Tony Lindgren <tony@atomide.com>
> Date:   Wed May 13 16:36:32 2015 -0700
> PM / Runtime: Update last_busy in rpm_resume
> 
> So, this commit actually causes the devices to stay active after a
> failed probe attempt.
> 
> I think it's a good idea to revert this change, but what to you think?

I disagree.  The whole idea of autosuspend is to prevent the device's
power state from bouncing up and down too rapidly.  This implies that 
whenever the device gets resumed, we should wait at least the minimum 
autosuspend delay before allowing the next autosuspend.

Perhaps you think that it's silly to behave that way in this case,
because the device wasn't accessed at all during the time it was at
full power.  That's a valid objection, but the proper solution is not
to revert the 56f487c78015 commit.  Rather, change the driver to avoid
doing a pm_runtime_resume_sync until you _know_ that the device will be
accessed soon.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 16:04                                                         ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-04 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 4 Feb 2016, Ulf Hansson wrote:

> Just wanted to add yet some new findings while I was looking into this
> regression.
> 
> So, the reason why pm_runtime_put_sync() doesn't idle the device in
> these cases, is because autosuspend is respected and for some reason
> the autosuspend timer hasn't expired.

No doubt the autosuspend delay is longer than the time it takes for a 
probe to fail.

> I was wondering *why* the timer isn't considered as expired, because
> the driver don't call pm_runtime_mark_last_busy() in the failing probe
> case...
> 
> Then I realized the following commit was merged a few releases ago,
> which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
> for us.
> 
> commit 56f487c78015936097474fd89b2ccb229d500d0f
> Author: Tony Lindgren <tony@atomide.com>
> Date:   Wed May 13 16:36:32 2015 -0700
> PM / Runtime: Update last_busy in rpm_resume
> 
> So, this commit actually causes the devices to stay active after a
> failed probe attempt.
> 
> I think it's a good idea to revert this change, but what to you think?

I disagree.  The whole idea of autosuspend is to prevent the device's
power state from bouncing up and down too rapidly.  This implies that 
whenever the device gets resumed, we should wait at least the minimum 
autosuspend delay before allowing the next autosuspend.

Perhaps you think that it's silly to behave that way in this case,
because the device wasn't accessed at all during the time it was at
full power.  That's a valid objection, but the proper solution is not
to revert the 56f487c78015 commit.  Rather, change the driver to avoid
doing a pm_runtime_resume_sync until you _know_ that the device will be
accessed soon.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 16:04                                                         ` Alan Stern
@ 2016-02-04 17:20                                                           ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-04 17:20 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160204 08:05]:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
> 
> > Just wanted to add yet some new findings while I was looking into this
> > regression.
> > 
> > So, the reason why pm_runtime_put_sync() doesn't idle the device in
> > these cases, is because autosuspend is respected and for some reason
> > the autosuspend timer hasn't expired.
> 
> No doubt the autosuspend delay is longer than the time it takes for a 
> probe to fail.
> 
> > I was wondering *why* the timer isn't considered as expired, because
> > the driver don't call pm_runtime_mark_last_busy() in the failing probe
> > case...
> > 
> > Then I realized the following commit was merged a few releases ago,
> > which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
> > for us.
> > 
> > commit 56f487c78015936097474fd89b2ccb229d500d0f
> > Author: Tony Lindgren <tony@atomide.com>
> > Date:   Wed May 13 16:36:32 2015 -0700
> > PM / Runtime: Update last_busy in rpm_resume
> > 
> > So, this commit actually causes the devices to stay active after a
> > failed probe attempt.
> > 
> > I think it's a good idea to revert this change, but what to you think?
> 
> I disagree.  The whole idea of autosuspend is to prevent the device's
> power state from bouncing up and down too rapidly.  This implies that 
> whenever the device gets resumed, we should wait at least the minimum 
> autosuspend delay before allowing the next autosuspend.

Yeah let's not revert 56f487c78015. Without that we have devices
falling right back to sleep.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 17:20                                                           ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-04 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

* Alan Stern <stern@rowland.harvard.edu> [160204 08:05]:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
> 
> > Just wanted to add yet some new findings while I was looking into this
> > regression.
> > 
> > So, the reason why pm_runtime_put_sync() doesn't idle the device in
> > these cases, is because autosuspend is respected and for some reason
> > the autosuspend timer hasn't expired.
> 
> No doubt the autosuspend delay is longer than the time it takes for a 
> probe to fail.
> 
> > I was wondering *why* the timer isn't considered as expired, because
> > the driver don't call pm_runtime_mark_last_busy() in the failing probe
> > case...
> > 
> > Then I realized the following commit was merged a few releases ago,
> > which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
> > for us.
> > 
> > commit 56f487c78015936097474fd89b2ccb229d500d0f
> > Author: Tony Lindgren <tony@atomide.com>
> > Date:   Wed May 13 16:36:32 2015 -0700
> > PM / Runtime: Update last_busy in rpm_resume
> > 
> > So, this commit actually causes the devices to stay active after a
> > failed probe attempt.
> > 
> > I think it's a good idea to revert this change, but what to you think?
> 
> I disagree.  The whole idea of autosuspend is to prevent the device's
> power state from bouncing up and down too rapidly.  This implies that 
> whenever the device gets resumed, we should wait at least the minimum 
> autosuspend delay before allowing the next autosuspend.

Yeah let's not revert 56f487c78015. Without that we have devices
falling right back to sleep.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 16:04                                                         ` Alan Stern
@ 2016-02-04 21:11                                                           ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 21:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Tony Lindgren, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On 4 February 2016 at 17:04, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
>
>> Just wanted to add yet some new findings while I was looking into this
>> regression.
>>
>> So, the reason why pm_runtime_put_sync() doesn't idle the device in
>> these cases, is because autosuspend is respected and for some reason
>> the autosuspend timer hasn't expired.
>
> No doubt the autosuspend delay is longer than the time it takes for a
> probe to fail.
>
>> I was wondering *why* the timer isn't considered as expired, because
>> the driver don't call pm_runtime_mark_last_busy() in the failing probe
>> case...
>>
>> Then I realized the following commit was merged a few releases ago,
>> which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
>> for us.
>>
>> commit 56f487c78015936097474fd89b2ccb229d500d0f
>> Author: Tony Lindgren <tony@atomide.com>
>> Date:   Wed May 13 16:36:32 2015 -0700
>> PM / Runtime: Update last_busy in rpm_resume
>>
>> So, this commit actually causes the devices to stay active after a
>> failed probe attempt.
>>
>> I think it's a good idea to revert this change, but what to you think?
>
> I disagree.  The whole idea of autosuspend is to prevent the device's
> power state from bouncing up and down too rapidly.  This implies that
> whenever the device gets resumed, we should wait at least the minimum
> autosuspend delay before allowing the next autosuspend.

I am really not questioning the autosuspend feature at all, it's a
really great feature!

Now, I question the minor benefit we actually gain from having the
runtime PM core to update the mark in rpm_resume().

To me, the best decision when to update the mark is know by the
driver/subsystem for the device and not the core.

In most cases the mark will be updated after a request has been
completed, which leads to one unnecessary update at rpm_resume().

In this path (the resume), you really want to keep latencies to a
minimum and for sure not do unnecessary things.

>
> Perhaps you think that it's silly to behave that way in this case,
> because the device wasn't accessed at all during the time it was at
> full power.  That's a valid objection, but the proper solution is not
> to revert the 56f487c78015 commit.  Rather, change the driver to avoid
> doing a pm_runtime_resume_sync until you _know_ that the device will be
> accessed soon.

That's not always going to work.

Sometimes you need to access the device when trying to probe. Failing
later in probe, shows just *one* case where it doesn't make sense to
update the last busy mark. I suspect there may be other cases as well.

Of course one can always use runtime PM APIs which overrides the
autosuspend mode, so it's not a big deal.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 21:11                                                           ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 21:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 February 2016 at 17:04, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
>
>> Just wanted to add yet some new findings while I was looking into this
>> regression.
>>
>> So, the reason why pm_runtime_put_sync() doesn't idle the device in
>> these cases, is because autosuspend is respected and for some reason
>> the autosuspend timer hasn't expired.
>
> No doubt the autosuspend delay is longer than the time it takes for a
> probe to fail.
>
>> I was wondering *why* the timer isn't considered as expired, because
>> the driver don't call pm_runtime_mark_last_busy() in the failing probe
>> case...
>>
>> Then I realized the following commit was merged a few releases ago,
>> which makes the runtime PM core to invoke pm_runtime_mark_last_busy()
>> for us.
>>
>> commit 56f487c78015936097474fd89b2ccb229d500d0f
>> Author: Tony Lindgren <tony@atomide.com>
>> Date:   Wed May 13 16:36:32 2015 -0700
>> PM / Runtime: Update last_busy in rpm_resume
>>
>> So, this commit actually causes the devices to stay active after a
>> failed probe attempt.
>>
>> I think it's a good idea to revert this change, but what to you think?
>
> I disagree.  The whole idea of autosuspend is to prevent the device's
> power state from bouncing up and down too rapidly.  This implies that
> whenever the device gets resumed, we should wait at least the minimum
> autosuspend delay before allowing the next autosuspend.

I am really not questioning the autosuspend feature at all, it's a
really great feature!

Now, I question the minor benefit we actually gain from having the
runtime PM core to update the mark in rpm_resume().

To me, the best decision when to update the mark is know by the
driver/subsystem for the device and not the core.

In most cases the mark will be updated after a request has been
completed, which leads to one unnecessary update at rpm_resume().

In this path (the resume), you really want to keep latencies to a
minimum and for sure not do unnecessary things.

>
> Perhaps you think that it's silly to behave that way in this case,
> because the device wasn't accessed at all during the time it was at
> full power.  That's a valid objection, but the proper solution is not
> to revert the 56f487c78015 commit.  Rather, change the driver to avoid
> doing a pm_runtime_resume_sync until you _know_ that the device will be
> accessed soon.

That's not always going to work.

Sometimes you need to access the device when trying to probe. Failing
later in probe, shows just *one* case where it doesn't make sense to
update the last busy mark. I suspect there may be other cases as well.

Of course one can always use runtime PM APIs which overrides the
autosuspend mode, so it's not a big deal.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 21:11                                                           ` Ulf Hansson
@ 2016-02-04 22:09                                                             ` Alan Stern
  -1 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-04 22:09 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Tony Lindgren, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On Thu, 4 Feb 2016, Ulf Hansson wrote:

> I am really not questioning the autosuspend feature at all, it's a
> really great feature!
> 
> Now, I question the minor benefit we actually gain from having the
> runtime PM core to update the mark in rpm_resume().

As Tony pointed out, it prevents some devices from going to sleep right 
away.

> To me, the best decision when to update the mark is know by the
> driver/subsystem for the device and not the core.
> 
> In most cases the mark will be updated after a request has been
> completed, which leads to one unnecessary update at rpm_resume().

Sure, but that update is a simple assignment statement.  It's about as 
cheap as you can get, short of doing nothing at all.

> In this path (the resume), you really want to keep latencies to a
> minimum and for sure not do unnecessary things.
> 
> >
> > Perhaps you think that it's silly to behave that way in this case,
> > because the device wasn't accessed at all during the time it was at
> > full power.  That's a valid objection, but the proper solution is not
> > to revert the 56f487c78015 commit.  Rather, change the driver to avoid
> > doing a pm_runtime_resume_sync until you _know_ that the device will be
> > accessed soon.
> 
> That's not always going to work.
> 
> Sometimes you need to access the device when trying to probe. Failing
> later in probe, shows just *one* case where it doesn't make sense to
> update the last busy mark. I suspect there may be other cases as well.

I don't follow your reasoning.  If you don't update the last_busy mark 
then the probe fails, the device goes to sleep immediately, and then 
wakes up again a fraction of a second later for another probe attempt 
(if the error was -EDEFER).  Thus you get an unnecessary suspend 
followed by an unnecessary resume.

If the error was something other than -EDEFER and there will be no more 
probes, then yes -- the device remains at full power for longer than 
necessary.  But how often does that happen?  In general, people have 
drivers that _do_ work with their devices.

> Of course one can always use runtime PM APIs which overrides the
> autosuspend mode, so it's not a big deal.

Or turn autosuspend off completely when you know you're not going to
want it any more.  True.

Alan Stern


^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 22:09                                                             ` Alan Stern
  0 siblings, 0 replies; 148+ messages in thread
From: Alan Stern @ 2016-02-04 22:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 4 Feb 2016, Ulf Hansson wrote:

> I am really not questioning the autosuspend feature at all, it's a
> really great feature!
> 
> Now, I question the minor benefit we actually gain from having the
> runtime PM core to update the mark in rpm_resume().

As Tony pointed out, it prevents some devices from going to sleep right 
away.

> To me, the best decision when to update the mark is know by the
> driver/subsystem for the device and not the core.
> 
> In most cases the mark will be updated after a request has been
> completed, which leads to one unnecessary update at rpm_resume().

Sure, but that update is a simple assignment statement.  It's about as 
cheap as you can get, short of doing nothing at all.

> In this path (the resume), you really want to keep latencies to a
> minimum and for sure not do unnecessary things.
> 
> >
> > Perhaps you think that it's silly to behave that way in this case,
> > because the device wasn't accessed at all during the time it was at
> > full power.  That's a valid objection, but the proper solution is not
> > to revert the 56f487c78015 commit.  Rather, change the driver to avoid
> > doing a pm_runtime_resume_sync until you _know_ that the device will be
> > accessed soon.
> 
> That's not always going to work.
> 
> Sometimes you need to access the device when trying to probe. Failing
> later in probe, shows just *one* case where it doesn't make sense to
> update the last busy mark. I suspect there may be other cases as well.

I don't follow your reasoning.  If you don't update the last_busy mark 
then the probe fails, the device goes to sleep immediately, and then 
wakes up again a fraction of a second later for another probe attempt 
(if the error was -EDEFER).  Thus you get an unnecessary suspend 
followed by an unnecessary resume.

If the error was something other than -EDEFER and there will be no more 
probes, then yes -- the device remains at full power for longer than 
necessary.  But how often does that happen?  In general, people have 
drivers that _do_ work with their devices.

> Of course one can always use runtime PM APIs which overrides the
> autosuspend mode, so it's not a big deal.

Or turn autosuspend off completely when you know you're not going to
want it any more.  True.

Alan Stern

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 22:09                                                             ` Alan Stern
@ 2016-02-04 22:34                                                               ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 22:34 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Tony Lindgren, Rafael J. Wysocki,
	Kevin Hilman, linux-pm, Linux OMAP Mailing List,
	linux-arm-kernel

On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
>
>> I am really not questioning the autosuspend feature at all, it's a
>> really great feature!
>>
>> Now, I question the minor benefit we actually gain from having the
>> runtime PM core to update the mark in rpm_resume().
>
> As Tony pointed out, it prevents some devices from going to sleep right
> away.

Because their drivers don't care to update the last busy mark!?

>
>> To me, the best decision when to update the mark is know by the
>> driver/subsystem for the device and not the core.
>>
>> In most cases the mark will be updated after a request has been
>> completed, which leads to one unnecessary update at rpm_resume().
>
> Sure, but that update is a simple assignment statement.  It's about as
> cheap as you can get, short of doing nothing at all.

Valid point. I rest my case. :-)

Thanks for the discussion.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-04 22:34                                                               ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-04 22:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 4 Feb 2016, Ulf Hansson wrote:
>
>> I am really not questioning the autosuspend feature at all, it's a
>> really great feature!
>>
>> Now, I question the minor benefit we actually gain from having the
>> runtime PM core to update the mark in rpm_resume().
>
> As Tony pointed out, it prevents some devices from going to sleep right
> away.

Because their drivers don't care to update the last busy mark!?

>
>> To me, the best decision when to update the mark is know by the
>> driver/subsystem for the device and not the core.
>>
>> In most cases the mark will be updated after a request has been
>> completed, which leads to one unnecessary update at rpm_resume().
>
> Sure, but that update is a simple assignment statement.  It's about as
> cheap as you can get, short of doing nothing at all.

Valid point. I rest my case. :-)

Thanks for the discussion.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-04 22:34                                                               ` Ulf Hansson
@ 2016-02-05  1:08                                                                 ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-05  1:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
> >
> >> I am really not questioning the autosuspend feature at all, it's a
> >> really great feature!
> >>
> >> Now, I question the minor benefit we actually gain from having the
> >> runtime PM core to update the mark in rpm_resume().
> >
> > As Tony pointed out, it prevents some devices from going to sleep right
> > away.
> 
> Because their drivers don't care to update the last busy mark!?

Nope. Without that devices may never resume at all so the drivers
can't do anything about it.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-05  1:08                                                                 ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-05  1:08 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
> >
> >> I am really not questioning the autosuspend feature at all, it's a
> >> really great feature!
> >>
> >> Now, I question the minor benefit we actually gain from having the
> >> runtime PM core to update the mark in rpm_resume().
> >
> > As Tony pointed out, it prevents some devices from going to sleep right
> > away.
> 
> Because their drivers don't care to update the last busy mark!?

Nope. Without that devices may never resume at all so the drivers
can't do anything about it.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-05  1:08                                                                 ` Tony Lindgren
@ 2016-02-05  6:54                                                                   ` Ulf Hansson
  -1 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-05  6:54 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

On 5 February 2016 at 02:08, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
>> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
>> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
>> >
>> >> I am really not questioning the autosuspend feature at all, it's a
>> >> really great feature!
>> >>
>> >> Now, I question the minor benefit we actually gain from having the
>> >> runtime PM core to update the mark in rpm_resume().
>> >
>> > As Tony pointed out, it prevents some devices from going to sleep right
>> > away.
>>
>> Because their drivers don't care to update the last busy mark!?
>
> Nope. Without that devices may never resume at all so the drivers
> can't do anything about it.

I don't get it. Why not? Because of another abuse of the runtime PM API?

Or we should probably continue to focus on fixing the regression. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-05  6:54                                                                   ` Ulf Hansson
  0 siblings, 0 replies; 148+ messages in thread
From: Ulf Hansson @ 2016-02-05  6:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 February 2016 at 02:08, Tony Lindgren <tony@atomide.com> wrote:
> * Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
>> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
>> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
>> >
>> >> I am really not questioning the autosuspend feature at all, it's a
>> >> really great feature!
>> >>
>> >> Now, I question the minor benefit we actually gain from having the
>> >> runtime PM core to update the mark in rpm_resume().
>> >
>> > As Tony pointed out, it prevents some devices from going to sleep right
>> > away.
>>
>> Because their drivers don't care to update the last busy mark!?
>
> Nope. Without that devices may never resume at all so the drivers
> can't do anything about it.

I don't get it. Why not? Because of another abuse of the runtime PM API?

Or we should probably continue to focus on fixing the regression. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 148+ messages in thread

* Re: PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
  2016-02-05  6:54                                                                   ` Ulf Hansson
@ 2016-02-05 19:10                                                                     ` Tony Lindgren
  -1 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-05 19:10 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Alan Stern, Rafael J. Wysocki, Rafael J. Wysocki, Kevin Hilman,
	linux-pm, Linux OMAP Mailing List, linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160204 22:55]:
> On 5 February 2016 at 02:08, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
> >> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> >> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
> >> >
> >> >> I am really not questioning the autosuspend feature at all, it's a
> >> >> really great feature!
> >> >>
> >> >> Now, I question the minor benefit we actually gain from having the
> >> >> runtime PM core to update the mark in rpm_resume().
> >> >
> >> > As Tony pointed out, it prevents some devices from going to sleep right
> >> > away.
> >>
> >> Because their drivers don't care to update the last busy mark!?
> >
> > Nope. Without that devices may never resume at all so the drivers
> > can't do anything about it.
> 
> I don't get it. Why not? Because of another abuse of the runtime PM API?

I think you should be able to test this case in your test driver by
calling pm_runtime_resume() for your test driver after your
test drive has autosuspended. Probably you need some delayed_work to
do this in your test driver unless you have some test bus to go with it.

> Or we should probably continue to focus on fixing the regression. :-)

Naturally we should fix up things yeah :) Having a test driver
that works on any architecture sure makes things easier to verify.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

* PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1
@ 2016-02-05 19:10                                                                     ` Tony Lindgren
  0 siblings, 0 replies; 148+ messages in thread
From: Tony Lindgren @ 2016-02-05 19:10 UTC (permalink / raw)
  To: linux-arm-kernel

* Ulf Hansson <ulf.hansson@linaro.org> [160204 22:55]:
> On 5 February 2016 at 02:08, Tony Lindgren <tony@atomide.com> wrote:
> > * Ulf Hansson <ulf.hansson@linaro.org> [160204 14:35]:
> >> On 4 February 2016 at 23:09, Alan Stern <stern@rowland.harvard.edu> wrote:
> >> > On Thu, 4 Feb 2016, Ulf Hansson wrote:
> >> >
> >> >> I am really not questioning the autosuspend feature at all, it's a
> >> >> really great feature!
> >> >>
> >> >> Now, I question the minor benefit we actually gain from having the
> >> >> runtime PM core to update the mark in rpm_resume().
> >> >
> >> > As Tony pointed out, it prevents some devices from going to sleep right
> >> > away.
> >>
> >> Because their drivers don't care to update the last busy mark!?
> >
> > Nope. Without that devices may never resume at all so the drivers
> > can't do anything about it.
> 
> I don't get it. Why not? Because of another abuse of the runtime PM API?

I think you should be able to test this case in your test driver by
calling pm_runtime_resume() for your test driver after your
test drive has autosuspended. Probably you need some delayed_work to
do this in your test driver unless you have some test bus to go with it.

> Or we should probably continue to focus on fixing the regression. :-)

Naturally we should fix up things yeah :) Having a test driver
that works on any architecture sure makes things easier to verify.

Regards,

Tony

^ permalink raw reply	[flat|nested] 148+ messages in thread

end of thread, other threads:[~2016-02-05 19:10 UTC | newest]

Thread overview: 148+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-26 22:48 PM regression with commit 5de85b9d57ab PM runtime re-init in v4.5-rc1 Tony Lindgren
2016-01-26 22:48 ` Tony Lindgren
2016-01-26 22:50 ` Tony Lindgren
2016-01-26 22:50   ` Tony Lindgren
2016-01-26 23:14 ` Rafael J. Wysocki
2016-01-26 23:14   ` Rafael J. Wysocki
2016-01-26 23:22   ` Tony Lindgren
2016-01-26 23:22     ` Tony Lindgren
2016-01-26 23:37     ` Rafael J. Wysocki
2016-01-26 23:37       ` Rafael J. Wysocki
2016-01-26 23:52       ` Tony Lindgren
2016-01-26 23:52         ` Tony Lindgren
2016-01-27  7:54         ` Rafael J. Wysocki
2016-01-27  7:54           ` Rafael J. Wysocki
2016-01-27  8:17           ` Ulf Hansson
2016-01-27  8:17             ` Ulf Hansson
2016-01-27 15:19             ` Tony Lindgren
2016-01-27 15:19               ` Tony Lindgren
2016-01-27 22:51             ` Rafael J. Wysocki
2016-01-27 22:51               ` Rafael J. Wysocki
2016-01-28 14:29         ` Ulf Hansson
2016-01-28 14:29           ` Ulf Hansson
2016-01-28 16:58           ` Tony Lindgren
2016-01-28 16:58             ` Tony Lindgren
2016-02-01 16:44             ` Ulf Hansson
2016-02-01 16:44               ` Ulf Hansson
2016-02-01 18:11               ` Tony Lindgren
2016-02-01 18:11                 ` Tony Lindgren
2016-02-01 22:06                 ` Tony Lindgren
2016-02-01 22:06                   ` Tony Lindgren
2016-02-01 22:17                   ` Rafael J. Wysocki
2016-02-01 22:17                     ` Rafael J. Wysocki
2016-02-01 22:29                     ` Tony Lindgren
2016-02-01 22:29                       ` Tony Lindgren
2016-02-01 23:10                       ` Rafael J. Wysocki
2016-02-01 23:10                         ` Rafael J. Wysocki
2016-02-01 23:28                         ` Tony Lindgren
2016-02-01 23:28                           ` Tony Lindgren
2016-02-01 23:44                           ` Tony Lindgren
2016-02-01 23:44                             ` Tony Lindgren
2016-02-01 23:49                           ` Alan Stern
2016-02-01 23:49                             ` Alan Stern
2016-02-02  3:05                             ` Tony Lindgren
2016-02-02  3:05                               ` Tony Lindgren
2016-02-02 10:07                               ` Ulf Hansson
2016-02-02 10:07                                 ` Ulf Hansson
2016-02-02 10:42                                 ` Ulf Hansson
2016-02-02 10:42                                   ` Ulf Hansson
2016-02-02 16:23                                   ` Alan Stern
2016-02-02 16:23                                     ` Alan Stern
2016-02-02 16:35                                   ` Tony Lindgren
2016-02-02 16:35                                     ` Tony Lindgren
2016-02-02 20:47                                     ` Ulf Hansson
2016-02-02 20:47                                       ` Ulf Hansson
2016-02-02 23:41                                       ` Tony Lindgren
2016-02-02 23:41                                         ` Tony Lindgren
2016-02-03 10:23                                         ` Ulf Hansson
2016-02-03 10:23                                           ` Ulf Hansson
2016-02-03 10:25                                           ` Ulf Hansson
2016-02-03 10:25                                             ` Ulf Hansson
2016-02-03 12:18                                             ` Rafael J. Wysocki
2016-02-03 12:18                                               ` Rafael J. Wysocki
2016-02-03 14:58                                               ` Ulf Hansson
2016-02-03 14:58                                                 ` Ulf Hansson
2016-02-03 15:45                                                 ` Alan Stern
2016-02-03 15:45                                                   ` Alan Stern
2016-02-03 16:09                                                   ` Tony Lindgren
2016-02-03 16:09                                                     ` Tony Lindgren
2016-02-03 16:24                                                     ` Ulf Hansson
2016-02-03 16:24                                                       ` Ulf Hansson
2016-02-03 17:01                                                       ` Tony Lindgren
2016-02-03 17:01                                                         ` Tony Lindgren
2016-02-03 17:16                                                       ` Rafael J. Wysocki
2016-02-03 17:16                                                         ` Rafael J. Wysocki
2016-02-03 16:27                                           ` Tony Lindgren
2016-02-03 16:27                                             ` Tony Lindgren
2016-02-03 18:02                                             ` Ulf Hansson
2016-02-03 18:02                                               ` Ulf Hansson
2016-02-03 18:28                                               ` Tony Lindgren
2016-02-03 18:28                                                 ` Tony Lindgren
2016-02-03 18:37                                                 ` Ulf Hansson
2016-02-03 18:37                                                   ` Ulf Hansson
2016-02-03 18:45                                                   ` Tony Lindgren
2016-02-03 18:45                                                     ` Tony Lindgren
2016-02-03 21:51                                                     ` Tony Lindgren
2016-02-03 21:51                                                       ` Tony Lindgren
2016-02-02 16:15                                 ` Alan Stern
2016-02-02 16:15                                   ` Alan Stern
2016-02-02 16:49                                   ` Tony Lindgren
2016-02-02 16:49                                     ` Tony Lindgren
2016-02-02 18:05                                     ` Tony Lindgren
2016-02-02 18:05                                       ` Tony Lindgren
2016-02-02 18:43                                       ` Alan Stern
2016-02-02 18:43                                         ` Alan Stern
2016-02-02 18:54                                         ` Tony Lindgren
2016-02-02 18:54                                           ` Tony Lindgren
2016-02-02 19:16                                           ` Alan Stern
2016-02-02 19:16                                             ` Alan Stern
2016-02-02 21:03                                             ` Tony Lindgren
2016-02-02 21:03                                               ` Tony Lindgren
2016-02-02 21:45                                               ` Alan Stern
2016-02-02 21:45                                                 ` Alan Stern
2016-02-02 23:46                                                 ` Tony Lindgren
2016-02-02 23:46                                                   ` Tony Lindgren
2016-02-03 13:06                                                   ` Rafael J. Wysocki
2016-02-03 13:06                                                     ` Rafael J. Wysocki
2016-02-03 16:36                                                     ` Tony Lindgren
2016-02-03 16:36                                                       ` Tony Lindgren
2016-02-03 15:48                                                   ` Alan Stern
2016-02-03 15:48                                                     ` Alan Stern
2016-02-03 16:37                                                     ` Tony Lindgren
2016-02-03 16:37                                                       ` Tony Lindgren
2016-02-03 17:18                                                   ` Rafael J. Wysocki
2016-02-03 17:18                                                     ` Rafael J. Wysocki
2016-02-03 17:22                                                     ` Tony Lindgren
2016-02-03 17:22                                                       ` Tony Lindgren
2016-02-03 17:27                                                       ` Rafael J. Wysocki
2016-02-03 17:27                                                         ` Rafael J. Wysocki
2016-02-04 10:20                                                     ` Ulf Hansson
2016-02-04 10:20                                                       ` Ulf Hansson
2016-02-04 16:04                                                       ` Alan Stern
2016-02-04 16:04                                                         ` Alan Stern
2016-02-04 17:20                                                         ` Tony Lindgren
2016-02-04 17:20                                                           ` Tony Lindgren
2016-02-04 21:11                                                         ` Ulf Hansson
2016-02-04 21:11                                                           ` Ulf Hansson
2016-02-04 22:09                                                           ` Alan Stern
2016-02-04 22:09                                                             ` Alan Stern
2016-02-04 22:34                                                             ` Ulf Hansson
2016-02-04 22:34                                                               ` Ulf Hansson
2016-02-05  1:08                                                               ` Tony Lindgren
2016-02-05  1:08                                                                 ` Tony Lindgren
2016-02-05  6:54                                                                 ` Ulf Hansson
2016-02-05  6:54                                                                   ` Ulf Hansson
2016-02-05 19:10                                                                   ` Tony Lindgren
2016-02-05 19:10                                                                     ` Tony Lindgren
2016-02-02 18:47                                       ` Tony Lindgren
2016-02-02 18:47                                         ` Tony Lindgren
2016-02-02 20:24                                   ` Ulf Hansson
2016-02-02 20:24                                     ` Ulf Hansson
2016-02-02 21:24                                     ` Alan Stern
2016-02-02 21:24                                       ` Alan Stern
2016-02-02 21:39                                     ` Tony Lindgren
2016-02-02 21:39                                       ` Tony Lindgren
2016-02-03 13:03                                       ` Rafael J. Wysocki
2016-02-03 13:03                                         ` Rafael J. Wysocki
2016-02-03 16:49                                         ` Tony Lindgren
2016-02-03 16:49                                           ` Tony Lindgren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.