linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
@ 2016-05-13 18:03 Grygorii Strashko
  2016-05-19 13:38 ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Grygorii Strashko @ 2016-05-13 18:03 UTC (permalink / raw)
  To: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek
  Cc: Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson, linux-kernel,
	Grygorii Strashko

The PM runtime will be left disabled for the device if its .suspend_late()
callback fails and async suspend is not allowed for this device. In
this case device will not be added in dpm_late_early_list and
dpm_resume_early() will ignore this device, as result PM runtime will
be disabled for it forever (side effect: after 8 subsequent failures
for the same device the PM runtime will be reenabled due to
disable_depth overflow).

Hence, re-enable PM runtime in __device_suspend_late() if
.suspend_late() callback fails and async suspend is not allowed for
this device.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/base/power/main.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 6e7c3cc..9b266e5 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
+	if (!error) {
 		dev->power.is_late_suspended = true;
-	else
+	} else {
 		async_error = error;
+		if (!is_async(dev))
+			pm_runtime_enable(dev);
+	}
 
 Complete:
 	TRACE_SUSPEND(error);
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-13 18:03 [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late() Grygorii Strashko
@ 2016-05-19 13:38 ` Rafael J. Wysocki
  2016-05-19 17:11   ` Grygorii Strashko
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2016-05-19 13:38 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
> The PM runtime will be left disabled for the device if its .suspend_late()
> callback fails and async suspend is not allowed for this device. In
> this case device will not be added in dpm_late_early_list and
> dpm_resume_early() will ignore this device, as result PM runtime will
> be disabled for it forever (side effect: after 8 subsequent failures
> for the same device the PM runtime will be reenabled due to
> disable_depth overflow).
>
> Hence, re-enable PM runtime in __device_suspend_late() if
> .suspend_late() callback fails and async suspend is not allowed for
> this device.
>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> ---
>  drivers/base/power/main.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index 6e7c3cc..9b266e5 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
>         }
>
>         error = dpm_run_callback(callback, dev, state, info);
> -       if (!error)
> +       if (!error) {
>                 dev->power.is_late_suspended = true;
> -       else
> +       } else {
>                 async_error = error;
> +               if (!is_async(dev))

Why is the is_async() check necessary here?

> +                       pm_runtime_enable(dev);
> +       }
>
>  Complete:
>         TRACE_SUSPEND(error);
> --

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-19 13:38 ` Rafael J. Wysocki
@ 2016-05-19 17:11   ` Grygorii Strashko
  2016-05-20 12:18     ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Grygorii Strashko @ 2016-05-19 17:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote:
> On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
> <grygorii.strashko@ti.com> wrote:
>> The PM runtime will be left disabled for the device if its .suspend_late()
>> callback fails and async suspend is not allowed for this device. In
>> this case device will not be added in dpm_late_early_list and
>> dpm_resume_early() will ignore this device, as result PM runtime will
>> be disabled for it forever (side effect: after 8 subsequent failures
>> for the same device the PM runtime will be reenabled due to
>> disable_depth overflow).
>>
>> Hence, re-enable PM runtime in __device_suspend_late() if
>> .suspend_late() callback fails and async suspend is not allowed for
>> this device.
>>
>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>> ---
>>   drivers/base/power/main.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>> index 6e7c3cc..9b266e5 100644
>> --- a/drivers/base/power/main.c
>> +++ b/drivers/base/power/main.c
>> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
>>          }
>>
>>          error = dpm_run_callback(callback, dev, state, info);
>> -       if (!error)
>> +       if (!error) {
>>                  dev->power.is_late_suspended = true;
>> -       else
>> +       } else {
		Point [1]
>>                  async_error = error;
>> +               if (!is_async(dev))
> 
> Why is the is_async() check necessary here?
			
A: deviceX is suspended *async* and reached point [1], in this case:
- deviceX has been added in dpm_late_early_list already
- dpm_suspend_late() will detect async_error and call dpm_resume_early()
- dpm_resume_early() will call device_resume_early() for deviceX
- device_resume_early() will re-enable PM runtime
{
...
	if (!dev->power.is_late_suspended)
		goto Out;

	...
 Out:
	TRACE_RESUME(error);

	pm_runtime_enable(dev);
^^^^^^^^^^^^
	complete_all(&dev->power.completion);
	return error;
}	
	

B: deviceX is suspended *sync* and reached point [1], in this case:
- deviceX has not been added in dpm_late_early_list yet
- dpm_suspend_late() will detect sync_error and call dpm_resume_early()
- dpm_resume_early() will ignore deviceX

if i'll not check for !is_async(dev) then pm_runtime_enable(dev)
will be called twice for deviceX with this patch.

> 
>> +                       pm_runtime_enable(dev);
>> +       }
>>
>>   Complete:
>>          TRACE_SUSPEND(error);
>> --


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-19 17:11   ` Grygorii Strashko
@ 2016-05-20 12:18     ` Rafael J. Wysocki
  2016-05-20 16:21       ` Grygorii Strashko
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2016-05-20 12:18 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On Thursday, May 19, 2016 08:11:34 PM Grygorii Strashko wrote:
> On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote:
> > On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
> > <grygorii.strashko@ti.com> wrote:
> >> The PM runtime will be left disabled for the device if its .suspend_late()
> >> callback fails and async suspend is not allowed for this device. In
> >> this case device will not be added in dpm_late_early_list and
> >> dpm_resume_early() will ignore this device, as result PM runtime will
> >> be disabled for it forever (side effect: after 8 subsequent failures
> >> for the same device the PM runtime will be reenabled due to
> >> disable_depth overflow).
> >>
> >> Hence, re-enable PM runtime in __device_suspend_late() if
> >> .suspend_late() callback fails and async suspend is not allowed for
> >> this device.
> >>
> >> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> >> ---
> >>   drivers/base/power/main.c | 7 +++++--
> >>   1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> >> index 6e7c3cc..9b266e5 100644
> >> --- a/drivers/base/power/main.c
> >> +++ b/drivers/base/power/main.c
> >> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
> >>          }
> >>
> >>          error = dpm_run_callback(callback, dev, state, info);
> >> -       if (!error)
> >> +       if (!error) {
> >>                  dev->power.is_late_suspended = true;
> >> -       else
> >> +       } else {
> 		Point [1]
> >>                  async_error = error;
> >> +               if (!is_async(dev))
> > 
> > Why is the is_async() check necessary here?
> 			
> A: deviceX is suspended *async* and reached point [1], in this case:
> - deviceX has been added in dpm_late_early_list already
> - dpm_suspend_late() will detect async_error and call dpm_resume_early()
> - dpm_resume_early() will call device_resume_early() for deviceX
> - device_resume_early() will re-enable PM runtime
> {
> ...
> 	if (!dev->power.is_late_suspended)
> 		goto Out;
> 
> 	...
>  Out:
> 	TRACE_RESUME(error);
> 
> 	pm_runtime_enable(dev);
> ^^^^^^^^^^^^
> 	complete_all(&dev->power.completion);
> 	return error;
> }	
> 	
> 
> B: deviceX is suspended *sync* and reached point [1], in this case:
> - deviceX has not been added in dpm_late_early_list yet
> - dpm_suspend_late() will detect sync_error and call dpm_resume_early()
> - dpm_resume_early() will ignore deviceX
> 
> if i'll not check for !is_async(dev) then pm_runtime_enable(dev)
> will be called twice for deviceX with this patch.

OK, thanks!

So to me, the problem is that we handle failures in that code inconsistently
depending on whether or not async suspend/resume is enabled for the device.

I'd rather make it consistent than add extra checks to it, so the patch below
is how I would fix this.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] PM / sleep: Handle failures in device_suspend_late() consistently

Grygorii Strashko reports:

 The PM runtime will be left disabled for the device if its
 .suspend_late() callback fails and async suspend is not allowed
 for this device. In this case device will not be added in
 dpm_late_early_list and dpm_resume_early() will ignore this
 device, as result PM runtime will be disabled for it forever
 (side effect: after 8 subsequent failures for the same device
 the PM runtime will be reenabled due to disable_depth overflow).

To fix this problem, add devices to dpm_late_early_list regardless
of whether or not device_suspend_late() returns errors for them.

That will ensure failures in there to be handled consistently for
all devices regardless of their async suspend/resume status.

Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -1267,14 +1267,15 @@ int dpm_suspend_late(pm_message_t state)
 		error = device_suspend_late(dev);
 
 		mutex_lock(&dpm_list_mtx);
+		if (!list_empty(&dev->power.entry))
+			list_move(&dev->power.entry, &dpm_late_early_list);
+
 		if (error) {
 			pm_dev_err(dev, state, " late", error);
 			dpm_save_failed_dev(dev_name(dev));
 			put_device(dev);
 			break;
 		}
-		if (!list_empty(&dev->power.entry))
-			list_move(&dev->power.entry, &dpm_late_early_list);
 		put_device(dev);
 
 		if (async_error)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-20 12:18     ` Rafael J. Wysocki
@ 2016-05-20 16:21       ` Grygorii Strashko
  2016-05-20 21:26         ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Grygorii Strashko @ 2016-05-20 16:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On 05/20/2016 03:18 PM, Rafael J. Wysocki wrote:
> On Thursday, May 19, 2016 08:11:34 PM Grygorii Strashko wrote:
>> On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote:
>>> On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
>>> <grygorii.strashko@ti.com> wrote:
>>>> The PM runtime will be left disabled for the device if its .suspend_late()
>>>> callback fails and async suspend is not allowed for this device. In
>>>> this case device will not be added in dpm_late_early_list and
>>>> dpm_resume_early() will ignore this device, as result PM runtime will
>>>> be disabled for it forever (side effect: after 8 subsequent failures
>>>> for the same device the PM runtime will be reenabled due to
>>>> disable_depth overflow).
>>>>
>>>> Hence, re-enable PM runtime in __device_suspend_late() if
>>>> .suspend_late() callback fails and async suspend is not allowed for
>>>> this device.
>>>>
>>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>>> ---
>>>>    drivers/base/power/main.c | 7 +++++--
>>>>    1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>>>> index 6e7c3cc..9b266e5 100644
>>>> --- a/drivers/base/power/main.c
>>>> +++ b/drivers/base/power/main.c
>>>> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
>>>>           }
>>>>
>>>>           error = dpm_run_callback(callback, dev, state, info);
>>>> -       if (!error)
>>>> +       if (!error) {
>>>>                   dev->power.is_late_suspended = true;
>>>> -       else
>>>> +       } else {
>> 		Point [1]
>>>>                   async_error = error;
>>>> +               if (!is_async(dev))
>>>
>>> Why is the is_async() check necessary here?
>> 			
>> A: deviceX is suspended *async* and reached point [1], in this case:
>> - deviceX has been added in dpm_late_early_list already
>> - dpm_suspend_late() will detect async_error and call dpm_resume_early()
>> - dpm_resume_early() will call device_resume_early() for deviceX
>> - device_resume_early() will re-enable PM runtime
>> {
>> ...
>> 	if (!dev->power.is_late_suspended)
>> 		goto Out;
>>
>> 	...
>>   Out:
>> 	TRACE_RESUME(error);
>>
>> 	pm_runtime_enable(dev);
>> ^^^^^^^^^^^^
>> 	complete_all(&dev->power.completion);
>> 	return error;
>> }	
>> 	
>>
>> B: deviceX is suspended *sync* and reached point [1], in this case:
>> - deviceX has not been added in dpm_late_early_list yet
>> - dpm_suspend_late() will detect sync_error and call dpm_resume_early()
>> - dpm_resume_early() will ignore deviceX
>>
>> if i'll not check for !is_async(dev) then pm_runtime_enable(dev)
>> will be called twice for deviceX with this patch.
> 
> OK, thanks!
> 
> So to me, the problem is that we handle failures in that code inconsistently
> depending on whether or not async suspend/resume is enabled for the device.
> 
> I'd rather make it consistent than add extra checks to it, so the patch below
> is how I would fix this.
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] PM / sleep: Handle failures in device_suspend_late() consistently
> 
> Grygorii Strashko reports:
> 
>   The PM runtime will be left disabled for the device if its
>   .suspend_late() callback fails and async suspend is not allowed
>   for this device. In this case device will not be added in
>   dpm_late_early_list and dpm_resume_early() will ignore this
>   device, as result PM runtime will be disabled for it forever
>   (side effect: after 8 subsequent failures for the same device
>   the PM runtime will be reenabled due to disable_depth overflow).
> 
> To fix this problem, add devices to dpm_late_early_list regardless
> of whether or not device_suspend_late() returns errors for them.
> 
> That will ensure failures in there to be handled consistently for
> all devices regardless of their async suspend/resume status.
> 
> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/base/power/main.c |    5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -1267,14 +1267,15 @@ int dpm_suspend_late(pm_message_t state)
>   		error = device_suspend_late(dev);
>   
>   		mutex_lock(&dpm_list_mtx);
> +		if (!list_empty(&dev->power.entry))
> +			list_move(&dev->power.entry, &dpm_late_early_list);
> +
>   		if (error) {
>   			pm_dev_err(dev, state, " late", error);
>   			dpm_save_failed_dev(dev_name(dev));
>   			put_device(dev);
>   			break;
>   		}
> -		if (!list_empty(&dev->power.entry))
> -			list_move(&dev->power.entry, &dpm_late_early_list);
>   		put_device(dev);
>   
>   		if (async_error)
> 

Yep, it works too.
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>

By the way, there is third option:)

+++ b/drivers/base/power/main.c
@@ -1211,8 +1211,7 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
                dev->power.is_late_suspended = true;
        } else {
                async_error = error;
-               if (!is_async(dev))
-                       pm_runtime_enable(dev);
+               error = 0;
        }



-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-20 16:21       ` Grygorii Strashko
@ 2016-05-20 21:26         ` Rafael J. Wysocki
  2016-05-23 15:06           ` Grygorii Strashko
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2016-05-20 21:26 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On Friday, May 20, 2016 07:21:03 PM Grygorii Strashko wrote:
> On 05/20/2016 03:18 PM, Rafael J. Wysocki wrote:
> > On Thursday, May 19, 2016 08:11:34 PM Grygorii Strashko wrote:
> >> On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote:
> >>> On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
> >>> <grygorii.strashko@ti.com> wrote:
> >>>> The PM runtime will be left disabled for the device if its .suspend_late()
> >>>> callback fails and async suspend is not allowed for this device. In
> >>>> this case device will not be added in dpm_late_early_list and
> >>>> dpm_resume_early() will ignore this device, as result PM runtime will
> >>>> be disabled for it forever (side effect: after 8 subsequent failures
> >>>> for the same device the PM runtime will be reenabled due to
> >>>> disable_depth overflow).
> >>>>
> >>>> Hence, re-enable PM runtime in __device_suspend_late() if
> >>>> .suspend_late() callback fails and async suspend is not allowed for
> >>>> this device.
> >>>>
> >>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> >>>> ---
> >>>>    drivers/base/power/main.c | 7 +++++--
> >>>>    1 file changed, 5 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> >>>> index 6e7c3cc..9b266e5 100644
> >>>> --- a/drivers/base/power/main.c
> >>>> +++ b/drivers/base/power/main.c
> >>>> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
> >>>>           }
> >>>>
> >>>>           error = dpm_run_callback(callback, dev, state, info);
> >>>> -       if (!error)
> >>>> +       if (!error) {
> >>>>                   dev->power.is_late_suspended = true;
> >>>> -       else
> >>>> +       } else {
> >> 		Point [1]
> >>>>                   async_error = error;
> >>>> +               if (!is_async(dev))
> >>>
> >>> Why is the is_async() check necessary here?
> >> 			
> >> A: deviceX is suspended *async* and reached point [1], in this case:
> >> - deviceX has been added in dpm_late_early_list already
> >> - dpm_suspend_late() will detect async_error and call dpm_resume_early()
> >> - dpm_resume_early() will call device_resume_early() for deviceX
> >> - device_resume_early() will re-enable PM runtime
> >> {
> >> ...
> >> 	if (!dev->power.is_late_suspended)
> >> 		goto Out;
> >>
> >> 	...
> >>   Out:
> >> 	TRACE_RESUME(error);
> >>
> >> 	pm_runtime_enable(dev);
> >> ^^^^^^^^^^^^
> >> 	complete_all(&dev->power.completion);
> >> 	return error;
> >> }	
> >> 	
> >>
> >> B: deviceX is suspended *sync* and reached point [1], in this case:
> >> - deviceX has not been added in dpm_late_early_list yet
> >> - dpm_suspend_late() will detect sync_error and call dpm_resume_early()
> >> - dpm_resume_early() will ignore deviceX
> >>
> >> if i'll not check for !is_async(dev) then pm_runtime_enable(dev)
> >> will be called twice for deviceX with this patch.
> > 
> > OK, thanks!
> > 
> > So to me, the problem is that we handle failures in that code inconsistently
> > depending on whether or not async suspend/resume is enabled for the device.
> > 
> > I'd rather make it consistent than add extra checks to it, so the patch below
> > is how I would fix this.
> > 
> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: [PATCH] PM / sleep: Handle failures in device_suspend_late() consistently
> > 
> > Grygorii Strashko reports:
> > 
> >   The PM runtime will be left disabled for the device if its
> >   .suspend_late() callback fails and async suspend is not allowed
> >   for this device. In this case device will not be added in
> >   dpm_late_early_list and dpm_resume_early() will ignore this
> >   device, as result PM runtime will be disabled for it forever
> >   (side effect: after 8 subsequent failures for the same device
> >   the PM runtime will be reenabled due to disable_depth overflow).
> > 
> > To fix this problem, add devices to dpm_late_early_list regardless
> > of whether or not device_suspend_late() returns errors for them.
> > 
> > That will ensure failures in there to be handled consistently for
> > all devices regardless of their async suspend/resume status.
> > 
> > Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/base/power/main.c |    5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > Index: linux-pm/drivers/base/power/main.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/main.c
> > +++ linux-pm/drivers/base/power/main.c
> > @@ -1267,14 +1267,15 @@ int dpm_suspend_late(pm_message_t state)
> >   		error = device_suspend_late(dev);
> >   
> >   		mutex_lock(&dpm_list_mtx);
> > +		if (!list_empty(&dev->power.entry))
> > +			list_move(&dev->power.entry, &dpm_late_early_list);
> > +
> >   		if (error) {
> >   			pm_dev_err(dev, state, " late", error);
> >   			dpm_save_failed_dev(dev_name(dev));
> >   			put_device(dev);
> >   			break;
> >   		}
> > -		if (!list_empty(&dev->power.entry))
> > -			list_move(&dev->power.entry, &dpm_late_early_list);
> >   		put_device(dev);
> >   
> >   		if (async_error)
> > 
> 
> Yep, it works too.
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>

OK, thanks!

Applied.

> By the way, there is third option:)

Well, that at least would require a comment explaining what's going on.

> +++ b/drivers/base/power/main.c
> @@ -1211,8 +1211,7 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
>                 dev->power.is_late_suspended = true;
>         } else {
>                 async_error = error;
> -               if (!is_async(dev))
> -                       pm_runtime_enable(dev);
> +               error = 0;
>         }

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late()
  2016-05-20 21:26         ` Rafael J. Wysocki
@ 2016-05-23 15:06           ` Grygorii Strashko
  0 siblings, 0 replies; 7+ messages in thread
From: Grygorii Strashko @ 2016-05-23 15:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, linux-pm, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kevin Hilman, Ulf Hansson,
	Linux Kernel Mailing List

On 05/21/2016 12:26 AM, Rafael J. Wysocki wrote:
> On Friday, May 20, 2016 07:21:03 PM Grygorii Strashko wrote:
>> On 05/20/2016 03:18 PM, Rafael J. Wysocki wrote:
>>> On Thursday, May 19, 2016 08:11:34 PM Grygorii Strashko wrote:
>>>> On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote:
>>>>> On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko
>>>>> <grygorii.strashko@ti.com> wrote:
>>>>>> The PM runtime will be left disabled for the device if its .suspend_late()
>>>>>> callback fails and async suspend is not allowed for this device. In
>>>>>> this case device will not be added in dpm_late_early_list and
>>>>>> dpm_resume_early() will ignore this device, as result PM runtime will
>>>>>> be disabled for it forever (side effect: after 8 subsequent failures
>>>>>> for the same device the PM runtime will be reenabled due to
>>>>>> disable_depth overflow).
>>>>>>
>>>>>> Hence, re-enable PM runtime in __device_suspend_late() if
>>>>>> .suspend_late() callback fails and async suspend is not allowed for
>>>>>> this device.
>>>>>>
>>>>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>>>>> ---
>>>>>>     drivers/base/power/main.c | 7 +++++--
>>>>>>     1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>>>>>> index 6e7c3cc..9b266e5 100644
>>>>>> --- a/drivers/base/power/main.c
>>>>>> +++ b/drivers/base/power/main.c
>>>>>> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
>>>>>>            }
>>>>>>
>>>>>>            error = dpm_run_callback(callback, dev, state, info);
>>>>>> -       if (!error)
>>>>>> +       if (!error) {
>>>>>>                    dev->power.is_late_suspended = true;
>>>>>> -       else
>>>>>> +       } else {
>>>> 		Point [1]
>>>>>>                    async_error = error;
>>>>>> +               if (!is_async(dev))
>>>>>
>>>>> Why is the is_async() check necessary here?
>>>> 			
>>>> A: deviceX is suspended *async* and reached point [1], in this case:
>>>> - deviceX has been added in dpm_late_early_list already
>>>> - dpm_suspend_late() will detect async_error and call dpm_resume_early()
>>>> - dpm_resume_early() will call device_resume_early() for deviceX
>>>> - device_resume_early() will re-enable PM runtime
>>>> {
>>>> ...
>>>> 	if (!dev->power.is_late_suspended)
>>>> 		goto Out;
>>>>
>>>> 	...
>>>>    Out:
>>>> 	TRACE_RESUME(error);
>>>>
>>>> 	pm_runtime_enable(dev);
>>>> ^^^^^^^^^^^^
>>>> 	complete_all(&dev->power.completion);
>>>> 	return error;
>>>> }	
>>>> 	
>>>>
>>>> B: deviceX is suspended *sync* and reached point [1], in this case:
>>>> - deviceX has not been added in dpm_late_early_list yet
>>>> - dpm_suspend_late() will detect sync_error and call dpm_resume_early()
>>>> - dpm_resume_early() will ignore deviceX
>>>>
>>>> if i'll not check for !is_async(dev) then pm_runtime_enable(dev)
>>>> will be called twice for deviceX with this patch.
>>>
>>> OK, thanks!
>>>
>>> So to me, the problem is that we handle failures in that code inconsistently
>>> depending on whether or not async suspend/resume is enabled for the device.
>>>
>>> I'd rather make it consistent than add extra checks to it, so the patch below
>>> is how I would fix this.
>>>
>>> ---
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> Subject: [PATCH] PM / sleep: Handle failures in device_suspend_late() consistently
>>>
>>> Grygorii Strashko reports:
>>>
>>>    The PM runtime will be left disabled for the device if its
>>>    .suspend_late() callback fails and async suspend is not allowed
>>>    for this device. In this case device will not be added in
>>>    dpm_late_early_list and dpm_resume_early() will ignore this
>>>    device, as result PM runtime will be disabled for it forever
>>>    (side effect: after 8 subsequent failures for the same device
>>>    the PM runtime will be reenabled due to disable_depth overflow).
>>>
>>> To fix this problem, add devices to dpm_late_early_list regardless
>>> of whether or not device_suspend_late() returns errors for them.
>>>
>>> That will ensure failures in there to be handled consistently for
>>> all devices regardless of their async suspend/resume status.
>>>
>>> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>    drivers/base/power/main.c |    5 +++--
>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> Index: linux-pm/drivers/base/power/main.c
>>> ===================================================================
>>> --- linux-pm.orig/drivers/base/power/main.c
>>> +++ linux-pm/drivers/base/power/main.c
>>> @@ -1267,14 +1267,15 @@ int dpm_suspend_late(pm_message_t state)
>>>    		error = device_suspend_late(dev);
>>>
>>>    		mutex_lock(&dpm_list_mtx);
>>> +		if (!list_empty(&dev->power.entry))
>>> +			list_move(&dev->power.entry, &dpm_late_early_list);
>>> +
>>>    		if (error) {
>>>    			pm_dev_err(dev, state, " late", error);
>>>    			dpm_save_failed_dev(dev_name(dev));
>>>    			put_device(dev);
>>>    			break;
>>>    		}
>>> -		if (!list_empty(&dev->power.entry))
>>> -			list_move(&dev->power.entry, &dpm_late_early_list);
>>>    		put_device(dev);
>>>
>>>    		if (async_error)
>>>
>>
>> Yep, it works too.
>> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>
> OK, thanks!
>
> Applied.
>

Thanks.


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-23 15:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 18:03 [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late() Grygorii Strashko
2016-05-19 13:38 ` Rafael J. Wysocki
2016-05-19 17:11   ` Grygorii Strashko
2016-05-20 12:18     ` Rafael J. Wysocki
2016-05-20 16:21       ` Grygorii Strashko
2016-05-20 21:26         ` Rafael J. Wysocki
2016-05-23 15:06           ` Grygorii Strashko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).