From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753469AbcEWPHQ (ORCPT ); Mon, 23 May 2016 11:07:16 -0400 Received: from devils.ext.ti.com ([198.47.26.153]:46056 "EHLO devils.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753119AbcEWPHO (ORCPT ); Mon, 23 May 2016 11:07:14 -0400 Subject: Re: [PATCH] PM / sleep: fix unbalanced pm runtime disable in __device_suspend_late() To: "Rafael J. Wysocki" References: <1463162628-16976-1-git-send-email-grygorii.strashko@ti.com> <4160153.ggAvdOZIvP@vostro.rjw.lan> <573F396F.3060803@ti.com> <1910982.6Hh19bFu2s@vostro.rjw.lan> CC: "Rafael J. Wysocki" , "linux-pm@vger.kernel.org" , Len Brown , Pavel Machek , Greg Kroah-Hartman , Kevin Hilman , Ulf Hansson , Linux Kernel Mailing List From: Grygorii Strashko Message-ID: <57431C93.5050100@ti.com> Date: Mon, 23 May 2016 18:06:59 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1910982.6Hh19bFu2s@vostro.rjw.lan> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/21/2016 12:26 AM, Rafael J. Wysocki wrote: > On Friday, May 20, 2016 07:21:03 PM Grygorii Strashko wrote: >> On 05/20/2016 03:18 PM, Rafael J. Wysocki wrote: >>> On Thursday, May 19, 2016 08:11:34 PM Grygorii Strashko wrote: >>>> On 05/19/2016 04:38 PM, Rafael J. Wysocki wrote: >>>>> On Fri, May 13, 2016 at 8:03 PM, Grygorii Strashko >>>>> wrote: >>>>>> The PM runtime will be left disabled for the device if its .suspend_late() >>>>>> callback fails and async suspend is not allowed for this device. In >>>>>> this case device will not be added in dpm_late_early_list and >>>>>> dpm_resume_early() will ignore this device, as result PM runtime will >>>>>> be disabled for it forever (side effect: after 8 subsequent failures >>>>>> for the same device the PM runtime will be reenabled due to >>>>>> disable_depth overflow). >>>>>> >>>>>> Hence, re-enable PM runtime in __device_suspend_late() if >>>>>> .suspend_late() callback fails and async suspend is not allowed for >>>>>> this device. >>>>>> >>>>>> Signed-off-by: Grygorii Strashko >>>>>> --- >>>>>> drivers/base/power/main.c | 7 +++++-- >>>>>> 1 file changed, 5 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c >>>>>> index 6e7c3cc..9b266e5 100644 >>>>>> --- a/drivers/base/power/main.c >>>>>> +++ b/drivers/base/power/main.c >>>>>> @@ -1207,10 +1207,13 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as >>>>>> } >>>>>> >>>>>> error = dpm_run_callback(callback, dev, state, info); >>>>>> - if (!error) >>>>>> + if (!error) { >>>>>> dev->power.is_late_suspended = true; >>>>>> - else >>>>>> + } else { >>>> Point [1] >>>>>> async_error = error; >>>>>> + if (!is_async(dev)) >>>>> >>>>> Why is the is_async() check necessary here? >>>> >>>> A: deviceX is suspended *async* and reached point [1], in this case: >>>> - deviceX has been added in dpm_late_early_list already >>>> - dpm_suspend_late() will detect async_error and call dpm_resume_early() >>>> - dpm_resume_early() will call device_resume_early() for deviceX >>>> - device_resume_early() will re-enable PM runtime >>>> { >>>> ... >>>> if (!dev->power.is_late_suspended) >>>> goto Out; >>>> >>>> ... >>>> Out: >>>> TRACE_RESUME(error); >>>> >>>> pm_runtime_enable(dev); >>>> ^^^^^^^^^^^^ >>>> complete_all(&dev->power.completion); >>>> return error; >>>> } >>>> >>>> >>>> B: deviceX is suspended *sync* and reached point [1], in this case: >>>> - deviceX has not been added in dpm_late_early_list yet >>>> - dpm_suspend_late() will detect sync_error and call dpm_resume_early() >>>> - dpm_resume_early() will ignore deviceX >>>> >>>> if i'll not check for !is_async(dev) then pm_runtime_enable(dev) >>>> will be called twice for deviceX with this patch. >>> >>> OK, thanks! >>> >>> So to me, the problem is that we handle failures in that code inconsistently >>> depending on whether or not async suspend/resume is enabled for the device. >>> >>> I'd rather make it consistent than add extra checks to it, so the patch below >>> is how I would fix this. >>> >>> --- >>> From: Rafael J. Wysocki >>> Subject: [PATCH] PM / sleep: Handle failures in device_suspend_late() consistently >>> >>> Grygorii Strashko reports: >>> >>> The PM runtime will be left disabled for the device if its >>> .suspend_late() callback fails and async suspend is not allowed >>> for this device. In this case device will not be added in >>> dpm_late_early_list and dpm_resume_early() will ignore this >>> device, as result PM runtime will be disabled for it forever >>> (side effect: after 8 subsequent failures for the same device >>> the PM runtime will be reenabled due to disable_depth overflow). >>> >>> To fix this problem, add devices to dpm_late_early_list regardless >>> of whether or not device_suspend_late() returns errors for them. >>> >>> That will ensure failures in there to be handled consistently for >>> all devices regardless of their async suspend/resume status. >>> >>> Reported-by: Grygorii Strashko >>> Signed-off-by: Rafael J. Wysocki >>> --- >>> drivers/base/power/main.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> Index: linux-pm/drivers/base/power/main.c >>> =================================================================== >>> --- linux-pm.orig/drivers/base/power/main.c >>> +++ linux-pm/drivers/base/power/main.c >>> @@ -1267,14 +1267,15 @@ int dpm_suspend_late(pm_message_t state) >>> error = device_suspend_late(dev); >>> >>> mutex_lock(&dpm_list_mtx); >>> + if (!list_empty(&dev->power.entry)) >>> + list_move(&dev->power.entry, &dpm_late_early_list); >>> + >>> if (error) { >>> pm_dev_err(dev, state, " late", error); >>> dpm_save_failed_dev(dev_name(dev)); >>> put_device(dev); >>> break; >>> } >>> - if (!list_empty(&dev->power.entry)) >>> - list_move(&dev->power.entry, &dpm_late_early_list); >>> put_device(dev); >>> >>> if (async_error) >>> >> >> Yep, it works too. >> Tested-by: Grygorii Strashko > > OK, thanks! > > Applied. > Thanks. -- regards, -grygorii