All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
To: Jani Nikula <jani.nikula@linux.intel.com>,
	Daniel Vetter <daniel@ffwll.ch>, Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	intel-gfx@lists.freedesktop.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux PM <linux-pm@vger.kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [Intel-gfx] [PATCH v2] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval
Date: Fri, 27 Nov 2015 15:44:48 +0100	[thread overview]
Message-ID: <56586C60.5050104@intel.com> (raw)
In-Reply-To: <878u5j7jt9.fsf@intel.com>

On 11/27/2015 12:39 PM, Jani Nikula wrote:
> On Wed, 18 Nov 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Nov 18, 2015 at 03:28:38PM +0200, Imre Deak wrote:
>>> On ke, 2015-11-18 at 12:56 +0200, Imre Deak wrote:
>>>> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
>>>> suspend hooks as errors, but they still show up as errors in dmesg. Tune
>>>> them down.
>>>>
>>>> One problem caused by this was noticed by Daniel: the i915 driver
>>>> returns EAGAIN to signal a temporary failure to suspend and as a request
>>>> towards the RPM core for scheduling a suspend again. This is a normal
>>>> event, but the resulting error message flags a breakage during the
>>>> driver's automated testing which parses dmesg and picks up the error.
>>>>
>>>> v2:
>>>> - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder)
>>>>
>>>> Reported-by: Daniel Vetter <daniel.vetter@intel.com>
>>>> Signed-off-by: Imre Deak <imre.deak@intel.com>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992
>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> Rafael, can you please pick this up for 4.4? The spurious KERN_ERR noise
>> in dmesg is causing a lot fo spurious fail in our (very recently put into
>> place) i915 CI system.
> Rafael, ping.

Well, so I'm not sure about this one.

And the question is ->

>>>> ---
>>>>   drivers/base/power/main.c |  7 +++++--
>>>>   drivers/pci/pci-driver.c  |  2 +-
>>>>   include/linux/pm.h        | 11 +++++++++--
>>>>   3 files changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>>>> index 1710c26..39d2090 100644
>>>> --- a/drivers/base/power/main.c
>>>> +++ b/drivers/base/power/main.c
>>>> @@ -1679,9 +1679,12 @@ int dpm_suspend_start(pm_message_t state)
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(dpm_suspend_start);
>>>>   
>>>> -void __suspend_report_result(const char *function, void *fn, int ret)
>>>> +void __suspend_report_result(const char *function, void *fn, int ret,
>>>> +			     bool runtime_pm)
>>>>   {
>>>> -	if (ret)
>>>> +	if (runtime_pm && (ret == -EBUSY || ret == -EAGAIN))
>>>> +		printk(KERN_DEBUG "%s(): %pF returns %d\n", function, fn, ret);
>>>> +	else if (ret)
>>>>   		printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
>>>>   }

-> why you are adding overhead to this function, instead of -->

>>>>   EXPORT_SYMBOL_GPL(__suspend_report_result);
>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>> index 108a311..9569572 100644
>>>> --- a/drivers/pci/pci-driver.c
>>>> +++ b/drivers/pci/pci-driver.c
>>>> @@ -1142,7 +1142,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
>>>>   	pci_dev->state_saved = false;
>>>>   	pci_dev->no_d3cold = false;
>>>>   	error = pm->runtime_suspend(dev);
>>>> -	suspend_report_result(pm->runtime_suspend, error);
>>>> +	rpm_suspend_report_result(pm->runtime_suspend, error);

--> replacing the suspend_report_result() above with a direct printk() 
in the if (error) block below.

Surely, suspend_report_result() was not designed with runtime PM in mind 
and it was a mistake to use it here.  It just seemed to do the right 
thing, but it clearly doesn't.

>>>>   	if (error)
>>>>   		return error;
>>>>   	if (!pci_dev->d3cold_allowed)
>>>> diff --git a/include/linux/pm.h b/include/linux/pm.h
>>>> index 35d599e..54f37e3 100644
>>>> --- a/include/linux/pm.h
>>>> +++ b/include/linux/pm.h
>>>> @@ -702,11 +702,17 @@ extern int dpm_suspend_late(pm_message_t state);
>>>>   extern int dpm_suspend(pm_message_t state);
>>>>   extern int dpm_prepare(pm_message_t state);
>>>>   
>>>> -extern void __suspend_report_result(const char *function, void *fn, int ret);
>>>> +extern void __suspend_report_result(const char *function, void *fn, int ret,
>>>> +				    bool runtime_pm);
>>>>   
>>>>   #define suspend_report_result(fn, ret)					\
>>>>   	do {								\
>>>> -		__suspend_report_result(__func__, fn, ret);		\
>>>> +		__suspend_report_result(__func__, fn, ret, false);	\
>>>> +	} while (0)
>>>> +
>>>> +#define rpm_suspend_report_result(fn, ret)				\
>>>> +	do {								\
>>>> +		__suspend_report_result(__func__, fn, ret, true);	\
>>>>   	} while (0)
>>>>   
>>>>   extern int device_pm_wait_for_dev(struct device *sub, struct device *dev);
>>>> @@ -744,6 +750,7 @@ static inline int dpm_suspend_start(pm_message_t state)
>>>>   }
>>>>   
>>>>   #define suspend_report_result(fn, ret)		do {} while (0)
>>>> +#define rpm_suspend_report_result(fn, ret)	do {} while (0)
>>>>   
>>>>   static inline int device_pm_wait_for_dev(struct device *a, struct device *b)
>>>>   {

BTW, if you're changing PM code, it is good to CC linux-pm too (now 
done) and if you're changing PCI code, it is mandatory to CC linux-pci 
and the PCI maintainer (now done too).

Thanks,
Rafael


WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
To: Jani Nikula <jani.nikula@linux.intel.com>,
	Daniel Vetter <daniel@ffwll.ch>, Imre Deak <imre.deak@intel.com>
Cc: Linux PM <linux-pm@vger.kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>,
	intel-gfx@lists.freedesktop.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [PATCH v2] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval
Date: Fri, 27 Nov 2015 15:44:48 +0100	[thread overview]
Message-ID: <56586C60.5050104@intel.com> (raw)
In-Reply-To: <878u5j7jt9.fsf@intel.com>

On 11/27/2015 12:39 PM, Jani Nikula wrote:
> On Wed, 18 Nov 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Nov 18, 2015 at 03:28:38PM +0200, Imre Deak wrote:
>>> On ke, 2015-11-18 at 12:56 +0200, Imre Deak wrote:
>>>> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
>>>> suspend hooks as errors, but they still show up as errors in dmesg. Tune
>>>> them down.
>>>>
>>>> One problem caused by this was noticed by Daniel: the i915 driver
>>>> returns EAGAIN to signal a temporary failure to suspend and as a request
>>>> towards the RPM core for scheduling a suspend again. This is a normal
>>>> event, but the resulting error message flags a breakage during the
>>>> driver's automated testing which parses dmesg and picks up the error.
>>>>
>>>> v2:
>>>> - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder)
>>>>
>>>> Reported-by: Daniel Vetter <daniel.vetter@intel.com>
>>>> Signed-off-by: Imre Deak <imre.deak@intel.com>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992
>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> Rafael, can you please pick this up for 4.4? The spurious KERN_ERR noise
>> in dmesg is causing a lot fo spurious fail in our (very recently put into
>> place) i915 CI system.
> Rafael, ping.

Well, so I'm not sure about this one.

And the question is ->

>>>> ---
>>>>   drivers/base/power/main.c |  7 +++++--
>>>>   drivers/pci/pci-driver.c  |  2 +-
>>>>   include/linux/pm.h        | 11 +++++++++--
>>>>   3 files changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>>>> index 1710c26..39d2090 100644
>>>> --- a/drivers/base/power/main.c
>>>> +++ b/drivers/base/power/main.c
>>>> @@ -1679,9 +1679,12 @@ int dpm_suspend_start(pm_message_t state)
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(dpm_suspend_start);
>>>>   
>>>> -void __suspend_report_result(const char *function, void *fn, int ret)
>>>> +void __suspend_report_result(const char *function, void *fn, int ret,
>>>> +			     bool runtime_pm)
>>>>   {
>>>> -	if (ret)
>>>> +	if (runtime_pm && (ret == -EBUSY || ret == -EAGAIN))
>>>> +		printk(KERN_DEBUG "%s(): %pF returns %d\n", function, fn, ret);
>>>> +	else if (ret)
>>>>   		printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
>>>>   }

-> why you are adding overhead to this function, instead of -->

>>>>   EXPORT_SYMBOL_GPL(__suspend_report_result);
>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>> index 108a311..9569572 100644
>>>> --- a/drivers/pci/pci-driver.c
>>>> +++ b/drivers/pci/pci-driver.c
>>>> @@ -1142,7 +1142,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
>>>>   	pci_dev->state_saved = false;
>>>>   	pci_dev->no_d3cold = false;
>>>>   	error = pm->runtime_suspend(dev);
>>>> -	suspend_report_result(pm->runtime_suspend, error);
>>>> +	rpm_suspend_report_result(pm->runtime_suspend, error);

--> replacing the suspend_report_result() above with a direct printk() 
in the if (error) block below.

Surely, suspend_report_result() was not designed with runtime PM in mind 
and it was a mistake to use it here.  It just seemed to do the right 
thing, but it clearly doesn't.

>>>>   	if (error)
>>>>   		return error;
>>>>   	if (!pci_dev->d3cold_allowed)
>>>> diff --git a/include/linux/pm.h b/include/linux/pm.h
>>>> index 35d599e..54f37e3 100644
>>>> --- a/include/linux/pm.h
>>>> +++ b/include/linux/pm.h
>>>> @@ -702,11 +702,17 @@ extern int dpm_suspend_late(pm_message_t state);
>>>>   extern int dpm_suspend(pm_message_t state);
>>>>   extern int dpm_prepare(pm_message_t state);
>>>>   
>>>> -extern void __suspend_report_result(const char *function, void *fn, int ret);
>>>> +extern void __suspend_report_result(const char *function, void *fn, int ret,
>>>> +				    bool runtime_pm);
>>>>   
>>>>   #define suspend_report_result(fn, ret)					\
>>>>   	do {								\
>>>> -		__suspend_report_result(__func__, fn, ret);		\
>>>> +		__suspend_report_result(__func__, fn, ret, false);	\
>>>> +	} while (0)
>>>> +
>>>> +#define rpm_suspend_report_result(fn, ret)				\
>>>> +	do {								\
>>>> +		__suspend_report_result(__func__, fn, ret, true);	\
>>>>   	} while (0)
>>>>   
>>>>   extern int device_pm_wait_for_dev(struct device *sub, struct device *dev);
>>>> @@ -744,6 +750,7 @@ static inline int dpm_suspend_start(pm_message_t state)
>>>>   }
>>>>   
>>>>   #define suspend_report_result(fn, ret)		do {} while (0)
>>>> +#define rpm_suspend_report_result(fn, ret)	do {} while (0)
>>>>   
>>>>   static inline int device_pm_wait_for_dev(struct device *a, struct device *b)
>>>>   {

BTW, if you're changing PM code, it is good to CC linux-pm too (now 
done) and if you're changing PCI code, it is mandatory to CC linux-pci 
and the PCI maintainer (now done too).

Thanks,
Rafael

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-11-27 14:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-18  9:16 [PATCH] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval Imre Deak
2015-11-18 10:16 ` kbuild test robot
2015-11-18 10:56 ` [PATCH v2] " Imre Deak
2015-11-18 13:28   ` Imre Deak
2015-11-18 14:19     ` Daniel Vetter
2015-11-27 11:39       ` Jani Nikula
2015-11-27 14:44         ` Rafael J. Wysocki [this message]
2015-11-27 14:44           ` Rafael J. Wysocki
2015-11-27 14:56           ` [Intel-gfx] " Imre Deak
2015-11-27 18:17   ` [PATCH v3] " Imre Deak
2015-11-27 22:23     ` Rafael J. Wysocki
2015-11-28  8:34     ` [PATCH v4] " Imre Deak
2015-11-30  2:07       ` Rafael J. Wysocki
2015-11-30 18:07       ` Bjorn Helgaas
2015-11-30 19:02       ` [PATCH v5] PCI / PM: Tune down retryable runtime suspend error messages Imre Deak
2015-12-02  1:54         ` Rafael J. Wysocki
2015-12-02  4:43           ` Bjorn Helgaas
2015-12-02  4:43             ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56586C60.5050104@intel.com \
    --to=rafael.j.wysocki@intel.com \
    --cc=bhelgaas@google.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel@ffwll.ch \
    --cc=imre.deak@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.