All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>,
	Pavel Machek <pavel@ucw.cz>, Len Brown <len.brown@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Doug Anderson <dianders@chromium.org>,
	Brian Norris <computersforpeace@gmail.com>,
	Jeffy Chen <jeffy.chen@rock-chips.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Chuansheng Liu <chuansheng.liu@intel.com>,
	Kevin Hilman <khilman@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>
Subject: Re: [PATCH v2 2/2] PM / sleep: don't suspend parent when async child suspend_{noirq,late} fails
Date: Tue, 1 Nov 2016 22:07:07 -0700	[thread overview]
Message-ID: <20161102050706.GA49402@google.com> (raw)
In-Reply-To: <50971906.K6xak2t6Z6@vostro.rjw.lan>

+ more genpd folks

On Wed, Nov 02, 2016 at 04:51:08AM +0100, Rafael J. Wysocki wrote:
> On Tuesday, November 01, 2016 12:04:28 AM Dmitry Torokhov wrote:
> > On Mon, Oct 31, 2016 at 10:25 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > On Thursday, October 27, 2016 09:05:34 AM Brian Norris wrote:
> > >> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > >> index c58563581345..eaf6b53463a5 100644
> > >> --- a/drivers/base/power/main.c
> > >> +++ b/drivers/base/power/main.c
> > >> @@ -1040,6 +1040,9 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a
> > >>
> > >>       dpm_wait_for_children(dev, async);
> > >>
> > >> +     if (async_error)
> > >> +             goto Complete;
> > >> +
> > >
> > > This is a second chech for async_error in this routine and is the first one
> > > really needed after adding this?
> > 
> > There is really no point in waiting for children to be suspended if
> > error has already been signalled; that's what first check achieves.
> > The 2nd check ensures that we abort suspend if any of the children
> > failed to suspend.
> > 
> > I'd say both checks are needed (well, 1st is helpful, 2nd is essential).
> 
> OK, fair enough.

Sort of agreed, although I'm still not sure how helpful the 1st one is;
kinda serves to complicate things, for little real benefit IMO (you
don't save much time by "not waiting" -- either the child quickly
notices the same error and complete()'s quickly, or else you're going to
wait for that child in the end anyway).

I think it's also important to ask why we do this optimization in the
{late,noirq} cases, but we don't do this in __device_suspend(). As
demonstrated by the $subject bug, I think we would yield fewer bugs by
sharing code structure (if not the code itself) among the similar
phases.

I'm happy for you to take my current patch, of course, but I think some
further effort on making this consistent might be warranted. Either put
all of these short-circuit checks after the wait_for_children(), or else
add the same short-circuit for the missing case (__device_suspend()).
i.e., this (untested) patch:

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index e44944f4be77..2932a5bd892f 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1027,6 +1027,8 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a
 	TRACE_DEVICE(dev);
 	TRACE_SUSPEND(0);
 
+	dpm_wait_for_children(dev, async);
+
 	if (async_error)
 		goto Complete;
 
@@ -1038,8 +1040,6 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
-	dpm_wait_for_children(dev, async);
-
 	if (dev->pm_domain) {
 		info = "noirq power domain ";
 		callback = pm_noirq_op(&dev->pm_domain->ops, state);
@@ -1174,6 +1174,8 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
 
 	__pm_runtime_disable(dev, false);
 
+	dpm_wait_for_children(dev, async);
+
 	if (async_error)
 		goto Complete;
 
@@ -1185,8 +1187,6 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
-	dpm_wait_for_children(dev, async);
-
 	if (dev->pm_domain) {
 		info = "late power domain ";
 		callback = pm_late_early_op(&dev->pm_domain->ops, state);

---

I can test this and send it in proper form if that looks preferable.

P.S. To get slightly off-topic here (but speaking of noirq bugs): I
noticed the genpd code has comments like this scattered all over:

 * This function is only called in "noirq" and "syscore" stages of system power
 * transitions, so it need not acquire locks (all of the "noirq" callbacks are
 * executed sequentially, so it is guaranteed that it will never run twice in
 * parallel).

Isn't that no longer true, now that noirq suspend can be asynchronous?
Maybe we should grep for the phrase "need not acquire locks" throughout
the kernel, in order to find low-hanging fruit for race conditions :)

  reply	other threads:[~2016-11-02  5:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-20  0:26 [RESEND PATCH 1/2] PM / sleep: print function name of callbacks Brian Norris
2016-10-20  0:26 ` Brian Norris
2016-10-20  0:26 ` [PATCH 2/2] PM / sleep: don't suspend parent when async child suspend_{noirq,early} fails Brian Norris
2016-10-20  0:26   ` Brian Norris
2016-10-20  0:46   ` Brian Norris
2016-10-27 15:34     ` Greg Kroah-Hartman
2016-10-27 16:03       ` Brian Norris
2016-10-20  0:56   ` Dmitry Torokhov
2016-10-27 16:05   ` [PATCH v2 2/2] PM / sleep: don't suspend parent when async child suspend_{noirq,late} fails Brian Norris
2016-10-27 16:05     ` Brian Norris
2016-11-01  4:25     ` Rafael J. Wysocki
2016-11-01  5:22       ` Brian Norris
2016-11-01  6:04       ` Dmitry Torokhov
2016-11-02  3:51         ` Rafael J. Wysocki
2016-11-02  5:07           ` Brian Norris [this message]
2016-11-10  0:08             ` Rafael J. Wysocki
2016-11-10  0:18               ` Brian Norris
2016-11-10  1:21     ` [PATCH v3] " Brian Norris
2016-11-10  1:21       ` Brian Norris
2016-11-10  1:53       ` Rafael J. Wysocki
2016-11-10  2:00         ` Brian Norris
2016-11-11  1:42           ` Rafael J. Wysocki
2016-10-20  0:52 ` [RESEND PATCH 1/2] PM / sleep: print function name of callbacks Dmitry Torokhov
2016-11-01  4:27 ` Rafael J. Wysocki
2016-11-02 21:02   ` Brian Norris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161102050706.GA49402@google.com \
    --to=briannorris@chromium.org \
    --cc=chuansheng.liu@intel.com \
    --cc=computersforpeace@gmail.com \
    --cc=dianders@chromium.org \
    --cc=dmitry.torokhov@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeffy.chen@rock-chips.com \
    --cc=khilman@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@rjwysocki.net \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.