linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Thierry Reding <thierry.reding@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Dmitry Osipenko <digetx@gmail.com>,
	Linux PM <linux-pm@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	linux-tegra <linux-tegra@vger.kernel.org>
Subject: Re: [PATCH 1/2] PM / runtime: Allow drivers to override runtime PM behaviour on sleep
Date: Fri, 29 Nov 2019 11:09:26 +0100	[thread overview]
Message-ID: <CAJZ5v0j56xrmiO6bvs3W-_pY0HjGhSUEbZYzjZHfm6gA-k605g@mail.gmail.com> (raw)
In-Reply-To: <20191129093357.GA2770902@ulmo>

On Fri, Nov 29, 2019 at 10:34 AM Thierry Reding
<thierry.reding@gmail.com> wrote:
>
> On Thu, Nov 28, 2019 at 11:03:57PM +0100, Rafael J. Wysocki wrote:
> > On Thursday, November 28, 2019 5:50:26 PM CET Thierry Reding wrote:
> > >
> > > --0F1p//8PRICkK4MW
> > > Content-Type: text/plain; charset=us-ascii
> > > Content-Disposition: inline
> > > Content-Transfer-Encoding: quoted-printable
> > >
> > > On Thu, Nov 28, 2019 at 05:14:51PM +0100, Rafael J. Wysocki wrote:
> > > > On Thu, Nov 28, 2019 at 5:03 PM Thierry Reding <thierry.reding@gmail.com>=
> > >  wrote:
> > > > >
> > > > > From: Thierry Reding <treding@nvidia.com>
> > > > >
> > > > > Currently the driver PM core will automatically acquire a runtime PM
> > > > > reference for devices before system sleep is entered. This is needed
> > > > > to avoid potential issues related to devices' parents getting put to
> > > > > runtime suspend at the wrong time and causing problems with their
> > > > > children.
> > > >=20
> > > > Not only for that.
> > > >=20
> > > > > In some cases drivers are carefully written to avoid such issues and
> > > > > the default behaviour can be changed to allow runtime PM to operate
> > > > > regularly during system sleep.
> > > >=20
> > > > But this change breaks quite a few assumptions in the core too, so no,
> > > > it can't be made.
> > >
> > > Anything in particular that I can look at? I'm not seeing any issues
> > > when I test this, which could of course mean that I'm just getting
> > > lucky.
> >
> > There are races and such that you may never hit during casual testing.
> >
> > > One thing that irritated me is that I think this used to work. I do
> > > recall testing suspend/resume a few years ago and devices would get
> > > properly runtime suspended/resumed.
> >
> > Not true at all.
> >
> > The PM core has always taken PM-runtime references on all devices pretty much
> > since when PM-runtime was introduced.
>
> You're right. I was finally able to find a toolchain that I could build
> an old version of the kernel with. I tested system suspend/resume on the
> v4.8 release, which is the first one that had the runtime PM changes as
> well as the subsystem suspend/resume support wired up, and I can't see
> the runtime PM callbacks invoked during system suspend/resume.
>
> So I must be misremembering, or I'm confusing it with some other tests I
> was running at the time.
>
> > > I did some digging but couldn't
> > > find anything that would have had an impact on this.
> > >
> > > Given that this is completely opt-in feature, why are you categorically
> > > NAK'ing this?
> >
> > The general problem is that if any device has been touched by system-wide
> > suspend code, it should not be subject to PM-runtime any more until the
> > subsequent system-wide resume is able to undo whatever the suspend did.
> >
> > Moreover, if a device is runtime-suspended, the system-wide suspend code
> > may mishandle it, in general.  That's why PM-runtime suspend is not allowed
> > during system-wide transitions at all.  And it has always been like that.
>
> For this particular use-case the above should all be irrelevant. None of
> the drivers involved here do anything special at system suspend, because
> runtime suspend already puts the devices into the lowest possible power
> state. Basically when these devices are put into runtime suspend, they
> are completely turned off. The only exception is for things like HDMI
> where the +5V pin remains powered, so that hotplug detection will work.
>
> The runtime PM state of the devices involved is managed by the subsystem
> system suspend/resume helpers in DRM/KMS. Basically those helpers turn
> off all the devices in the composite device, which ultimately results in
> their last runtime PM reference being released. So for system suspend
> and resume, these devices aren't touched, other than maybe for the PM
> core's internal book-keeping.

OK, so you actually want system-wide PM to work like PM-runtime on the
platform in question, but there are substantial differences.

First, PM-runtime suspend can be effectively disabled by user space
and system-wide suspend is always expected to work.

Second, if system wakeup devices are involved, their handling during
system-wide suspend depends on the return value of device_may_wakeup()
which depends on what user space does, whereas PM-runtime assumes
device wakeup to be always enabled.

> > For a specific platform you may be able to overcome these limitations if
> > you are careful enough, but certainly they are there in general and surely
> > you cannot prevent people from using your opt-in just because they think
> > that they know what they are doing.
>
> That's true. But the same thing is true for pretty much all other APIs.
> People obviously have to make sure they know what they're doing, just
> like they have to with any other API.
>
> I suppose the documentation for this new function is currently lacking a
> bit. Perhaps adding a big warning to this and listing the common
> pitfalls would help people make the right call about whether or not they
> can use this.

And then *somebody* would have to chase a ton of subtle issues
resulting from that.  No, thanks, but no thanks.

> > > Is there some other alternative that I can look into?
> >
> > First of all, ensure that the dpm_list ordering is what it should be on the
> > system/platform in question.  That can be done with the help of device links.
>
> I don't think we have device links for everything, but the deferred
> probe code should take care of ordering the dpm_list correctly because
> we do handle deferred probe properly in all cases.
>
> Also, the dpm_list ordering isn't very critical in this case. If the
> devices are allowed to runtime suspend during system sleep, the
> subsystem sleep helper will put them into runtime suspend at the correct
> time. This is propagated all the way through the display pipeline and
> that order is ensured by the subsystem helpers.

You are still not saying what happens if user space doesn't allow
PM-runtime to suspend the devices (by writing "on" to their "control"
files).

> > In addition, make sure that the devices needed to suspend other devices are
> > suspended in the noirq phase of system-wide suspend and resumed in the
> > noirq phase of system-wide resume.  Or at least all of the other devices
> > need to be suspended before them and resumed after them.
>
> We're fine on this front as well. We have run into such issues in the
> past, but I don't think there are any such issue left at the moment. I
> do have one pending fix for I2C suspend/resume which fixes an issue
> where some pinmuxing changes needed to get the HDMI DDC channel to work
> were not getting applied during resume.
>
> That I2C issue is related to this, I think. What I'm seeing is that when
> the system goes to sleep, the pinmux looses its programming at a
> hardware level, but the I2C driver doesn't know about it because it does
> not get runtime suspended.

Well, no, that's not the reason.  The real reason is that the handling
of that device during system-wide suspend does not follow the rules
followed by PM-runtime for it.

Switching system-wide PM over to PM-runtime to address that is not
going to work, because PM-runtime is not mandatory and system-wide PM
is.

> At runtime suspend it would switch the pinmux
> state to "idle" which would then match the system suspend state. Upon
> runtime resume it sets the "default" pinmux state, which will then
> restore the register programming.

So this logic needs to be implemented in the system-wide suspend flow as well.

> In the current case where runtime suspend/resume is prohibited during

Runtime suspend is, runtime resume isn't until the "late" suspend phase.

> system sleep, upon resume the I2C driver will assume that the pinmux
> state is still "default" and it won't reapply the state (it's actually
> the pinmux subsystem that makes this decision) and causes HDMI DDC
> transactions to time out.

So this is a bug in the system-wide suspend/resume flow that needs to
be addressed, but not by switching it over to PM-runtime.

> One simple fix for that is to use pm_runtime_force_suspend() and
> pm_runtime_force_resume() as system suspend/resume callbacks to make
> sure the I2C controller is runtime suspended/resumed during system
> sleep.
>
> Note that forcing runtime suspend/resume this way is suboptimal in the
> DRM/KMS case because the suspend/resume happens disconnected from the
> subsystem suspend/resume callbacks, which is not desired as that breaks
> some of the assumptions in those callbacks.

So there needs to be another way.

Have you looked at DPM_FLAG_SMART_SUSPEND?

> > These two things should allow you to cover the vast majority of cases if
> > not all of them without messing up with the rules.
>
> One alternative that I had thought about was to just ditch the runtime
> PM callbacks for this. However, there's one corner case where this may
> break. On early Tegra generations, the two display controllers are
> "coupled" in that the second one doesn't work if the first one is
> disabled. We describe that using a device link from the second to the
> first controller. This causes the first controller to be automatically
> be runtime resumed when the second controller is used. This only works
> via runtime PM, so if I don't use runtime PM I'd have to add special
> handling for that case.

Runtime resume during system-wide suspend and resume is basically fine
unless you try to do it in the "late" suspend phase or later, but that
limitation is kind of artificial.  [I was talking about that at the
LPC this year.]  It basically cannot be carried out in the part of
system-wide suspend after the core regards the device and its parent
etc as "suspended", but the definition of that may be adjusted IMO.

And using PM-runtime resume during system-wide resume may be fine too,
basically (as long as the ordering of that is not lead to any kind of
loop dependencies).

On the other hand, there is *zero* need for runtime suspend during
system-wide transitions and it is known problematic.

> Actually, there's another problem as well. Most of these devices use
> generic PM domains to power on/off the SoC partitions that they're in.
> If I side-step runtime PM, then I'd have to somehow find a way to
> explicitly control the PM domains.

That's a problem with genpd, I'd say.

> Another alternative would be to have a kind of hybrid approach where I
> leave runtime PM calls in the drivers, but disconnect the runtime PM
> callback implementations from that. That would at least fix the issue
> with the generic PM domains.
>
> However, it would not fix the problem with coupled display controllers
> because empty runtime PM callbacks wouldn't actually power up the first
> display controller when it is needed by the second controller. I would
> have to add infrastructure that basically duplicates some of runtime PM
> to fix that.
>
> So the bottom line is that runtime PM is still the best solution for
> this problem. It works really nice and is very consistent.
>
> Do you think adding better documentation to this new flag and the
> accessors would help remove your concerns about this?

No, it wouldn't.

Also your arguments are mostly about PM-runtime resume, which is a
different story.

  reply	other threads:[~2019-11-29 10:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-28 16:03 [PATCH 0/2] PM / runtime: Allow drivers to override runtime PM behaviour on sleep Thierry Reding
2019-11-28 16:03 ` [PATCH 1/2] " Thierry Reding
2019-11-28 16:14   ` Rafael J. Wysocki
2019-11-28 16:50     ` Thierry Reding
2019-11-28 22:03       ` Rafael J. Wysocki
2019-11-28 22:20         ` Rafael J. Wysocki
2019-11-29 10:08           ` Thierry Reding
2019-11-29 10:22             ` Rafael J. Wysocki
2019-11-29 12:07               ` Thierry Reding
2019-11-29 20:27                 ` Daniel Vetter
2019-12-04  0:02                 ` Rafael J. Wysocki
2019-11-29  9:33         ` Thierry Reding
2019-11-29 10:09           ` Rafael J. Wysocki [this message]
2019-11-29 11:44             ` Thierry Reding
2019-11-28 16:03 ` [PATCH 2/2] drm/tegra: Allow runtime suspend on system sleep Thierry Reding
2019-11-28 16:47 ` [PATCH 0/2] PM / runtime: Allow drivers to override runtime PM behaviour on sleep Daniel Vetter
2019-11-28 17:04   ` Thierry Reding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0j56xrmiO6bvs3W-_pY0HjGhSUEbZYzjZHfm6gA-k605g@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=digetx@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=len.brown@intel.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@rjwysocki.net \
    --cc=thierry.reding@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).