linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pingfan Liu <kernelfans@gmail.com>
To: Rafael Wysocki <rafael@kernel.org>
Cc: lukas@wunner.de, linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Grygorii Strashko <grygorii.strashko@ti.com>,
	Christoph Hellwig <hch@infradead.org>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Dave Young <dyoung@redhat.com>,
	linux-pci@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-pm@vger.kernel.org, kishon@ti.com
Subject: Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset
Date: Mon, 9 Jul 2018 16:40:52 +0800	[thread overview]
Message-ID: <CAFgQCTvGJVBdGLz0R74jO3f1nY3aWprGoADUXzw4E6jxOxWHBg@mail.gmail.com> (raw)
In-Reply-To: <CAJZ5v0gSo-jFO0h+aLkeyGGxRL+tdsRtGi0WFhhGQbZvezJmkA@mail.gmail.com>

On Mon, Jul 9, 2018 at 3:48 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu <kernelfans@gmail.com> wrote:
> > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>
> >> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu <kernelfans@gmail.com> wrote:
> >> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> >> >>
> >> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >> >> >
> >> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner <lukas@wunner.de> wrote:
> >> >> > > [cc += Kishon Vijay Abraham]
> >> >> > >
> >> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> >> >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly is
> >> >> > >> a mistake.
> >> >> > >>
> >> >> > >> I'm not really sure what the intention of it was as the changelog of
> >> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be
> >> >> > >> insufficient without that change?)
> >> >> > >
> >> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> >> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC
> >> >> > > won't be found on the next boot.
> >> >> > >
> >> >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled
> >> >> > > as a regulator.  The regulator is enabled when the MMC probes and
> >> >> > > disabled on driver unbind and shutdown.  As a result, the pin is driven
> >> >> > > low on shutdown and the MMC is not found on the next boot.
> >> >> > >
> >> >> > > To fix this, another kludge was invented wherein the GPIO expander
> >> >> > > driving the reset pin unconditionally drives all its pins high on
> >> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> >> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state
> >> >> > > of all pcf lines").
> >> >> > >
> >> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to
> >> >> > > be executed after the MMC expander's ->shutdown hook.
> >> >> > >
> >> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset according
> >> >> > > to the probe order.  Apparently the MMC probes after the GPIO expander,
> >> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> >> >> > > available yet, see mmc_regulator_get_supply().
> >> >> > >
> >> >> > > Note, I'm just piecing the information together from git history,
> >> >> > > I'm not responsible for these kludges.  (I'm innocent!)
> >> >> >
> >> >> > Sure enough. :-)
> >> >> >
> >> >> > In any case, calling devices_kset_move_last() in really_probe() is
> >> >> > plain broken and if its only purpose was to address a single, arguably
> >> >> > kludgy, use case, let's just get rid of it in the first place IMO.
> >> >> >
> >> >> Yes, if it is only used for a single use case.
> >> >>
> >> > Think it again, I saw other potential issue with the current code.
> >> > device_link_add->device_reorder_to_tail() can break the
> >> > "supplier<-consumer" order. During moving children after parent's
> >> > supplier, it ignores the order of child's consumer.
> >>
> >> What do you mean?
> >>
> > The drivers use device_link_add() to build "supplier<-consumer" order
> > without knowing each other. Hence there is the following potential
> > odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where
> > consumer_a consumes child_a.
>
> Well, what's the initial state of the list?
>
> > When device_link_add()->device_reorder_to_tail() moves all descendant of
> > consumerX to the tail, it breaks the "supplier<-consumer" order by
> > "consumer_a <- child_a".
>
> That depends on what the initial ordering of the list is and please
> note that circular dependencies are explicitly assumed to be not
> present.
>
> The assumption is that the initial ordering of the list reflects the
> correct suspend (or shutdown) order without the new link.  Therefore
> initially all children are located after their parents and all known
> consumers are located after their suppliers.
>
> If a new link is added, the new consumer goes to the end of the list
> and all of its children and all of its consumers go after it.
> device_reorder_to_tail() is recursive, so for each of the devices that
> went to the end of the list, all of its children and all of its
> consumers go after it and so on.
>
> Now, that operation doesn't change the order of any of the
> parent<-child or supplier<-consumer pairs that get moved and since all
> of the devices that depend on any device that get moved go to the end
> of list after it, the only devices that don't go to the end of list
> are guaranteed to not depend on any of them (they may be parents or
> suppliers of the devices that go to the end of the list, but not their
> children or suppliers).
>
Thanks for the detailed explain. It is clear now, and you are right.

> > And we need recrusion to resolve the item in
> > (consumer_a,..), each time when moving a consumer behind its supplier,
> > we may break "parent<-child".
>
> I don't see this as per the above.
>
> Say, device_reorder_to_tail() moves a parent after its child.  This
> means that device_reorder_to_tail() was not called for the child after
> it had been called for the parent, but that is not true, because it is
> called for all of the children of each device that gets moved *after*
> moving that device.
>
Yes, you are right.

> >> > Beside this, essentially both devices_kset_move_after/_before() and
> >> > device_pm_move_after/_before() expose  the shutdown order to the
> >> > indirect caller,  and we can not expect that the caller can not handle
> >> > it correctly. It should be a job of drivers core.
> >>
> >> Arguably so, but that's how those functions were designed and the
> >> callers should be aware of the limitation.
> >>
> >> If they aren't, there is a bug in the caller.
> >>
> > If we consider device_move()-> device_pm_move_after/_before() more
> > carefully like the above description, then we can hide the detail from
> > caller. And keep the info of the pm order inside the core.
>
> Yes, we can.
>
> My point is that we have not been doing that so far and the current
> callers of those routines are expected to know that.
>
> We can do that to make the life of *future* callers easier (and maybe
> to simplify the current ones), but currently the caller is expected to
> do the right thing.
>
OK, I get your point.

> >> > It is hard to extract high dimension info and pack them into one dimension
> >> > linked-list.
> >>
> >> Well, yes and no.
> >>
> > For "hard", I means that we need two interleaved recursion to make the
> > order correct. Otherwise, I think it is a bug or limitation.
>
> So the limitation is that circular dependencies may not exist, because
> if they did, there would be no suitable suspend/shutdown ordering
> between devices.
>
Yes.

> >> We know it for a fact that there is a linear ordering that will work.
> >> It is inefficient to figure it out every time during system suspend
> >> and resume, for one and that's why we have dpm_list.
> >>
> > Yeah, I agree that iterating over device tree may hurt performance. I
> > guess the iterating will not cost the majority of the suspend time,
> > comparing to the device_suspend(), which causes hardware's sync. But
> > data is more persuasive. Besides the performance, do you have other
> > concern till now?
>
> I simply think that there should be one way to iterate over devices
> for both system-wide PM and shutdown.
>
> The reason why it is not like that today is because of the development
> history, but if it doesn't work and we want to fix it, let's just
> consolidate all of that.
>
> Now, system-wide suspend resume sometimes iterates the list in the
> reverse order which would be hard without having a list, wouldn't it?
>
Yes, it would be hard without having a list. I just thought to use
device tree info to build up a shadowed list, and rebuild the list
until there is new device_link_add() operation. For
device_add/_remove(), it can modify the shadowed list directly.

Thanks,
Pingfan

  reply	other threads:[~2018-07-09  8:41 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-03  6:50 [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset Pingfan Liu
2018-07-03  6:50 ` [PATCHv3 1/4] drivers/base: fold the routine of device's shutdown into a func Pingfan Liu
2018-07-03  6:50 ` [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices Pingfan Liu
2018-07-03  7:51   ` Lukas Wunner
2018-07-03  9:26     ` Pingfan Liu
2018-07-04  3:10       ` Pingfan Liu
2018-07-03 10:58   ` Andy Shevchenko
2018-07-03 17:03     ` Pavel Tatashin
2018-07-04 17:04   ` kbuild test robot
2018-07-05 10:11   ` Rafael J. Wysocki
2018-07-06  3:02     ` Pingfan Liu
2018-07-06  9:53       ` Rafael J. Wysocki
2018-07-07  4:02         ` Pingfan Liu
2018-07-06 10:00       ` [PATCH] driver core: Drop devices_kset_move_last() call from really_probe() Rafael J. Wysocki
2018-07-09 13:57         ` Bjorn Helgaas
2018-07-09 21:35           ` Rafael J. Wysocki
2018-07-09 22:06             ` Bjorn Helgaas
2018-07-10  6:19               ` Kishon Vijay Abraham I
2018-07-10 10:32                 ` Rafael J. Wysocki
2018-07-10 10:29               ` Rafael J. Wysocki
2018-07-10  6:33         ` Pingfan Liu
2018-07-10 11:35         ` [PATCH] driver core: Partially revert "driver core: correct device's shutdown order" Rafael J. Wysocki
2018-07-10 12:22           ` Kishon Vijay Abraham I
2018-07-10 12:38             ` Rafael J. Wysocki
2018-07-10 12:51           ` [PATCH v2] " Rafael J. Wysocki
2018-07-10 12:59             ` Greg Kroah-Hartman
2018-07-10 15:40               ` Rafael J. Wysocki
2018-07-10 15:47                 ` Greg Kroah-Hartman
2018-07-10 19:13                   ` Kishon Vijay Abraham I
2018-07-03  6:50 ` [PATCHv3 3/4] drivers/base: clean up the usage of devices_kset_move_last() Pingfan Liu
2018-07-03 14:26   ` Rafael J. Wysocki
2018-07-04  4:40     ` Pingfan Liu
2018-07-04 10:17       ` Rafael J. Wysocki
2018-07-05  2:32         ` Pingfan Liu
2018-07-03  6:50 ` [PATCHv3 4/4] Revert "driver core: correct device's shutdown order" Pingfan Liu
2018-07-03 14:35 ` [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset Rafael J. Wysocki
2018-07-04  2:47   ` Pingfan Liu
2018-07-04 10:21     ` Rafael J. Wysocki
2018-07-05  2:44       ` Pingfan Liu
2018-07-05  9:18         ` Rafael J. Wysocki
2018-07-06  8:36           ` Lukas Wunner
2018-07-06  8:47             ` Rafael J. Wysocki
2018-07-06 13:55               ` Pingfan Liu
2018-07-07  4:24                 ` Pingfan Liu
2018-07-08  8:25                   ` Rafael J. Wysocki
2018-07-09  6:48                     ` Pingfan Liu
2018-07-09  7:48                       ` Rafael J. Wysocki
2018-07-09  8:40                         ` Pingfan Liu [this message]
2018-07-09  8:58                           ` Rafael J. Wysocki
2018-07-06 10:02             ` Kishon Vijay Abraham I
2018-07-06 13:52             ` Pingfan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFgQCTvGJVBdGLz0R74jO3f1nY3aWprGoADUXzw4E6jxOxWHBg@mail.gmail.com \
    --to=kernelfans@gmail.com \
    --cc=dyoung@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=grygorii.strashko@ti.com \
    --cc=hch@infradead.org \
    --cc=helgaas@kernel.org \
    --cc=kishon@ti.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).