All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Peter Wang <peter.wang@mediatek.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1] PM-runtime: Check supplier_preactivated before release supplier
Date: Thu, 30 Jun 2022 18:28:21 +0200	[thread overview]
Message-ID: <CAJZ5v0gChpusk6JuTG+Zhd_qGR1N+s97Avn4ybdp7Ggpv_uRaQ@mail.gmail.com> (raw)
In-Reply-To: <f6ebfd39-a27a-8b1c-6a61-f9a63236961d@mediatek.com>

On Thu, Jun 30, 2022 at 5:19 PM Peter Wang <peter.wang@mediatek.com> wrote:
>
>
> On 6/30/22 10:47 PM, Rafael J. Wysocki wrote:
> > On Thu, Jun 30, 2022 at 4:26 PM Peter Wang <peter.wang@mediatek.com> wrote:
> >>
> >> On 6/30/22 12:01 AM, Rafael J. Wysocki wrote:
> >>> [Add CCs to linix-pm, LKML and Greg]
> >>>
> >>> On Wednesday, June 29, 2022 5:32:00 PM CEST Rafael J. Wysocki wrote:
> >>>> On Wed, Jun 29, 2022 at 4:47 PM Peter Wang <peter.wang@mediatek.com> wrote:
> >>>>> On 6/29/22 9:22 PM, Rafael J. Wysocki wrote:
> >>>>>> On Wed, Jun 29, 2022 at 5:02 AM Peter Wang <peter.wang@mediatek.com> wrote:
> >>>>>>> On 6/28/22 11:54 PM, Rafael J. Wysocki wrote:
> >>>>>>>> On Tue, Jun 28, 2022 at 3:53 AM Peter Wang <peter.wang@mediatek.com> wrote:
> >>>>>>>>> On 6/28/22 3:00 AM, Rafael J. Wysocki wrote:
> >>>>>>>>>> On Mon, Jun 13, 2022 at 2:08 PM <peter.wang@mediatek.com> wrote:
> >>>>>>>>>>> From: Peter Wang <peter.wang@mediatek.com>
> >>>>>>>>>>>
> >>>>>>>>>>> With divice link of DL_FLAG_PM_RUNTIME, if consumer call pm_runtime_get_suppliers
> >>>>>>>>>>> to prevent supplier enter suspend, pm_runtime_release_supplier should
> >>>>>>>>>>> check supplier_preactivated before let supplier enter suspend.
> >>>>>>>>>> Why?
> >>>>>>>>> because supplier_preactivated is true means supplier cannot enter
> >>>>>>>>> suspend, right?
> >>>>>>>> No, it doesn't mean that.
> >>>>>>> Hi Rafael,
> >>>>>>>
> >>>>>>> if supplier_preactivated is true, means someone call
> >>>>>>> pm_runtime_get_suppliers and
> >>>>>>> before pm_runtime_put_suppliers right? This section suppliers should not
> >>>>>>> enter suspend.
> >>>>>> No, this is not how this is expected to work.
> >>>>>>
> >>>>>> First off, the only caller of pm_runtime_get_suppliers() and
> >>>>>> pm_runtime_put_suppliers() is __driver_probe_device().  Really nobody
> >>>>>> else has any business that would require calling them.
> >>>>> Hi Rafael,
> >>>>>
> >>>>> Yes, you are right!
> >>>>> __driver_probe_device the only one use and just because
> >>>>> __driver_probe_device use
> >>>>> pm_runtime_get_suppliers cause problem.
> >>>>>
> >>>>>
> >>>>>> Second, the role of pm_runtime_get_suppliers() is to "preactivate" the
> >>>>>> suppliers before running probe for a consumer device and the role of
> >>>>> the role of pm_runtime_get_suppliers() is to "preactivate" the suppliers,
> >>>>> but suppliers may suspend immediately after preactivate right?
> >>>>> Here is just this case. this is first racing point.
> >>>>> Thread A: pm_runtime_get_suppliers                -> __driver_probe_device
> >>>>> Thread B: pm_runtime_release_supplier
> >>>>> Thread A: Run with supplier not preactivate      -> __driver_probe_device
> >>>>>
> >>>>>> pm_runtime_put_suppliers() is to do the cleanup in case the device is
> >>>>>> left in suspend after probing.
> >>>>>>
> >>>>>> IOW, pm_runtime_get_suppliers() is to ensure that the suppliers will
> >>>>>> be active until the probe callback takes over and the rest depends on
> >>>>>> that callback.
> >>>>> The problem of this racing will finally let consumer is active but
> >>>>> supplier is suspended.
> >>>> So it would be better to send a bug report regarding this.
> >>>>
> >>>>> The link relation is broken.
> >>>>> I know you may curious how it happened? right?
> >>>>> Honestly, I am not sure, but I think the second racing point
> >>>>> is rpm_get_suppliers and pm_runtime_put_suppliers(release rpm_active).
> >>>> I'm not sure what you mean by "the racing point".
> >>>>
> >>>> Yes, these functions can run concurrently.
> >>>>
> >>>>> So, I try to fix the first racing point and the problem is gone.
> >>>>> It is full meet expect, and the pm runtime will work smoothly after
> >>>>> __driver_probe_device done.
> >>>> I'm almost sure that there is at least one scenario that would be
> >>>> broken by this change.
> >>> That said, the code in there may be a bit overdesigned.
> >>>
> >>> Does the patch below help?
> >>>
> >>> ---
> >>>    drivers/base/power/runtime.c |   14 +-------------
> >>>    1 file changed, 1 insertion(+), 13 deletions(-)
> >>>
> >>> Index: linux-pm/drivers/base/power/runtime.c
> >>> ===================================================================
> >>> --- linux-pm.orig/drivers/base/power/runtime.c
> >>> +++ linux-pm/drivers/base/power/runtime.c
> >>> @@ -1768,7 +1768,6 @@ void pm_runtime_get_suppliers(struct dev
> >>>                if (link->flags & DL_FLAG_PM_RUNTIME) {
> >>>                        link->supplier_preactivated = true;
> >>>                        pm_runtime_get_sync(link->supplier);
> >>> -                     refcount_inc(&link->rpm_active);
> >>>                }
> >>>
> >>>        device_links_read_unlock(idx);
> >>> @@ -1788,19 +1787,8 @@ void pm_runtime_put_suppliers(struct dev
> >>>        list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> >>>                                device_links_read_lock_held())
> >>>                if (link->supplier_preactivated) {
> >>> -                     bool put;
> >>> -
> >>>                        link->supplier_preactivated = false;
> >>> -
> >>> -                     spin_lock_irq(&dev->power.lock);
> >>> -
> >>> -                     put = pm_runtime_status_suspended(dev) &&
> >>> -                           refcount_dec_not_one(&link->rpm_active);
> >>> -
> >>> -                     spin_unlock_irq(&dev->power.lock);
> >>> -
> >>> -                     if (put)
> >>> -                             pm_runtime_put(link->supplier);
> >>> +                     pm_runtime_put(link->supplier);
> >>>                }
> >>>
> >>>        device_links_read_unlock(idx);
> >>
> >> Hi Rafael,
> >>
> >> I think this patch solve the rpm_active racing problem.
> >> But it still have problem that
> >> pm_runtime_get_suppliers call pm_runtime_get_sync(link->supplier)
> >> and supplier could suspend immediately by other thread who call
> >> pm_runtime_release_supplier.
> > No, it won't, because pm_runtime_release_supplier() won't drop the
> > reference on the supplier taken by pm_runtime_get_suppliers(0 after
> > the patch.
>
> Hi Rafael,
>
> I think pm_runtime_release_supplier will always decrese the reference
> rpm_active count to 1 and check idle will let supplier enter suspend. Am
> I wrong?
>
> Could you explain why this patch won't drop the reference?

What matters is the supplier's PM-runtime usage counter and (with the
patch above applied) pm_runtime_get_suppliers() bumps it up via
pm_runtime_get_sync() and it doesn't bump up the device link's
rpm_active count at the same time.

This is important, because the number of times
pm_runtime_release_supplier() decrements the supplier's usage counter
is the same as the rpm_active count value at the beginning of that
function minus 1.  Now, rpm_active is 1 initially and every time it
gets incremented, the supplier's usage counter is also incremented.
Combined with the observation in the previous paragraph, this means
that after pm_runtime_get_suppliers() the value of the supplier's
PM-runtime usage counter will always be greater than the rpm_active
value minus 1, so pm_runtime_release_supplier() cannot decrement it
down to zero until pm_runtime_put_suppliers() runs.

  reply	other threads:[~2022-06-30 16:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 12:07 [PATCH v1] PM-runtime: Check supplier_preactivated before release supplier peter.wang
2022-06-13 12:07 ` peter.wang
2022-06-13 12:07 ` peter.wang
2022-06-22  6:09 ` Peter Wang
2022-06-22  6:48   ` Greg KH
2022-06-22  6:48     ` Greg KH
2022-06-27 14:14 ` Greg KH
2022-06-27 14:14   ` Greg KH
2022-06-27 14:27   ` Rafael J. Wysocki
2022-06-27 14:27     ` Rafael J. Wysocki
2022-06-28  1:49   ` Peter Wang
2022-06-27 19:00 ` Rafael J. Wysocki
2022-06-27 19:00   ` Rafael J. Wysocki
2022-06-28  1:53   ` Peter Wang
2022-06-28 15:54     ` Rafael J. Wysocki
2022-06-28 15:54       ` Rafael J. Wysocki
     [not found] ` <b55d5691-0b2d-56bb-26ff-dcac56770611@mediatek.com>
     [not found]   ` <CAJZ5v0gTpv2gt_Gm9rUd+8Jmp4=ij2=J20o7qO0sC-hm=w3=_A@mail.gmail.com>
2022-06-29 16:01     ` Rafael J. Wysocki
2022-06-30 14:26       ` Peter Wang
2022-06-30 14:47         ` Rafael J. Wysocki
2022-06-30 15:19           ` Peter Wang
2022-06-30 16:28             ` Rafael J. Wysocki [this message]
2022-07-01 10:21               ` Peter Wang
2022-08-02  3:19                 ` Peter Wang
2022-08-02 11:01                   ` Rafael J. Wysocki
2022-08-02 13:33                     ` Peter Wang
2022-10-12 10:31                       ` Nitin Rawat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0gChpusk6JuTG+Zhd_qGR1N+s97Avn4ybdp7Ggpv_uRaQ@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peter.wang@mediatek.com \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.