All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ulf Hansson <ulf.hansson@linaro.org>
To: "zhangqing@rock-chips.com" <zhangqing@rock-chips.com>,
	rjw <rjw@rjwysocki.net>
Cc: heiko <heiko@sntech.de>,
	"Finley Xiao" <finley.xiao@rock-chips.com>,
	"len.brown" <len.brown@intel.com>, pavel <pavel@ucw.cz>,
	gregkh <gregkh@linuxfoundation.org>, khilman <khilman@kernel.org>,
	linux-rockchip <linux-rockchip@lists.infradead.org>,
	linux-pm <linux-pm@vger.kernel.org>,
	"Huang, Tao" <huangtao@rock-chips.com>,
	谢修鑫 <tony.xie@rock-chips.com>, 杨凯 <kever.yang@rock-chips.com>
Subject: Re: [REPORT_ISSUE]: RK3399 pd power down failed
Date: Tue, 23 Feb 2021 12:30:39 +0100	[thread overview]
Message-ID: <CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com> (raw)
In-Reply-To: <20210120172939160049119@rock-chips.com>

On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
<zhangqing@rock-chips.com> wrote:
>
> Hi, Heiko :
>
> In rk3399 evb board,  I found a probabilistic problem about PD. Turning off PD occasionally failed.
>
> log show:
> Open the vop
> #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
>
> close the vop
> #enter
>
>  # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> domain                          status          slaves
>     /device                                             runtime status
> ----------------------------------------------------------------------
> pd_vopl                         off
> pd_vopb                         on
>     /devices/platform/ff903f00.iommu                     suspended
>     /devices/platform/ff900000.vop                          suspended
>
> I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately,  PD power off may be failed.
> The process is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
>     -> rpm_idle(vop)
>         -> rpm_suspend(vop)
>             -> __update_runtime_status(dev, RPM_SUSPENDING)
>                 -> rpm_callback(vop)
>                     -> __rpm_callback(vop)
>                         -> do power off pd callback(genpd_power_off)
>                             -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                 -> Handle link device callbacks according to device link(rpm_put_suppliers)
>                                     -> pm_runtime_put(link->supplier)
>                                         -> queue_work(pm_wq, &dev->power.work), execute immediately
>                                             ->rpm_idle(iommu)
>                                                 -> rpm_suspend(iommu)
>                                                     -> rpm_callback(iommu)
>                                                         -> rk_iommu_suspend
>                                                             ->  do power off pd callback(genpd_power_off)
>                                                                 -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                                                     -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
>                                                                         -> vop do __update_runtime_status(dev, RPM_SUSPENDED)

So, rpm_suspend() tries to suspend the supplier device link via
rpm_put_suppliers(), before it has updated its consumer device's state
to RPM_SUSPENDED.

This looks worrying to me, both because it's seems wrong to allow a
supplier to be suspended before a consumers device's state has reached
RPM_SUSPENDED - but also because it's not consistent with the way we
treat parent/child devices. The child's state will always be set to
RPM_SUSPENDED, before we try to suspend its parent by calling
rpm_idle() for it in rpm_suspend().

Rafael, what's your take on this? Would it make sense to align the
behavior for consumer/supplier-links in rpm_suspend() according to
child/parents?

>
> So the whole process goes down, the PD is not really power off, but the state of the child devices is SUSPENDED.
> Pm_runtime_put (link->supplier) used to power off PD is asynchronous and is added to queue_work(pm_wq, &dev-bbb>wer-work);
> If the queue_work is queued, the PD shutdown will succeed
> The call is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
>     -> rpm_idle(vop)
>         -> rpm_suspend(vop)
>             -> __update_runtime_status(dev, RPM_SUSPENDING)
>                 -> rpm_callback(vop)
>                     -> __rpm_callback(vop)
>                         -> do power off pd callback(genpd_power_off)
>                             ->  list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                  -> Handle link device callbacks according to device link(rpm_put_suppliers)
>                                     -> pm_runtime_put(link->supplier)
>                                          -> queue_work(pm_wq, &dev->power.work),
>                                              -> vop do  __update_runtime_status(dev, RPM_SUSPENDED)
>                                                 -> do queue_work (pm_runtime_work)
>                                                     ->  rpm_idle(iommu)
>                                                          -> rpm_suspend(iommu)
>                                                             -> do callback( rk_iommu_suspend)
>                                                                 -> do callback (genpd_power_off)
>                                                                     -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspended, ff903f00.iommu : suspending,so not_suspended = 1; do _genpd_power_off, Really turn off PD。
>                                                                         -> iommu do  __update_runtime_status(dev, RPM_SUSPENDED)
>

[...]

Kind regards
Uffe

WARNING: multiple messages have this Message-ID (diff)
From: Ulf Hansson <ulf.hansson@linaro.org>
To: "zhangqing@rock-chips.com" <zhangqing@rock-chips.com>,
	rjw <rjw@rjwysocki.net>
Cc: "Huang, Tao" <huangtao@rock-chips.com>,
	"len.brown" <len.brown@intel.com>, heiko <heiko@sntech.de>,
	khilman <khilman@kernel.org>, gregkh <gregkh@linuxfoundation.org>,
	linux-pm <linux-pm@vger.kernel.org>,
	杨凯 <kever.yang@rock-chips.com>,
	linux-rockchip <linux-rockchip@lists.infradead.org>,
	谢修鑫 <tony.xie@rock-chips.com>,
	"Finley Xiao" <finley.xiao@rock-chips.com>, pavel <pavel@ucw.cz>
Subject: Re: [REPORT_ISSUE]: RK3399 pd power down failed
Date: Tue, 23 Feb 2021 12:30:39 +0100	[thread overview]
Message-ID: <CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com> (raw)
In-Reply-To: <20210120172939160049119@rock-chips.com>

On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
<zhangqing@rock-chips.com> wrote:
>
> Hi, Heiko :
>
> In rk3399 evb board,  I found a probabilistic problem about PD. Turning off PD occasionally failed.
>
> log show:
> Open the vop
> #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
>
> close the vop
> #enter
>
>  # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> domain                          status          slaves
>     /device                                             runtime status
> ----------------------------------------------------------------------
> pd_vopl                         off
> pd_vopb                         on
>     /devices/platform/ff903f00.iommu                     suspended
>     /devices/platform/ff900000.vop                          suspended
>
> I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately,  PD power off may be failed.
> The process is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
>     -> rpm_idle(vop)
>         -> rpm_suspend(vop)
>             -> __update_runtime_status(dev, RPM_SUSPENDING)
>                 -> rpm_callback(vop)
>                     -> __rpm_callback(vop)
>                         -> do power off pd callback(genpd_power_off)
>                             -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                 -> Handle link device callbacks according to device link(rpm_put_suppliers)
>                                     -> pm_runtime_put(link->supplier)
>                                         -> queue_work(pm_wq, &dev->power.work), execute immediately
>                                             ->rpm_idle(iommu)
>                                                 -> rpm_suspend(iommu)
>                                                     -> rpm_callback(iommu)
>                                                         -> rk_iommu_suspend
>                                                             ->  do power off pd callback(genpd_power_off)
>                                                                 -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                                                     -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
>                                                                         -> vop do __update_runtime_status(dev, RPM_SUSPENDED)

So, rpm_suspend() tries to suspend the supplier device link via
rpm_put_suppliers(), before it has updated its consumer device's state
to RPM_SUSPENDED.

This looks worrying to me, both because it's seems wrong to allow a
supplier to be suspended before a consumers device's state has reached
RPM_SUSPENDED - but also because it's not consistent with the way we
treat parent/child devices. The child's state will always be set to
RPM_SUSPENDED, before we try to suspend its parent by calling
rpm_idle() for it in rpm_suspend().

Rafael, what's your take on this? Would it make sense to align the
behavior for consumer/supplier-links in rpm_suspend() according to
child/parents?

>
> So the whole process goes down, the PD is not really power off, but the state of the child devices is SUSPENDED.
> Pm_runtime_put (link->supplier) used to power off PD is asynchronous and is added to queue_work(pm_wq, &dev-bbb>wer-work);
> If the queue_work is queued, the PD shutdown will succeed
> The call is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
>     -> rpm_idle(vop)
>         -> rpm_suspend(vop)
>             -> __update_runtime_status(dev, RPM_SUSPENDING)
>                 -> rpm_callback(vop)
>                     -> __rpm_callback(vop)
>                         -> do power off pd callback(genpd_power_off)
>                             ->  list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
>                                  -> Handle link device callbacks according to device link(rpm_put_suppliers)
>                                     -> pm_runtime_put(link->supplier)
>                                          -> queue_work(pm_wq, &dev->power.work),
>                                              -> vop do  __update_runtime_status(dev, RPM_SUSPENDED)
>                                                 -> do queue_work (pm_runtime_work)
>                                                     ->  rpm_idle(iommu)
>                                                          -> rpm_suspend(iommu)
>                                                             -> do callback( rk_iommu_suspend)
>                                                                 -> do callback (genpd_power_off)
>                                                                     -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspended, ff903f00.iommu : suspending,so not_suspended = 1; do _genpd_power_off, Really turn off PD。
>                                                                         -> iommu do  __update_runtime_status(dev, RPM_SUSPENDED)
>

[...]

Kind regards
Uffe

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

       reply	other threads:[~2021-02-23 11:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210120172939160049119@rock-chips.com>
2021-02-23 11:30 ` Ulf Hansson [this message]
2021-02-23 11:30   ` [REPORT_ISSUE]: RK3399 pd power down failed Ulf Hansson
2021-02-23 17:09   ` Rafael J. Wysocki
2021-02-23 17:09     ` Rafael J. Wysocki
2021-02-24  3:37     ` elaine.zhang
2021-02-24  3:37       ` elaine.zhang
2021-02-24  9:42     ` Ulf Hansson
2021-02-24  9:42       ` Ulf Hansson
2021-02-24 12:50       ` Rafael J. Wysocki
2021-02-24 12:50         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com \
    --to=ulf.hansson@linaro.org \
    --cc=finley.xiao@rock-chips.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko@sntech.de \
    --cc=huangtao@rock-chips.com \
    --cc=kever.yang@rock-chips.com \
    --cc=khilman@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@rjwysocki.net \
    --cc=tony.xie@rock-chips.com \
    --cc=zhangqing@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.