* Re: [REPORT_ISSUE]: RK3399 pd power down failed
[not found] <20210120172939160049119@rock-chips.com>
@ 2021-02-23 11:30 ` Ulf Hansson
2021-02-23 17:09 ` Rafael J. Wysocki
0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hansson @ 2021-02-23 11:30 UTC (permalink / raw)
To: zhangqing, rjw
Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm,
杨凯, linux-rockchip, 谢修鑫,
Finley Xiao, pavel
On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
<zhangqing@rock-chips.com> wrote:
>
> Hi, Heiko :
>
> In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed.
>
> log show:
> Open the vop
> #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
>
> close the vop
> #enter
>
> # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> domain status slaves
> /device runtime status
> ----------------------------------------------------------------------
> pd_vopl off
> pd_vopb on
> /devices/platform/ff903f00.iommu suspended
> /devices/platform/ff900000.vop suspended
>
> I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed.
> The process is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
> -> rpm_idle(vop)
> -> rpm_suspend(vop)
> -> __update_runtime_status(dev, RPM_SUSPENDING)
> -> rpm_callback(vop)
> -> __rpm_callback(vop)
> -> do power off pd callback(genpd_power_off)
> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
> -> Handle link device callbacks according to device link(rpm_put_suppliers)
> -> pm_runtime_put(link->supplier)
> -> queue_work(pm_wq, &dev->power.work), execute immediately
> ->rpm_idle(iommu)
> -> rpm_suspend(iommu)
> -> rpm_callback(iommu)
> -> rk_iommu_suspend
> -> do power off pd callback(genpd_power_off)
> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
> -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
> -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
So, rpm_suspend() tries to suspend the supplier device link via
rpm_put_suppliers(), before it has updated its consumer device's state
to RPM_SUSPENDED.
This looks worrying to me, both because it's seems wrong to allow a
supplier to be suspended before a consumers device's state has reached
RPM_SUSPENDED - but also because it's not consistent with the way we
treat parent/child devices. The child's state will always be set to
RPM_SUSPENDED, before we try to suspend its parent by calling
rpm_idle() for it in rpm_suspend().
Rafael, what's your take on this? Would it make sense to align the
behavior for consumer/supplier-links in rpm_suspend() according to
child/parents?
>
> So the whole process goes down, the PD is not really power off, but the state of the child devices is SUSPENDED.
> Pm_runtime_put (link->supplier) used to power off PD is asynchronous and is added to queue_work(pm_wq, &dev-bbb>wer-work);
> If the queue_work is queued, the PD shutdown will succeed
> The call is as follows:
>
> VOP requests to power off PD:
> pm_runtime_put_sync(vop->dev)
> -> rpm_idle(vop)
> -> rpm_suspend(vop)
> -> __update_runtime_status(dev, RPM_SUSPENDING)
> -> rpm_callback(vop)
> -> __rpm_callback(vop)
> -> do power off pd callback(genpd_power_off)
> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
> -> Handle link device callbacks according to device link(rpm_put_suppliers)
> -> pm_runtime_put(link->supplier)
> -> queue_work(pm_wq, &dev->power.work),
> -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
> -> do queue_work (pm_runtime_work)
> -> rpm_idle(iommu)
> -> rpm_suspend(iommu)
> -> do callback( rk_iommu_suspend)
> -> do callback (genpd_power_off)
> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspended, ff903f00.iommu : suspending,so not_suspended = 1; do _genpd_power_off, Really turn off PD。
> -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
>
[...]
Kind regards
Uffe
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed
2021-02-23 11:30 ` [REPORT_ISSUE]: RK3399 pd power down failed Ulf Hansson
@ 2021-02-23 17:09 ` Rafael J. Wysocki
2021-02-24 3:37 ` elaine.zhang
2021-02-24 9:42 ` Ulf Hansson
0 siblings, 2 replies; 5+ messages in thread
From: Rafael J. Wysocki @ 2021-02-23 17:09 UTC (permalink / raw)
To: Ulf Hansson
Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm,
zhangqing, 杨凯,
linux-rockchip, 谢修鑫,
Finley Xiao, pavel
On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote:
> On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
> <zhangqing@rock-chips.com> wrote:
> >
> > Hi, Heiko :
> >
> > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed.
> >
> > log show:
> > Open the vop
> > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
> >
> > close the vop
> > #enter
> >
> > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> > domain status slaves
> > /device runtime status
> > ----------------------------------------------------------------------
> > pd_vopl off
> > pd_vopb on
> > /devices/platform/ff903f00.iommu suspended
> > /devices/platform/ff900000.vop suspended
> >
> > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed.
> > The process is as follows:
> >
> > VOP requests to power off PD:
> > pm_runtime_put_sync(vop->dev)
> > -> rpm_idle(vop)
> > -> rpm_suspend(vop)
> > -> __update_runtime_status(dev, RPM_SUSPENDING)
> > -> rpm_callback(vop)
> > -> __rpm_callback(vop)
> > -> do power off pd callback(genpd_power_off)
> > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > -> Handle link device callbacks according to device link(rpm_put_suppliers)
> > -> pm_runtime_put(link->supplier)
> > -> queue_work(pm_wq, &dev->power.work), execute immediately
> > ->rpm_idle(iommu)
> > -> rpm_suspend(iommu)
> > -> rpm_callback(iommu)
> > -> rk_iommu_suspend
> > -> do power off pd callback(genpd_power_off)
> > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
> > -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
>
> So, rpm_suspend() tries to suspend the supplier device link via
> rpm_put_suppliers(), before it has updated its consumer device's state
> to RPM_SUSPENDED.
>
> This looks worrying to me, both because it's seems wrong to allow a
> supplier to be suspended before a consumers device's state has reached
> RPM_SUSPENDED - but also because it's not consistent with the way we
> treat parent/child devices. The child's state will always be set to
> RPM_SUSPENDED, before we try to suspend its parent by calling
> rpm_idle() for it in rpm_suspend().
>
> Rafael, what's your take on this? Would it make sense to align the
> behavior for consumer/supplier-links in rpm_suspend() according to
> child/parents?
Suspending the suppliers before changing the consumer RPM status to
"suspended" is indeed incorrect, which is something I overlooked when
writing the code in question.
Fortunately, it seems to be relatively easy to address.
Please see the appended tentative patch (untested). It also avoids reading
runtime_status outside the lock which is arguably fishy.
---
drivers/base/power/runtime.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru
if (dev->power.irq_safe) {
spin_unlock(&dev->power.lock);
+ } else if (!use_links) {
+ spin_unlock_irq(&dev->power.lock);
} else {
+ bool get = dev->power.runtime_status == RPM_RESUMING;
+
spin_unlock_irq(&dev->power.lock);
/*
@@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru
* routine returns, so it is safe to read the status outside of
* the lock.
*/
- if (use_links && dev->power.runtime_status == RPM_RESUMING) {
+ if (get) {
idx = device_links_read_lock();
retval = rpm_get_suppliers(dev);
@@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru
if (dev->power.irq_safe) {
spin_lock(&dev->power.lock);
+ } if (!use_links) {
+ spin_lock_irq(&dev->power.lock);
} else {
+ bool put;
+
+ spin_lock_irq(&dev->power.lock);
+
+ put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
+ if (put)
+ __update_runtime_status(dev, RPM_SUSPENDED);
+ else
+ put = dev->power.runtime_status == RPM_RESUMING && retval;
+
+ spin_unlock_irq(&dev->power.lock);
+
/*
* If the device is suspending and the callback has returned
* success, drop the usage counters of the suppliers that have
@@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru
*
* Do that if resume fails too.
*/
- if (use_links
- && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
- || (dev->power.runtime_status == RPM_RESUMING && retval))) {
+ if (put) {
idx = device_links_read_lock();
fail:
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed
2021-02-23 17:09 ` Rafael J. Wysocki
@ 2021-02-24 3:37 ` elaine.zhang
2021-02-24 9:42 ` Ulf Hansson
1 sibling, 0 replies; 5+ messages in thread
From: elaine.zhang @ 2021-02-24 3:37 UTC (permalink / raw)
To: Rafael J. Wysocki, Ulf Hansson
Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm,
杨凯, linux-rockchip, 谢修鑫,
Finley Xiao, pavel
Hi, Rafael:
在 2021/2/24 上午1:09, Rafael J. Wysocki 写道:
> On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote:
>> On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
>> <zhangqing@rock-chips.com> wrote:
>>> Hi, Heiko :
>>>
>>> In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed.
>>>
>>> log show:
>>> Open the vop
>>> #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
>>>
>>> close the vop
>>> #enter
>>>
>>> # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
>>> domain status slaves
>>> /device runtime status
>>> ----------------------------------------------------------------------
>>> pd_vopl off
>>> pd_vopb on
>>> /devices/platform/ff903f00.iommu suspended
>>> /devices/platform/ff900000.vop suspended
>>>
>>> I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed.
>>> The process is as follows:
>>>
>>> VOP requests to power off PD:
>>> pm_runtime_put_sync(vop->dev)
>>> -> rpm_idle(vop)
>>> -> rpm_suspend(vop)
>>> -> __update_runtime_status(dev, RPM_SUSPENDING)
>>> -> rpm_callback(vop)
>>> -> __rpm_callback(vop)
>>> -> do power off pd callback(genpd_power_off)
>>> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
>>> -> Handle link device callbacks according to device link(rpm_put_suppliers)
>>> -> pm_runtime_put(link->supplier)
>>> -> queue_work(pm_wq, &dev->power.work), execute immediately
>>> ->rpm_idle(iommu)
>>> -> rpm_suspend(iommu)
>>> -> rpm_callback(iommu)
>>> -> rk_iommu_suspend
>>> -> do power off pd callback(genpd_power_off)
>>> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
>>> -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
>>> -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
>> So, rpm_suspend() tries to suspend the supplier device link via
>> rpm_put_suppliers(), before it has updated its consumer device's state
>> to RPM_SUSPENDED.
>>
>> This looks worrying to me, both because it's seems wrong to allow a
>> supplier to be suspended before a consumers device's state has reached
>> RPM_SUSPENDED - but also because it's not consistent with the way we
>> treat parent/child devices. The child's state will always be set to
>> RPM_SUSPENDED, before we try to suspend its parent by calling
>> rpm_idle() for it in rpm_suspend().
>>
>> Rafael, what's your take on this? Would it make sense to align the
>> behavior for consumer/supplier-links in rpm_suspend() according to
>> child/parents?
> Suspending the suppliers before changing the consumer RPM status to
> "suspended" is indeed incorrect, which is something I overlooked when
> writing the code in question.
>
> Fortunately, it seems to be relatively easy to address.
>
> Please see the appended tentative patch (untested). It also avoids reading
> runtime_status outside the lock which is arguably fishy.
>
> ---
> drivers/base/power/runtime.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru
>
> if (dev->power.irq_safe) {
> spin_unlock(&dev->power.lock);
> + } else if (!use_links) {
> + spin_unlock_irq(&dev->power.lock);
> } else {
> + bool get = dev->power.runtime_status == RPM_RESUMING;
> +
> spin_unlock_irq(&dev->power.lock);
>
> /*
> @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru
> * routine returns, so it is safe to read the status outside of
> * the lock.
> */
> - if (use_links && dev->power.runtime_status == RPM_RESUMING) {
> + if (get) {
> idx = device_links_read_lock();
>
> retval = rpm_get_suppliers(dev);
> @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru
>
> if (dev->power.irq_safe) {
> spin_lock(&dev->power.lock);
> + } if (!use_links) {
> + spin_lock_irq(&dev->power.lock);
> } else {
> + bool put;
> +
> + spin_lock_irq(&dev->power.lock);
> +
> + put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
> + if (put)
> + __update_runtime_status(dev, RPM_SUSPENDED);
> + else
> + put = dev->power.runtime_status == RPM_RESUMING && retval;
> +
> + spin_unlock_irq(&dev->power.lock);
> +
> /*
> * If the device is suspending and the callback has returned
> * success, drop the usage counters of the suppliers that have
> @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru
> *
> * Do that if resume fails too.
> */
> - if (use_links
> - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
> - || (dev->power.runtime_status == RPM_RESUMING && retval))) {
> + if (put) {
> idx = device_links_read_lock();
>
> fail:
Thank you for your reply.
I have tested this patch, and it's works well.Perfect solution to our
problem.
We expect this patch to be committed to the mainline branch.
>
>
>
>
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed
2021-02-23 17:09 ` Rafael J. Wysocki
2021-02-24 3:37 ` elaine.zhang
@ 2021-02-24 9:42 ` Ulf Hansson
2021-02-24 12:50 ` Rafael J. Wysocki
1 sibling, 1 reply; 5+ messages in thread
From: Ulf Hansson @ 2021-02-24 9:42 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm,
zhangqing, 杨凯,
linux-rockchip, 谢修鑫,
Finley Xiao, pavel
On Tue, 23 Feb 2021 at 18:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote:
> > On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
> > <zhangqing@rock-chips.com> wrote:
> > >
> > > Hi, Heiko :
> > >
> > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed.
> > >
> > > log show:
> > > Open the vop
> > > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
> > >
> > > close the vop
> > > #enter
> > >
> > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> > > domain status slaves
> > > /device runtime status
> > > ----------------------------------------------------------------------
> > > pd_vopl off
> > > pd_vopb on
> > > /devices/platform/ff903f00.iommu suspended
> > > /devices/platform/ff900000.vop suspended
> > >
> > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed.
> > > The process is as follows:
> > >
> > > VOP requests to power off PD:
> > > pm_runtime_put_sync(vop->dev)
> > > -> rpm_idle(vop)
> > > -> rpm_suspend(vop)
> > > -> __update_runtime_status(dev, RPM_SUSPENDING)
> > > -> rpm_callback(vop)
> > > -> __rpm_callback(vop)
> > > -> do power off pd callback(genpd_power_off)
> > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > > -> Handle link device callbacks according to device link(rpm_put_suppliers)
> > > -> pm_runtime_put(link->supplier)
> > > -> queue_work(pm_wq, &dev->power.work), execute immediately
> > > ->rpm_idle(iommu)
> > > -> rpm_suspend(iommu)
> > > -> rpm_callback(iommu)
> > > -> rk_iommu_suspend
> > > -> do power off pd callback(genpd_power_off)
> > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
> > > -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
> >
> > So, rpm_suspend() tries to suspend the supplier device link via
> > rpm_put_suppliers(), before it has updated its consumer device's state
> > to RPM_SUSPENDED.
> >
> > This looks worrying to me, both because it's seems wrong to allow a
> > supplier to be suspended before a consumers device's state has reached
> > RPM_SUSPENDED - but also because it's not consistent with the way we
> > treat parent/child devices. The child's state will always be set to
> > RPM_SUSPENDED, before we try to suspend its parent by calling
> > rpm_idle() for it in rpm_suspend().
> >
> > Rafael, what's your take on this? Would it make sense to align the
> > behavior for consumer/supplier-links in rpm_suspend() according to
> > child/parents?
>
> Suspending the suppliers before changing the consumer RPM status to
> "suspended" is indeed incorrect, which is something I overlooked when
> writing the code in question.
>
> Fortunately, it seems to be relatively easy to address.
>
> Please see the appended tentative patch (untested). It also avoids reading
> runtime_status outside the lock which is arguably fishy.
Great, thanks for your quick reply!
A minor comment on the below change, but otherwise feel free add my
reviewed-by tag.
Kind regards
Uffe
>
> ---
> drivers/base/power/runtime.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru
>
> if (dev->power.irq_safe) {
> spin_unlock(&dev->power.lock);
> + } else if (!use_links) {
> + spin_unlock_irq(&dev->power.lock);
> } else {
> + bool get = dev->power.runtime_status == RPM_RESUMING;
> +
> spin_unlock_irq(&dev->power.lock);
>
> /*
> @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru
> * routine returns, so it is safe to read the status outside of
> * the lock.
> */
> - if (use_links && dev->power.runtime_status == RPM_RESUMING) {
> + if (get) {
> idx = device_links_read_lock();
>
> retval = rpm_get_suppliers(dev);
> @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru
>
> if (dev->power.irq_safe) {
> spin_lock(&dev->power.lock);
> + } if (!use_links) {
This should be an "else if", I think.
> + spin_lock_irq(&dev->power.lock);
> } else {
> + bool put;
> +
> + spin_lock_irq(&dev->power.lock);
> +
> + put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
> + if (put)
> + __update_runtime_status(dev, RPM_SUSPENDED);
> + else
> + put = dev->power.runtime_status == RPM_RESUMING && retval;
> +
> + spin_unlock_irq(&dev->power.lock);
> +
> /*
> * If the device is suspending and the callback has returned
> * success, drop the usage counters of the suppliers that have
> @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru
> *
> * Do that if resume fails too.
> */
> - if (use_links
> - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
> - || (dev->power.runtime_status == RPM_RESUMING && retval))) {
> + if (put) {
> idx = device_links_read_lock();
>
> fail:
>
>
>
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed
2021-02-24 9:42 ` Ulf Hansson
@ 2021-02-24 12:50 ` Rafael J. Wysocki
0 siblings, 0 replies; 5+ messages in thread
From: Rafael J. Wysocki @ 2021-02-24 12:50 UTC (permalink / raw)
To: Ulf Hansson
Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm,
zhangqing, Rafael J. Wysocki, 杨凯,
linux-rockchip, 谢修鑫,
Finley Xiao, pavel
On Wed, Feb 24, 2021 at 10:46 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Tue, 23 Feb 2021 at 18:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote:
> > > On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com
> > > <zhangqing@rock-chips.com> wrote:
> > > >
> > > > Hi, Heiko :
> > > >
> > > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed.
> > > >
> > > > log show:
> > > > Open the vop
> > > > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a
> > > >
> > > > close the vop
> > > > #enter
> > > >
> > > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary
> > > > domain status slaves
> > > > /device runtime status
> > > > ----------------------------------------------------------------------
> > > > pd_vopl off
> > > > pd_vopb on
> > > > /devices/platform/ff903f00.iommu suspended
> > > > /devices/platform/ff900000.vop suspended
> > > >
> > > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed.
> > > > The process is as follows:
> > > >
> > > > VOP requests to power off PD:
> > > > pm_runtime_put_sync(vop->dev)
> > > > -> rpm_idle(vop)
> > > > -> rpm_suspend(vop)
> > > > -> __update_runtime_status(dev, RPM_SUSPENDING)
> > > > -> rpm_callback(vop)
> > > > -> __rpm_callback(vop)
> > > > -> do power off pd callback(genpd_power_off)
> > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > > > -> Handle link device callbacks according to device link(rpm_put_suppliers)
> > > > -> pm_runtime_put(link->supplier)
> > > > -> queue_work(pm_wq, &dev->power.work), execute immediately
> > > > ->rpm_idle(iommu)
> > > > -> rpm_suspend(iommu)
> > > > -> rpm_callback(iommu)
> > > > -> rk_iommu_suspend
> > > > -> do power off pd callback(genpd_power_off)
> > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。
> > > > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED)
> > > > -> vop do __update_runtime_status(dev, RPM_SUSPENDED)
> > >
> > > So, rpm_suspend() tries to suspend the supplier device link via
> > > rpm_put_suppliers(), before it has updated its consumer device's state
> > > to RPM_SUSPENDED.
> > >
> > > This looks worrying to me, both because it's seems wrong to allow a
> > > supplier to be suspended before a consumers device's state has reached
> > > RPM_SUSPENDED - but also because it's not consistent with the way we
> > > treat parent/child devices. The child's state will always be set to
> > > RPM_SUSPENDED, before we try to suspend its parent by calling
> > > rpm_idle() for it in rpm_suspend().
> > >
> > > Rafael, what's your take on this? Would it make sense to align the
> > > behavior for consumer/supplier-links in rpm_suspend() according to
> > > child/parents?
> >
> > Suspending the suppliers before changing the consumer RPM status to
> > "suspended" is indeed incorrect, which is something I overlooked when
> > writing the code in question.
> >
> > Fortunately, it seems to be relatively easy to address.
> >
> > Please see the appended tentative patch (untested). It also avoids reading
> > runtime_status outside the lock which is arguably fishy.
>
> Great, thanks for your quick reply!
>
> A minor comment on the below change, but otherwise feel free add my
> reviewed-by tag.
Thanks!
I'm going to submit an updated patch that avoids some unnecessary
locking overhead.
> >
> > ---
> > drivers/base/power/runtime.c | 24 ++++++++++++++++++++----
> > 1 file changed, 20 insertions(+), 4 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru
> >
> > if (dev->power.irq_safe) {
> > spin_unlock(&dev->power.lock);
> > + } else if (!use_links) {
> > + spin_unlock_irq(&dev->power.lock);
> > } else {
> > + bool get = dev->power.runtime_status == RPM_RESUMING;
> > +
> > spin_unlock_irq(&dev->power.lock);
> >
> > /*
> > @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru
> > * routine returns, so it is safe to read the status outside of
> > * the lock.
> > */
> > - if (use_links && dev->power.runtime_status == RPM_RESUMING) {
> > + if (get) {
> > idx = device_links_read_lock();
> >
> > retval = rpm_get_suppliers(dev);
> > @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru
> >
> > if (dev->power.irq_safe) {
> > spin_lock(&dev->power.lock);
> > + } if (!use_links) {
>
> This should be an "else if", I think.
Yes, it should, thanks!
> > + spin_lock_irq(&dev->power.lock);
> > } else {
> > + bool put;
> > +
> > + spin_lock_irq(&dev->power.lock);
> > +
> > + put = dev->power.runtime_status == RPM_SUSPENDING && !retval;
> > + if (put)
> > + __update_runtime_status(dev, RPM_SUSPENDED);
> > + else
> > + put = dev->power.runtime_status == RPM_RESUMING && retval;
> > +
> > + spin_unlock_irq(&dev->power.lock);
> > +
> > /*
> > * If the device is suspending and the callback has returned
> > * success, drop the usage counters of the suppliers that have
> > @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru
> > *
> > * Do that if resume fails too.
> > */
> > - if (use_links
> > - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
> > - || (dev->power.runtime_status == RPM_RESUMING && retval))) {
> > + if (put) {
> > idx = device_links_read_lock();
> >
> > fail:
> >
> >
> >
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-02-24 12:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20210120172939160049119@rock-chips.com>
2021-02-23 11:30 ` [REPORT_ISSUE]: RK3399 pd power down failed Ulf Hansson
2021-02-23 17:09 ` Rafael J. Wysocki
2021-02-24 3:37 ` elaine.zhang
2021-02-24 9:42 ` Ulf Hansson
2021-02-24 12:50 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).