* Re: [REPORT_ISSUE]: RK3399 pd power down failed [not found] <20210120172939160049119@rock-chips.com> @ 2021-02-23 11:30 ` Ulf Hansson 2021-02-23 17:09 ` Rafael J. Wysocki 0 siblings, 1 reply; 5+ messages in thread From: Ulf Hansson @ 2021-02-23 11:30 UTC (permalink / raw) To: zhangqing, rjw Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm, 杨凯, linux-rockchip, 谢修鑫, Finley Xiao, pavel On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com <zhangqing@rock-chips.com> wrote: > > Hi, Heiko : > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed. > > log show: > Open the vop > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a > > close the vop > #enter > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary > domain status slaves > /device runtime status > ---------------------------------------------------------------------- > pd_vopl off > pd_vopb on > /devices/platform/ff903f00.iommu suspended > /devices/platform/ff900000.vop suspended > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed. > The process is as follows: > > VOP requests to power off PD: > pm_runtime_put_sync(vop->dev) > -> rpm_idle(vop) > -> rpm_suspend(vop) > -> __update_runtime_status(dev, RPM_SUSPENDING) > -> rpm_callback(vop) > -> __rpm_callback(vop) > -> do power off pd callback(genpd_power_off) > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 > -> Handle link device callbacks according to device link(rpm_put_suppliers) > -> pm_runtime_put(link->supplier) > -> queue_work(pm_wq, &dev->power.work), execute immediately > ->rpm_idle(iommu) > -> rpm_suspend(iommu) > -> rpm_callback(iommu) > -> rk_iommu_suspend > -> do power off pd callback(genpd_power_off) > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。 > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) > -> vop do __update_runtime_status(dev, RPM_SUSPENDED) So, rpm_suspend() tries to suspend the supplier device link via rpm_put_suppliers(), before it has updated its consumer device's state to RPM_SUSPENDED. This looks worrying to me, both because it's seems wrong to allow a supplier to be suspended before a consumers device's state has reached RPM_SUSPENDED - but also because it's not consistent with the way we treat parent/child devices. The child's state will always be set to RPM_SUSPENDED, before we try to suspend its parent by calling rpm_idle() for it in rpm_suspend(). Rafael, what's your take on this? Would it make sense to align the behavior for consumer/supplier-links in rpm_suspend() according to child/parents? > > So the whole process goes down, the PD is not really power off, but the state of the child devices is SUSPENDED. > Pm_runtime_put (link->supplier) used to power off PD is asynchronous and is added to queue_work(pm_wq, &dev-bbb>wer-work); > If the queue_work is queued, the PD shutdown will succeed > The call is as follows: > > VOP requests to power off PD: > pm_runtime_put_sync(vop->dev) > -> rpm_idle(vop) > -> rpm_suspend(vop) > -> __update_runtime_status(dev, RPM_SUSPENDING) > -> rpm_callback(vop) > -> __rpm_callback(vop) > -> do power off pd callback(genpd_power_off) > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 > -> Handle link device callbacks according to device link(rpm_put_suppliers) > -> pm_runtime_put(link->supplier) > -> queue_work(pm_wq, &dev->power.work), > -> vop do __update_runtime_status(dev, RPM_SUSPENDED) > -> do queue_work (pm_runtime_work) > -> rpm_idle(iommu) > -> rpm_suspend(iommu) > -> do callback( rk_iommu_suspend) > -> do callback (genpd_power_off) > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspended, ff903f00.iommu : suspending,so not_suspended = 1; do _genpd_power_off, Really turn off PD。 > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) > [...] Kind regards Uffe _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed 2021-02-23 11:30 ` [REPORT_ISSUE]: RK3399 pd power down failed Ulf Hansson @ 2021-02-23 17:09 ` Rafael J. Wysocki 2021-02-24 3:37 ` elaine.zhang 2021-02-24 9:42 ` Ulf Hansson 0 siblings, 2 replies; 5+ messages in thread From: Rafael J. Wysocki @ 2021-02-23 17:09 UTC (permalink / raw) To: Ulf Hansson Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm, zhangqing, 杨凯, linux-rockchip, 谢修鑫, Finley Xiao, pavel On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote: > On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com > <zhangqing@rock-chips.com> wrote: > > > > Hi, Heiko : > > > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed. > > > > log show: > > Open the vop > > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a > > > > close the vop > > #enter > > > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary > > domain status slaves > > /device runtime status > > ---------------------------------------------------------------------- > > pd_vopl off > > pd_vopb on > > /devices/platform/ff903f00.iommu suspended > > /devices/platform/ff900000.vop suspended > > > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed. > > The process is as follows: > > > > VOP requests to power off PD: > > pm_runtime_put_sync(vop->dev) > > -> rpm_idle(vop) > > -> rpm_suspend(vop) > > -> __update_runtime_status(dev, RPM_SUSPENDING) > > -> rpm_callback(vop) > > -> __rpm_callback(vop) > > -> do power off pd callback(genpd_power_off) > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > -> Handle link device callbacks according to device link(rpm_put_suppliers) > > -> pm_runtime_put(link->supplier) > > -> queue_work(pm_wq, &dev->power.work), execute immediately > > ->rpm_idle(iommu) > > -> rpm_suspend(iommu) > > -> rpm_callback(iommu) > > -> rk_iommu_suspend > > -> do power off pd callback(genpd_power_off) > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) > > -> vop do __update_runtime_status(dev, RPM_SUSPENDED) > > So, rpm_suspend() tries to suspend the supplier device link via > rpm_put_suppliers(), before it has updated its consumer device's state > to RPM_SUSPENDED. > > This looks worrying to me, both because it's seems wrong to allow a > supplier to be suspended before a consumers device's state has reached > RPM_SUSPENDED - but also because it's not consistent with the way we > treat parent/child devices. The child's state will always be set to > RPM_SUSPENDED, before we try to suspend its parent by calling > rpm_idle() for it in rpm_suspend(). > > Rafael, what's your take on this? Would it make sense to align the > behavior for consumer/supplier-links in rpm_suspend() according to > child/parents? Suspending the suppliers before changing the consumer RPM status to "suspended" is indeed incorrect, which is something I overlooked when writing the code in question. Fortunately, it seems to be relatively easy to address. Please see the appended tentative patch (untested). It also avoids reading runtime_status outside the lock which is arguably fishy. --- drivers/base/power/runtime.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) Index: linux-pm/drivers/base/power/runtime.c =================================================================== --- linux-pm.orig/drivers/base/power/runtime.c +++ linux-pm/drivers/base/power/runtime.c @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru if (dev->power.irq_safe) { spin_unlock(&dev->power.lock); + } else if (!use_links) { + spin_unlock_irq(&dev->power.lock); } else { + bool get = dev->power.runtime_status == RPM_RESUMING; + spin_unlock_irq(&dev->power.lock); /* @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru * routine returns, so it is safe to read the status outside of * the lock. */ - if (use_links && dev->power.runtime_status == RPM_RESUMING) { + if (get) { idx = device_links_read_lock(); retval = rpm_get_suppliers(dev); @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru if (dev->power.irq_safe) { spin_lock(&dev->power.lock); + } if (!use_links) { + spin_lock_irq(&dev->power.lock); } else { + bool put; + + spin_lock_irq(&dev->power.lock); + + put = dev->power.runtime_status == RPM_SUSPENDING && !retval; + if (put) + __update_runtime_status(dev, RPM_SUSPENDED); + else + put = dev->power.runtime_status == RPM_RESUMING && retval; + + spin_unlock_irq(&dev->power.lock); + /* * If the device is suspending and the callback has returned * success, drop the usage counters of the suppliers that have @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru * * Do that if resume fails too. */ - if (use_links - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval) - || (dev->power.runtime_status == RPM_RESUMING && retval))) { + if (put) { idx = device_links_read_lock(); fail: _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed 2021-02-23 17:09 ` Rafael J. Wysocki @ 2021-02-24 3:37 ` elaine.zhang 2021-02-24 9:42 ` Ulf Hansson 1 sibling, 0 replies; 5+ messages in thread From: elaine.zhang @ 2021-02-24 3:37 UTC (permalink / raw) To: Rafael J. Wysocki, Ulf Hansson Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm, 杨凯, linux-rockchip, 谢修鑫, Finley Xiao, pavel Hi, Rafael: 在 2021/2/24 上午1:09, Rafael J. Wysocki 写道: > On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote: >> On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com >> <zhangqing@rock-chips.com> wrote: >>> Hi, Heiko : >>> >>> In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed. >>> >>> log show: >>> Open the vop >>> #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a >>> >>> close the vop >>> #enter >>> >>> # cat sys/kernel/debug/pm_genpd/pm_genpd_summary >>> domain status slaves >>> /device runtime status >>> ---------------------------------------------------------------------- >>> pd_vopl off >>> pd_vopb on >>> /devices/platform/ff903f00.iommu suspended >>> /devices/platform/ff900000.vop suspended >>> >>> I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed. >>> The process is as follows: >>> >>> VOP requests to power off PD: >>> pm_runtime_put_sync(vop->dev) >>> -> rpm_idle(vop) >>> -> rpm_suspend(vop) >>> -> __update_runtime_status(dev, RPM_SUSPENDING) >>> -> rpm_callback(vop) >>> -> __rpm_callback(vop) >>> -> do power off pd callback(genpd_power_off) >>> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 >>> -> Handle link device callbacks according to device link(rpm_put_suppliers) >>> -> pm_runtime_put(link->supplier) >>> -> queue_work(pm_wq, &dev->power.work), execute immediately >>> ->rpm_idle(iommu) >>> -> rpm_suspend(iommu) >>> -> rpm_callback(iommu) >>> -> rk_iommu_suspend >>> -> do power off pd callback(genpd_power_off) >>> -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。 >>> -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) >>> -> vop do __update_runtime_status(dev, RPM_SUSPENDED) >> So, rpm_suspend() tries to suspend the supplier device link via >> rpm_put_suppliers(), before it has updated its consumer device's state >> to RPM_SUSPENDED. >> >> This looks worrying to me, both because it's seems wrong to allow a >> supplier to be suspended before a consumers device's state has reached >> RPM_SUSPENDED - but also because it's not consistent with the way we >> treat parent/child devices. The child's state will always be set to >> RPM_SUSPENDED, before we try to suspend its parent by calling >> rpm_idle() for it in rpm_suspend(). >> >> Rafael, what's your take on this? Would it make sense to align the >> behavior for consumer/supplier-links in rpm_suspend() according to >> child/parents? > Suspending the suppliers before changing the consumer RPM status to > "suspended" is indeed incorrect, which is something I overlooked when > writing the code in question. > > Fortunately, it seems to be relatively easy to address. > > Please see the appended tentative patch (untested). It also avoids reading > runtime_status outside the lock which is arguably fishy. > > --- > drivers/base/power/runtime.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru > > if (dev->power.irq_safe) { > spin_unlock(&dev->power.lock); > + } else if (!use_links) { > + spin_unlock_irq(&dev->power.lock); > } else { > + bool get = dev->power.runtime_status == RPM_RESUMING; > + > spin_unlock_irq(&dev->power.lock); > > /* > @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru > * routine returns, so it is safe to read the status outside of > * the lock. > */ > - if (use_links && dev->power.runtime_status == RPM_RESUMING) { > + if (get) { > idx = device_links_read_lock(); > > retval = rpm_get_suppliers(dev); > @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru > > if (dev->power.irq_safe) { > spin_lock(&dev->power.lock); > + } if (!use_links) { > + spin_lock_irq(&dev->power.lock); > } else { > + bool put; > + > + spin_lock_irq(&dev->power.lock); > + > + put = dev->power.runtime_status == RPM_SUSPENDING && !retval; > + if (put) > + __update_runtime_status(dev, RPM_SUSPENDED); > + else > + put = dev->power.runtime_status == RPM_RESUMING && retval; > + > + spin_unlock_irq(&dev->power.lock); > + > /* > * If the device is suspending and the callback has returned > * success, drop the usage counters of the suppliers that have > @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru > * > * Do that if resume fails too. > */ > - if (use_links > - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval) > - || (dev->power.runtime_status == RPM_RESUMING && retval))) { > + if (put) { > idx = device_links_read_lock(); > > fail: Thank you for your reply. I have tested this patch, and it's works well.Perfect solution to our problem. We expect this patch to be committed to the mainline branch. > > > > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed 2021-02-23 17:09 ` Rafael J. Wysocki 2021-02-24 3:37 ` elaine.zhang @ 2021-02-24 9:42 ` Ulf Hansson 2021-02-24 12:50 ` Rafael J. Wysocki 1 sibling, 1 reply; 5+ messages in thread From: Ulf Hansson @ 2021-02-24 9:42 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm, zhangqing, 杨凯, linux-rockchip, 谢修鑫, Finley Xiao, pavel On Tue, 23 Feb 2021 at 18:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote: > > On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com > > <zhangqing@rock-chips.com> wrote: > > > > > > Hi, Heiko : > > > > > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed. > > > > > > log show: > > > Open the vop > > > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a > > > > > > close the vop > > > #enter > > > > > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary > > > domain status slaves > > > /device runtime status > > > ---------------------------------------------------------------------- > > > pd_vopl off > > > pd_vopb on > > > /devices/platform/ff903f00.iommu suspended > > > /devices/platform/ff900000.vop suspended > > > > > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed. > > > The process is as follows: > > > > > > VOP requests to power off PD: > > > pm_runtime_put_sync(vop->dev) > > > -> rpm_idle(vop) > > > -> rpm_suspend(vop) > > > -> __update_runtime_status(dev, RPM_SUSPENDING) > > > -> rpm_callback(vop) > > > -> __rpm_callback(vop) > > > -> do power off pd callback(genpd_power_off) > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > > -> Handle link device callbacks according to device link(rpm_put_suppliers) > > > -> pm_runtime_put(link->supplier) > > > -> queue_work(pm_wq, &dev->power.work), execute immediately > > > ->rpm_idle(iommu) > > > -> rpm_suspend(iommu) > > > -> rpm_callback(iommu) > > > -> rk_iommu_suspend > > > -> do power off pd callback(genpd_power_off) > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) > > > -> vop do __update_runtime_status(dev, RPM_SUSPENDED) > > > > So, rpm_suspend() tries to suspend the supplier device link via > > rpm_put_suppliers(), before it has updated its consumer device's state > > to RPM_SUSPENDED. > > > > This looks worrying to me, both because it's seems wrong to allow a > > supplier to be suspended before a consumers device's state has reached > > RPM_SUSPENDED - but also because it's not consistent with the way we > > treat parent/child devices. The child's state will always be set to > > RPM_SUSPENDED, before we try to suspend its parent by calling > > rpm_idle() for it in rpm_suspend(). > > > > Rafael, what's your take on this? Would it make sense to align the > > behavior for consumer/supplier-links in rpm_suspend() according to > > child/parents? > > Suspending the suppliers before changing the consumer RPM status to > "suspended" is indeed incorrect, which is something I overlooked when > writing the code in question. > > Fortunately, it seems to be relatively easy to address. > > Please see the appended tentative patch (untested). It also avoids reading > runtime_status outside the lock which is arguably fishy. Great, thanks for your quick reply! A minor comment on the below change, but otherwise feel free add my reviewed-by tag. Kind regards Uffe > > --- > drivers/base/power/runtime.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/base/power/runtime.c > =================================================================== > --- linux-pm.orig/drivers/base/power/runtime.c > +++ linux-pm/drivers/base/power/runtime.c > @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru > > if (dev->power.irq_safe) { > spin_unlock(&dev->power.lock); > + } else if (!use_links) { > + spin_unlock_irq(&dev->power.lock); > } else { > + bool get = dev->power.runtime_status == RPM_RESUMING; > + > spin_unlock_irq(&dev->power.lock); > > /* > @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru > * routine returns, so it is safe to read the status outside of > * the lock. > */ > - if (use_links && dev->power.runtime_status == RPM_RESUMING) { > + if (get) { > idx = device_links_read_lock(); > > retval = rpm_get_suppliers(dev); > @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru > > if (dev->power.irq_safe) { > spin_lock(&dev->power.lock); > + } if (!use_links) { This should be an "else if", I think. > + spin_lock_irq(&dev->power.lock); > } else { > + bool put; > + > + spin_lock_irq(&dev->power.lock); > + > + put = dev->power.runtime_status == RPM_SUSPENDING && !retval; > + if (put) > + __update_runtime_status(dev, RPM_SUSPENDED); > + else > + put = dev->power.runtime_status == RPM_RESUMING && retval; > + > + spin_unlock_irq(&dev->power.lock); > + > /* > * If the device is suspending and the callback has returned > * success, drop the usage counters of the suppliers that have > @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru > * > * Do that if resume fails too. > */ > - if (use_links > - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval) > - || (dev->power.runtime_status == RPM_RESUMING && retval))) { > + if (put) { > idx = device_links_read_lock(); > > fail: > > > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REPORT_ISSUE]: RK3399 pd power down failed 2021-02-24 9:42 ` Ulf Hansson @ 2021-02-24 12:50 ` Rafael J. Wysocki 0 siblings, 0 replies; 5+ messages in thread From: Rafael J. Wysocki @ 2021-02-24 12:50 UTC (permalink / raw) To: Ulf Hansson Cc: Huang, Tao, len.brown, heiko, khilman, gregkh, linux-pm, zhangqing, Rafael J. Wysocki, 杨凯, linux-rockchip, 谢修鑫, Finley Xiao, pavel On Wed, Feb 24, 2021 at 10:46 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Tue, 23 Feb 2021 at 18:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > On Tuesday, February 23, 2021 12:30:39 PM CET Ulf Hansson wrote: > > > On Wed, 20 Jan 2021 at 10:30, zhangqing@rock-chips.com > > > <zhangqing@rock-chips.com> wrote: > > > > > > > > Hi, Heiko : > > > > > > > > In rk3399 evb board, I found a probabilistic problem about PD. Turning off PD occasionally failed. > > > > > > > > log show: > > > > Open the vop > > > > #modetest -M rockchip -s 42@36:1536x2048 -P 31@36:1536x2048@AR24 -a > > > > > > > > close the vop > > > > #enter > > > > > > > > # cat sys/kernel/debug/pm_genpd/pm_genpd_summary > > > > domain status slaves > > > > /device runtime status > > > > ---------------------------------------------------------------------- > > > > pd_vopl off > > > > pd_vopb on > > > > /devices/platform/ff903f00.iommu suspended > > > > /devices/platform/ff900000.vop suspended > > > > > > > > I have checked the codes and concluded that there is a window of time for PD to be closed when using the device link. Once queue_work is executed immediately, PD power off may be failed. > > > > The process is as follows: > > > > > > > > VOP requests to power off PD: > > > > pm_runtime_put_sync(vop->dev) > > > > -> rpm_idle(vop) > > > > -> rpm_suspend(vop) > > > > -> __update_runtime_status(dev, RPM_SUSPENDING) > > > > -> rpm_callback(vop) > > > > -> __rpm_callback(vop) > > > > -> do power off pd callback(genpd_power_off) > > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : active,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > > > -> Handle link device callbacks according to device link(rpm_put_suppliers) > > > > -> pm_runtime_put(link->supplier) > > > > -> queue_work(pm_wq, &dev->power.work), execute immediately > > > > ->rpm_idle(iommu) > > > > -> rpm_suspend(iommu) > > > > -> rpm_callback(iommu) > > > > -> rk_iommu_suspend > > > > -> do power off pd callback(genpd_power_off) > > > > -> list_for_each_entry(pdd, &genpd->dev_list, list_node), ff900000.vop: suspending, ff903f00.iommu : suspending,so not_suspended = 2 return -EBUSY; Not really power off PD。 > > > > -> iommu do __update_runtime_status(dev, RPM_SUSPENDED) > > > > -> vop do __update_runtime_status(dev, RPM_SUSPENDED) > > > > > > So, rpm_suspend() tries to suspend the supplier device link via > > > rpm_put_suppliers(), before it has updated its consumer device's state > > > to RPM_SUSPENDED. > > > > > > This looks worrying to me, both because it's seems wrong to allow a > > > supplier to be suspended before a consumers device's state has reached > > > RPM_SUSPENDED - but also because it's not consistent with the way we > > > treat parent/child devices. The child's state will always be set to > > > RPM_SUSPENDED, before we try to suspend its parent by calling > > > rpm_idle() for it in rpm_suspend(). > > > > > > Rafael, what's your take on this? Would it make sense to align the > > > behavior for consumer/supplier-links in rpm_suspend() according to > > > child/parents? > > > > Suspending the suppliers before changing the consumer RPM status to > > "suspended" is indeed incorrect, which is something I overlooked when > > writing the code in question. > > > > Fortunately, it seems to be relatively easy to address. > > > > Please see the appended tentative patch (untested). It also avoids reading > > runtime_status outside the lock which is arguably fishy. > > Great, thanks for your quick reply! > > A minor comment on the below change, but otherwise feel free add my > reviewed-by tag. Thanks! I'm going to submit an updated patch that avoids some unnecessary locking overhead. > > > > --- > > drivers/base/power/runtime.c | 24 ++++++++++++++++++++---- > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > Index: linux-pm/drivers/base/power/runtime.c > > =================================================================== > > --- linux-pm.orig/drivers/base/power/runtime.c > > +++ linux-pm/drivers/base/power/runtime.c > > @@ -330,7 +330,11 @@ static int __rpm_callback(int (*cb)(stru > > > > if (dev->power.irq_safe) { > > spin_unlock(&dev->power.lock); > > + } else if (!use_links) { > > + spin_unlock_irq(&dev->power.lock); > > } else { > > + bool get = dev->power.runtime_status == RPM_RESUMING; > > + > > spin_unlock_irq(&dev->power.lock); > > > > /* > > @@ -340,7 +344,7 @@ static int __rpm_callback(int (*cb)(stru > > * routine returns, so it is safe to read the status outside of > > * the lock. > > */ > > - if (use_links && dev->power.runtime_status == RPM_RESUMING) { > > + if (get) { > > idx = device_links_read_lock(); > > > > retval = rpm_get_suppliers(dev); > > @@ -355,7 +359,21 @@ static int __rpm_callback(int (*cb)(stru > > > > if (dev->power.irq_safe) { > > spin_lock(&dev->power.lock); > > + } if (!use_links) { > > This should be an "else if", I think. Yes, it should, thanks! > > + spin_lock_irq(&dev->power.lock); > > } else { > > + bool put; > > + > > + spin_lock_irq(&dev->power.lock); > > + > > + put = dev->power.runtime_status == RPM_SUSPENDING && !retval; > > + if (put) > > + __update_runtime_status(dev, RPM_SUSPENDED); > > + else > > + put = dev->power.runtime_status == RPM_RESUMING && retval; > > + > > + spin_unlock_irq(&dev->power.lock); > > + > > /* > > * If the device is suspending and the callback has returned > > * success, drop the usage counters of the suppliers that have > > @@ -363,9 +381,7 @@ static int __rpm_callback(int (*cb)(stru > > * > > * Do that if resume fails too. > > */ > > - if (use_links > > - && ((dev->power.runtime_status == RPM_SUSPENDING && !retval) > > - || (dev->power.runtime_status == RPM_RESUMING && retval))) { > > + if (put) { > > idx = device_links_read_lock(); > > > > fail: > > > > > > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-02-24 12:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20210120172939160049119@rock-chips.com> 2021-02-23 11:30 ` [REPORT_ISSUE]: RK3399 pd power down failed Ulf Hansson 2021-02-23 17:09 ` Rafael J. Wysocki 2021-02-24 3:37 ` elaine.zhang 2021-02-24 9:42 ` Ulf Hansson 2021-02-24 12:50 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).