[PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
@ 2020-04-30  4:10 Can Guo
  2020-04-30  5:08 ` Bart Van Assche
  0 siblings, 1 reply; 12+ messages in thread
From: Can Guo @ 2020-04-30  4:10 UTC (permalink / raw)
  To: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, bvanassche, linux-scsi,
	kernel-team, saravanak, salyzyn, cang
  Cc: James E.J. Bottomley, Martin K. Petersen, open list

During system resume, scsi_resume_device() decreases a request queue's
pm_only counter if the scsi device was quiesced before. But after that,
if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is
still held (non-zero). Current scsi resume hook only sets the RPM status
of the scsi device and its request queue to RPM_ACTIVE, but leaves the
pm_only counter unchanged. This may make the request queue's pm_only
counter remain non-zero after resume hook returns, hence those who are
waiting on the mq_freeze_wq would never be woken up. Fix this by calling
blk_post_runtime_resume() if pm_only is non-zero to balance the pm_only
counter which is held by the scsi device's RPM ops.

(struct request_queue)0xFFFFFF815B69E938
	pm_only = (counter = 2),
	rpm_status = 0,
	dev = 0xFFFFFF815B0511A0,

((struct device)0xFFFFFF815B0511A0)).power
	is_suspended = FALSE,
	runtime_status = RPM_ACTIVE,

(struct scsi_device)0xFFFFFF815b051000
	request_queue = 0xFFFFFF815B69E938,
	sdev_state = SDEV_RUNNING,
	quiesced_by = 0x0,

B::v.f_/task_
-000|__switch_to
-001|context_switch
-001|__schedule
-002|schedule
-003|blk_queue_enter(q = 0xFFFFFF815B69E938, flags = 0)
-004|generic_make_request
-005|submit_bio

Signed-off-by: Can Guo <cang@codeaurora.org>
---

Change since v2:
- Rebased on 5.8-scsi-queue

Change since v1:
- Added more debugging context info

 drivers/scsi/scsi_pm.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index 3717eea..4804029 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -93,8 +93,10 @@ static int scsi_dev_type_resume(struct device *dev,
 		 */
 		if (!err && scsi_is_sdev_device(dev)) {
 			struct scsi_device *sdev = to_scsi_device(dev);
-
-			blk_set_runtime_active(sdev->request_queue);
+			if (blk_queue_pm_only(sdev->request_queue))
+				blk_post_runtime_resume(sdev->request_queue, 0);
+			else
+				blk_set_runtime_active(sdev->request_queue);
 		}
 	}
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30  4:10 [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume Can Guo
@ 2020-04-30  5:08 ` Bart Van Assche
  2020-04-30  5:40   ` Can Guo
  2020-04-30  9:11   ` Avri Altman
  0 siblings, 2 replies; 12+ messages in thread
From: Bart Van Assche @ 2020-04-30  5:08 UTC (permalink / raw)
  To: Can Guo, asutoshd, nguyenb, hongwus, rnayak, stanley.chu,
	alim.akhtar, beanhuo, Avri.Altman, bjorn.andersson, linux-scsi,
	kernel-team, saravanak, salyzyn
  Cc: James E.J. Bottomley, Martin K. Petersen, open list

On 2020-04-29 21:10, Can Guo wrote:
> During system resume, scsi_resume_device() decreases a request queue's
> pm_only counter if the scsi device was quiesced before. But after that,
> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is
> still held (non-zero). Current scsi resume hook only sets the RPM status
> of the scsi device and its request queue to RPM_ACTIVE, but leaves the
> pm_only counter unchanged. This may make the request queue's pm_only
> counter remain non-zero after resume hook returns, hence those who are
> waiting on the mq_freeze_wq would never be woken up. Fix this by calling
> blk_post_runtime_resume() if pm_only is non-zero to balance the pm_only
> counter which is held by the scsi device's RPM ops.

How was this issue discovered? How has this patch been tested?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30  5:08 ` Bart Van Assche
@ 2020-04-30  5:40   ` Can Guo
  2020-04-30 20:32     ` Bart Van Assche
  2020-04-30  9:11   ` Avri Altman
  1 sibling, 1 reply; 12+ messages in thread
From: Can Guo @ 2020-04-30  5:40 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

Hi Bart,

On 2020-04-30 13:08, Bart Van Assche wrote:
> On 2020-04-29 21:10, Can Guo wrote:
>> During system resume, scsi_resume_device() decreases a request queue's
>> pm_only counter if the scsi device was quiesced before. But after 
>> that,
>> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter 
>> is
>> still held (non-zero). Current scsi resume hook only sets the RPM 
>> status
>> of the scsi device and its request queue to RPM_ACTIVE, but leaves the
>> pm_only counter unchanged. This may make the request queue's pm_only
>> counter remain non-zero after resume hook returns, hence those who are
>> waiting on the mq_freeze_wq would never be woken up. Fix this by 
>> calling
>> blk_post_runtime_resume() if pm_only is non-zero to balance the 
>> pm_only
>> counter which is held by the scsi device's RPM ops.
> 
> How was this issue discovered? How has this patch been tested?
> 
> Thanks,
> 
> Bart.

As the issue was found after system resumes, so the issue was discovered
during system suspend/resume test, and it is very easy to be replicated.
After system resumes, if this issue hits some scsi devices, all bios 
sent
to their request queues are blocked, which may cause a system hang if 
the
scsi devices are vital to system functionality.

To make sure the patch work well, we have tested system suspend/resume
and made sure no system hang happen due to request queues got blocked
by imbalanced pm_only counter.

Thanks,

Can Guo.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30  5:08 ` Bart Van Assche
  2020-04-30  5:40   ` Can Guo
@ 2020-04-30  9:11   ` Avri Altman
  2020-04-30 12:38     ` Can Guo
  1 sibling, 1 reply; 12+ messages in thread
From: Avri Altman @ 2020-04-30  9:11 UTC (permalink / raw)
  To: Bart Van Assche, Can Guo, asutoshd, nguyenb, hongwus, rnayak,
	stanley.chu, alim.akhtar, beanhuo, bjorn.andersson, linux-scsi,
	kernel-team, saravanak, salyzyn
  Cc: James E.J. Bottomley, Martin K. Petersen, open list

 
> 
> On 2020-04-29 21:10, Can Guo wrote:
> > During system resume, scsi_resume_device() decreases a request queue's
> > pm_only counter if the scsi device was quiesced before. But after that,
> > if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is
> > still held (non-zero). Current scsi resume hook only sets the RPM status
> > of the scsi device and its request queue to RPM_ACTIVE, but leaves the
> > pm_only counter unchanged. This may make the request queue's pm_only
> > counter remain non-zero after resume hook returns, hence those who are
> > waiting on the mq_freeze_wq would never be woken up. Fix this by calling
> > blk_post_runtime_resume() if pm_only is non-zero to balance the pm_only
> > counter which is held by the scsi device's RPM ops.
> 
> How was this issue discovered? How has this patch been tested?

I think this insight was originally gained as part of commit fb276f770118
(scsi: ufs: Enable block layer runtime PM for well-known logical units)

But I will let Can reply on that.

Thanks,
Avri

> 
> Thanks,
> 
> Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30  9:11   ` Avri Altman
@ 2020-04-30 12:38     ` Can Guo
  0 siblings, 0 replies; 12+ messages in thread
From: Can Guo @ 2020-04-30 12:38 UTC (permalink / raw)
  To: Avri Altman
  Cc: Bart Van Assche, asutoshd, nguyenb, hongwus, rnayak, stanley.chu,
	alim.akhtar, beanhuo, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

Hi Avri,

On 2020-04-30 17:11, Avri Altman wrote:
>> 
>> On 2020-04-29 21:10, Can Guo wrote:
>> > During system resume, scsi_resume_device() decreases a request queue's
>> > pm_only counter if the scsi device was quiesced before. But after that,
>> > if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is
>> > still held (non-zero). Current scsi resume hook only sets the RPM status
>> > of the scsi device and its request queue to RPM_ACTIVE, but leaves the
>> > pm_only counter unchanged. This may make the request queue's pm_only
>> > counter remain non-zero after resume hook returns, hence those who are
>> > waiting on the mq_freeze_wq would never be woken up. Fix this by calling
>> > blk_post_runtime_resume() if pm_only is non-zero to balance the pm_only
>> > counter which is held by the scsi device's RPM ops.
>> 
>> How was this issue discovered? How has this patch been tested?
> 
> I think this insight was originally gained as part of commit 
> fb276f770118
> (scsi: ufs: Enable block layer runtime PM for well-known logical units)
> 
> But I will let Can reply on that.
> 
> Thanks,
> Avri
> 

Thanks for pointing to that commit, but this is a different story here.
SCSI devices, which have block layer runtime PM enabled, can hit this 
issue
during system resume. In the contratry, those which have block layer 
runtime
PM disabled are immune to this issue.

Thanks,

Can Guo.
>> 
>> Thanks,
>> 
>> Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30  5:40   ` Can Guo
@ 2020-04-30 20:32     ` Bart Van Assche
  2020-05-01  1:19       ` Can Guo
  2020-05-01  1:42       ` Can Guo
  0 siblings, 2 replies; 12+ messages in thread
From: Bart Van Assche @ 2020-04-30 20:32 UTC (permalink / raw)
  To: Can Guo
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-04-29 22:40, Can Guo wrote:
> On 2020-04-30 13:08, Bart Van Assche wrote:
>> On 2020-04-29 21:10, Can Guo wrote:
>>> During system resume, scsi_resume_device() decreases a request queue's
>>> pm_only counter if the scsi device was quiesced before. But after that,
>>> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only counter is
>>> still held (non-zero). Current scsi resume hook only sets the RPM status
>>> of the scsi device and its request queue to RPM_ACTIVE, but leaves the
>>> pm_only counter unchanged. This may make the request queue's pm_only
>>> counter remain non-zero after resume hook returns, hence those who are
>>> waiting on the mq_freeze_wq would never be woken up. Fix this by calling
>>> blk_post_runtime_resume() if pm_only is non-zero to balance the pm_only
>>> counter which is held by the scsi device's RPM ops.
>>
>> How was this issue discovered? How has this patch been tested?
> 
> As the issue was found after system resumes, so the issue was discovered
> during system suspend/resume test, and it is very easy to be replicated.
> After system resumes, if this issue hits some scsi devices, all bios sent
> to their request queues are blocked, which may cause a system hang if the
> scsi devices are vital to system functionality.
> 
> To make sure the patch work well, we have tested system suspend/resume
> and made sure no system hang happen due to request queues got blocked
> by imbalanced pm_only counter.

Thanks, that's very interesting information. My concern with this patch
is that the power management code is not the only caller of
blk_set_pm_only() / blk_clear_pm_only(). E.g. the SCSI SPI code also
calls scsi_device_quiesce() and scsi_device_resume(). These last
functions call blk_set_pm_only() and blk_clear_pm_only(). More calls of
scsi_device_quiesce() and scsi_device_resume() might be added in the future.

Has it been considered to test directly whether a SCSI device has been
runtime suspended instead of relying on blk_queue_pm_only()? How about
using pm_runtime_status_suspended() or adding a function in
block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30 20:32     ` Bart Van Assche
@ 2020-05-01  1:19       ` Can Guo
  2020-05-01  1:42       ` Can Guo
  1 sibling, 0 replies; 12+ messages in thread
From: Can Guo @ 2020-05-01  1:19 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-05-01 04:32, Bart Van Assche wrote:
> On 2020-04-29 22:40, Can Guo wrote:
>> On 2020-04-30 13:08, Bart Van Assche wrote:
>>> On 2020-04-29 21:10, Can Guo wrote:
>>>> During system resume, scsi_resume_device() decreases a request 
>>>> queue's
>>>> pm_only counter if the scsi device was quiesced before. But after 
>>>> that,
>>>> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only 
>>>> counter is
>>>> still held (non-zero). Current scsi resume hook only sets the RPM 
>>>> status
>>>> of the scsi device and its request queue to RPM_ACTIVE, but leaves 
>>>> the
>>>> pm_only counter unchanged. This may make the request queue's pm_only
>>>> counter remain non-zero after resume hook returns, hence those who 
>>>> are
>>>> waiting on the mq_freeze_wq would never be woken up. Fix this by 
>>>> calling
>>>> blk_post_runtime_resume() if pm_only is non-zero to balance the 
>>>> pm_only
>>>> counter which is held by the scsi device's RPM ops.
>>> 
>>> How was this issue discovered? How has this patch been tested?
>> 
>> As the issue was found after system resumes, so the issue was 
>> discovered
>> during system suspend/resume test, and it is very easy to be 
>> replicated.
>> After system resumes, if this issue hits some scsi devices, all bios 
>> sent
>> to their request queues are blocked, which may cause a system hang if 
>> the
>> scsi devices are vital to system functionality.
>> 
>> To make sure the patch work well, we have tested system suspend/resume
>> and made sure no system hang happen due to request queues got blocked
>> by imbalanced pm_only counter.
> 
> Thanks, that's very interesting information. My concern with this patch
> is that the power management code is not the only caller of
> blk_set_pm_only() / blk_clear_pm_only(). E.g. the SCSI SPI code also
> calls scsi_device_quiesce() and scsi_device_resume(). These last
> functions call blk_set_pm_only() and blk_clear_pm_only(). More calls of
> scsi_device_quiesce() and scsi_device_resume() might be added in the 
> future.
> 
> Has it been considered to test directly whether a SCSI device has been
> runtime suspended instead of relying on blk_queue_pm_only()? How about
> using pm_runtime_status_suspended() or adding a function in
> block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?
> 
> Thanks,
> 
> Bart.

Hi Bart,

Please let me address your concern.

First of all, it is allowed to call scsi_device_quiesce() multiple 
times,
but one sdev's request queue's pm_only counter can only be increased 
once
by scsi_device_quiesce(), because if a sdev has already been quiesced,
in scsi_device_quiesce(), scsi_device_set_state(sdev, SDEV_QUIESCE) 
would
return -ENIVAL (illegal state transform), then blk_clear_pm_only() shall
be called to decrease pm_only once, so no matter how many times
scsi_device_quiesce() is called, it can only increase pm_only once.

scsi_device_resume() is same, it calls blk_clear_pm_only only once and
only if the sdev was quiesced().

So, in a word, after scsi_device_resume() returns in 
scsi_dev_type_resume(),
pm_only counter should be 1 (if the sdev's runtime power status is
RPM_SUSPENDED) or 0 (if the sdev's runtime power status is RPM_ACTIVE).

> Has it been considered to test directly whether a SCSI device has been
> runtime suspended instead of relying on blk_queue_pm_only()? How about
> using pm_runtime_status_suspended() or adding a function in
> block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?

Yes, I used to make the patch like that way, and it also worked well, as
both ways are equal actually. I kinda like the current code because we
should be confident that after scsi_dev_type_resume() returns, pm_only
must be 0. Different reviewers may have different opionions, either way
works well anyways.

Thanks,

Can Guo.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-04-30 20:32     ` Bart Van Assche
  2020-05-01  1:19       ` Can Guo
@ 2020-05-01  1:42       ` Can Guo
  2020-05-01  1:50         ` Bart Van Assche
  1 sibling, 1 reply; 12+ messages in thread
From: Can Guo @ 2020-05-01  1:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-05-01 04:32, Bart Van Assche wrote:
> On 2020-04-29 22:40, Can Guo wrote:
>> On 2020-04-30 13:08, Bart Van Assche wrote:
>>> On 2020-04-29 21:10, Can Guo wrote:
>>>> During system resume, scsi_resume_device() decreases a request 
>>>> queue's
>>>> pm_only counter if the scsi device was quiesced before. But after 
>>>> that,
>>>> if the scsi device's RPM status is RPM_SUSPENDED, the pm_only 
>>>> counter is
>>>> still held (non-zero). Current scsi resume hook only sets the RPM 
>>>> status
>>>> of the scsi device and its request queue to RPM_ACTIVE, but leaves 
>>>> the
>>>> pm_only counter unchanged. This may make the request queue's pm_only
>>>> counter remain non-zero after resume hook returns, hence those who 
>>>> are
>>>> waiting on the mq_freeze_wq would never be woken up. Fix this by 
>>>> calling
>>>> blk_post_runtime_resume() if pm_only is non-zero to balance the 
>>>> pm_only
>>>> counter which is held by the scsi device's RPM ops.
>>> 
>>> How was this issue discovered? How has this patch been tested?
>> 
>> As the issue was found after system resumes, so the issue was 
>> discovered
>> during system suspend/resume test, and it is very easy to be 
>> replicated.
>> After system resumes, if this issue hits some scsi devices, all bios 
>> sent
>> to their request queues are blocked, which may cause a system hang if 
>> the
>> scsi devices are vital to system functionality.
>> 
>> To make sure the patch work well, we have tested system suspend/resume
>> and made sure no system hang happen due to request queues got blocked
>> by imbalanced pm_only counter.
> 
> Thanks, that's very interesting information. My concern with this patch
> is that the power management code is not the only caller of
> blk_set_pm_only() / blk_clear_pm_only(). E.g. the SCSI SPI code also
> calls scsi_device_quiesce() and scsi_device_resume(). These last
> functions call blk_set_pm_only() and blk_clear_pm_only(). More calls of
> scsi_device_quiesce() and scsi_device_resume() might be added in the 
> future.
> 
> Has it been considered to test directly whether a SCSI device has been
> runtime suspended instead of relying on blk_queue_pm_only()? How about
> using pm_runtime_status_suspended() or adding a function in
> block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?
> 
> Thanks,
> 
> Bart.

Hi Bart,

Slightly revised my previous mail.

Please let me address your concern.

First of all, it is allowed to call scsi_device_quiesce() multiple 
times,
but one sdev's request queue's pm_only counter can only be increased 
once
by scsi_device_quiesce(), because if a sdev has already been quiesced,
in scsi_device_quiesce(), scsi_device_set_state(sdev, SDEV_QUIESCE) 
would
return -ENIVAL (illegal state transform), then blk_clear_pm_only() shall
be called to decrease pm_only once, so no matter how many times
scsi_device_quiesce() is called, it can only increase pm_only once.

scsi_device_resume() is same, it calls blk_clear_pm_only only once and
only if the sdev was quiesced().

So, in a word, after scsi_device_resume() returns in 
scsi_dev_type_resume(),
if a sdev has block layer runtime PM enabled (sdev->request_queue->dev 
is not
NULL), its queue's pm_only counter should be 1 (if the sdev's runtime 
power
status is RPM_SUSPENDED) or 0 (if the sdev's runtime power status is 
RPM_ACTIVE).
If a sdev has block layer runtime PM disabled (sdev->request_queue->dev 
is NULL),
its queue's pm_only counter should be 0.

Has it been considered to test directly whether a SCSI device has been
runtime suspended instead of relying on blk_queue_pm_only()? How about
using pm_runtime_status_suspended() or adding a function in
block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?

Yes, I used to make the patch like that way, and it also worked well, as
both ways are equal actually. I kinda like the current code because we
should be confident that after scsi_dev_type_resume() returns, pm_only
must be 0. Different reviewers may have different opinions, either way
works well anyways.

Thanks,

Can Guo.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-05-01  1:42       ` Can Guo
@ 2020-05-01  1:50         ` Bart Van Assche
  2020-05-01  5:12           ` Can Guo
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2020-05-01  1:50 UTC (permalink / raw)
  To: Can Guo
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-04-30 18:42, Can Guo wrote:
> On 2020-05-01 04:32, Bart Van Assche wrote:
> > Has it been considered to test directly whether a SCSI device has been
> > runtime suspended instead of relying on blk_queue_pm_only()? How about
> > using pm_runtime_status_suspended() or adding a function in
> > block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?
> 
> Yes, I used to make the patch like that way, and it also worked well, as
> both ways are equal actually. I kinda like the current code because we
> should be confident that after scsi_dev_type_resume() returns, pm_only
> must be 0. Different reviewers may have different opinions, either way
> works well anyways.

Hi Can,

Please note that this is not a matter of personal preferences of a
reviewer but a matter of correctness. blk_queue_pm_only() does not only
return a value > 0 if a SCSI device has been runtime suspended but also
returns true if scsi_device_quiesce() was called for another reason.
Hence my request to test the "runtime suspended" status directly and not
to rely on blk_queue_pm_only().

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-05-01  1:50         ` Bart Van Assche
@ 2020-05-01  5:12           ` Can Guo
  2020-05-01 17:56             ` Bart Van Assche
  0 siblings, 1 reply; 12+ messages in thread
From: Can Guo @ 2020-05-01  5:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-05-01 09:50, Bart Van Assche wrote:
> On 2020-04-30 18:42, Can Guo wrote:
>> On 2020-05-01 04:32, Bart Van Assche wrote:
>> > Has it been considered to test directly whether a SCSI device has been
>> > runtime suspended instead of relying on blk_queue_pm_only()? How about
>> > using pm_runtime_status_suspended() or adding a function in
>> > block/blk-pm.h that checks whether q->rpm_status == RPM_SUSPENDED?
>> 
>> Yes, I used to make the patch like that way, and it also worked well, 
>> as
>> both ways are equal actually. I kinda like the current code because we
>> should be confident that after scsi_dev_type_resume() returns, pm_only
>> must be 0. Different reviewers may have different opinions, either way
>> works well anyways.
> 
> Hi Can,
> 
> Please note that this is not a matter of personal preferences of a
> reviewer but a matter of correctness. blk_queue_pm_only() does not only
> return a value > 0 if a SCSI device has been runtime suspended but also
> returns true if scsi_device_quiesce() was called for another reason.
> Hence my request to test the "runtime suspended" status directly and 
> not
> to rely on blk_queue_pm_only().
> 
> Thanks,
> 
> Bart.

Hi Bart,

I agree we are pursuing correctness here, but as I said, I think both
way are equally correct. I also agree with you that the alternative way,
see [2], is much easier to be understood, we can take the alternative 
way
if you are OK with it.

[1] Currently, scsi_dev_type_resume() is the hooker for resume, thaw and
restore. Per my understanding, when scsi_dev_type_resume() is running,
it is not possible that scsi_device_quiesce() can be called to this 
sdev,
at least not possible in current code base. So it is OK to rely on
blk_queue_pm_only() in scsi_dev_type_resume().

[2] The alternative way which I have tested with is like below. I think
it is what you requested for if my understanding is right, please 
correct
me if I am wrong.

diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index 3717eea..d18271d 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -74,12 +74,15 @@ static int scsi_dev_type_resume(struct device *dev,
  {
         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : 
NULL;
         int err = 0;
+       bool was_rpm_suspended = false;

         err = cb(dev, pm);
         scsi_device_resume(to_scsi_device(dev));
         dev_dbg(dev, "scsi resume: %d\n", err);

         if (err == 0) {
+               was_rpm_suspended = pm_runtime_suspended(dev);
+
                 pm_runtime_disable(dev);
                 err = pm_runtime_set_active(dev);
                 pm_runtime_enable(dev);
@@ -93,8 +96,10 @@ static int scsi_dev_type_resume(struct device *dev,
                  */
                 if (!err && scsi_is_sdev_device(dev)) {
                         struct scsi_device *sdev = to_scsi_device(dev);
-
-                       blk_set_runtime_active(sdev->request_queue);
+                       if (was_rpm_suspended)
+                               
blk_post_runtime_resume(sdev->request_queue, 0);
+                       else
+                               
blk_set_runtime_active(sdev->request_queue);
                 }
         }

Thanks,

Can Guo

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-05-01  5:12           ` Can Guo
@ 2020-05-01 17:56             ` Bart Van Assche
  2020-05-02  1:59               ` Can Guo
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2020-05-01 17:56 UTC (permalink / raw)
  To: Can Guo
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-04-30 22:12, Can Guo wrote:
> diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
> index 3717eea..d18271d 100644
> --- a/drivers/scsi/scsi_pm.c
> +++ b/drivers/scsi/scsi_pm.c
> @@ -74,12 +74,15 @@ static int scsi_dev_type_resume(struct device *dev,
>  {
>         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>         int err = 0;
> +       bool was_rpm_suspended = false;
> 
>         err = cb(dev, pm);
>         scsi_device_resume(to_scsi_device(dev));
>         dev_dbg(dev, "scsi resume: %d\n", err);
> 
>         if (err == 0) {
> +               was_rpm_suspended = pm_runtime_suspended(dev);
> +

How about renaming this variable into "was_runtime_suspended"? How about
moving the declaration of that variable inside the if-statement?

>                 pm_runtime_disable(dev);
>                 err = pm_runtime_set_active(dev);
>                 pm_runtime_enable(dev);
> @@ -93,8 +96,10 @@ static int scsi_dev_type_resume(struct device *dev,
>                  */
>                 if (!err && scsi_is_sdev_device(dev)) {
>                         struct scsi_device *sdev = to_scsi_device(dev);
> -
> -                       blk_set_runtime_active(sdev->request_queue);
> +                       if (was_rpm_suspended)
> +                              blk_post_runtime_resume(sdev->request_queue, 0);
> +                       else
> +                              blk_set_runtime_active(sdev->request_queue);
>                 }
>         }

Does other code always call both blk_pre_runtime_resume() and
blk_post_runtime_resume() upon runtime resume? How about adding a
blk_pre_runtime_resume() call before the blk_post_runtime_resume() call?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume
  2020-05-01 17:56             ` Bart Van Assche
@ 2020-05-02  1:59               ` Can Guo
  0 siblings, 0 replies; 12+ messages in thread
From: Can Guo @ 2020-05-02  1:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: asutoshd, nguyenb, hongwus, rnayak, stanley.chu, alim.akhtar,
	beanhuo, Avri.Altman, bjorn.andersson, linux-scsi, kernel-team,
	saravanak, salyzyn, James E.J. Bottomley, Martin K. Petersen,
	open list

On 2020-05-02 01:56, Bart Van Assche wrote:
> On 2020-04-30 22:12, Can Guo wrote:
>> diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
>> index 3717eea..d18271d 100644
>> --- a/drivers/scsi/scsi_pm.c
>> +++ b/drivers/scsi/scsi_pm.c
>> @@ -74,12 +74,15 @@ static int scsi_dev_type_resume(struct device 
>> *dev,
>>  {
>>         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : 
>> NULL;
>>         int err = 0;
>> +       bool was_rpm_suspended = false;
>> 
>>         err = cb(dev, pm);
>>         scsi_device_resume(to_scsi_device(dev));
>>         dev_dbg(dev, "scsi resume: %d\n", err);
>> 
>>         if (err == 0) {
>> +               was_rpm_suspended = pm_runtime_suspended(dev);
>> +
> 
> How about renaming this variable into "was_runtime_suspended"? How 
> about
> moving the declaration of that variable inside the if-statement?
> 

Sure, shall do, this patch was just a prototype which I made for 
testing.
If you are OK with this idea, I will send it as the next version.

>>                 pm_runtime_disable(dev);
>>                 err = pm_runtime_set_active(dev);
>>                 pm_runtime_enable(dev);
>> @@ -93,8 +96,10 @@ static int scsi_dev_type_resume(struct device *dev,
>>                  */
>>                 if (!err && scsi_is_sdev_device(dev)) {
>>                         struct scsi_device *sdev = 
>> to_scsi_device(dev);
>> -
>> -                       blk_set_runtime_active(sdev->request_queue);
>> +                       if (was_rpm_suspended)
>> +                              
>> blk_post_runtime_resume(sdev->request_queue, 0);
>> +                       else
>> +                              
>> blk_set_runtime_active(sdev->request_queue);
>>                 }
>>         }
> 
> Does other code always call both blk_pre_runtime_resume() and
> blk_post_runtime_resume() upon runtime resume? How about adding a
> blk_pre_runtime_resume() call before the blk_post_runtime_resume() 
> call?
> 
> Thanks,
> 
> Bart.

Yes, but adding a blk_pre_runtime_resume() here is meaningless, it only
sets the q->rpm_status to RPM_RESUMING, blk_post_runtime_resume() 
overrides
it to RPM_ACTIVE for sure. Besides, this place comes after the call of
pm_runtime_set_active(), meaning sdev is already runtime active, in 
contrast
with the real runtime resume routine, we can think it as the sdev's 
runtime
resume ops has returned 0, so we can just call 
blk_post_runtime_resume(err=0).

Thanks,

Can Guo.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-02  2:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30  4:10 [PATCH v3 1/1] scsi: pm: Balance pm_only counter of request queue during system resume Can Guo
2020-04-30  5:08 ` Bart Van Assche
2020-04-30  5:40   ` Can Guo
2020-04-30 20:32     ` Bart Van Assche
2020-05-01  1:19       ` Can Guo
2020-05-01  1:42       ` Can Guo
2020-05-01  1:50         ` Bart Van Assche
2020-05-01  5:12           ` Can Guo
2020-05-01 17:56             ` Bart Van Assche
2020-05-02  1:59               ` Can Guo
2020-04-30  9:11   ` Avri Altman
2020-04-30 12:38     ` Can Guo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.