linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scsi: sd: add runtime pm to open / release
@ 2020-06-23 11:10 Martin Kepplinger
  2020-06-24 13:33 ` Bart Van Assche
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-06-23 11:10 UTC (permalink / raw)
  To: jejb, martin.petersen; +Cc: linux-scsi, linux-kernel, kernel, Martin Kepplinger

This add a very conservative but simple implementation for runtime PM
to the sd scsi driver:
Resume when opened (mounted) and suspend when released (unmounted).

Improvements that allow suspending while a device is "open" can
be added later, but now we save power when no filesystem is mounted
and runtime PM is enabled.

Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
---
 drivers/scsi/sd.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..fe4cb7c50ec1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));
 
 	sdev = sdkp->device;
+	scsi_autopm_get_device(sdev);
 
 	/*
 	 * If the device is in error recovery, wait until it is done.
@@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 
 error_out:
 	scsi_disk_put(sdkp);
+
+	scsi_autopm_put_device(sdev);
+
 	return retval;	
 }
 
@@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
 
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n"));
 
+	scsi_autopm_put_device(sdev);
+
 	if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) {
 		if (scsi_block_when_processing_errors(sdev))
 			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-23 11:10 [PATCH] scsi: sd: add runtime pm to open / release Martin Kepplinger
@ 2020-06-24 13:33 ` Bart Van Assche
  2020-06-25  8:16   ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-24 13:33 UTC (permalink / raw)
  To: Martin Kepplinger, jejb, martin.petersen; +Cc: linux-scsi, linux-kernel, kernel

On 2020-06-23 04:10, Martin Kepplinger wrote:
> This add a very conservative but simple implementation for runtime PM
> to the sd scsi driver:
> Resume when opened (mounted) and suspend when released (unmounted).
> 
> Improvements that allow suspending while a device is "open" can
> be added later, but now we save power when no filesystem is mounted
> and runtime PM is enabled.
> 
> Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
> ---
>  drivers/scsi/sd.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..fe4cb7c50ec1 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));
>  
>  	sdev = sdkp->device;
> +	scsi_autopm_get_device(sdev);
>  
>  	/*
>  	 * If the device is in error recovery, wait until it is done.
> @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
>  
>  error_out:
>  	scsi_disk_put(sdkp);
> +
> +	scsi_autopm_put_device(sdev);
> +
>  	return retval;	
>  }
>  
> @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
>  
>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n"));
>  
> +	scsi_autopm_put_device(sdev);
> +
>  	if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) {
>  		if (scsi_block_when_processing_errors(sdev))
>  			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);

My understanding of the above patch is that it introduces a regression,
namely by disabling runtime suspend as long as an sd device is held open.

Bart.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-24 13:33 ` Bart Van Assche
@ 2020-06-25  8:16   ` Martin Kepplinger
  2020-06-25 14:52     ` Alan Stern
                       ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Martin Kepplinger @ 2020-06-25  8:16 UTC (permalink / raw)
  To: Bart Van Assche, jejb, martin.petersen, Alan Stern
  Cc: linux-scsi, linux-kernel, kernel

On 24.06.20 15:33, Bart Van Assche wrote:
> On 2020-06-23 04:10, Martin Kepplinger wrote:
>> This add a very conservative but simple implementation for runtime PM
>> to the sd scsi driver:
>> Resume when opened (mounted) and suspend when released (unmounted).
>>
>> Improvements that allow suspending while a device is "open" can
>> be added later, but now we save power when no filesystem is mounted
>> and runtime PM is enabled.
>>
>> Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
>> ---
>>  drivers/scsi/sd.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..fe4cb7c50ec1 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
>>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));
>>  
>>  	sdev = sdkp->device;
>> +	scsi_autopm_get_device(sdev);
>>  
>>  	/*
>>  	 * If the device is in error recovery, wait until it is done.
>> @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
>>  
>>  error_out:
>>  	scsi_disk_put(sdkp);
>> +
>> +	scsi_autopm_put_device(sdev);
>> +
>>  	return retval;	
>>  }
>>  
>> @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
>>  
>>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n"));
>>  
>> +	scsi_autopm_put_device(sdev);
>> +
>>  	if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) {
>>  		if (scsi_block_when_processing_errors(sdev))
>>  			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
> 
> My understanding of the above patch is that it introduces a regression,
> namely by disabling runtime suspend as long as an sd device is held open.
> 
> Bart.
> 
> 

hi Bart,

Alan says the same (on block request, the block layer should initiate a
runtime resume), so merging with the thread from
https://lore.kernel.org/linux-usb/8738e4d3-62b1-0144-107d-ff42000ed6c6@puri.sm/T/
now and answer to both Bart and Alan here:]

I see scsi-pm.c using the blk-pm.c API but I'm not sure how the block
layer would itself resume the scsi device (I use it via usb_storage, so
that usb_stor_resume() follows in my case but I guess that doesn't
matter here):

my understanding of "sd" is: enable runtime pm in probe(), so *allow*
the device to be suspended (if enabled by the user), but never
resume(?). Also, why isn't "autopm" used in its ioctl() implementation
(as opposed to in "sr")?

here's roughly what happens when enabling runtime PM in sysfs (again,
because sd_probe() calls autopm_put() and thus allows it:

[   27.384446] sd 0:0:0:0: scsi_runtime_suspend
[   27.432282] blk_pre_runtime_suspend
[   27.435783] sd_suspend_common
[   27.438782] blk_post_runtime_suspend
[   27.442427] scsi target0:0:0: scsi_runtime_suspend
[   27.447303] scsi host0: scsi_runtime_suspend

then I "mount /dev/sda1 /mnt" and none of the resume() functions get
called. To me it looks like the sd driver should initiate resuming, and
that's not implemented.

what am I doing wrong or overlooking? how exactly does (or should) the
block layer initiate resume here?

thanks again for your time,

                               martin

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-25  8:16   ` Martin Kepplinger
@ 2020-06-25 14:52     ` Alan Stern
  2020-06-26  3:53     ` Bart Van Assche
  2020-06-26 15:07     ` Bart Van Assche
  2 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-25 14:52 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, martin.petersen, linux-scsi, linux-kernel, kernel

On Thu, Jun 25, 2020 at 10:16:06AM +0200, Martin Kepplinger wrote:
> On 24.06.20 15:33, Bart Van Assche wrote:
> > On 2020-06-23 04:10, Martin Kepplinger wrote:
> >> This add a very conservative but simple implementation for runtime PM
> >> to the sd scsi driver:
> >> Resume when opened (mounted) and suspend when released (unmounted).
> >>
> >> Improvements that allow suspending while a device is "open" can
> >> be added later, but now we save power when no filesystem is mounted
> >> and runtime PM is enabled.
> >>
> >> Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
> >> ---
> >>  drivers/scsi/sd.c | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> >> index d90fefffe31b..fe4cb7c50ec1 100644
> >> --- a/drivers/scsi/sd.c
> >> +++ b/drivers/scsi/sd.c
> >> @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
> >>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));
> >>  
> >>  	sdev = sdkp->device;
> >> +	scsi_autopm_get_device(sdev);
> >>  
> >>  	/*
> >>  	 * If the device is in error recovery, wait until it is done.
> >> @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
> >>  
> >>  error_out:
> >>  	scsi_disk_put(sdkp);
> >> +
> >> +	scsi_autopm_put_device(sdev);
> >> +
> >>  	return retval;	
> >>  }
> >>  
> >> @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
> >>  
> >>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n"));
> >>  
> >> +	scsi_autopm_put_device(sdev);
> >> +
> >>  	if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) {
> >>  		if (scsi_block_when_processing_errors(sdev))
> >>  			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
> > 
> > My understanding of the above patch is that it introduces a regression,
> > namely by disabling runtime suspend as long as an sd device is held open.
> > 
> > Bart.
> > 
> > 
> 
> hi Bart,
> 
> Alan says the same (on block request, the block layer should initiate a
> runtime resume), so merging with the thread from
> https://lore.kernel.org/linux-usb/8738e4d3-62b1-0144-107d-ff42000ed6c6@puri.sm/T/
> now and answer to both Bart and Alan here:]
> 
> I see scsi-pm.c using the blk-pm.c API but I'm not sure how the block
> layer would itself resume the scsi device (I use it via usb_storage, so
> that usb_stor_resume() follows in my case but I guess that doesn't
> matter here):

The block layer does this in block/blk-core.c:blk_queue_enter(), as part 
of the condition check in the call to wait_event() near the end of the 
function.  The blk_pm_request_resume() inline routine calls 
pm_request_resume().

At least, that's what is _supposed_ to happen.  See commit 0d25bd072b49 
("block: Schedule runtime resume earlier").

> my understanding of "sd" is: enable runtime pm in probe(), so *allow*
> the device to be suspended (if enabled by the user), but never
> resume(?). Also, why isn't "autopm" used in its ioctl() implementation
> (as opposed to in "sr")?

I don't remember the reason.  It may be that the code in sr.c isn't 
needed.

> here's roughly what happens when enabling runtime PM in sysfs (again,
> because sd_probe() calls autopm_put() and thus allows it:
> 
> [   27.384446] sd 0:0:0:0: scsi_runtime_suspend
> [   27.432282] blk_pre_runtime_suspend
> [   27.435783] sd_suspend_common
> [   27.438782] blk_post_runtime_suspend
> [   27.442427] scsi target0:0:0: scsi_runtime_suspend
> [   27.447303] scsi host0: scsi_runtime_suspend
> 
> then I "mount /dev/sda1 /mnt" and none of the resume() functions get
> called. To me it looks like the sd driver should initiate resuming, and
> that's not implemented.
> 
> what am I doing wrong or overlooking? how exactly does (or should) the
> block layer initiate resume here?

I don't know what's going wrong.  Bart, can you look into it?  As far as I 
can tell, you're the last person to touch the block-layer's runtime PM 
code.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-25  8:16   ` Martin Kepplinger
  2020-06-25 14:52     ` Alan Stern
@ 2020-06-26  3:53     ` Bart Van Assche
  2020-06-26 15:07     ` Bart Van Assche
  2 siblings, 0 replies; 68+ messages in thread
From: Bart Van Assche @ 2020-06-26  3:53 UTC (permalink / raw)
  To: Martin Kepplinger, jejb, martin.petersen, Alan Stern
  Cc: linux-scsi, linux-kernel, kernel

On 2020-06-25 01:16, Martin Kepplinger wrote:
> Also, why isn't "autopm" used in its ioctl() implementation
> (as opposed to in "sr")?

Some of the scsi_autopm_{get,put}_device() calls in the sr driver
have been introduced by me before I fully understood runtime pm.
I will have a another look to see whether these are really
necessary and if not, post a patch to remove these again.

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-25  8:16   ` Martin Kepplinger
  2020-06-25 14:52     ` Alan Stern
  2020-06-26  3:53     ` Bart Van Assche
@ 2020-06-26 15:07     ` Bart Van Assche
  2020-06-26 15:44       ` Alan Stern
  2 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-26 15:07 UTC (permalink / raw)
  To: Martin Kepplinger, jejb, martin.petersen, Alan Stern
  Cc: linux-scsi, linux-kernel, kernel

On 2020-06-25 01:16, Martin Kepplinger wrote:
> here's roughly what happens when enabling runtime PM in sysfs (again,
> because sd_probe() calls autopm_put() and thus allows it:
> 
> [   27.384446] sd 0:0:0:0: scsi_runtime_suspend
> [   27.432282] blk_pre_runtime_suspend
> [   27.435783] sd_suspend_common
> [   27.438782] blk_post_runtime_suspend
> [   27.442427] scsi target0:0:0: scsi_runtime_suspend
> [   27.447303] scsi host0: scsi_runtime_suspend
> 
> then I "mount /dev/sda1 /mnt" and none of the resume() functions get
> called. To me it looks like the sd driver should initiate resuming, and
> that's not implemented.
> 
> what am I doing wrong or overlooking? how exactly does (or should) the
> block layer initiate resume here?

As far as I know runtime power management support in the sd driver is working
fine and is being used intensively by the UFS driver. The following commit was
submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi:
pm: Balance pm_only counter of request queue during system resume") # v5.7.
I'm not sure which bug is causing trouble on your setup but I think it's likely
that the root cause is somewhere else than in the block layer, the SCSI core
or the SCSI sd driver.

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-26 15:07     ` Bart Van Assche
@ 2020-06-26 15:44       ` Alan Stern
  2020-06-28  2:37         ` Bart Van Assche
  2020-06-29  9:42         ` Martin Kepplinger
  0 siblings, 2 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-26 15:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote:
> On 2020-06-25 01:16, Martin Kepplinger wrote:
> > here's roughly what happens when enabling runtime PM in sysfs (again,
> > because sd_probe() calls autopm_put() and thus allows it:
> > 
> > [   27.384446] sd 0:0:0:0: scsi_runtime_suspend
> > [   27.432282] blk_pre_runtime_suspend
> > [   27.435783] sd_suspend_common
> > [   27.438782] blk_post_runtime_suspend
> > [   27.442427] scsi target0:0:0: scsi_runtime_suspend
> > [   27.447303] scsi host0: scsi_runtime_suspend
> > 
> > then I "mount /dev/sda1 /mnt" and none of the resume() functions get
> > called. To me it looks like the sd driver should initiate resuming, and
> > that's not implemented.
> > 
> > what am I doing wrong or overlooking? how exactly does (or should) the
> > block layer initiate resume here?
> 
> As far as I know runtime power management support in the sd driver is working
> fine and is being used intensively by the UFS driver. The following commit was
> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi:
> pm: Balance pm_only counter of request queue during system resume") # v5.7.

I just looked at that commit for the first time.

Instead of making the SCSI driver do the work of deciding what routine to 
call, why not redefine blk_set_runtime_active(q) to simply call 
blk_post_runtime_resume(q, 0)?  Or vice versa: if err == 0 have 
blk_post_runtime_resume call blk_set_runtime_active?

After all, the two routines do almost the same thing -- and the bug 
addressed by this commit was caused by the difference in their behaviors.

If the device was already runtime-active during the system suspend, doing 
an extra clear of the pm_only counter won't hurt anything.

> I'm not sure which bug is causing trouble on your setup but I think it's likely
> that the root cause is somewhere else than in the block layer, the SCSI core
> or the SCSI sd driver.
> 
> Bart.

Martin's best approach would be to add some debugging code to find out why 
blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
doesn't lead to pm_request_resume().

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-26 15:44       ` Alan Stern
@ 2020-06-28  2:37         ` Bart Van Assche
  2020-06-28 13:10           ` Alan Stern
  2020-06-29  9:42         ` Martin Kepplinger
  1 sibling, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-28  2:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 2020-06-26 08:44, Alan Stern wrote:
> On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote:
>> As far as I know runtime power management support in the sd driver is working
>> fine and is being used intensively by the UFS driver. The following commit was
>> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi:
>> pm: Balance pm_only counter of request queue during system resume") # v5.7.
> 
> I just looked at that commit for the first time.
> 
> Instead of making the SCSI driver do the work of deciding what routine to 
> call, why not redefine blk_set_runtime_active(q) to simply call 
> blk_post_runtime_resume(q, 0)?  Or vice versa: if err == 0 have 
> blk_post_runtime_resume call blk_set_runtime_active?
> 
> After all, the two routines do almost the same thing -- and the bug 
> addressed by this commit was caused by the difference in their behaviors.
> 
> If the device was already runtime-active during the system suspend, doing 
> an extra clear of the pm_only counter won't hurt anything.

Hi Alan,

Do you want to submit a patch that implements this change or do you
perhaps expect me to do that?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-28  2:37         ` Bart Van Assche
@ 2020-06-28 13:10           ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-28 13:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Sat, Jun 27, 2020 at 07:37:54PM -0700, Bart Van Assche wrote:
> On 2020-06-26 08:44, Alan Stern wrote:
> > On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote:
> >> As far as I know runtime power management support in the sd driver is working
> >> fine and is being used intensively by the UFS driver. The following commit was
> >> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi:
> >> pm: Balance pm_only counter of request queue during system resume") # v5.7.
> > 
> > I just looked at that commit for the first time.
> > 
> > Instead of making the SCSI driver do the work of deciding what routine to 
> > call, why not redefine blk_set_runtime_active(q) to simply call 
> > blk_post_runtime_resume(q, 0)?  Or vice versa: if err == 0 have 
> > blk_post_runtime_resume call blk_set_runtime_active?
> > 
> > After all, the two routines do almost the same thing -- and the bug 
> > addressed by this commit was caused by the difference in their behaviors.
> > 
> > If the device was already runtime-active during the system suspend, doing 
> > an extra clear of the pm_only counter won't hurt anything.
> 
> Hi Alan,
> 
> Do you want to submit a patch that implements this change or do you
> perhaps expect me to do that?

I'll submit a patch in a few days.  I just wanted to check first that the 
idea was sound.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-26 15:44       ` Alan Stern
  2020-06-28  2:37         ` Bart Van Assche
@ 2020-06-29  9:42         ` Martin Kepplinger
  2020-06-29 16:15           ` Alan Stern
  1 sibling, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-06-29  9:42 UTC (permalink / raw)
  To: Alan Stern, Bart Van Assche
  Cc: jejb, Can Guo, martin.petersen, linux-scsi, linux-kernel, kernel



On 26.06.20 17:44, Alan Stern wrote:
> On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote:
>> On 2020-06-25 01:16, Martin Kepplinger wrote:
>>> here's roughly what happens when enabling runtime PM in sysfs (again,
>>> because sd_probe() calls autopm_put() and thus allows it:
>>>
>>> [   27.384446] sd 0:0:0:0: scsi_runtime_suspend
>>> [   27.432282] blk_pre_runtime_suspend
>>> [   27.435783] sd_suspend_common
>>> [   27.438782] blk_post_runtime_suspend
>>> [   27.442427] scsi target0:0:0: scsi_runtime_suspend
>>> [   27.447303] scsi host0: scsi_runtime_suspend
>>>
>>> then I "mount /dev/sda1 /mnt" and none of the resume() functions get
>>> called. To me it looks like the sd driver should initiate resuming, and
>>> that's not implemented.
>>>
>>> what am I doing wrong or overlooking? how exactly does (or should) the
>>> block layer initiate resume here?
>>
>> As far as I know runtime power management support in the sd driver is working
>> fine and is being used intensively by the UFS driver. The following commit was
>> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi:
>> pm: Balance pm_only counter of request queue during system resume") # v5.7.
> 
> I just looked at that commit for the first time.
> 
> Instead of making the SCSI driver do the work of deciding what routine to 
> call, why not redefine blk_set_runtime_active(q) to simply call 
> blk_post_runtime_resume(q, 0)?  Or vice versa: if err == 0 have 
> blk_post_runtime_resume call blk_set_runtime_active?
> 
> After all, the two routines do almost the same thing -- and the bug 
> addressed by this commit was caused by the difference in their behaviors.
> 
> If the device was already runtime-active during the system suspend, doing 
> an extra clear of the pm_only counter won't hurt anything.
> 
>> I'm not sure which bug is causing trouble on your setup but I think it's likely
>> that the root cause is somewhere else than in the block layer, the SCSI core
>> or the SCSI sd driver.
>>
>> Bart.
> 
> Martin's best approach would be to add some debugging code to find out why 
> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
> doesn't lead to pm_request_resume().
> 
> Alan Stern
> 

Hi Alan,

blk_queue_enter() always - especially when sd is runtime suspended and I
try to mount as above - sets success to be true for me, so never
continues down to bkl_pm_request_resume(). All I see is "PM: Removing
info for No Bus:sda1".

thanks,
                                martin

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29  9:42         ` Martin Kepplinger
@ 2020-06-29 16:15           ` Alan Stern
  2020-06-29 16:56             ` Bart Van Assche
                               ` (3 more replies)
  0 siblings, 4 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-29 16:15 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote:
> 
> 
> On 26.06.20 17:44, Alan Stern wrote:
> > Martin's best approach would be to add some debugging code to find out why 
> > blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
> > doesn't lead to pm_request_resume().
> > 
> > Alan Stern
> > 
> 
> Hi Alan,
> 
> blk_queue_enter() always - especially when sd is runtime suspended and I
> try to mount as above - sets success to be true for me, so never
> continues down to bkl_pm_request_resume(). All I see is "PM: Removing
> info for No Bus:sda1".

Aha.  Looking at this more closely, it's apparent that the code in 
blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
flag is set then the request can be issued regardless of the queue's 
runtime status.  That is not correct when the queue is suspended.

Below is my attempt to fix this up.  I'm not sure that the patch is 
entirely correct, but it should fix this logic bug.  I would appreciate a 
critical review.

Martin, does this fix the problem?

Alan Stern



Index: usb-devel/block/blk-core.c
===================================================================
--- usb-devel.orig/block/blk-core.c
+++ usb-devel/block/blk-core.c
@@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
 			 * responsible for ensuring that that counter is
 			 * globally visible before the queue is unfrozen.
 			 */
-			if (pm || !blk_queue_pm_only(q)) {
+			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
+			    !blk_queue_pm_only(q)) {
 				success = true;
 			} else {
 				percpu_ref_put(&q->q_usage_counter);
@@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
 
 		wait_event(q->mq_freeze_wq,
 			   (!q->mq_freeze_depth &&
-			    (pm || (blk_pm_request_resume(q),
-				    !blk_queue_pm_only(q)))) ||
+			    blk_pm_resume_queue(pm, q)) ||
 			   blk_queue_dying(q));
 		if (blk_queue_dying(q))
 			return -ENODEV;
Index: usb-devel/block/blk-pm.h
===================================================================
--- usb-devel.orig/block/blk-pm.h
+++ usb-devel/block/blk-pm.h
@@ -6,11 +6,14 @@
 #include <linux/pm_runtime.h>
 
 #ifdef CONFIG_PM
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
-	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
-		       q->rpm_status == RPM_SUSPENDING))
-		pm_request_resume(q->dev);
+	if (!q->dev || !blk_queue_pm_only(q))
+		return 1;	/* Nothing to do */
+	if (pm && q->rpm_status != RPM_SUSPENDED)
+		return 1;	/* Request allowed */
+	pm_request_resume(q->dev);
+	return 0;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)
@@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st
 		--rq->q->nr_pending;
 }
 #else
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
+	return 1;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29 16:15           ` Alan Stern
@ 2020-06-29 16:56             ` Bart Van Assche
  2020-06-29 17:40               ` Alan Stern
  2020-06-30  3:33             ` Martin Kepplinger
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-29 16:56 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: jejb, Can Guo, martin.petersen, linux-scsi, linux-kernel, kernel

On 2020-06-29 09:15, Alan Stern wrote:
> Aha.  Looking at this more closely, it's apparent that the code in 
> blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> flag is set then the request can be issued regardless of the queue's 
> runtime status.  That is not correct when the queue is suspended.

Please clarify why this is not correct.

> Index: usb-devel/block/blk-core.c
> ===================================================================
> --- usb-devel.orig/block/blk-core.c
> +++ usb-devel/block/blk-core.c
> @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
>  			 * responsible for ensuring that that counter is
>  			 * globally visible before the queue is unfrozen.
>  			 */
> -			if (pm || !blk_queue_pm_only(q)) {
> +			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
> +			    !blk_queue_pm_only(q)) {
>  				success = true;
>  			} else {
>  				percpu_ref_put(&q->q_usage_counter);

Does the above change make it impossible to bring a suspended device
back to the RPM_ACTIVE state if the BLK_MQ_REQ_NOWAIT flag is set?

> @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
>  
>  		wait_event(q->mq_freeze_wq,
>  			   (!q->mq_freeze_depth &&
> -			    (pm || (blk_pm_request_resume(q),
> -				    !blk_queue_pm_only(q)))) ||
> +			    blk_pm_resume_queue(pm, q)) ||
>  			   blk_queue_dying(q));
>  		if (blk_queue_dying(q))
>  			return -ENODEV;
> Index: usb-devel/block/blk-pm.h
> ===================================================================
> --- usb-devel.orig/block/blk-pm.h
> +++ usb-devel/block/blk-pm.h
> @@ -6,11 +6,14 @@
>  #include <linux/pm_runtime.h>
>  
>  #ifdef CONFIG_PM
> -static inline void blk_pm_request_resume(struct request_queue *q)
> +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
>  {
> -	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
> -		       q->rpm_status == RPM_SUSPENDING))
> -		pm_request_resume(q->dev);
> +	if (!q->dev || !blk_queue_pm_only(q))
> +		return 1;	/* Nothing to do */
> +	if (pm && q->rpm_status != RPM_SUSPENDED)
> +		return 1;	/* Request allowed */
> +	pm_request_resume(q->dev);
> +	return 0;
>  }

Does the above change, especially the " && q->rpm_status !=
RPM_SUSPENDED" part, make it impossible to bring a suspended device back
to the RPM_ACTIVE state?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29 16:56             ` Bart Van Assche
@ 2020-06-29 17:40               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-29 17:40 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Mon, Jun 29, 2020 at 09:56:49AM -0700, Bart Van Assche wrote:
> On 2020-06-29 09:15, Alan Stern wrote:
> > Aha.  Looking at this more closely, it's apparent that the code in 
> > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> > flag is set then the request can be issued regardless of the queue's 
> > runtime status.  That is not correct when the queue is suspended.
> 
> Please clarify why this is not correct.

As I understand it, BLK_MQ_REQ_PREEMPT is supposed to mean (among other 
things) that this request may be issued as part of the procedure for 
putting a device into a low-power state or returning it to a high-power 
state.  Consequently, requests with that flag set must be allowed while 
the queue is in the RPM_SUSPENDING or RPM_RESUMING runtime states -- as 
opposed to ordinary requests, which are allowed only in the RPM_ACTIVE 
state.

In the RPM_SUSPENDED state, however, the queue is entirely inactive.  Even 
if a request were to be issued somehow, it would fail because the system 
would not be able to transmit it to the device.  In other words, when the 
queue is in the RPM_SUSPENDED state, a resume must be requested before 
_any_ request can be issued.

> > Index: usb-devel/block/blk-core.c
> > ===================================================================
> > --- usb-devel.orig/block/blk-core.c
> > +++ usb-devel/block/blk-core.c
> > @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
> >  			 * responsible for ensuring that that counter is
> >  			 * globally visible before the queue is unfrozen.
> >  			 */
> > -			if (pm || !blk_queue_pm_only(q)) {
> > +			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
> > +			    !blk_queue_pm_only(q)) {
> >  				success = true;
> >  			} else {
> >  				percpu_ref_put(&q->q_usage_counter);
> 
> Does the above change make it impossible to bring a suspended device
> back to the RPM_ACTIVE state if the BLK_MQ_REQ_NOWAIT flag is set?

The only case affected by this change is when BLK_MQ_REQ_PREEMPT is set 
and the queue is in the RPM_SUSPENDED state.  If BLK_MQ_REQ_NOWAIT was 
also set, the original code would set "success" to true, allowing the 
request to proceed even though it could not be carried out immediately -- 
a bug.

With the patch, such a request will fail without resuming the device.  I 
don't know whether that is the desired behavior or not, but at least it's 
not obviously a bug.

It does seem odd that blk_queue_enter() tests the queue's pm_only status 
and the request flag in two different spots (here and below).  Why does it 
do this?  It seems like an invitation for bugs.

> > @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
> >  
> >  		wait_event(q->mq_freeze_wq,
> >  			   (!q->mq_freeze_depth &&
> > -			    (pm || (blk_pm_request_resume(q),
> > -				    !blk_queue_pm_only(q)))) ||
> > +			    blk_pm_resume_queue(pm, q)) ||
> >  			   blk_queue_dying(q));
> >  		if (blk_queue_dying(q))
> >  			return -ENODEV;
> > Index: usb-devel/block/blk-pm.h
> > ===================================================================
> > --- usb-devel.orig/block/blk-pm.h
> > +++ usb-devel/block/blk-pm.h
> > @@ -6,11 +6,14 @@
> >  #include <linux/pm_runtime.h>
> >  
> >  #ifdef CONFIG_PM
> > -static inline void blk_pm_request_resume(struct request_queue *q)
> > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
> >  {
> > -	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
> > -		       q->rpm_status == RPM_SUSPENDING))
> > -		pm_request_resume(q->dev);
> > +	if (!q->dev || !blk_queue_pm_only(q))
> > +		return 1;	/* Nothing to do */
> > +	if (pm && q->rpm_status != RPM_SUSPENDED)
> > +		return 1;	/* Request allowed */
> > +	pm_request_resume(q->dev);
> > +	return 0;
> >  }
> 
> Does the above change, especially the " && q->rpm_status !=
> RPM_SUSPENDED" part, make it impossible to bring a suspended device back
> to the RPM_ACTIVE state?

Just the opposite -- the change makes it _possible_ for a 
BLK_MQ_REQ_PREEMPT request to bring a suspended device back to the 
RPM_ACTIVE state.

Look at the existing code: If pm is true then blk_pm_request_resume() will 
be skipped, so the device won't be resumed.  With this patch -- in 
particular with the "&& q->rpm_status != RPM_SUSPENDED" part added -- the 
call won't be skipped and so the resume will take place.

The rather complicated syntax of the wait_event() call in the existing 
code contributes to this confusion.  One of the things my patch tries to 
do is make the code more straightforward and easier to grasp.

I admit that there are parts to this thing I don't understand.  The 
wait_event() call in blk_queue_enter(), for example: If we are waiting for 
the device to leave the RPM_SUSPENDED state (or enter the RPM_ACTIVE 
state), where does q->mq_freeze_wq get woken up?  There's no obvious spot 
in blk_{pre|post}_runtime_resume().

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29 16:15           ` Alan Stern
  2020-06-29 16:56             ` Bart Van Assche
@ 2020-06-30  3:33             ` Martin Kepplinger
  2020-06-30 13:38               ` Alan Stern
  2020-06-30 15:59             ` Bart Van Assche
  2020-07-30  8:05             ` Martin Kepplinger
  3 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-06-30  3:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 29.06.20 18:15, Alan Stern wrote:
> On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote:
>>
>>
>> On 26.06.20 17:44, Alan Stern wrote:
>>> Martin's best approach would be to add some debugging code to find out why 
>>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
>>> doesn't lead to pm_request_resume().
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> blk_queue_enter() always - especially when sd is runtime suspended and I
>> try to mount as above - sets success to be true for me, so never
>> continues down to bkl_pm_request_resume(). All I see is "PM: Removing
>> info for No Bus:sda1".
> 
> Aha.  Looking at this more closely, it's apparent that the code in 
> blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> flag is set then the request can be issued regardless of the queue's 
> runtime status.  That is not correct when the queue is suspended.
> 
> Below is my attempt to fix this up.  I'm not sure that the patch is 
> entirely correct, but it should fix this logic bug.  I would appreciate a 
> critical review.
> 
> Martin, does this fix the problem?
> 

not quite: mounting works and resuming itself indeed happens now when
copying a file, but the I/O itself doesn't, but says "device offline or
changed":

[  167.167615] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
hostbyte=0x00 driverbyte=0x08 cmd_age=0s
[  167.167630] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
[  167.167638] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
[  167.167648] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 24
c2 00 00 01 00
[  167.167658] blk_update_request: I/O error, dev sda, sector 9410 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  167.178327] FAT-fs (sda1): FAT read failed (blocknr 1218)
[  167.183895] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  167.189695] blk_update_request: I/O error, dev sda, sector 5101888 op
0x0:(READ) flags 0x80700 phys_seg 8 prio class 0
[  167.200510] sd 0:0:0:0: [sda] tag#0 device offline or changed


and a later try to copy a file only yields (mostly my own debug prints):


[  371.110798] blk_queue_enter: wait_event: pm=0
[  371.300666] scsi_runtime_resume
[  371.303834] scsi_runtime_resume
[  371.307007] scsi_runtime_resume
[  371.310213] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  371.316011] blk_update_request: I/O error, dev sda, sector 5101888 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  372.560690] scsi_runtime_suspend
[  372.563968] scsi_runtime_suspend
[  372.567237] scsi_runtime_suspend

thanks Alan for taking the time and trying to fix this! you're close.
what is missing?

                                martin

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30  3:33             ` Martin Kepplinger
@ 2020-06-30 13:38               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-06-30 13:38 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jun 30, 2020 at 05:33:25AM +0200, Martin Kepplinger wrote:
> > Martin, does this fix the problem?
> > 
> 
> not quite: mounting works and resuming itself indeed happens now when
> copying a file, but the I/O itself doesn't, but says "device offline or
> changed":
> 
> [  167.167615] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> [  167.167630] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> [  167.167638] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0

That code stands for "Not-ready to ready transition".  It isn't really an 
error, just a notification.  The command should have been retried.

> [  167.167648] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 24
> c2 00 00 01 00
> [  167.167658] blk_update_request: I/O error, dev sda, sector 9410 op
> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> [  167.178327] FAT-fs (sda1): FAT read failed (blocknr 1218)

And it should not have failed.  Martin or James, any ideas about this?

> [  167.183895] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [  167.189695] blk_update_request: I/O error, dev sda, sector 5101888 op
> 0x0:(READ) flags 0x80700 phys_seg 8 prio class 0
> [  167.200510] sd 0:0:0:0: [sda] tag#0 device offline or changed
> 
> 
> and a later try to copy a file only yields (mostly my own debug prints):
> 
> 
> [  371.110798] blk_queue_enter: wait_event: pm=0
> [  371.300666] scsi_runtime_resume
> [  371.303834] scsi_runtime_resume
> [  371.307007] scsi_runtime_resume
> [  371.310213] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [  371.316011] blk_update_request: I/O error, dev sda, sector 5101888 op
> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

No way to tell from the log what caused this error.

> [  372.560690] scsi_runtime_suspend
> [  372.563968] scsi_runtime_suspend
> [  372.567237] scsi_runtime_suspend
> 
> thanks Alan for taking the time and trying to fix this! you're close.
> what is missing?

At this point I don't know.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29 16:15           ` Alan Stern
  2020-06-29 16:56             ` Bart Van Assche
  2020-06-30  3:33             ` Martin Kepplinger
@ 2020-06-30 15:59             ` Bart Van Assche
  2020-06-30 18:02               ` Alan Stern
  2020-07-30  8:05             ` Martin Kepplinger
  3 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-30 15:59 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: jejb, Can Guo, martin.petersen, linux-scsi, linux-kernel, kernel

On 2020-06-29 09:15, Alan Stern wrote:
> Aha.  Looking at this more closely, it's apparent that the code in 
> blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> flag is set then the request can be issued regardless of the queue's 
> runtime status.  That is not correct when the queue is suspended.

Are you sure of this? In the past (legacy block layer) no requests were
processed for queues in state RPM_SUSPENDED. However, that function and
its successor blk_pm_allow_request() are gone. The following code was
removed by commit 7cedffec8e75 ("block: Make blk_get_request() block for
non-PM requests while suspended").

static struct request *blk_pm_peek_request(struct request_queue *q,
                                           struct request *rq)
{
        if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
            (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM))))
                return NULL;
        else
                return rq;
}

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30 15:59             ` Bart Van Assche
@ 2020-06-30 18:02               ` Alan Stern
  2020-06-30 19:23                 ` Bart Van Assche
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-06-30 18:02 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jun 30, 2020 at 08:59:00AM -0700, Bart Van Assche wrote:
> On 2020-06-29 09:15, Alan Stern wrote:
> > Aha.  Looking at this more closely, it's apparent that the code in 
> > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> > flag is set then the request can be issued regardless of the queue's 
> > runtime status.  That is not correct when the queue is suspended.
> 
> Are you sure of this? In the past (legacy block layer) no requests were
> processed for queues in state RPM_SUSPENDED. However, that function and
> its successor blk_pm_allow_request() are gone. The following code was
> removed by commit 7cedffec8e75 ("block: Make blk_get_request() block for
> non-PM requests while suspended").
> 
> static struct request *blk_pm_peek_request(struct request_queue *q,
>                                            struct request *rq)
> {
>         if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
>             (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM))))
>                 return NULL;
>         else
>                 return rq;
> }

No, it wasn't.  Another routine, blk_pm_allow_request(), was removed by 
that commit, but almost no other code was deleted.  Maybe you're thinking 
of a different commit?

In any case, I don't understand the point you're trying to make.

Here's what I _do_ understand: While the queue is in the RPM_SUSPENDED 
state, it can't carry out any requests at all.  If a request is added to 
the queue, the queue must be resumed before the request can be issued to 
the lower-layer drivers -- no matter what flags are set in the request.

Right now there doesn't seem to be any mechanism for resuming the queue 
if an REQ_PREEMPT request is added while the queue is suspended.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30 18:02               ` Alan Stern
@ 2020-06-30 19:23                 ` Bart Van Assche
  2020-06-30 19:38                   ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-30 19:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 2020-06-30 11:02, Alan Stern wrote:
> Right now there doesn't seem to be any mechanism for resuming the queue 
> if an REQ_PREEMPT request is added while the queue is suspended.

I do not agree with the above statement. My understanding is that resuming
happens as follows if a request is submitted against a runtime suspended
queue owned by a SCSI LLD:

blk_queue_enter()
  -> blk_pm_request_resume()
    -> pm_request_resume(dev)
      -> __pm_runtime_resume(dev, RPM_ASYNC)
        -> rpm_resume(dev, RPM_ASYNC)
          -> dev->power.request = RPM_REQ_RESUME;
          -> queue_work(pm_wq, &dev->power.work)
            -> pm_runtime_work()
              -> rpm_resume(dev, RPM_NOWAIT)
                -> callback = scsi_runtime_resume;
                -> rpm_callback(callback, dev);
                  -> scsi_runtime_resume(dev);
                    -> sdev_runtime_resume(dev);
                      -> blk_pre_runtime_resume(sdev->request_queue);
                        -> q->rpm_status = RPM_RESUMING;
                      -> sd_resume(dev);
                        -> sd_start_stop_device(sdkp);
                          -> sd_pm_ops.runtime_resume == scsi_execute(sdp, START);
                            -> blk_get_request(..., ..., BLK_MQ_REQ_PREEMPT)
                              -> blk_mq_alloc_request()
                                -> blk_queue_enter()
                                -> __blk_mq_alloc_request()
                            -> blk_execute_rq()
                            -> blk_put_request()
                      -> blk_post_runtime_resume(sdev->request_queue);
                        -> q->rpm_status = RPM_ACTIVE;
                        -> pm_runtime_mark_last_busy(q->dev);
                        -> pm_request_autosuspend(q->dev);

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30 19:23                 ` Bart Van Assche
@ 2020-06-30 19:38                   ` Alan Stern
  2020-06-30 23:31                     ` Bart Van Assche
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-06-30 19:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jun 30, 2020 at 12:23:12PM -0700, Bart Van Assche wrote:
> On 2020-06-30 11:02, Alan Stern wrote:
> > Right now there doesn't seem to be any mechanism for resuming the queue 
> > if an REQ_PREEMPT request is added while the queue is suspended.
> 
> I do not agree with the above statement. My understanding is that resuming
> happens as follows if a request is submitted against a runtime suspended
> queue owned by a SCSI LLD:
> 
> blk_queue_enter()
>   -> blk_pm_request_resume()

Assume that BLK_MQ_REQ_PREEMPT is set in flags.  Then where exactly 
does blk_queue_enter(q, flags) call blk_pm_request_resume(q)?

Nowhere, as far as I can see.

Alan Stern

>     -> pm_request_resume(dev)
>       -> __pm_runtime_resume(dev, RPM_ASYNC)
>         -> rpm_resume(dev, RPM_ASYNC)
>           -> dev->power.request = RPM_REQ_RESUME;
>           -> queue_work(pm_wq, &dev->power.work)
>             -> pm_runtime_work()
>               -> rpm_resume(dev, RPM_NOWAIT)
>                 -> callback = scsi_runtime_resume;
>                 -> rpm_callback(callback, dev);
>                   -> scsi_runtime_resume(dev);
>                     -> sdev_runtime_resume(dev);
>                       -> blk_pre_runtime_resume(sdev->request_queue);
>                         -> q->rpm_status = RPM_RESUMING;
>                       -> sd_resume(dev);
>                         -> sd_start_stop_device(sdkp);
>                           -> sd_pm_ops.runtime_resume == scsi_execute(sdp, START);
>                             -> blk_get_request(..., ..., BLK_MQ_REQ_PREEMPT)
>                               -> blk_mq_alloc_request()
>                                 -> blk_queue_enter()
>                                 -> __blk_mq_alloc_request()
>                             -> blk_execute_rq()
>                             -> blk_put_request()
>                       -> blk_post_runtime_resume(sdev->request_queue);
>                         -> q->rpm_status = RPM_ACTIVE;
>                         -> pm_runtime_mark_last_busy(q->dev);
>                         -> pm_request_autosuspend(q->dev);
> 
> Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30 19:38                   ` Alan Stern
@ 2020-06-30 23:31                     ` Bart Van Assche
  2020-07-01  0:49                       ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-06-30 23:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 2020-06-30 12:38, Alan Stern wrote:
> Assume that BLK_MQ_REQ_PREEMPT is set in flags.  Then where exactly 
> does blk_queue_enter(q, flags) call blk_pm_request_resume(q)?

Please take a look at how the *current* implementation of runtime power
management works. Your question is relevant for the old implementation
of runtime power management but not for the current implementation.

Bart.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-30 23:31                     ` Bart Van Assche
@ 2020-07-01  0:49                       ` Alan Stern
  2020-07-06 16:41                         ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-07-01  0:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote:
> On 2020-06-30 12:38, Alan Stern wrote:
> > Assume that BLK_MQ_REQ_PREEMPT is set in flags.  Then where exactly 
> > does blk_queue_enter(q, flags) call blk_pm_request_resume(q)?
> 
> Please take a look at how the *current* implementation of runtime power
> management works. Your question is relevant for the old implementation
> of runtime power management but not for the current implementation.

What do you mean by "current"?  I have been looking at the implementation 
in 5.8-rc3 from Linus's tree.  Should I look somewhere else?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-01  0:49                       ` Alan Stern
@ 2020-07-06 16:41                         ` Alan Stern
  2020-07-28  7:02                           ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-07-06 16:41 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jun 30, 2020 at 20:49:58PM -0400, Alan Stern wrote:
> On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote:
> > On 2020-06-30 12:38, Alan Stern wrote:
> > > Assume that BLK_MQ_REQ_PREEMPT is set in flags.  Then where exactly 
> > > does blk_queue_enter(q, flags) call blk_pm_request_resume(q)?
> > 
> > Please take a look at how the *current* implementation of runtime power
> > management works. Your question is relevant for the old implementation
> > of runtime power management but not for the current implementation.
> 
> What do you mean by "current"?  I have been looking at the implementation 
> in 5.8-rc3 from Linus's tree.  Should I look somewhere else?

Any reply to this, or further concerns about the proposed patch?

I'd like to fix up this API, and it appears that you are the only
person to have worked on it significantly in the last two years.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-06 16:41                         ` Alan Stern
@ 2020-07-28  7:02                           ` Martin Kepplinger
  2020-07-28 20:02                             ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-28  7:02 UTC (permalink / raw)
  To: Alan Stern, Bart Van Assche
  Cc: jejb, Can Guo, martin.petersen, linux-scsi, linux-kernel, kernel

On 06.07.20 18:41, Alan Stern wrote:
> On Tue, Jun 30, 2020 at 20:49:58PM -0400, Alan Stern wrote:
>> On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote:
>>> On 2020-06-30 12:38, Alan Stern wrote:
>>>> Assume that BLK_MQ_REQ_PREEMPT is set in flags.  Then where exactly 
>>>> does blk_queue_enter(q, flags) call blk_pm_request_resume(q)?
>>>
>>> Please take a look at how the *current* implementation of runtime power
>>> management works. Your question is relevant for the old implementation
>>> of runtime power management but not for the current implementation.
>>
>> What do you mean by "current"?  I have been looking at the implementation 
>> in 5.8-rc3 from Linus's tree.  Should I look somewhere else?
> 
> Any reply to this, or further concerns about the proposed patch?
> 
> I'd like to fix up this API, and it appears that you are the only
> person to have worked on it significantly in the last two years.
> 
> Alan Stern
> 

Hi Alan,

Any API cleanup is of course welcome. I just wanted to remind you that
the underlying problem: broken block device runtime pm. Your initial
proposed fix "almost" did it and mounting works but during file access,
it still just looks like a runtime_resume is missing somewhere.

As we need to have that working at some point, I might look into it, but
someone who has experience in the block layer can surely do it more
efficiently.

to test, turn off polling:
echo 0 > /sys/module/block/parameters/events_dfl_poll_msecs

and enable runtime pm in your (scsi) device: power/autosuspend_delay_ms
and of course power/control (set to auto).

thanks,
                              martin

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-28  7:02                           ` Martin Kepplinger
@ 2020-07-28 20:02                             ` Alan Stern
  2020-07-29 14:12                               ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-07-28 20:02 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote:
> Hi Alan,
> 
> Any API cleanup is of course welcome. I just wanted to remind you that
> the underlying problem: broken block device runtime pm. Your initial
> proposed fix "almost" did it and mounting works but during file access,
> it still just looks like a runtime_resume is missing somewhere.

Well, I have tested that proposed fix several times, and on my system 
it's working perfectly.  When I stop accessing a drive it autosuspends, 
and when I access it again it gets resumed and works -- as you would 
expect.

> As we need to have that working at some point, I might look into it, but
> someone who has experience in the block layer can surely do it more
> efficiently.

I suspect that any problems you still face are caused by something else.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-28 20:02                             ` Alan Stern
@ 2020-07-29 14:12                               ` Martin Kepplinger
  2020-07-29 14:32                                 ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-29 14:12 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 28.07.20 22:02, Alan Stern wrote:
> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote:
>> Hi Alan,
>>
>> Any API cleanup is of course welcome. I just wanted to remind you that
>> the underlying problem: broken block device runtime pm. Your initial
>> proposed fix "almost" did it and mounting works but during file access,
>> it still just looks like a runtime_resume is missing somewhere.
> 
> Well, I have tested that proposed fix several times, and on my system 
> it's working perfectly.  When I stop accessing a drive it autosuspends, 
> and when I access it again it gets resumed and works -- as you would 
> expect.

that's weird. when I mount, everything looks good, "sda1". But as soon
as I cd to the mountpoint and do "ls" (on another SD card "ls" works but
actual file reading leads to the exact same errors), I get:

[   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
hostbyte=0x00 driverbyte=0x08 cmd_age=0s
[   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
[   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
[   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60
40 00 00 01 00
[   77.474678] blk_update_request: I/O error, dev sda, sector 24640 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   77.491628] blk_update_request: I/O error, dev sda, sector 24641 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   77.508051] blk_update_request: I/O error, dev sda, sector 24642 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed
(...)
[   77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   77.953434] FAT-fs (sda1): Directory bread(block 16448) failed
[   77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   77.965118] FAT-fs (sda1): Directory bread(block 16449) failed
[   77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   77.976802] FAT-fs (sda1): Directory bread(block 16450) failed
[   77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed
(...)
[   78.384929] FAT-fs (sda1): Filesystem has been set read-only
[  103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  103.076751] print_req_error: 118 callbacks suppressed
[  103.076760] blk_update_request: I/O error, dev sda, sector 9748 op
0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
[  103.087428] Buffer I/O error on dev sda1, logical block 1556, lost
async page write
[  103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  103.101123] blk_update_request: I/O error, dev sda, sector 17162 op
0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
[  103.111883] Buffer I/O error on dev sda1, logical block 8970, lost
async page write

> 
>> As we need to have that working at some point, I might look into it, but
>> someone who has experience in the block layer can surely do it more
>> efficiently.
> 
> I suspect that any problems you still face are caused by something else.
> 

I then formatted sda1 to ext2 (on the runtime suspend system testing
your patch) and that seems to have worked!

Again accessing the mountpoint then yield the very same "device offline
or changed" errors.

What kind of device are you testing? You should be easily able to
reproduce this using an "sd" device.

The problems must lie in the different other drivers we use I guess.

                             martin


> Alan Stern
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:12                               ` Martin Kepplinger
@ 2020-07-29 14:32                                 ` Alan Stern
  2020-07-29 14:44                                   ` Martin K. Petersen
                                                     ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Alan Stern @ 2020-07-29 14:32 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
> On 28.07.20 22:02, Alan Stern wrote:
> > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote:
> >> Hi Alan,
> >>
> >> Any API cleanup is of course welcome. I just wanted to remind you that
> >> the underlying problem: broken block device runtime pm. Your initial
> >> proposed fix "almost" did it and mounting works but during file access,
> >> it still just looks like a runtime_resume is missing somewhere.
> > 
> > Well, I have tested that proposed fix several times, and on my system 
> > it's working perfectly.  When I stop accessing a drive it autosuspends, 
> > and when I access it again it gets resumed and works -- as you would 
> > expect.
> 
> that's weird. when I mount, everything looks good, "sda1". But as soon
> as I cd to the mountpoint and do "ls" (on another SD card "ls" works but
> actual file reading leads to the exact same errors), I get:
> 
> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60
> 40 00 00 01 00

This error report comes from the SCSI layer, not the block layer.

> [   77.474678] blk_update_request: I/O error, dev sda, sector 24640 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   77.491628] blk_update_request: I/O error, dev sda, sector 24641 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   77.508051] blk_update_request: I/O error, dev sda, sector 24642 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed
> (...)
> [   77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   77.953434] FAT-fs (sda1): Directory bread(block 16448) failed
> [   77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   77.965118] FAT-fs (sda1): Directory bread(block 16449) failed
> [   77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   77.976802] FAT-fs (sda1): Directory bread(block 16450) failed
> [   77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed
> (...)
> [   78.384929] FAT-fs (sda1): Filesystem has been set read-only
> [  103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [  103.076751] print_req_error: 118 callbacks suppressed
> [  103.076760] blk_update_request: I/O error, dev sda, sector 9748 op
> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
> [  103.087428] Buffer I/O error on dev sda1, logical block 1556, lost
> async page write
> [  103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [  103.101123] blk_update_request: I/O error, dev sda, sector 17162 op
> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
> [  103.111883] Buffer I/O error on dev sda1, logical block 8970, lost
> async page write

I can't tell why you're getting that error.  In one of my tests the 
device returned the same kind of error status (Sense Key = 6, ASC = 
0x28) but the operation was then retried successfully.  Perhaps the 
problem lies in the device you are testing.

> >> As we need to have that working at some point, I might look into it, but
> >> someone who has experience in the block layer can surely do it more
> >> efficiently.
> > 
> > I suspect that any problems you still face are caused by something else.
> > 
> 
> I then formatted sda1 to ext2 (on the runtime suspend system testing
> your patch) and that seems to have worked!
> 
> Again accessing the mountpoint then yield the very same "device offline
> or changed" errors.
> 
> What kind of device are you testing? You should be easily able to
> reproduce this using an "sd" device.

I tested two devices: a SanDisk Cruzer USB flash drive and a 
g-mass-storage gadget running under dummy-hcd.  They each showed up as 
/dev/sdb on my system.

I haven't tried testing with an SD card.  If you have any specific 
sequence of commands you would like me to run, let me know.

> The problems must lie in the different other drivers we use I guess.

Or the devices.  Have you tried testing with a USB flash drive?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:32                                 ` Alan Stern
@ 2020-07-29 14:44                                   ` Martin K. Petersen
  2020-07-29 14:56                                     ` Alan Stern
  2020-07-29 14:46                                   ` James Bottomley
  2020-07-29 18:10                                   ` Douglas Gilbert
  2 siblings, 1 reply; 68+ messages in thread
From: Martin K. Petersen @ 2020-07-29 14:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: Martin Kepplinger, Bart Van Assche, jejb, Can Guo,
	martin.petersen, linux-scsi, linux-kernel, kernel


Alan,

>> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60
>> 40 00 00 01 00
>
> This error report comes from the SCSI layer, not the block layer.

This the device telling us that the media (SD card?) has changed.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:32                                 ` Alan Stern
  2020-07-29 14:44                                   ` Martin K. Petersen
@ 2020-07-29 14:46                                   ` James Bottomley
  2020-07-29 14:53                                     ` James Bottomley
  2020-07-29 18:10                                   ` Douglas Gilbert
  2 siblings, 1 reply; 68+ messages in thread
From: James Bottomley @ 2020-07-29 14:46 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
> > On 28.07.20 22:02, Alan Stern wrote:
> > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
> > > wrote:
> > > > Hi Alan,
> > > > 
> > > > Any API cleanup is of course welcome. I just wanted to remind
> > > > you that the underlying problem: broken block device runtime
> > > > pm. Your initial proposed fix "almost" did it and mounting
> > > > works but during file access, it still just looks like a
> > > > runtime_resume is missing somewhere.
> > > 
> > > Well, I have tested that proposed fix several times, and on my
> > > system it's working perfectly.  When I stop accessing a drive it
> > > autosuspends, and when I access it again it gets resumed and
> > > works -- as you would expect.
> > 
> > that's weird. when I mount, everything looks good, "sda1". But as
> > soon as I cd to the mountpoint and do "ls" (on another SD card "ls"
> > works but actual file reading leads to the exact same errors), I
> > get:
> > 
> > [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> > hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> > [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> > [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> > [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00
> > 60 40 00 00 01 00
> 
> This error report comes from the SCSI layer, not the block layer.

That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
CHANGED" so it sounds like it something we should be ignoring.  Usually
this signals a problem, like you changed the medium manually (ejected
the CD).  But in this case you can tell us to expect this by setting

sdev->expecting_cc_ua

And we'll retry.  I think you need to set this on all resumed devices.

James


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:46                                   ` James Bottomley
@ 2020-07-29 14:53                                     ` James Bottomley
  2020-07-29 15:40                                       ` Martin Kepplinger
  2020-07-29 15:40                                       ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
  0 siblings, 2 replies; 68+ messages in thread
From: James Bottomley @ 2020-07-29 14:53 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
> > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
> > > On 28.07.20 22:02, Alan Stern wrote:
> > > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
> > > > wrote:
> > > > > Hi Alan,
> > > > > 
> > > > > Any API cleanup is of course welcome. I just wanted to remind
> > > > > you that the underlying problem: broken block device runtime
> > > > > pm. Your initial proposed fix "almost" did it and mounting
> > > > > works but during file access, it still just looks like a
> > > > > runtime_resume is missing somewhere.
> > > > 
> > > > Well, I have tested that proposed fix several times, and on my
> > > > system it's working perfectly.  When I stop accessing a drive
> > > > it autosuspends, and when I access it again it gets resumed and
> > > > works -- as you would expect.
> > > 
> > > that's weird. when I mount, everything looks good, "sda1". But as
> > > soon as I cd to the mountpoint and do "ls" (on another SD card
> > > "ls" works but actual file reading leads to the exact same
> > > errors), I get:
> > > 
> > > [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> > > hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> > > [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> > > [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> > > [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
> > > 00 60 40 00 00 01 00
> > 
> > This error report comes from the SCSI layer, not the block layer.
> 
> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> CHANGED" so it sounds like it something we should be
> ignoring.  Usually this signals a problem, like you changed the
> medium manually (ejected the CD).  But in this case you can tell us
> to expect this by setting
> 
> sdev->expecting_cc_ua
> 
> And we'll retry.  I think you need to set this on all resumed
> devices.

Actually, it's not quite that easy, we filter out this ASC/ASCQ
combination from the check because we should never ignore medium might
have changed events on running devices.  We could ignore it if we had a
flag to say the power has been yanked (perhaps an additional sdev flag
you set on resume) but we would still miss the case where you really
had powered off the drive and then changed the media ... if you can
regard this as the user's problem, then we might have a solution.

James
 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:44                                   ` Martin K. Petersen
@ 2020-07-29 14:56                                     ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-07-29 14:56 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Martin Kepplinger, Bart Van Assche, jejb, Can Guo, linux-scsi,
	linux-kernel, kernel

On Wed, Jul 29, 2020 at 10:44:26AM -0400, Martin K. Petersen wrote:
> 
> Alan,
> 
> >> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> >> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> >> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> >> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> >> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60
> >> 40 00 00 01 00
> >
> > This error report comes from the SCSI layer, not the block layer.
> 
> This the device telling us that the media (SD card?) has changed.

Ah yes, thank you.  I knew that SK=6 ASC=0x28 meant "Not Ready to Ready 
Transition", but I had forgotten the "(Media May Have Changed)" part.

This makes sense and is a reasonable thing to see, since many SD card 
readers lose track of whether or not the card has been changed when they 
go into suspend.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:53                                     ` James Bottomley
@ 2020-07-29 15:40                                       ` Martin Kepplinger
  2020-07-29 15:44                                         ` James Bottomley
  2020-07-29 15:40                                       ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
  1 sibling, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-29 15:40 UTC (permalink / raw)
  To: James Bottomley, Alan Stern
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 29.07.20 16:53, James Bottomley wrote:
> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
>> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
>>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
>>>> On 28.07.20 22:02, Alan Stern wrote:
>>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
>>>>> wrote:
>>>>>> Hi Alan,
>>>>>>
>>>>>> Any API cleanup is of course welcome. I just wanted to remind
>>>>>> you that the underlying problem: broken block device runtime
>>>>>> pm. Your initial proposed fix "almost" did it and mounting
>>>>>> works but during file access, it still just looks like a
>>>>>> runtime_resume is missing somewhere.
>>>>>
>>>>> Well, I have tested that proposed fix several times, and on my
>>>>> system it's working perfectly.  When I stop accessing a drive
>>>>> it autosuspends, and when I access it again it gets resumed and
>>>>> works -- as you would expect.
>>>>
>>>> that's weird. when I mount, everything looks good, "sda1". But as
>>>> soon as I cd to the mountpoint and do "ls" (on another SD card
>>>> "ls" works but actual file reading leads to the exact same
>>>> errors), I get:
>>>>
>>>> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>>>> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>>>> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>>>> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
>>>> 00 60 40 00 00 01 00
>>>
>>> This error report comes from the SCSI layer, not the block layer.
>>
>> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
>> CHANGED" so it sounds like it something we should be
>> ignoring.  Usually this signals a problem, like you changed the
>> medium manually (ejected the CD).  But in this case you can tell us
>> to expect this by setting
>>
>> sdev->expecting_cc_ua
>>
>> And we'll retry.  I think you need to set this on all resumed
>> devices.
> 
> Actually, it's not quite that easy, we filter out this ASC/ASCQ
> combination from the check because we should never ignore medium might
> have changed events on running devices.  We could ignore it if we had a
> flag to say the power has been yanked (perhaps an additional sdev flag
> you set on resume) but we would still miss the case where you really
> had powered off the drive and then changed the media ... if you can
> regard this as the user's problem, then we might have a solution.
> 
> James
>  

oh I see what you mean now, thanks for the ellaboration.

if I do the following change, things all look normal and runtime pm
works. I'm not 100% sure if just setting expecting_cc_ua in resume() is
"correct" but that looks like it is what you're talking about:

(note that this is of course with the one block layer diff applied that
Alan posted a few emails back)


--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
                 * so that we can deal with it there.
                 */
                if (scmd->device->expecting_cc_ua) {
-                       /*
-                        * Because some device does not queue unit
-                        * attentions correctly, we carefully check
-                        * additional sense code and qualifier so as
-                        * not to squash media change unit attention.
-                        */
-                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
-                               scmd->device->expecting_cc_ua = 0;
-                               return NEEDS_RETRY;
-                       }
+                       scmd->device->expecting_cc_ua = 0;
+                       return NEEDS_RETRY;
                }
                /*
                 * we might also expect a cc/ua if another LUN on the target
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..5ad847fed8b9 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
        if (!sdkp)      /* E.g.: runtime resume at the start of
sd_probe() */
                return 0;

+       sdkp->device->expecting_cc_ua = 1;
+
        if (!sdkp->device->manage_start_stop)
                return 0;

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:53                                     ` James Bottomley
  2020-07-29 15:40                                       ` Martin Kepplinger
@ 2020-07-29 15:40                                       ` Alan Stern
  2020-07-29 15:49                                         ` James Bottomley
  2020-07-29 15:52                                         ` Martin Kepplinger
  1 sibling, 2 replies; 68+ messages in thread
From: Alan Stern @ 2020-07-29 15:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Martin Kepplinger, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote:
> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
> > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
> > > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
> > > > On 28.07.20 22:02, Alan Stern wrote:
> > > > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
> > > > > wrote:
> > > > > > Hi Alan,
> > > > > > 
> > > > > > Any API cleanup is of course welcome. I just wanted to remind
> > > > > > you that the underlying problem: broken block device runtime
> > > > > > pm. Your initial proposed fix "almost" did it and mounting
> > > > > > works but during file access, it still just looks like a
> > > > > > runtime_resume is missing somewhere.
> > > > > 
> > > > > Well, I have tested that proposed fix several times, and on my
> > > > > system it's working perfectly.  When I stop accessing a drive
> > > > > it autosuspends, and when I access it again it gets resumed and
> > > > > works -- as you would expect.
> > > > 
> > > > that's weird. when I mount, everything looks good, "sda1". But as
> > > > soon as I cd to the mountpoint and do "ls" (on another SD card
> > > > "ls" works but actual file reading leads to the exact same
> > > > errors), I get:
> > > > 
> > > > [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> > > > hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> > > > [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> > > > [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> > > > [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
> > > > 00 60 40 00 00 01 00
> > > 
> > > This error report comes from the SCSI layer, not the block layer.
> > 
> > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> > CHANGED" so it sounds like it something we should be
> > ignoring.  Usually this signals a problem, like you changed the
> > medium manually (ejected the CD).  But in this case you can tell us
> > to expect this by setting
> > 
> > sdev->expecting_cc_ua
> > 
> > And we'll retry.  I think you need to set this on all resumed
> > devices.
> 
> Actually, it's not quite that easy, we filter out this ASC/ASCQ
> combination from the check because we should never ignore medium might
> have changed events on running devices.  We could ignore it if we had a
> flag to say the power has been yanked (perhaps an additional sdev flag
> you set on resume) but we would still miss the case where you really
> had powered off the drive and then changed the media ... if you can
> regard this as the user's problem, then we might have a solution.

Indeed, I was going to make the same point.

The only reasonable conclusion is that suspending these SD card readers 
isn't safe unless they don't contain a card -- or maybe just if the 
device file isn't open or mounted.

Although support for this sort of thing could be added to the kernel, 
for now it's best to rely on userspace doing the right thing.  The 
kernel doesn't even know which devices suffer from this problem.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 15:40                                       ` Martin Kepplinger
@ 2020-07-29 15:44                                         ` James Bottomley
  2020-07-29 16:43                                           ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: James Bottomley @ 2020-07-29 15:44 UTC (permalink / raw)
  To: Martin Kepplinger, Alan Stern
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote:
> On 29.07.20 16:53, James Bottomley wrote:
> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
> > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
[...]
> > > > This error report comes from the SCSI layer, not the block
> > > > layer.
> > > 
> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> > > CHANGED" so it sounds like it something we should be
> > > ignoring.  Usually this signals a problem, like you changed the
> > > medium manually (ejected the CD).  But in this case you can tell
> > > us to expect this by setting
> > > 
> > > sdev->expecting_cc_ua
> > > 
> > > And we'll retry.  I think you need to set this on all resumed
> > > devices.
> > 
> > Actually, it's not quite that easy, we filter out this ASC/ASCQ
> > combination from the check because we should never ignore medium
> > might have changed events on running devices.  We could ignore it
> > if we had a flag to say the power has been yanked (perhaps an
> > additional sdev flag you set on resume) but we would still miss the
> > case where you really had powered off the drive and then changed
> > the media ... if you can regard this as the user's problem, then we
> > might have a solution.
> > 
> > James
> >  
> 
> oh I see what you mean now, thanks for the ellaboration.
> 
> if I do the following change, things all look normal and runtime pm
> works. I'm not 100% sure if just setting expecting_cc_ua in resume()
> is "correct" but that looks like it is what you're talking about:
> 
> (note that this is of course with the one block layer diff applied
> that Alan posted a few emails back)
> 
> 
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>                  * so that we can deal with it there.
>                  */
>                 if (scmd->device->expecting_cc_ua) {
> -                       /*
> -                        * Because some device does not queue unit
> -                        * attentions correctly, we carefully check
> -                        * additional sense code and qualifier so as
> -                        * not to squash media change unit attention.
> -                        */
> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00)
> {
> -                               scmd->device->expecting_cc_ua = 0;
> -                               return NEEDS_RETRY;
> -                       }
> +                       scmd->device->expecting_cc_ua = 0;
> +                       return NEEDS_RETRY;

Well, yes, but you can't do this because it would lose us media change
events in the non-suspend/resume case which we really don't want. 
That's why I was suggesting a new flag.

James


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 15:40                                       ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
@ 2020-07-29 15:49                                         ` James Bottomley
  2020-07-29 16:17                                           ` Alan Stern
  2020-07-29 15:52                                         ` Martin Kepplinger
  1 sibling, 1 reply; 68+ messages in thread
From: James Bottomley @ 2020-07-29 15:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Martin Kepplinger, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Wed, 2020-07-29 at 11:40 -0400, Alan Stern wrote:
> On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote:
> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
[...]
> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> > > CHANGED" so it sounds like it something we should be
> > > ignoring.  Usually this signals a problem, like you changed the
> > > medium manually (ejected the CD).  But in this case you can tell
> > > us to expect this by setting
> > > 
> > > sdev->expecting_cc_ua
> > > 
> > > And we'll retry.  I think you need to set this on all resumed
> > > devices.
> > 
> > Actually, it's not quite that easy, we filter out this ASC/ASCQ
> > combination from the check because we should never ignore medium
> > might have changed events on running devices.  We could ignore it
> > if we had a flag to say the power has been yanked (perhaps an
> > additional sdev flag you set on resume) but we would still miss the
> > case where you really had powered off the drive and then changed
> > the media ... if you can regard this as the user's problem, then we
> > might have a solution.
> 
> Indeed, I was going to make the same point.
> 
> The only reasonable conclusion is that suspending these SD card
> readers isn't safe unless they don't contain a card -- or maybe just
> if the device file isn't open or mounted.

For standard removable media, I could see issuing an eject before
suspend to emphasize the point.  However, sd cards don't work like that
... they're not ejectable except manually and mostly used as the HDD of
embedded, so the 99% use case is that the user will suspend and resume
the device with the same sdd insert.  It does sound like a use case we
should support.

> Although support for this sort of thing could be added to the
> kernel, for now it's best to rely on userspace doing the right
> thing.  The kernel doesn't even know which devices suffer from this
> problem.

Can we solve from userspace the case where the user hasn't changed the
media and the resume fails because we don't know?

James


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 15:40                                       ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
  2020-07-29 15:49                                         ` James Bottomley
@ 2020-07-29 15:52                                         ` Martin Kepplinger
  1 sibling, 0 replies; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-29 15:52 UTC (permalink / raw)
  To: Alan Stern, James Bottomley
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel



On 29.07.20 17:40, Alan Stern wrote:
> On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote:
>> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
>>> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
>>>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
>>>>> On 28.07.20 22:02, Alan Stern wrote:
>>>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
>>>>>> wrote:
>>>>>>> Hi Alan,
>>>>>>>
>>>>>>> Any API cleanup is of course welcome. I just wanted to remind
>>>>>>> you that the underlying problem: broken block device runtime
>>>>>>> pm. Your initial proposed fix "almost" did it and mounting
>>>>>>> works but during file access, it still just looks like a
>>>>>>> runtime_resume is missing somewhere.
>>>>>>
>>>>>> Well, I have tested that proposed fix several times, and on my
>>>>>> system it's working perfectly.  When I stop accessing a drive
>>>>>> it autosuspends, and when I access it again it gets resumed and
>>>>>> works -- as you would expect.
>>>>>
>>>>> that's weird. when I mount, everything looks good, "sda1". But as
>>>>> soon as I cd to the mountpoint and do "ls" (on another SD card
>>>>> "ls" works but actual file reading leads to the exact same
>>>>> errors), I get:
>>>>>
>>>>> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>>>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>>>>> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>>>>> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>>>>> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
>>>>> 00 60 40 00 00 01 00
>>>>
>>>> This error report comes from the SCSI layer, not the block layer.
>>>
>>> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
>>> CHANGED" so it sounds like it something we should be
>>> ignoring.  Usually this signals a problem, like you changed the
>>> medium manually (ejected the CD).  But in this case you can tell us
>>> to expect this by setting
>>>
>>> sdev->expecting_cc_ua
>>>
>>> And we'll retry.  I think you need to set this on all resumed
>>> devices.
>>
>> Actually, it's not quite that easy, we filter out this ASC/ASCQ
>> combination from the check because we should never ignore medium might
>> have changed events on running devices.  We could ignore it if we had a
>> flag to say the power has been yanked (perhaps an additional sdev flag
>> you set on resume) but we would still miss the case where you really
>> had powered off the drive and then changed the media ... if you can
>> regard this as the user's problem, then we might have a solution.
> 
> Indeed, I was going to make the same point.
> 
> The only reasonable conclusion is that suspending these SD card readers 
> isn't safe unless they don't contain a card -- or maybe just if the 
> device file isn't open or mounted.
> 
> Although support for this sort of thing could be added to the kernel, 
> for now it's best to rely on userspace doing the right thing.  The 
> kernel doesn't even know which devices suffer from this problem.
> 
> 

well, userspace can do something like "automatically unmount if not
used" but that is *way* more inefficient than if the kernel would
support this is some way or another, for SD cards.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 15:49                                         ` James Bottomley
@ 2020-07-29 16:17                                           ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-07-29 16:17 UTC (permalink / raw)
  To: James Bottomley
  Cc: Martin Kepplinger, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Wed, Jul 29, 2020 at 08:49:34AM -0700, James Bottomley wrote:
> On Wed, 2020-07-29 at 11:40 -0400, Alan Stern wrote:
> > On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote:
> > > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
> [...]
> > > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> > > > CHANGED" so it sounds like it something we should be
> > > > ignoring.  Usually this signals a problem, like you changed the
> > > > medium manually (ejected the CD).  But in this case you can tell
> > > > us to expect this by setting
> > > > 
> > > > sdev->expecting_cc_ua
> > > > 
> > > > And we'll retry.  I think you need to set this on all resumed
> > > > devices.
> > > 
> > > Actually, it's not quite that easy, we filter out this ASC/ASCQ
> > > combination from the check because we should never ignore medium
> > > might have changed events on running devices.  We could ignore it
> > > if we had a flag to say the power has been yanked (perhaps an
> > > additional sdev flag you set on resume) but we would still miss the
> > > case where you really had powered off the drive and then changed
> > > the media ... if you can regard this as the user's problem, then we
> > > might have a solution.
> > 
> > Indeed, I was going to make the same point.
> > 
> > The only reasonable conclusion is that suspending these SD card
> > readers isn't safe unless they don't contain a card -- or maybe just
> > if the device file isn't open or mounted.
> 
> For standard removable media, I could see issuing an eject before
> suspend to emphasize the point.

That's not a workable approach in general.  For example, you wouldn't 
want to eject a CD just because it hadn't been used for a couple of 
minutes and the drive was therefore about to be suspended.

>  However, sd cards don't work like that
> ... they're not ejectable except manually and mostly used as the HDD of
> embedded, so the 99% use case is that the user will suspend and resume
> the device with the same sdd insert.  It does sound like a use case we
> should support.

Agreed.

> > Although support for this sort of thing could be added to the
> > kernel, for now it's best to rely on userspace doing the right
> > thing.  The kernel doesn't even know which devices suffer from this
> > problem.
> 
> Can we solve from userspace the case where the user hasn't changed the
> media and the resume fails because we don't know?

The difficulty is recognizing situations where the media really was 
changed while the device was suspended.  Most devices, AFAIK, don't have 
any problem with this, but card readers often do.

I suppose the kernel could just rely on users not to change media in 
drives while they are mounted.  Then any problems that arise would be 
the users' fault.  Yes, this is nothing more than passing the buck, but 
it's hard to come up with any better approach.  Doesn't Windows do 
something like this?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 15:44                                         ` James Bottomley
@ 2020-07-29 16:43                                           ` Martin Kepplinger
  2020-07-29 18:25                                             ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-29 16:43 UTC (permalink / raw)
  To: James Bottomley, Alan Stern
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel



Am 29. Juli 2020 17:44:42 MESZ schrieb James Bottomley <James.Bottomley@HansenPartnership.com>:
>On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote:
>> On 29.07.20 16:53, James Bottomley wrote:
>> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
>> > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
>[...]
>> > > > This error report comes from the SCSI layer, not the block
>> > > > layer.
>> > > 
>> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
>> > > CHANGED" so it sounds like it something we should be
>> > > ignoring.  Usually this signals a problem, like you changed the
>> > > medium manually (ejected the CD).  But in this case you can tell
>> > > us to expect this by setting
>> > > 
>> > > sdev->expecting_cc_ua
>> > > 
>> > > And we'll retry.  I think you need to set this on all resumed
>> > > devices.
>> > 
>> > Actually, it's not quite that easy, we filter out this ASC/ASCQ
>> > combination from the check because we should never ignore medium
>> > might have changed events on running devices.  We could ignore it
>> > if we had a flag to say the power has been yanked (perhaps an
>> > additional sdev flag you set on resume) but we would still miss the
>> > case where you really had powered off the drive and then changed
>> > the media ... if you can regard this as the user's problem, then we
>> > might have a solution.
>> > 
>> > James
>> >  
>> 
>> oh I see what you mean now, thanks for the ellaboration.
>> 
>> if I do the following change, things all look normal and runtime pm
>> works. I'm not 100% sure if just setting expecting_cc_ua in resume()
>> is "correct" but that looks like it is what you're talking about:
>> 
>> (note that this is of course with the one block layer diff applied
>> that Alan posted a few emails back)
>> 
>> 
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>                  * so that we can deal with it there.
>>                  */
>>                 if (scmd->device->expecting_cc_ua) {
>> -                       /*
>> -                        * Because some device does not queue unit
>> -                        * attentions correctly, we carefully check
>> -                        * additional sense code and qualifier so as
>> -                        * not to squash media change unit attention.
>> -                        */
>> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00)
>> {
>> -                               scmd->device->expecting_cc_ua = 0;
>> -                               return NEEDS_RETRY;
>> -                       }
>> +                       scmd->device->expecting_cc_ua = 0;
>> +                       return NEEDS_RETRY;
>
>Well, yes, but you can't do this because it would lose us media change
>events in the non-suspend/resume case which we really don't want. 
>That's why I was suggesting a new flag.
>
>James

also if I set expecting_cc_ua in resume() only, like I did?

-- 
Martin Kepplinger
xmpp: martink@jabber.at
Sent from mobile.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 14:32                                 ` Alan Stern
  2020-07-29 14:44                                   ` Martin K. Petersen
  2020-07-29 14:46                                   ` James Bottomley
@ 2020-07-29 18:10                                   ` Douglas Gilbert
  2 siblings, 0 replies; 68+ messages in thread
From: Douglas Gilbert @ 2020-07-29 18:10 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 2020-07-29 10:32 a.m., Alan Stern wrote:
> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
>> On 28.07.20 22:02, Alan Stern wrote:
>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote:
>>>> Hi Alan,
>>>>
>>>> Any API cleanup is of course welcome. I just wanted to remind you that
>>>> the underlying problem: broken block device runtime pm. Your initial
>>>> proposed fix "almost" did it and mounting works but during file access,
>>>> it still just looks like a runtime_resume is missing somewhere.
>>>
>>> Well, I have tested that proposed fix several times, and on my system
>>> it's working perfectly.  When I stop accessing a drive it autosuspends,
>>> and when I access it again it gets resumed and works -- as you would
>>> expect.
>>
>> that's weird. when I mount, everything looks good, "sda1". But as soon
>> as I cd to the mountpoint and do "ls" (on another SD card "ls" works but
>> actual file reading leads to the exact same errors), I get:
>>
>> [   77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>> [   77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>> [   77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>> [   77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60
>> 40 00 00 01 00
> 
> This error report comes from the SCSI layer, not the block layer.

SCSI's first 11 byte command! I'm guessing the first byte is being
repeated and it's actually:
     28 00 00 00 60 40 00 00 01 00  [READ(10)]

That should be fixed. It should be something like: "...CDB in hex: 28 00 ...".

Doug Gilbert

>> [   77.474678] blk_update_request: I/O error, dev sda, sector 24640 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   77.491628] blk_update_request: I/O error, dev sda, sector 24641 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   77.508051] blk_update_request: I/O error, dev sda, sector 24642 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> (...)
>> [   77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   77.953434] FAT-fs (sda1): Directory bread(block 16448) failed
>> [   77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   77.965118] FAT-fs (sda1): Directory bread(block 16449) failed
>> [   77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   77.976802] FAT-fs (sda1): Directory bread(block 16450) failed
>> [   77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> (...)
>> [   78.384929] FAT-fs (sda1): Filesystem has been set read-only
>> [  103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [  103.076751] print_req_error: 118 callbacks suppressed
>> [  103.076760] blk_update_request: I/O error, dev sda, sector 9748 op
>> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
>> [  103.087428] Buffer I/O error on dev sda1, logical block 1556, lost
>> async page write
>> [  103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [  103.101123] blk_update_request: I/O error, dev sda, sector 17162 op
>> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0
>> [  103.111883] Buffer I/O error on dev sda1, logical block 8970, lost
>> async page write
> 
> I can't tell why you're getting that error.  In one of my tests the
> device returned the same kind of error status (Sense Key = 6, ASC =
> 0x28) but the operation was then retried successfully.  Perhaps the
> problem lies in the device you are testing.
> 
>>>> As we need to have that working at some point, I might look into it, but
>>>> someone who has experience in the block layer can surely do it more
>>>> efficiently.
>>>
>>> I suspect that any problems you still face are caused by something else.
>>>
>>
>> I then formatted sda1 to ext2 (on the runtime suspend system testing
>> your patch) and that seems to have worked!
>>
>> Again accessing the mountpoint then yield the very same "device offline
>> or changed" errors.
>>
>> What kind of device are you testing? You should be easily able to
>> reproduce this using an "sd" device.
> 
> I tested two devices: a SanDisk Cruzer USB flash drive and a
> g-mass-storage gadget running under dummy-hcd.  They each showed up as
> /dev/sdb on my system.
> 
> I haven't tried testing with an SD card.  If you have any specific
> sequence of commands you would like me to run, let me know.
> 
>> The problems must lie in the different other drivers we use I guess.
> 
> Or the devices.  Have you tried testing with a USB flash drive?
> 
> Alan Stern
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 16:43                                           ` Martin Kepplinger
@ 2020-07-29 18:25                                             ` Alan Stern
  2020-07-29 18:29                                               ` James Bottomley
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-07-29 18:25 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote:
> 
> 
> Am 29. Juli 2020 17:44:42 MESZ schrieb James Bottomley <James.Bottomley@HansenPartnership.com>:
> >On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote:
> >> On 29.07.20 16:53, James Bottomley wrote:
> >> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
> >> > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
> >[...]
> >> > > > This error report comes from the SCSI layer, not the block
> >> > > > layer.
> >> > > 
> >> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
> >> > > CHANGED" so it sounds like it something we should be
> >> > > ignoring.  Usually this signals a problem, like you changed the
> >> > > medium manually (ejected the CD).  But in this case you can tell
> >> > > us to expect this by setting
> >> > > 
> >> > > sdev->expecting_cc_ua
> >> > > 
> >> > > And we'll retry.  I think you need to set this on all resumed
> >> > > devices.
> >> > 
> >> > Actually, it's not quite that easy, we filter out this ASC/ASCQ
> >> > combination from the check because we should never ignore medium
> >> > might have changed events on running devices.  We could ignore it
> >> > if we had a flag to say the power has been yanked (perhaps an
> >> > additional sdev flag you set on resume) but we would still miss the
> >> > case where you really had powered off the drive and then changed
> >> > the media ... if you can regard this as the user's problem, then we
> >> > might have a solution.
> >> > 
> >> > James
> >> >  
> >> 
> >> oh I see what you mean now, thanks for the ellaboration.
> >> 
> >> if I do the following change, things all look normal and runtime pm
> >> works. I'm not 100% sure if just setting expecting_cc_ua in resume()
> >> is "correct" but that looks like it is what you're talking about:
> >> 
> >> (note that this is of course with the one block layer diff applied
> >> that Alan posted a few emails back)
> >> 
> >> 
> >> --- a/drivers/scsi/scsi_error.c
> >> +++ b/drivers/scsi/scsi_error.c
> >> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
> >>                  * so that we can deal with it there.
> >>                  */
> >>                 if (scmd->device->expecting_cc_ua) {
> >> -                       /*
> >> -                        * Because some device does not queue unit
> >> -                        * attentions correctly, we carefully check
> >> -                        * additional sense code and qualifier so as
> >> -                        * not to squash media change unit attention.
> >> -                        */
> >> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00)
> >> {
> >> -                               scmd->device->expecting_cc_ua = 0;
> >> -                               return NEEDS_RETRY;
> >> -                       }
> >> +                       scmd->device->expecting_cc_ua = 0;
> >> +                       return NEEDS_RETRY;
> >
> >Well, yes, but you can't do this because it would lose us media change
> >events in the non-suspend/resume case which we really don't want. 
> >That's why I was suggesting a new flag.
> >
> >James
> 
> also if I set expecting_cc_ua in resume() only, like I did?

That wouldn't make any difference.  The information sent by your card 
reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see it in 
the log).  So because of the code here in scsi_check_sense(), which you 
can't change, the Unit Attention sent by the card reader would not be 
retried even if you do set the flag in resume().

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 18:25                                             ` Alan Stern
@ 2020-07-29 18:29                                               ` James Bottomley
  2020-07-30  8:52                                                 ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: James Bottomley @ 2020-07-29 18:29 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote:
> On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote:
[...]
> > > > --- a/drivers/scsi/scsi_error.c
> > > > +++ b/drivers/scsi/scsi_error.c
> > > > @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd
> > > > *scmd)
> > > >                  * so that we can deal with it there.
> > > >                  */
> > > >                 if (scmd->device->expecting_cc_ua) {
> > > > -                       /*
> > > > -                        * Because some device does not queue
> > > > unit
> > > > -                        * attentions correctly, we carefully
> > > > check
> > > > -                        * additional sense code and qualifier
> > > > so as
> > > > -                        * not to squash media change unit
> > > > attention.
> > > > -                        */
> > > > -                       if (sshdr.asc != 0x28 || sshdr.ascq !=
> > > > 0x00)
> > > > {
> > > > -                               scmd->device->expecting_cc_ua =
> > > > 0;
> > > > -                               return NEEDS_RETRY;
> > > > -                       }
> > > > +                       scmd->device->expecting_cc_ua = 0;
> > > > +                       return NEEDS_RETRY;
> > > 
> > > Well, yes, but you can't do this because it would lose us media
> > > change events in the non-suspend/resume case which we really
> > > don't want.  That's why I was suggesting a new flag.
> > > 
> > > James
> > 
> > also if I set expecting_cc_ua in resume() only, like I did?
> 
> That wouldn't make any difference.  The information sent by your
> card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see
> it in the log).  So because of the code here in scsi_check_sense(),
> which you can't change, the Unit Attention sent by the card reader
> would not be  retried even if you do set the flag in resume().

But if we had a new flag, like expecting_media_change, you could set
that in resume and we could condition the above test in the code on it
and reset it and do a retry if it gets set.  I'm not saying we have to
do it this way, but it sounds like we have to do something in the
kernel, so I think the flag will become necessary but there might be a
bit of policy based dance around how it gets set in the kernel (to
avoid losing accurate media change events).

James


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-06-29 16:15           ` Alan Stern
                               ` (2 preceding siblings ...)
  2020-06-30 15:59             ` Bart Van Assche
@ 2020-07-30  8:05             ` Martin Kepplinger
  2020-07-30 15:14               ` Alan Stern
  3 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-30  8:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 29.06.20 18:15, Alan Stern wrote:
> On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote:
>>
>>
>> On 26.06.20 17:44, Alan Stern wrote:
>>> Martin's best approach would be to add some debugging code to find out why 
>>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
>>> doesn't lead to pm_request_resume().
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> blk_queue_enter() always - especially when sd is runtime suspended and I
>> try to mount as above - sets success to be true for me, so never
>> continues down to bkl_pm_request_resume(). All I see is "PM: Removing
>> info for No Bus:sda1".
> 
> Aha.  Looking at this more closely, it's apparent that the code in 
> blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> flag is set then the request can be issued regardless of the queue's 
> runtime status.  That is not correct when the queue is suspended.
> 
> Below is my attempt to fix this up.  I'm not sure that the patch is 
> entirely correct, but it should fix this logic bug.  I would appreciate a 
> critical review.
> 
> Martin, does this fix the problem?
> 
> Alan Stern

Hi Alan,

So in the block layer your change below indeed fixes the problem and if
you want to submit that 1:1 feel free to add

Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>

thanks for your help in this!

                       martin


> 
> 
> 
> Index: usb-devel/block/blk-core.c
> ===================================================================
> --- usb-devel.orig/block/blk-core.c
> +++ usb-devel/block/blk-core.c
> @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
>  			 * responsible for ensuring that that counter is
>  			 * globally visible before the queue is unfrozen.
>  			 */
> -			if (pm || !blk_queue_pm_only(q)) {
> +			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
> +			    !blk_queue_pm_only(q)) {
>  				success = true;
>  			} else {
>  				percpu_ref_put(&q->q_usage_counter);
> @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
>  
>  		wait_event(q->mq_freeze_wq,
>  			   (!q->mq_freeze_depth &&
> -			    (pm || (blk_pm_request_resume(q),
> -				    !blk_queue_pm_only(q)))) ||
> +			    blk_pm_resume_queue(pm, q)) ||
>  			   blk_queue_dying(q));
>  		if (blk_queue_dying(q))
>  			return -ENODEV;
> Index: usb-devel/block/blk-pm.h
> ===================================================================
> --- usb-devel.orig/block/blk-pm.h
> +++ usb-devel/block/blk-pm.h
> @@ -6,11 +6,14 @@
>  #include <linux/pm_runtime.h>
>  
>  #ifdef CONFIG_PM
> -static inline void blk_pm_request_resume(struct request_queue *q)
> +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
>  {
> -	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
> -		       q->rpm_status == RPM_SUSPENDING))
> -		pm_request_resume(q->dev);
> +	if (!q->dev || !blk_queue_pm_only(q))
> +		return 1;	/* Nothing to do */
> +	if (pm && q->rpm_status != RPM_SUSPENDED)
> +		return 1;	/* Request allowed */
> +	pm_request_resume(q->dev);
> +	return 0;
>  }
>  
>  static inline void blk_pm_mark_last_busy(struct request *rq)
> @@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st
>  		--rq->q->nr_pending;
>  }
>  #else
> -static inline void blk_pm_request_resume(struct request_queue *q)
> +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
>  {
> +	return 1;
>  }
>  
>  static inline void blk_pm_mark_last_busy(struct request *rq)
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-29 18:29                                               ` James Bottomley
@ 2020-07-30  8:52                                                 ` Martin Kepplinger
  2020-07-30  8:54                                                   ` Martin Kepplinger
  2020-07-30 15:10                                                   ` Alan Stern
  0 siblings, 2 replies; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-30  8:52 UTC (permalink / raw)
  To: James Bottomley, Alan Stern
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 29.07.20 20:29, James Bottomley wrote:
> On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote:
>> On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote:
> [...]
>>>>> --- a/drivers/scsi/scsi_error.c
>>>>> +++ b/drivers/scsi/scsi_error.c
>>>>> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd
>>>>> *scmd)
>>>>>                  * so that we can deal with it there.
>>>>>                  */
>>>>>                 if (scmd->device->expecting_cc_ua) {
>>>>> -                       /*
>>>>> -                        * Because some device does not queue
>>>>> unit
>>>>> -                        * attentions correctly, we carefully
>>>>> check
>>>>> -                        * additional sense code and qualifier
>>>>> so as
>>>>> -                        * not to squash media change unit
>>>>> attention.
>>>>> -                        */
>>>>> -                       if (sshdr.asc != 0x28 || sshdr.ascq !=
>>>>> 0x00)
>>>>> {
>>>>> -                               scmd->device->expecting_cc_ua =
>>>>> 0;
>>>>> -                               return NEEDS_RETRY;
>>>>> -                       }
>>>>> +                       scmd->device->expecting_cc_ua = 0;
>>>>> +                       return NEEDS_RETRY;
>>>>
>>>> Well, yes, but you can't do this because it would lose us media
>>>> change events in the non-suspend/resume case which we really
>>>> don't want.  That's why I was suggesting a new flag.
>>>>
>>>> James
>>>
>>> also if I set expecting_cc_ua in resume() only, like I did?
>>
>> That wouldn't make any difference.  The information sent by your
>> card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see
>> it in the log).  So because of the code here in scsi_check_sense(),
>> which you can't change, the Unit Attention sent by the card reader
>> would not be  retried even if you do set the flag in resume().
> 
> But if we had a new flag, like expecting_media_change, you could set
> that in resume and we could condition the above test in the code on it
> and reset it and do a retry if it gets set.  I'm not saying we have to
> do it this way, but it sounds like we have to do something in the
> kernel, so I think the flag will become necessary but there might be a
> bit of policy based dance around how it gets set in the kernel (to
> avoid losing accurate media change events).
> 
> James
> 

Maybe I should just start a new discussion with a patch, but the below
is what makes sense to me (when I understand you correctly) and seems to
work. I basically add a new flag, so that the old flags behave unchanged
and only call it during *runtime* resume for SD cards:


--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
                 * information that we should pass up to the upper-level
driver
                 * so that we can deal with it there.
                 */
-               if (scmd->device->expecting_cc_ua) {
+               if (scmd->device->expecting_cc_ua ||
+                   scmd->device->expecting_media_change) {
                        /*
                         * Because some device does not queue unit
                         * attentions correctly, we carefully check
                         * additional sense code and qualifier so as
-                        * not to squash media change unit attention.
+                        * not to squash media change unit attention;
+                        * unless expecting_media_change is set, indicating
+                        * that the media (most likely) didn't change
+                        * but a device only believes so (for example
+                        * because of suspend/resume).
                         */
-                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
-                               scmd->device->expecting_cc_ua = 0;
+                       if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) ||
+                           scmd->device->expecting_media_change) {
+                               scmd->device->expecting_media_change = 0;
                                return NEEDS_RETRY;
                        }
                }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..b647fab2b663 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -114,6 +114,7 @@ static void sd_shutdown(struct device *);
 static int sd_suspend_system(struct device *);
 static int sd_suspend_runtime(struct device *);
 static int sd_resume(struct device *);
+static int sd_resume_runtime(struct device *);
 static void sd_rescan(struct device *);
 static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt);
 static void sd_uninit_command(struct scsi_cmnd *SCpnt);
@@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = {
        .poweroff               = sd_suspend_system,
        .restore                = sd_resume,
        .runtime_suspend        = sd_suspend_runtime,
-       .runtime_resume         = sd_resume,
+       .runtime_resume         = sd_resume_runtime,
 };

 static struct scsi_driver sd_template = {
@@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev)
        return ret;
 }

+static int sd_resume_runtime(struct device *dev)
+{
+       struct scsi_disk *sdkp = dev_get_drvdata(dev);
+
+       /* Some SD cardreaders report media change when resuming from
suspend
+        * because they can't keep track during suspend. */
+
+       /* XXX This is not unproblematic though: We won't notice when a card
+        * was really changed during runtime suspend! We basically rely
on users
+        * to unmount or suspend before doing so. */
+       sdkp->device->expecting_media_change = 1;
+
+       return sd_resume(dev);
+}
+
 /**
  *     init_sd - entry point for this driver (both when built in or when
  *     a module).
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index bc5909033d13..8c8f053f71c8 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -169,6 +169,8 @@ struct scsi_device {
                                 * this device */
        unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
                                     * because we did a bus reset. */
+       unsigned expecting_media_change:1; /* Expecting media change
ASC/ASCQ
+                                             when it actually doesn't
change */
        unsigned use_10_for_rw:1; /* first try 10-byte read / write */
        unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
        unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-30  8:52                                                 ` Martin Kepplinger
@ 2020-07-30  8:54                                                   ` Martin Kepplinger
  2020-07-30 15:10                                                   ` Alan Stern
  1 sibling, 0 replies; 68+ messages in thread
From: Martin Kepplinger @ 2020-07-30  8:54 UTC (permalink / raw)
  To: James Bottomley, Alan Stern
  Cc: Bart Van Assche, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On 30.07.20 10:52, Martin Kepplinger wrote:
> On 29.07.20 20:29, James Bottomley wrote:
>> On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote:
>>> On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote:
>> [...]
>>>>>> --- a/drivers/scsi/scsi_error.c
>>>>>> +++ b/drivers/scsi/scsi_error.c
>>>>>> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd
>>>>>> *scmd)
>>>>>>                  * so that we can deal with it there.
>>>>>>                  */
>>>>>>                 if (scmd->device->expecting_cc_ua) {
>>>>>> -                       /*
>>>>>> -                        * Because some device does not queue
>>>>>> unit
>>>>>> -                        * attentions correctly, we carefully
>>>>>> check
>>>>>> -                        * additional sense code and qualifier
>>>>>> so as
>>>>>> -                        * not to squash media change unit
>>>>>> attention.
>>>>>> -                        */
>>>>>> -                       if (sshdr.asc != 0x28 || sshdr.ascq !=
>>>>>> 0x00)
>>>>>> {
>>>>>> -                               scmd->device->expecting_cc_ua =
>>>>>> 0;
>>>>>> -                               return NEEDS_RETRY;
>>>>>> -                       }
>>>>>> +                       scmd->device->expecting_cc_ua = 0;
>>>>>> +                       return NEEDS_RETRY;
>>>>>
>>>>> Well, yes, but you can't do this because it would lose us media
>>>>> change events in the non-suspend/resume case which we really
>>>>> don't want.  That's why I was suggesting a new flag.
>>>>>
>>>>> James
>>>>
>>>> also if I set expecting_cc_ua in resume() only, like I did?
>>>
>>> That wouldn't make any difference.  The information sent by your
>>> card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see
>>> it in the log).  So because of the code here in scsi_check_sense(),
>>> which you can't change, the Unit Attention sent by the card reader
>>> would not be  retried even if you do set the flag in resume().
>>
>> But if we had a new flag, like expecting_media_change, you could set
>> that in resume and we could condition the above test in the code on it
>> and reset it and do a retry if it gets set.  I'm not saying we have to
>> do it this way, but it sounds like we have to do something in the
>> kernel, so I think the flag will become necessary but there might be a
>> bit of policy based dance around how it gets set in the kernel (to
>> avoid losing accurate media change events).
>>
>> James
>>
> 
> Maybe I should just start a new discussion with a patch, but the below
> is what makes sense to me (when I understand you correctly) and seems to
> work. I basically add a new flag, so that the old flags behave unchanged
> and only call it during *runtime* resume for SD cards:
> 
> 
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>                  * information that we should pass up to the upper-level
> driver
>                  * so that we can deal with it there.
>                  */
> -               if (scmd->device->expecting_cc_ua) {
> +               if (scmd->device->expecting_cc_ua ||
> +                   scmd->device->expecting_media_change) {
>                         /*
>                          * Because some device does not queue unit
>                          * attentions correctly, we carefully check
>                          * additional sense code and qualifier so as
> -                        * not to squash media change unit attention.
> +                        * not to squash media change unit attention;
> +                        * unless expecting_media_change is set, indicating
> +                        * that the media (most likely) didn't change
> +                        * but a device only believes so (for example
> +                        * because of suspend/resume).
>                          */
> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
> -                               scmd->device->expecting_cc_ua = 0;
> +                       if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) ||
> +                           scmd->device->expecting_media_change) {
> +                               scmd->device->expecting_media_change = 0;

oops, I missed expecting_cc_ua here, but you get the idea.

>                                 return NEEDS_RETRY;
>                         }
>                 }
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..b647fab2b663 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *);
>  static int sd_suspend_system(struct device *);
>  static int sd_suspend_runtime(struct device *);
>  static int sd_resume(struct device *);
> +static int sd_resume_runtime(struct device *);
>  static void sd_rescan(struct device *);
>  static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt);
>  static void sd_uninit_command(struct scsi_cmnd *SCpnt);
> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = {
>         .poweroff               = sd_suspend_system,
>         .restore                = sd_resume,
>         .runtime_suspend        = sd_suspend_runtime,
> -       .runtime_resume         = sd_resume,
> +       .runtime_resume         = sd_resume_runtime,
>  };
> 
>  static struct scsi_driver sd_template = {
> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev)
>         return ret;
>  }
> 
> +static int sd_resume_runtime(struct device *dev)
> +{
> +       struct scsi_disk *sdkp = dev_get_drvdata(dev);
> +
> +       /* Some SD cardreaders report media change when resuming from
> suspend
> +        * because they can't keep track during suspend. */
> +
> +       /* XXX This is not unproblematic though: We won't notice when a card
> +        * was really changed during runtime suspend! We basically rely
> on users
> +        * to unmount or suspend before doing so. */
> +       sdkp->device->expecting_media_change = 1;
> +
> +       return sd_resume(dev);
> +}
> +
>  /**
>   *     init_sd - entry point for this driver (both when built in or when
>   *     a module).
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index bc5909033d13..8c8f053f71c8 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -169,6 +169,8 @@ struct scsi_device {
>                                  * this device */
>         unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>                                      * because we did a bus reset. */
> +       unsigned expecting_media_change:1; /* Expecting media change
> ASC/ASCQ
> +                                             when it actually doesn't
> change */
>         unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>         unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>         unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-30  8:52                                                 ` Martin Kepplinger
  2020-07-30  8:54                                                   ` Martin Kepplinger
@ 2020-07-30 15:10                                                   ` Alan Stern
  2020-08-04  9:39                                                     ` Martin Kepplinger
  1 sibling, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-07-30 15:10 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote:
> Maybe I should just start a new discussion with a patch, but the below
> is what makes sense to me (when I understand you correctly) and seems to
> work. I basically add a new flag, so that the old flags behave unchanged
> and only call it during *runtime* resume for SD cards:
> 
> 
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>                  * information that we should pass up to the upper-level
> driver
>                  * so that we can deal with it there.
>                  */
> -               if (scmd->device->expecting_cc_ua) {
> +               if (scmd->device->expecting_cc_ua ||
> +                   scmd->device->expecting_media_change) {
>                         /*
>                          * Because some device does not queue unit
>                          * attentions correctly, we carefully check
>                          * additional sense code and qualifier so as
> -                        * not to squash media change unit attention.
> +                        * not to squash media change unit attention;
> +                        * unless expecting_media_change is set, indicating
> +                        * that the media (most likely) didn't change
> +                        * but a device only believes so (for example
> +                        * because of suspend/resume).
>                          */
> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
> -                               scmd->device->expecting_cc_ua = 0;
> +                       if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) ||
> +                           scmd->device->expecting_media_change) {
> +                               scmd->device->expecting_media_change = 0;
>                                 return NEEDS_RETRY;
>                         }
>                 }
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..b647fab2b663 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *);
>  static int sd_suspend_system(struct device *);
>  static int sd_suspend_runtime(struct device *);
>  static int sd_resume(struct device *);
> +static int sd_resume_runtime(struct device *);
>  static void sd_rescan(struct device *);
>  static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt);
>  static void sd_uninit_command(struct scsi_cmnd *SCpnt);
> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = {
>         .poweroff               = sd_suspend_system,
>         .restore                = sd_resume,
>         .runtime_suspend        = sd_suspend_runtime,
> -       .runtime_resume         = sd_resume,
> +       .runtime_resume         = sd_resume_runtime,
>  };
> 
>  static struct scsi_driver sd_template = {
> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev)
>         return ret;
>  }
> 
> +static int sd_resume_runtime(struct device *dev)
> +{
> +       struct scsi_disk *sdkp = dev_get_drvdata(dev);
> +
> +       /* Some SD cardreaders report media change when resuming from
> suspend
> +        * because they can't keep track during suspend. */
> +
> +       /* XXX This is not unproblematic though: We won't notice when a card
> +        * was really changed during runtime suspend! We basically rely
> on users
> +        * to unmount or suspend before doing so. */
> +       sdkp->device->expecting_media_change = 1;
> +
> +       return sd_resume(dev);
> +}
> +
>  /**
>   *     init_sd - entry point for this driver (both when built in or when
>   *     a module).
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index bc5909033d13..8c8f053f71c8 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -169,6 +169,8 @@ struct scsi_device {
>                                  * this device */
>         unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>                                      * because we did a bus reset. */
> +       unsigned expecting_media_change:1; /* Expecting media change
> ASC/ASCQ
> +                                             when it actually doesn't
> change */
>         unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>         unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>         unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

That's pretty much what James was suggesting, except for one thing: You 
must not set sdkp->device->expecting_media_change to 1 for all devices 
in sd_runtime_resume().  Only for devices which may generate a spurious 
Unit Attention following runtime resume -- and maybe not even for all of 
them, depending on what the user wants.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-30  8:05             ` Martin Kepplinger
@ 2020-07-30 15:14               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-07-30 15:14 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, jejb, Can Guo, martin.petersen, linux-scsi,
	linux-kernel, kernel

On Thu, Jul 30, 2020 at 10:05:50AM +0200, Martin Kepplinger wrote:
> On 29.06.20 18:15, Alan Stern wrote:
> > On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote:
> >>
> >>
> >> On 26.06.20 17:44, Alan Stern wrote:
> >>> Martin's best approach would be to add some debugging code to find out why 
> >>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
> >>> doesn't lead to pm_request_resume().
> >>>
> >>> Alan Stern
> >>>
> >>
> >> Hi Alan,
> >>
> >> blk_queue_enter() always - especially when sd is runtime suspended and I
> >> try to mount as above - sets success to be true for me, so never
> >> continues down to bkl_pm_request_resume(). All I see is "PM: Removing
> >> info for No Bus:sda1".
> > 
> > Aha.  Looking at this more closely, it's apparent that the code in 
> > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> > flag is set then the request can be issued regardless of the queue's 
> > runtime status.  That is not correct when the queue is suspended.
> > 
> > Below is my attempt to fix this up.  I'm not sure that the patch is 
> > entirely correct, but it should fix this logic bug.  I would appreciate a 
> > critical review.
> > 
> > Martin, does this fix the problem?
> > 
> > Alan Stern
> 
> Hi Alan,
> 
> So in the block layer your change below indeed fixes the problem and if
> you want to submit that 1:1 feel free to add
> 
> Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
> 
> thanks for your help in this!

Thank you for _your_ help!

The next merge window is coming up soon.  I think I'll wait until it is 
over before submitting the patch (maintainers tend to be too busy to 
consider new patches during a merge window).

But I am still open to comments or criticism of the patch in the 
meantime.  There hasn't been any feedback since Bart's initial set of 
questions.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-07-30 15:10                                                   ` Alan Stern
@ 2020-08-04  9:39                                                     ` Martin Kepplinger
  2020-08-07  9:51                                                       ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-04  9:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 30.07.20 17:10, Alan Stern wrote:
> On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote:
>> Maybe I should just start a new discussion with a patch, but the below
>> is what makes sense to me (when I understand you correctly) and seems to
>> work. I basically add a new flag, so that the old flags behave unchanged
>> and only call it during *runtime* resume for SD cards:
>>
>>
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>                  * information that we should pass up to the upper-level
>> driver
>>                  * so that we can deal with it there.
>>                  */
>> -               if (scmd->device->expecting_cc_ua) {
>> +               if (scmd->device->expecting_cc_ua ||
>> +                   scmd->device->expecting_media_change) {
>>                         /*
>>                          * Because some device does not queue unit
>>                          * attentions correctly, we carefully check
>>                          * additional sense code and qualifier so as
>> -                        * not to squash media change unit attention.
>> +                        * not to squash media change unit attention;
>> +                        * unless expecting_media_change is set, indicating
>> +                        * that the media (most likely) didn't change
>> +                        * but a device only believes so (for example
>> +                        * because of suspend/resume).
>>                          */
>> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
>> -                               scmd->device->expecting_cc_ua = 0;
>> +                       if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) ||
>> +                           scmd->device->expecting_media_change) {
>> +                               scmd->device->expecting_media_change = 0;
>>                                 return NEEDS_RETRY;
>>                         }
>>                 }
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..b647fab2b663 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *);
>>  static int sd_suspend_system(struct device *);
>>  static int sd_suspend_runtime(struct device *);
>>  static int sd_resume(struct device *);
>> +static int sd_resume_runtime(struct device *);
>>  static void sd_rescan(struct device *);
>>  static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt);
>>  static void sd_uninit_command(struct scsi_cmnd *SCpnt);
>> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = {
>>         .poweroff               = sd_suspend_system,
>>         .restore                = sd_resume,
>>         .runtime_suspend        = sd_suspend_runtime,
>> -       .runtime_resume         = sd_resume,
>> +       .runtime_resume         = sd_resume_runtime,
>>  };
>>
>>  static struct scsi_driver sd_template = {
>> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev)
>>         return ret;
>>  }
>>
>> +static int sd_resume_runtime(struct device *dev)
>> +{
>> +       struct scsi_disk *sdkp = dev_get_drvdata(dev);
>> +
>> +       /* Some SD cardreaders report media change when resuming from
>> suspend
>> +        * because they can't keep track during suspend. */
>> +
>> +       /* XXX This is not unproblematic though: We won't notice when a card
>> +        * was really changed during runtime suspend! We basically rely
>> on users
>> +        * to unmount or suspend before doing so. */
>> +       sdkp->device->expecting_media_change = 1;
>> +
>> +       return sd_resume(dev);
>> +}
>> +
>>  /**
>>   *     init_sd - entry point for this driver (both when built in or when
>>   *     a module).
>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>> index bc5909033d13..8c8f053f71c8 100644
>> --- a/include/scsi/scsi_device.h
>> +++ b/include/scsi/scsi_device.h
>> @@ -169,6 +169,8 @@ struct scsi_device {
>>                                  * this device */
>>         unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>>                                      * because we did a bus reset. */
>> +       unsigned expecting_media_change:1; /* Expecting media change
>> ASC/ASCQ
>> +                                             when it actually doesn't
>> change */
>>         unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>>         unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>>         unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
> 
> That's pretty much what James was suggesting, except for one thing: You 
> must not set sdkp->device->expecting_media_change to 1 for all devices 
> in sd_runtime_resume().  Only for devices which may generate a spurious 
> Unit Attention following runtime resume -- and maybe not even for all of 
> them, depending on what the user wants.
> 
> Alan Stern
> 

when I mount the SD card myself or via Nautilus, things work. When I put
sth like:

/dev/sda1 /mnt/sda1 auto auto,nofail 0 2

into fstab, I *still* get (constantly) when accessing the files:

[   50.838061] sd 0:0:0:0: [sda] tag#0 device offline or changed

why could that be? is there another place we would add such a new flag
(not only resume())?

                              martin

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-04  9:39                                                     ` Martin Kepplinger
@ 2020-08-07  9:51                                                       ` Martin Kepplinger
  2020-08-07 14:30                                                         ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-07  9:51 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 04.08.20 11:39, Martin Kepplinger wrote:
> On 30.07.20 17:10, Alan Stern wrote:
>> On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote:
>>> Maybe I should just start a new discussion with a patch, but the below
>>> is what makes sense to me (when I understand you correctly) and seems to
>>> work. I basically add a new flag, so that the old flags behave unchanged
>>> and only call it during *runtime* resume for SD cards:
>>>
>>>
>>> --- a/drivers/scsi/scsi_error.c
>>> +++ b/drivers/scsi/scsi_error.c
>>> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>>                  * information that we should pass up to the upper-level
>>> driver
>>>                  * so that we can deal with it there.
>>>                  */
>>> -               if (scmd->device->expecting_cc_ua) {
>>> +               if (scmd->device->expecting_cc_ua ||
>>> +                   scmd->device->expecting_media_change) {
>>>                         /*
>>>                          * Because some device does not queue unit
>>>                          * attentions correctly, we carefully check
>>>                          * additional sense code and qualifier so as
>>> -                        * not to squash media change unit attention.
>>> +                        * not to squash media change unit attention;
>>> +                        * unless expecting_media_change is set, indicating
>>> +                        * that the media (most likely) didn't change
>>> +                        * but a device only believes so (for example
>>> +                        * because of suspend/resume).
>>>                          */
>>> -                       if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
>>> -                               scmd->device->expecting_cc_ua = 0;
>>> +                       if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) ||
>>> +                           scmd->device->expecting_media_change) {
>>> +                               scmd->device->expecting_media_change = 0;
>>>                                 return NEEDS_RETRY;
>>>                         }
>>>                 }
>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>> index d90fefffe31b..b647fab2b663 100644
>>> --- a/drivers/scsi/sd.c
>>> +++ b/drivers/scsi/sd.c
>>> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *);
>>>  static int sd_suspend_system(struct device *);
>>>  static int sd_suspend_runtime(struct device *);
>>>  static int sd_resume(struct device *);
>>> +static int sd_resume_runtime(struct device *);
>>>  static void sd_rescan(struct device *);
>>>  static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt);
>>>  static void sd_uninit_command(struct scsi_cmnd *SCpnt);
>>> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = {
>>>         .poweroff               = sd_suspend_system,
>>>         .restore                = sd_resume,
>>>         .runtime_suspend        = sd_suspend_runtime,
>>> -       .runtime_resume         = sd_resume,
>>> +       .runtime_resume         = sd_resume_runtime,
>>>  };
>>>
>>>  static struct scsi_driver sd_template = {
>>> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev)
>>>         return ret;
>>>  }
>>>
>>> +static int sd_resume_runtime(struct device *dev)
>>> +{
>>> +       struct scsi_disk *sdkp = dev_get_drvdata(dev);
>>> +
>>> +       /* Some SD cardreaders report media change when resuming from
>>> suspend
>>> +        * because they can't keep track during suspend. */
>>> +
>>> +       /* XXX This is not unproblematic though: We won't notice when a card
>>> +        * was really changed during runtime suspend! We basically rely
>>> on users
>>> +        * to unmount or suspend before doing so. */
>>> +       sdkp->device->expecting_media_change = 1;
>>> +
>>> +       return sd_resume(dev);
>>> +}
>>> +
>>>  /**
>>>   *     init_sd - entry point for this driver (both when built in or when
>>>   *     a module).
>>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>>> index bc5909033d13..8c8f053f71c8 100644
>>> --- a/include/scsi/scsi_device.h
>>> +++ b/include/scsi/scsi_device.h
>>> @@ -169,6 +169,8 @@ struct scsi_device {
>>>                                  * this device */
>>>         unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>>>                                      * because we did a bus reset. */
>>> +       unsigned expecting_media_change:1; /* Expecting media change
>>> ASC/ASCQ
>>> +                                             when it actually doesn't
>>> change */
>>>         unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>>>         unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>>>         unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
>>
>> That's pretty much what James was suggesting, except for one thing: You 
>> must not set sdkp->device->expecting_media_change to 1 for all devices 
>> in sd_runtime_resume().  Only for devices which may generate a spurious 
>> Unit Attention following runtime resume -- and maybe not even for all of 
>> them, depending on what the user wants.
>>
>> Alan Stern
>>
> 
> when I mount the SD card myself or via Nautilus, things work. When I put
> sth like:
> 
> /dev/sda1 /mnt/sda1 auto auto,nofail 0 2
> 
> into fstab, I *still* get (constantly) when accessing the files:
> 
> [   50.838061] sd 0:0:0:0: [sda] tag#0 device offline or changed
> 
> why could that be? is there another place we would add such a new flag
> (not only resume())?
> 
>                               martin
> 


it's really strange: below is the change I'm trying. Of course that's
only for testing the functionality, nothing how a patch could look like.

While I remember it had worked, now (weirdly since I tried that mounting
via fstab) it doesn't anymore!

What I understand (not much): I handle the error with "retry" via the
new flag, but scsi_decide_disposition() returns SUCCESS because of "no
more retries"; but it's the first and only time it's called.

How can this be? What am I missing?



--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
 				return NEEDS_RETRY;
 			}
 		}
+		if (scmd->device->expecting_media_change) {
+			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
+				scmd->device->expecting_media_change = 0;
+				return NEEDS_RETRY;
+			}
+		}
+
 		/*
 		 * we might also expect a cc/ua if another LUN on the target
 		 * reported a UA with an ASC/ASCQ of 3F 0E -
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..bb583e403b81 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
 	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
 		return 0;

+	sdkp->device->expecting_media_change = 1;
+
 	if (!sdkp->device->manage_start_stop)
 		return 0;

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index bc5909033d13..f5fc1af68e00 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -169,6 +169,7 @@ struct scsi_device {
 				 * this device */
 	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
 				     * because we did a bus reset. */
+	unsigned expecting_media_change:1;
 	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
 	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
 	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-07  9:51                                                       ` Martin Kepplinger
@ 2020-08-07 14:30                                                         ` Alan Stern
  2020-08-08  6:59                                                           ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-07 14:30 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote:
> it's really strange: below is the change I'm trying. Of course that's
> only for testing the functionality, nothing how a patch could look like.
> 
> While I remember it had worked, now (weirdly since I tried that mounting
> via fstab) it doesn't anymore!
> 
> What I understand (not much): I handle the error with "retry" via the
> new flag, but scsi_decide_disposition() returns SUCCESS because of "no
> more retries"; but it's the first and only time it's called.

Are you saying that scmd->allowed is set to 0?  Or is scsi_notry_cmd() 
returning a nonzero value?  Whichever is true, why does it happen that 
way?

What is the failing command?  Is it a READ(10)?

> How can this be? What am I missing?

It's kind of hard to tell without seeing the error messages or system 
log or any debugging information.

Alan Stern

> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>  				return NEEDS_RETRY;
>  			}
>  		}
> +		if (scmd->device->expecting_media_change) {
> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
> +				scmd->device->expecting_media_change = 0;
> +				return NEEDS_RETRY;
> +			}
> +		}
> +
>  		/*
>  		 * we might also expect a cc/ua if another LUN on the target
>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index d90fefffe31b..bb583e403b81 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
>  	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
>  		return 0;
> 
> +	sdkp->device->expecting_media_change = 1;
> +
>  	if (!sdkp->device->manage_start_stop)
>  		return 0;
> 
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index bc5909033d13..f5fc1af68e00 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -169,6 +169,7 @@ struct scsi_device {
>  				 * this device */
>  	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>  				     * because we did a bus reset. */
> +	unsigned expecting_media_change:1;
>  	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>  	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>  	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-07 14:30                                                         ` Alan Stern
@ 2020-08-08  6:59                                                           ` Martin Kepplinger
  2020-08-08 15:05                                                             ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-08  6:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 07.08.20 16:30, Alan Stern wrote:
> On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote:
>> it's really strange: below is the change I'm trying. Of course that's
>> only for testing the functionality, nothing how a patch could look like.
>>
>> While I remember it had worked, now (weirdly since I tried that mounting
>> via fstab) it doesn't anymore!
>>
>> What I understand (not much): I handle the error with "retry" via the
>> new flag, but scsi_decide_disposition() returns SUCCESS because of "no
>> more retries"; but it's the first and only time it's called.
> 
> Are you saying that scmd->allowed is set to 0?  Or is scsi_notry_cmd() 
> returning a nonzero value?  Whichever is true, why does it happen that 
> way?

scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed).

why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if
(status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true,
then at the end it returns 1 because of REQ_FAILFAST_DEV.

that seems to come from the block layer. why and when? could I change
that so that the scsi error handling stays in control?

> 
> What is the failing command?  Is it a READ(10)?

Not sure how I'd answer that, but here's the test to trigger the error:

mount /dev/sda1 /mnt
cd /mnt
ls
cp file ~/ (if ls "works" and doesn't yet trigger the error)

and that's the (familiar looking) logs when doing so. again: despite the
mentioned workaround in scsi_error and the new expected_media_change
flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems
to override what I want to do is scsi_error:

[   55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
hostbyte=0x00 driverbyte=0x08 cmd_age=0s
[   55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
[   55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
[   55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc
e0 00 00 01 00
[   55.557666] blk_update_request: I/O error, dev sda, sector 589024 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   55.574691] blk_update_request: I/O error, dev sda, sector 589025 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   55.591562] blk_update_request: I/O error, dev sda, sector 589026 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed
(... goes on with the same)


> 
>> How can this be? What am I missing?
> 
> It's kind of hard to tell without seeing the error messages or system 
> log or any debugging information.

thanks a lot for getting back to me,

                           martin


> 
> Alan Stern
> 
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>  				return NEEDS_RETRY;
>>  			}
>>  		}
>> +		if (scmd->device->expecting_media_change) {
>> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +				scmd->device->expecting_media_change = 0;
>> +				return NEEDS_RETRY;
>> +			}
>> +		}
>> +
>>  		/*
>>  		 * we might also expect a cc/ua if another LUN on the target
>>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..bb583e403b81 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
>>  	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
>>  		return 0;
>>
>> +	sdkp->device->expecting_media_change = 1;
>> +
>>  	if (!sdkp->device->manage_start_stop)
>>  		return 0;
>>
>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>> index bc5909033d13..f5fc1af68e00 100644
>> --- a/include/scsi/scsi_device.h
>> +++ b/include/scsi/scsi_device.h
>> @@ -169,6 +169,7 @@ struct scsi_device {
>>  				 * this device */
>>  	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>>  				     * because we did a bus reset. */
>> +	unsigned expecting_media_change:1;
>>  	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>>  	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>>  	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-08  6:59                                                           ` Martin Kepplinger
@ 2020-08-08 15:05                                                             ` Alan Stern
  2020-08-09  9:20                                                               ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-08 15:05 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Sat, Aug 08, 2020 at 08:59:09AM +0200, Martin Kepplinger wrote:
> On 07.08.20 16:30, Alan Stern wrote:
> > On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote:
> >> it's really strange: below is the change I'm trying. Of course that's
> >> only for testing the functionality, nothing how a patch could look like.
> >>
> >> While I remember it had worked, now (weirdly since I tried that mounting
> >> via fstab) it doesn't anymore!
> >>
> >> What I understand (not much): I handle the error with "retry" via the
> >> new flag, but scsi_decide_disposition() returns SUCCESS because of "no
> >> more retries"; but it's the first and only time it's called.
> > 
> > Are you saying that scmd->allowed is set to 0?  Or is scsi_notry_cmd() 
> > returning a nonzero value?  Whichever is true, why does it happen that 
> > way?
> 
> scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed).
> 
> why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if
> (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true,
> then at the end it returns 1 because of REQ_FAILFAST_DEV.
> 
> that seems to come from the block layer. why and when? could I change
> that so that the scsi error handling stays in control?

The only place I see where that flag might get set is in 
blk_mq_bio_to_request() in block/blk-mq.c, which does:

	if (bio->bi_opf & REQ_RAHEAD)
		rq->cmd_flags |= REQ_FAILFAST_MASK;

So apparently read-ahead reads are supposed to fail fast (i.e., without 
retries), presumably because they are optional after all.

> > What is the failing command?  Is it a READ(10)?
> 
> Not sure how I'd answer that, but here's the test to trigger the error:
> 
> mount /dev/sda1 /mnt
> cd /mnt
> ls
> cp file ~/ (if ls "works" and doesn't yet trigger the error)
> 
> and that's the (familiar looking) logs when doing so. again: despite the
> mentioned workaround in scsi_error and the new expected_media_change
> flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems
> to override what I want to do is scsi_error:
> 
> [   55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
> [   55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
> [   55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
> [   55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc
> e0 00 00 01 00

Yes, 0x28 is READ(10).  Likely this is a read-ahead request, although I 
don't know how we can tell for sure.

> [   55.557666] blk_update_request: I/O error, dev sda, sector 589024 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   55.574691] blk_update_request: I/O error, dev sda, sector 589025 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed
> [   55.591562] blk_update_request: I/O error, dev sda, sector 589026 op
> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
> [   55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed
> (... goes on with the same)

Is such a drastic response really appropriate for the failure of a 
read-ahead request?  It seems like a more logical response would be to 
let the request fail but keep the device online.

Of course, that would only solve part of your problem -- your log would 
still get filled with those "I/O error" messages even though they 
wouldn't be fatal.  Probably a better approach would be to make the new 
expecting_media_change flag override scsi_no_retry_cmd().

But this is not my area of expertise.  Maybe someone else will have more 
to say.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-08 15:05                                                             ` Alan Stern
@ 2020-08-09  9:20                                                               ` Martin Kepplinger
  2020-08-09 15:26                                                                 ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-09  9:20 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 08.08.20 17:05, Alan Stern wrote:
> On Sat, Aug 08, 2020 at 08:59:09AM +0200, Martin Kepplinger wrote:
>> On 07.08.20 16:30, Alan Stern wrote:
>>> On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote:
>>>> it's really strange: below is the change I'm trying. Of course that's
>>>> only for testing the functionality, nothing how a patch could look like.
>>>>
>>>> While I remember it had worked, now (weirdly since I tried that mounting
>>>> via fstab) it doesn't anymore!
>>>>
>>>> What I understand (not much): I handle the error with "retry" via the
>>>> new flag, but scsi_decide_disposition() returns SUCCESS because of "no
>>>> more retries"; but it's the first and only time it's called.
>>>
>>> Are you saying that scmd->allowed is set to 0?  Or is scsi_notry_cmd() 
>>> returning a nonzero value?  Whichever is true, why does it happen that 
>>> way?
>>
>> scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed).
>>
>> why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if
>> (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true,
>> then at the end it returns 1 because of REQ_FAILFAST_DEV.
>>
>> that seems to come from the block layer. why and when? could I change
>> that so that the scsi error handling stays in control?
> 
> The only place I see where that flag might get set is in 
> blk_mq_bio_to_request() in block/blk-mq.c, which does:
> 
> 	if (bio->bi_opf & REQ_RAHEAD)
> 		rq->cmd_flags |= REQ_FAILFAST_MASK;
> 
> So apparently read-ahead reads are supposed to fail fast (i.e., without 
> retries), presumably because they are optional after all.
> 
>>> What is the failing command?  Is it a READ(10)?
>>
>> Not sure how I'd answer that, but here's the test to trigger the error:
>>
>> mount /dev/sda1 /mnt
>> cd /mnt
>> ls
>> cp file ~/ (if ls "works" and doesn't yet trigger the error)
>>
>> and that's the (familiar looking) logs when doing so. again: despite the
>> mentioned workaround in scsi_error and the new expected_media_change
>> flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems
>> to override what I want to do is scsi_error:
>>
>> [   55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>> [   55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>> [   55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>> [   55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc
>> e0 00 00 01 00
> 
> Yes, 0x28 is READ(10).  Likely this is a read-ahead request, although I 
> don't know how we can tell for sure.
> 
>> [   55.557666] blk_update_request: I/O error, dev sda, sector 589024 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   55.574691] blk_update_request: I/O error, dev sda, sector 589025 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> [   55.591562] blk_update_request: I/O error, dev sda, sector 589026 op
>> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
>> [   55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed
>> (... goes on with the same)
> 
> Is such a drastic response really appropriate for the failure of a 
> read-ahead request?  It seems like a more logical response would be to 
> let the request fail but keep the device online.
> 
> Of course, that would only solve part of your problem -- your log would 
> still get filled with those "I/O error" messages even though they 
> wouldn't be fatal.  Probably a better approach would be to make the new 
> expecting_media_change flag override scsi_no_retry_cmd().
> 
> But this is not my area of expertise.  Maybe someone else will have more 
> to say.
> 
> Alan Stern
> 

Hey Alan, I'm really glad for that, I suspected some of this but I have
little experience in scsi/block layers, so that is super helpful.

I'd appreciate an opinion on the below workaround that *seems* to work
now (let's see, I had thought so before :)

Whether or not this helps to find a real solution, let's see. But
integration of such a flag in the error handling paths is what's
interesting for now:


--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
 				return NEEDS_RETRY;
 			}
 		}
+		if (scmd->device->expecting_media_change) {
+			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
+				/* clear expecting_media_change in
+				 * scsi_noretry_cmd() because we need
+				 * to override possible "failfast" overrides
+				 * that block readahead can cause.
+				 */
+				return NEEDS_RETRY;
+			}
+		}
+
 		/*
 		 * we might also expect a cc/ua if another LUN on the target
 		 * reported a UA with an ASC/ASCQ of 3F 0E -
@@ -1744,6 +1755,15 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 		return (scmd->request->cmd_flags & REQ_FAILFAST_DRIVER);
 	}

+	/*
+	 * We need to have retries when expecting_media_change is set.
+	 * So we need to return success and clear the flag here.
+	 */
+	if (scmd->device->expecting_media_change) {
+		scmd->device->expecting_media_change = 0;
+		return 0;
+	}
+
 	if (status_byte(scmd->result) != CHECK_CONDITION)
 		return 0;

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..bb583e403b81 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
 	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
 		return 0;

+	sdkp->device->expecting_media_change = 1;
+
 	if (!sdkp->device->manage_start_stop)
 		return 0;

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index bc5909033d13..f5fc1af68e00 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -169,6 +169,7 @@ struct scsi_device {
 				 * this device */
 	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
 				     * because we did a bus reset. */
+	unsigned expecting_media_change:1;
 	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
 	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
 	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
--

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-09  9:20                                                               ` Martin Kepplinger
@ 2020-08-09 15:26                                                                 ` Alan Stern
  2020-08-10 12:03                                                                   ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-09 15:26 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Sun, Aug 09, 2020 at 11:20:22AM +0200, Martin Kepplinger wrote:
> Hey Alan, I'm really glad for that, I suspected some of this but I have
> little experience in scsi/block layers, so that is super helpful.
> 
> I'd appreciate an opinion on the below workaround that *seems* to work
> now (let's see, I had thought so before :)
> 
> Whether or not this helps to find a real solution, let's see. But
> integration of such a flag in the error handling paths is what's
> interesting for now:
> 
> 
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>  				return NEEDS_RETRY;
>  			}
>  		}
> +		if (scmd->device->expecting_media_change) {
> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
> +				/* clear expecting_media_change in
> +				 * scsi_noretry_cmd() because we need
> +				 * to override possible "failfast" overrides
> +				 * that block readahead can cause.
> +				 */
> +				return NEEDS_RETRY;

This is a somewhat fragile approach.  You don't know for certain that 
scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
from other places.

It would be better to clear the expecting_media_change flag just before 
returning from scsi_decide_disposition.  That way its use is localized 
to one routine, not spread out between two.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-09 15:26                                                                 ` Alan Stern
@ 2020-08-10 12:03                                                                   ` Martin Kepplinger
  2020-08-10 14:13                                                                     ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-10 12:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 09.08.20 17:26, Alan Stern wrote:
> On Sun, Aug 09, 2020 at 11:20:22AM +0200, Martin Kepplinger wrote:
>> Hey Alan, I'm really glad for that, I suspected some of this but I have
>> little experience in scsi/block layers, so that is super helpful.
>>
>> I'd appreciate an opinion on the below workaround that *seems* to work
>> now (let's see, I had thought so before :)
>>
>> Whether or not this helps to find a real solution, let's see. But
>> integration of such a flag in the error handling paths is what's
>> interesting for now:
>>
>>
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>  				return NEEDS_RETRY;
>>  			}
>>  		}
>> +		if (scmd->device->expecting_media_change) {
>> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +				/* clear expecting_media_change in
>> +				 * scsi_noretry_cmd() because we need
>> +				 * to override possible "failfast" overrides
>> +				 * that block readahead can cause.
>> +				 */
>> +				return NEEDS_RETRY;
> 
> This is a somewhat fragile approach.  You don't know for certain that 
> scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
> from other places.
> 
> It would be better to clear the expecting_media_change flag just before 
> returning from scsi_decide_disposition.  That way its use is localized 
> to one routine, not spread out between two.
> 
> Alan Stern
> 

Hi Alan,

maybe you're right. I initially just thought that I'd allow for specific
error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY,
ERROR) despite having the flag set.

The below version works equally fine for me but I'm not sure if it's
actually more safe.

James, when exposing a new writable sysfs option like
"suspend_no_media_change"(?) that drivers can check before setting the
new "expecting_media_change" flag (during resume), would this addition
make sense to you?

thanks,

                      martin



--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
 				return NEEDS_RETRY;
 			}
 		}
+		if (scmd->device->expecting_media_change) {
+			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
+				/*
+				 * clear the expecting_media_change in
+				 * scsi_decide_disposition() because we
+				 * need to catch possible "fail fast" overrides
+				 * that block readahead can cause.
+				 */
+				return NEEDS_RETRY;
+			}
+		}
+
 		/*
 		 * we might also expect a cc/ua if another LUN on the target
 		 * reported a UA with an ASC/ASCQ of 3F 0E -
@@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	 * the request was not marked fast fail.  Note that above,
 	 * even if the request is marked fast fail, we still requeue
 	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
-	if ((++scmd->retries) <= scmd->allowed
-	    && !scsi_noretry_cmd(scmd)) {
-		return NEEDS_RETRY;
+	if ((++scmd->retries) <= scmd->allowed) {
+		/*
+		 * but scsi_noretry_cmd() cannot override the
+		 * expecting_media_change flag.
+		 */
+		if (!scsi_noretry_cmd(scmd) ||
+		    scmd->device->expecting_media_change) {
+			scmd->device->expecting_media_change = 0;
+			return NEEDS_RETRY;
+		} else {
+			/* marked fast fail and not expected. */
+			return SUCCESS;
+		}
 	} else {
 		/*
 		 * no more retries - report this one back to upper level.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..bb583e403b81 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
 	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
 		return 0;

+	sdkp->device->expecting_media_change = 1;
+
 	if (!sdkp->device->manage_start_stop)
 		return 0;

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index bc5909033d13..f5fc1af68e00 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -169,6 +169,7 @@ struct scsi_device {
 				 * this device */
 	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
 				     * because we did a bus reset. */
+	unsigned expecting_media_change:1;
 	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
 	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
 	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
-- 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-10 12:03                                                                   ` Martin Kepplinger
@ 2020-08-10 14:13                                                                     ` Alan Stern
  2020-08-11  7:55                                                                       ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-10 14:13 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote:
> On 09.08.20 17:26, Alan Stern wrote:
> > This is a somewhat fragile approach.  You don't know for certain that 
> > scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
> > from other places.
> > 
> > It would be better to clear the expecting_media_change flag just before 
> > returning from scsi_decide_disposition.  That way its use is localized 
> > to one routine, not spread out between two.
> > 
> > Alan Stern
> > 
> 
> Hi Alan,
> 
> maybe you're right. I initially just thought that I'd allow for specific
> error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY,
> ERROR) despite having the flag set.
> 
> The below version works equally fine for me but I'm not sure if it's
> actually more safe.
> 
> James, when exposing a new writable sysfs option like
> "suspend_no_media_change"(?) that drivers can check before setting the
> new "expecting_media_change" flag (during resume), would this addition
> make sense to you?
> 
> thanks,
> 
>                       martin
> 
> 
> 
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>  				return NEEDS_RETRY;
>  			}
>  		}
> +		if (scmd->device->expecting_media_change) {
> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
> +				/*
> +				 * clear the expecting_media_change in
> +				 * scsi_decide_disposition() because we
> +				 * need to catch possible "fail fast" overrides
> +				 * that block readahead can cause.
> +				 */
> +				return NEEDS_RETRY;
> +			}
> +		}
> +
>  		/*
>  		 * we might also expect a cc/ua if another LUN on the target
>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
> @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
>  	 * the request was not marked fast fail.  Note that above,
>  	 * even if the request is marked fast fail, we still requeue
>  	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
> -	if ((++scmd->retries) <= scmd->allowed
> -	    && !scsi_noretry_cmd(scmd)) {
> -		return NEEDS_RETRY;
> +	if ((++scmd->retries) <= scmd->allowed) {
> +		/*
> +		 * but scsi_noretry_cmd() cannot override the
> +		 * expecting_media_change flag.
> +		 */
> +		if (!scsi_noretry_cmd(scmd) ||
> +		    scmd->device->expecting_media_change) {
> +			scmd->device->expecting_media_change = 0;
> +			return NEEDS_RETRY;
> +		} else {
> +			/* marked fast fail and not expected. */
> +			return SUCCESS;
> +		}
>  	} else {

This may not matter...  but it's worth pointing out that 
expecting_media_change doesn't get cleared if ++scmd->retries > 
scmd->allowed.

It also doesn't get cleared in cases where the device _doesn't_ 
report a Unit Attention.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-10 14:13                                                                     ` Alan Stern
@ 2020-08-11  7:55                                                                       ` Martin Kepplinger
  2020-08-11 13:48                                                                         ` Alan Stern
  2020-08-23 14:57                                                                         ` [PATCH] block: Fix bug in runtime-resume handling Alan Stern
  0 siblings, 2 replies; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-11  7:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On 10.08.20 16:13, Alan Stern wrote:
> On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote:
>> On 09.08.20 17:26, Alan Stern wrote:
>>> This is a somewhat fragile approach.  You don't know for certain that 
>>> scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
>>> from other places.
>>>
>>> It would be better to clear the expecting_media_change flag just before 
>>> returning from scsi_decide_disposition.  That way its use is localized 
>>> to one routine, not spread out between two.
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> maybe you're right. I initially just thought that I'd allow for specific
>> error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY,
>> ERROR) despite having the flag set.
>>
>> The below version works equally fine for me but I'm not sure if it's
>> actually more safe.
>>
>> James, when exposing a new writable sysfs option like
>> "suspend_no_media_change"(?) that drivers can check before setting the
>> new "expecting_media_change" flag (during resume), would this addition
>> make sense to you?
>>
>> thanks,
>>
>>                       martin
>>
>>
>>
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>  				return NEEDS_RETRY;
>>  			}
>>  		}
>> +		if (scmd->device->expecting_media_change) {
>> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +				/*
>> +				 * clear the expecting_media_change in
>> +				 * scsi_decide_disposition() because we
>> +				 * need to catch possible "fail fast" overrides
>> +				 * that block readahead can cause.
>> +				 */
>> +				return NEEDS_RETRY;
>> +			}
>> +		}
>> +
>>  		/*
>>  		 * we might also expect a cc/ua if another LUN on the target
>>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
>> @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
>>  	 * the request was not marked fast fail.  Note that above,
>>  	 * even if the request is marked fast fail, we still requeue
>>  	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
>> -	if ((++scmd->retries) <= scmd->allowed
>> -	    && !scsi_noretry_cmd(scmd)) {
>> -		return NEEDS_RETRY;
>> +	if ((++scmd->retries) <= scmd->allowed) {
>> +		/*
>> +		 * but scsi_noretry_cmd() cannot override the
>> +		 * expecting_media_change flag.
>> +		 */
>> +		if (!scsi_noretry_cmd(scmd) ||
>> +		    scmd->device->expecting_media_change) {
>> +			scmd->device->expecting_media_change = 0;
>> +			return NEEDS_RETRY;
>> +		} else {
>> +			/* marked fast fail and not expected. */
>> +			return SUCCESS;
>> +		}
>>  	} else {
> 
> This may not matter...  but it's worth pointing out that 
> expecting_media_change doesn't get cleared if ++scmd->retries > 
> scmd->allowed.

absolutely worth pointing out and I'm not yet sure about that one.

> 
> It also doesn't get cleared in cases where the device _doesn't_ 
> report a Unit Attention.

true. but don't we set the flag for a future UA we don't yet know of? If
we would want to clear it outside of a UA, I think we'd need to keep
track of a suspend/resume cycle and if we see that we *had* successfully
"done requests" after resuming, we could clear it...

> 
> Alan Stern
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] scsi: sd: add runtime pm to open / release
  2020-08-11  7:55                                                                       ` Martin Kepplinger
@ 2020-08-11 13:48                                                                         ` Alan Stern
  2020-08-23 14:57                                                                         ` [PATCH] block: Fix bug in runtime-resume handling Alan Stern
  1 sibling, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-08-11 13:48 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: James Bottomley, Bart Van Assche, Can Guo, martin.petersen,
	linux-scsi, linux-kernel, kernel

On Tue, Aug 11, 2020 at 09:55:54AM +0200, Martin Kepplinger wrote:
> On 10.08.20 16:13, Alan Stern wrote:
> > This may not matter...  but it's worth pointing out that 
> > expecting_media_change doesn't get cleared if ++scmd->retries > 
> > scmd->allowed.
> 
> absolutely worth pointing out and I'm not yet sure about that one.
> 
> > 
> > It also doesn't get cleared in cases where the device _doesn't_ 
> > report a Unit Attention.
> 
> true. but don't we set the flag for a future UA we don't yet know of? If
> we would want to clear it outside of a UA, I think we'd need to keep
> track of a suspend/resume cycle and if we see that we *had* successfully
> "done requests" after resuming, we could clear it...

The point is that expecting_media_change should apply only to the _next_ 
command.  It should be cleared after _every_ command.  You do not want 
to leave it hanging around -- if you do then you will miss real media 
changes.

There's a potential issue when a bunch of commands get sent and 
completed all at once, but hopefully it won't matter here.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH] block: Fix bug in runtime-resume handling
  2020-08-11  7:55                                                                       ` Martin Kepplinger
  2020-08-11 13:48                                                                         ` Alan Stern
@ 2020-08-23 14:57                                                                         ` Alan Stern
  2020-08-24 17:48                                                                           ` Bart Van Assche
  1 sibling, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-23 14:57 UTC (permalink / raw)
  To: Jens Axboe, Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

Runtime power-management support in the block layer has somehow gotten
messed up in the past few years.  This area of code has been through a
lot of churn and it's not easy to track exactly when the bug addressed
here was introduced; it was present for a while, then fixed for a
while, and then it returned.

At any rate, the problem is that the block layer code thinks that it's
okay to issue a request with the BLK_MQ_REQ_PREEMPT flag set at any
time, even when the queue and underlying device are runtime suspended.
This belief is wrong; the flag merely indicates that it's okay to
issue the request while the queue and device are in the process of
suspending or resuming.  When they are actually suspended, no requests
may be issued.

The symptom of this bug is that a runtime-suspended block device
doesn't resume as it should.  The request which should cause a runtime
resume instead gets issued directly, without resuming the device
first.  Of course the device can't handle it properly, the I/O
fails, and the device remains suspended.

The problem is fixed by checking that the queue's runtime-PM status
isn't RPM_SUSPENDED before allowing a request to be issued, and
queuing a runtime-resume request if it is.  In particular, the inline
blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and
the code is unified by merging the surrounding checks into the
routine.  If the queue isn't set up for runtime PM, or there currently
is no restriction on allowed requests, the request is allowed.
Likewise if the BLK_MQ_REQ_PREEMPT flag is set and the status isn't
RPM_SUSPENDED.  Otherwise a runtime resume is queued and the request
is blocked until conditions are more suitable.

Reported-and-tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Bart Van Assche <bvanassche@acm.org>
CC: <stable@vger.kernel.org> # 5.2

---

The bug goes back way before 5.2, but other changes will prevent the
patch from applying directly to earlier kernels, so I'm limiting the 
@stable updates to 5.2 and after.


[as1941]


 block/blk-core.c |    6 +++---
 block/blk-pm.h   |   14 +++++++++-----
 2 files changed, 12 insertions(+), 8 deletions(-)

Index: usb-devel/block/blk-core.c
===================================================================
--- usb-devel.orig/block/blk-core.c
+++ usb-devel/block/blk-core.c
@@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
 			 * responsible for ensuring that that counter is
 			 * globally visible before the queue is unfrozen.
 			 */
-			if (pm || !blk_queue_pm_only(q)) {
+			if ((pm && q->rpm_status != RPM_SUSPENDED) ||
+			    !blk_queue_pm_only(q)) {
 				success = true;
 			} else {
 				percpu_ref_put(&q->q_usage_counter);
@@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
 
 		wait_event(q->mq_freeze_wq,
 			   (!q->mq_freeze_depth &&
-			    (pm || (blk_pm_request_resume(q),
-				    !blk_queue_pm_only(q)))) ||
+			    blk_pm_resume_queue(pm, q)) ||
 			   blk_queue_dying(q));
 		if (blk_queue_dying(q))
 			return -ENODEV;
Index: usb-devel/block/blk-pm.h
===================================================================
--- usb-devel.orig/block/blk-pm.h
+++ usb-devel/block/blk-pm.h
@@ -6,11 +6,14 @@
 #include <linux/pm_runtime.h>
 
 #ifdef CONFIG_PM
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
-	if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
-		       q->rpm_status == RPM_SUSPENDING))
-		pm_request_resume(q->dev);
+	if (!q->dev || !blk_queue_pm_only(q))
+		return 1;	/* Nothing to do */
+	if (pm && q->rpm_status != RPM_SUSPENDED)
+		return 1;	/* Request allowed */
+	pm_request_resume(q->dev);
+	return 0;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)
@@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st
 		--rq->q->nr_pending;
 }
 #else
-static inline void blk_pm_request_resume(struct request_queue *q)
+static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
 {
+	return 1;
 }
 
 static inline void blk_pm_mark_last_busy(struct request *rq)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-23 14:57                                                                         ` [PATCH] block: Fix bug in runtime-resume handling Alan Stern
@ 2020-08-24 17:48                                                                           ` Bart Van Assche
  2020-08-24 20:13                                                                             ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-08-24 17:48 UTC (permalink / raw)
  To: Alan Stern, Jens Axboe, Martin Kepplinger
  Cc: Can Guo, linux-scsi, linux-block, kernel

On 2020-08-23 07:57, Alan Stern wrote:
> The problem is fixed by checking that the queue's runtime-PM status
> isn't RPM_SUSPENDED before allowing a request to be issued, [ ... ]

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Can, can you help with testing this patch?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-24 17:48                                                                           ` Bart Van Assche
@ 2020-08-24 20:13                                                                             ` Alan Stern
  2020-08-26  7:48                                                                               ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-24 20:13 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On Mon, Aug 24, 2020 at 10:48:32AM -0700, Bart Van Assche wrote:
> On 2020-08-23 07:57, Alan Stern wrote:
> > The problem is fixed by checking that the queue's runtime-PM status
> > isn't RPM_SUSPENDED before allowing a request to be issued, [ ... ]
> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> 
> Can, can you help with testing this patch?
> 
> Thanks,
> 
> Bart.

Martin:

(I forgot to ask this question several weeks ago, while you were running 
your tests.  Better ask it now before I forget again...)

I suspect the old runtime-PM code in the block layer would have worked 
okay in your SD cardreader test if the BLK_MQ_REQ_PREEMPT flag had not 
been set.  Do you know why the flag was set, or what line of code caused 
it to be set?

As I recall, the request that failed was a read-ahead.  It's not clear 
to me why such a request would need to have BLK_MQ_REQ_PREEMPT set.  
Although there may be other reasons for having that flag, at this point 
we're concerned with requests that are part of the runtime-resume 
procedure.  A read-ahead does not fall into that category.

Do you think you can find the answer?  Perhaps we should add a separate 
BLK_MQ_REQ_PM flag.

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-24 20:13                                                                             ` Alan Stern
@ 2020-08-26  7:48                                                                               ` Martin Kepplinger
  2020-08-27 17:42                                                                                 ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-26  7:48 UTC (permalink / raw)
  To: Alan Stern; +Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On 24.08.20 22:13, Alan Stern wrote:
> On Mon, Aug 24, 2020 at 10:48:32AM -0700, Bart Van Assche wrote:
>> On 2020-08-23 07:57, Alan Stern wrote:
>>> The problem is fixed by checking that the queue's runtime-PM status
>>> isn't RPM_SUSPENDED before allowing a request to be issued, [ ... ]
>>
>> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
>>
>> Can, can you help with testing this patch?
>>
>> Thanks,
>>
>> Bart.
> 
> Martin:
> 
> (I forgot to ask this question several weeks ago, while you were running 
> your tests.  Better ask it now before I forget again...)
> 
> I suspect the old runtime-PM code in the block layer would have worked 
> okay in your SD cardreader test if the BLK_MQ_REQ_PREEMPT flag had not 
> been set.  Do you know why the flag was set, or what line of code caused 
> it to be set?

Correct. if not set, I could handle all I need in the scsi error path.

> 
> As I recall, the request that failed was a read-ahead.  It's not clear 
> to me why such a request would need to have BLK_MQ_REQ_PREEMPT set.  
> Although there may be other reasons for having that flag, at this point 
> we're concerned with requests that are part of the runtime-resume 
> procedure.  A read-ahead does not fall into that category.
> 
> Do you think you can find the answer?  Perhaps we should add a separate 
> BLK_MQ_REQ_PM flag.

I guess I can but give me some time as I'm on vacations currently.

> 
> Alan Stern
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-26  7:48                                                                               ` Martin Kepplinger
@ 2020-08-27 17:42                                                                                 ` Martin Kepplinger
  2020-08-27 20:29                                                                                   ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-27 17:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On 26.08.20 09:48, Martin Kepplinger wrote:
> On 24.08.20 22:13, Alan Stern wrote:
>> On Mon, Aug 24, 2020 at 10:48:32AM -0700, Bart Van Assche wrote:
>>> On 2020-08-23 07:57, Alan Stern wrote:
>>>> The problem is fixed by checking that the queue's runtime-PM status
>>>> isn't RPM_SUSPENDED before allowing a request to be issued, [ ... ]
>>>
>>> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
>>>
>>> Can, can you help with testing this patch?
>>>
>>> Thanks,
>>>
>>> Bart.
>>
>> Martin:
>>
>> (I forgot to ask this question several weeks ago, while you were running 
>> your tests.  Better ask it now before I forget again...)
>>
>> I suspect the old runtime-PM code in the block layer would have worked 
>> okay in your SD cardreader test if the BLK_MQ_REQ_PREEMPT flag had not 
>> been set.  Do you know why the flag was set, or what line of code caused 
>> it to be set?
> 
> Correct. if not set, I could handle all I need in the scsi error path.

this thread becomes a bit confusing. I thought about REQ_FAILFAST_DEV
but you're talking about something different.

the only place I see BLK_MQ_REQ_PREEMPT getting passed on is in
__scsi_execute() which is the case when mounting/unmounting. At least
that about the only place I can find.

I remember *only* your block pm fix would let me mount/unmount, but not
use files yet (REQ_FAILFAST_DEV and so on).

When I revert your fix and remove BLK_MQ_REQ_PREEMPT from being passed
on to blk_get_request() in __scsi_execute(), that line gets executed
exactly once during startup and I'm missing the /dev/sda device from the
cardreader then.

Is this what you're asking?


> 
>>
>> As I recall, the request that failed was a read-ahead.  It's not clear 
>> to me why such a request would need to have BLK_MQ_REQ_PREEMPT set.  
>> Although there may be other reasons for having that flag, at this point 
>> we're concerned with requests that are part of the runtime-resume 
>> procedure.  A read-ahead does not fall into that category.
>>
>> Do you think you can find the answer?  Perhaps we should add a separate 
>> BLK_MQ_REQ_PM flag.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-27 17:42                                                                                 ` Martin Kepplinger
@ 2020-08-27 20:29                                                                                   ` Alan Stern
  2020-08-29  7:24                                                                                     ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-27 20:29 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On Thu, Aug 27, 2020 at 07:42:43PM +0200, Martin Kepplinger wrote:
> On 26.08.20 09:48, Martin Kepplinger wrote:
> > On 24.08.20 22:13, Alan Stern wrote:

> >> Martin:
> >>
> >> (I forgot to ask this question several weeks ago, while you were running 
> >> your tests.  Better ask it now before I forget again...)
> >>
> >> I suspect the old runtime-PM code in the block layer would have worked 
> >> okay in your SD cardreader test if the BLK_MQ_REQ_PREEMPT flag had not 
> >> been set.  Do you know why the flag was set, or what line of code caused 
> >> it to be set?
> > 
> > Correct. if not set, I could handle all I need in the scsi error path.
> 
> this thread becomes a bit confusing. I thought about REQ_FAILFAST_DEV
> but you're talking about something different.
> 
> the only place I see BLK_MQ_REQ_PREEMPT getting passed on is in
> __scsi_execute() which is the case when mounting/unmounting. At least
> that about the only place I can find.

Ah yes, I see what you mean.

> I remember *only* your block pm fix would let me mount/unmount, but not
> use files yet (REQ_FAILFAST_DEV and so on).
> 
> When I revert your fix and remove BLK_MQ_REQ_PREEMPT from being passed
> on to blk_get_request() in __scsi_execute(), that line gets executed
> exactly once during startup and I'm missing the /dev/sda device from the
> cardreader then.
> 
> Is this what you're asking?

Not quite sure, but it doesn't matter.  Removing BLK_MQ_REQ_PREEMPT in 
__scsi_execute() is probably not a safe thing to do.

Instead, look at sd_resume().  That routine calls __scsi_execute() 
indirectly through sd_start_stop_device(), and the only reason it does 
this is because the sdkp->device->manage_start_stop flag is set.  You 
ought to be able to clear this flag in sysfs, by writing to 
/sys/block/sda/device/scsi_disk/*/manage_start_stop.  If you do this 
before allowing the card reader to go into runtime suspend, does it then 
resume okay?

(Yes, I know you still won't be able to read it because of the FAILFAST 
flag.  I just want to know if the runtime resume actually takes place.)

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-27 20:29                                                                                   ` Alan Stern
@ 2020-08-29  7:24                                                                                     ` Martin Kepplinger
  2020-08-29 15:26                                                                                       ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-29  7:24 UTC (permalink / raw)
  To: Alan Stern; +Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On 27.08.20 22:29, Alan Stern wrote:
> On Thu, Aug 27, 2020 at 07:42:43PM +0200, Martin Kepplinger wrote:
>> On 26.08.20 09:48, Martin Kepplinger wrote:
>>> On 24.08.20 22:13, Alan Stern wrote:
> 
>>>> Martin:
>>>>
>>>> (I forgot to ask this question several weeks ago, while you were running 
>>>> your tests.  Better ask it now before I forget again...)
>>>>
>>>> I suspect the old runtime-PM code in the block layer would have worked 
>>>> okay in your SD cardreader test if the BLK_MQ_REQ_PREEMPT flag had not 
>>>> been set.  Do you know why the flag was set, or what line of code caused 
>>>> it to be set?
>>>
>>> Correct. if not set, I could handle all I need in the scsi error path.
>>
>> this thread becomes a bit confusing. I thought about REQ_FAILFAST_DEV
>> but you're talking about something different.
>>
>> the only place I see BLK_MQ_REQ_PREEMPT getting passed on is in
>> __scsi_execute() which is the case when mounting/unmounting. At least
>> that about the only place I can find.
> 
> Ah yes, I see what you mean.
> 
>> I remember *only* your block pm fix would let me mount/unmount, but not
>> use files yet (REQ_FAILFAST_DEV and so on).
>>
>> When I revert your fix and remove BLK_MQ_REQ_PREEMPT from being passed
>> on to blk_get_request() in __scsi_execute(), that line gets executed
>> exactly once during startup and I'm missing the /dev/sda device from the
>> cardreader then.
>>
>> Is this what you're asking?
> 
> Not quite sure, but it doesn't matter.  Removing BLK_MQ_REQ_PREEMPT in 
> __scsi_execute() is probably not a safe thing to do.
> 
> Instead, look at sd_resume().  That routine calls __scsi_execute() 
> indirectly through sd_start_stop_device(), and the only reason it does 
> this is because the sdkp->device->manage_start_stop flag is set.  You 
> ought to be able to clear this flag in sysfs, by writing to 
> /sys/block/sda/device/scsi_disk/*/manage_start_stop.  If you do this 
> before allowing the card reader to go into runtime suspend, does it then 
> resume okay?

manage_start_stop in sysfs is 0 here.

> 
> (Yes, I know you still won't be able to read it because of the FAILFAST 
> flag.  I just want to know if the runtime resume actually takes place.)
> 
> Alan Stern
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-29  7:24                                                                                     ` Martin Kepplinger
@ 2020-08-29 15:26                                                                                       ` Alan Stern
  2020-08-29 16:33                                                                                         ` Martin Kepplinger
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-29 15:26 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On Sat, Aug 29, 2020 at 09:24:30AM +0200, Martin Kepplinger wrote:
> On 27.08.20 22:29, Alan Stern wrote:
> > Instead, look at sd_resume().  That routine calls __scsi_execute() 
> > indirectly through sd_start_stop_device(), and the only reason it does 
> > this is because the sdkp->device->manage_start_stop flag is set.  You 
> > ought to be able to clear this flag in sysfs, by writing to 
> > /sys/block/sda/device/scsi_disk/*/manage_start_stop.  If you do this 
> > before allowing the card reader to go into runtime suspend, does it then 
> > resume okay?
> 
> manage_start_stop in sysfs is 0 here.

Hmmm.  I'm wondering about something you wrote back in June 
(https://marc.info/?l=linux-scsi&m=159345778431615&w=2):

	blk_queue_enter() always - especially when sd is runtime 
	suspended and I try to mount as above - sets success to be true 
	for me, so never continues down to bkl_pm_request_resume(). All 
	I see is "PM: Removing info for No Bus:sda1".

blk_queue_enter() would always set success to be true because pm 
(derived from the BLK_MQ_REQ_PREEMPT flag) is true.  But why was the 
BLK_MQ_REQ_PREEMPT flag set?  In other words, where was 
blk_queue_enter() called from?

Can you get a stack trace (i.e., call dump_stack()) at exactly this 
point, that is, when pm is true and q->rpm_status is RPM_SUSPENDED?  Or 
do you already know the answer?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-29 15:26                                                                                       ` Alan Stern
@ 2020-08-29 16:33                                                                                         ` Martin Kepplinger
  2020-08-29 18:56                                                                                           ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Martin Kepplinger @ 2020-08-29 16:33 UTC (permalink / raw)
  To: Alan Stern; +Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On 29.08.20 17:26, Alan Stern wrote:
> On Sat, Aug 29, 2020 at 09:24:30AM +0200, Martin Kepplinger wrote:
>> On 27.08.20 22:29, Alan Stern wrote:
>>> Instead, look at sd_resume().  That routine calls __scsi_execute() 
>>> indirectly through sd_start_stop_device(), and the only reason it does 
>>> this is because the sdkp->device->manage_start_stop flag is set.  You 
>>> ought to be able to clear this flag in sysfs, by writing to 
>>> /sys/block/sda/device/scsi_disk/*/manage_start_stop.  If you do this 
>>> before allowing the card reader to go into runtime suspend, does it then 
>>> resume okay?
>>
>> manage_start_stop in sysfs is 0 here.
> 
> Hmmm.  I'm wondering about something you wrote back in June 
> (https://marc.info/?l=linux-scsi&m=159345778431615&w=2):
> 
> 	blk_queue_enter() always - especially when sd is runtime 
> 	suspended and I try to mount as above - sets success to be true 
> 	for me, so never continues down to bkl_pm_request_resume(). All 
> 	I see is "PM: Removing info for No Bus:sda1".
> 
> blk_queue_enter() would always set success to be true because pm 
> (derived from the BLK_MQ_REQ_PREEMPT flag) is true.  But why was the 
> BLK_MQ_REQ_PREEMPT flag set?  In other words, where was 
> blk_queue_enter() called from?
> 
> Can you get a stack trace (i.e., call dump_stack()) at exactly this 
> point, that is, when pm is true and q->rpm_status is RPM_SUSPENDED?  Or 
> do you already know the answer?
> 
>

I reverted any scsi/block out-of-tree fixes for this.

when I try to mount, pm is TRUE (BLK_MQ_REQ_PREEMT set) and that's the
first stack trace I get in this condition, inside of blk_queue_enter():

There is more, but I don't know if that's interesting.

[   38.642202] CPU: 2 PID: 1522 Comm: mount Not tainted 5.8.0-1-librem5 #487
[   38.642207] Hardware name: Purism Librem 5r3 (DT)
[   38.642213] Call trace:
[   38.642233]  dump_backtrace+0x0/0x210
[   38.642242]  show_stack+0x20/0x30
[   38.642252]  dump_stack+0xc8/0x128
[   38.642262]  blk_queue_enter+0x1b8/0x2d8
[   38.642271]  blk_mq_alloc_request+0x54/0xb0
[   38.642277]  blk_get_request+0x34/0x78
[   38.642286]  __scsi_execute+0x60/0x1c8
[   38.642291]  scsi_test_unit_ready+0x88/0x118
[   38.642298]  sd_check_events+0x110/0x158
[   38.642306]  disk_check_events+0x68/0x188
[   38.642312]  disk_clear_events+0x84/0x198
[   38.642320]  check_disk_change+0x38/0x90
[   38.642325]  sd_open+0x60/0x148
[   38.642330]  __blkdev_get+0xcc/0x4c8
[   38.642335]  __blkdev_get+0x278/0x4c8
[   38.642339]  blkdev_get+0x128/0x1a8
[   38.642345]  blkdev_open+0x98/0xb0
[   38.642354]  do_dentry_open+0x130/0x3c8
[   38.642359]  vfs_open+0x34/0x40
[   38.642366]  path_openat+0xa30/0xe40
[   38.642372]  do_filp_open+0x84/0x100
[   38.642377]  do_sys_openat2+0x1f4/0x2b0
[   38.642382]  do_sys_open+0x60/0xa8
(...)

and of course it doesn't work and /dev/sda1 disappears, see the initial
discussion that led to your fix.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-29 16:33                                                                                         ` Martin Kepplinger
@ 2020-08-29 18:56                                                                                           ` Alan Stern
  2020-08-30  0:38                                                                                             ` Bart Van Assche
  0 siblings, 1 reply; 68+ messages in thread
From: Alan Stern @ 2020-08-29 18:56 UTC (permalink / raw)
  To: Martin Kepplinger
  Cc: Bart Van Assche, Can Guo, linux-scsi, linux-block, kernel

On Sat, Aug 29, 2020 at 06:33:26PM +0200, Martin Kepplinger wrote:
> On 29.08.20 17:26, Alan Stern wrote:
> > Hmmm.  I'm wondering about something you wrote back in June 
> > (https://marc.info/?l=linux-scsi&m=159345778431615&w=2):
> > 
> > 	blk_queue_enter() always - especially when sd is runtime 
> > 	suspended and I try to mount as above - sets success to be true 
> > 	for me, so never continues down to bkl_pm_request_resume(). All 
> > 	I see is "PM: Removing info for No Bus:sda1".
> > 
> > blk_queue_enter() would always set success to be true because pm 
> > (derived from the BLK_MQ_REQ_PREEMPT flag) is true.  But why was the 
> > BLK_MQ_REQ_PREEMPT flag set?  In other words, where was 
> > blk_queue_enter() called from?
> > 
> > Can you get a stack trace (i.e., call dump_stack()) at exactly this 
> > point, that is, when pm is true and q->rpm_status is RPM_SUSPENDED?  Or 
> > do you already know the answer?
> > 
> >
> 
> I reverted any scsi/block out-of-tree fixes for this.
> 
> when I try to mount, pm is TRUE (BLK_MQ_REQ_PREEMT set) and that's the
> first stack trace I get in this condition, inside of blk_queue_enter():
> 
> There is more, but I don't know if that's interesting.
> 
> [   38.642202] CPU: 2 PID: 1522 Comm: mount Not tainted 5.8.0-1-librem5 #487
> [   38.642207] Hardware name: Purism Librem 5r3 (DT)
> [   38.642213] Call trace:
> [   38.642233]  dump_backtrace+0x0/0x210
> [   38.642242]  show_stack+0x20/0x30
> [   38.642252]  dump_stack+0xc8/0x128
> [   38.642262]  blk_queue_enter+0x1b8/0x2d8
> [   38.642271]  blk_mq_alloc_request+0x54/0xb0
> [   38.642277]  blk_get_request+0x34/0x78
> [   38.642286]  __scsi_execute+0x60/0x1c8
> [   38.642291]  scsi_test_unit_ready+0x88/0x118
> [   38.642298]  sd_check_events+0x110/0x158
> [   38.642306]  disk_check_events+0x68/0x188
> [   38.642312]  disk_clear_events+0x84/0x198
> [   38.642320]  check_disk_change+0x38/0x90
> [   38.642325]  sd_open+0x60/0x148
> [   38.642330]  __blkdev_get+0xcc/0x4c8
> [   38.642335]  __blkdev_get+0x278/0x4c8
> [   38.642339]  blkdev_get+0x128/0x1a8
> [   38.642345]  blkdev_open+0x98/0xb0
> [   38.642354]  do_dentry_open+0x130/0x3c8
> [   38.642359]  vfs_open+0x34/0x40
> [   38.642366]  path_openat+0xa30/0xe40
> [   38.642372]  do_filp_open+0x84/0x100
> [   38.642377]  do_sys_openat2+0x1f4/0x2b0
> [   38.642382]  do_sys_open+0x60/0xa8
> (...)
> 
> and of course it doesn't work and /dev/sda1 disappears, see the initial
> discussion that led to your fix.

Great!  That's exactly what I was looking for, thank you.

Bart, this is a perfect example of the potential race I've been talking 
about in the other email thread.  Suppose thread 0 is carrying out a 
runtime suspend of a SCSI disk and at the same time, thread 1 is opening 
the disk's block device (as we see in the stack trace here).  Then we 
could have the following:

	Thread 0		Thread 1
	--------		--------
	Start runtime suspend
	blk_pre_runtime_suspend calls
	  blk_set_pm_only and sets
	  q->rpm_status to RPM_SUSPENDING

				Call sd_open -> ... -> scsi_test_unit_ready
				  -> __scsi_execute -> ...
				  -> blk_queue_enter
				Sees BLK_MQ_REQ_PREEMPT set and
				  RPM_SUSPENDING queue status, so does 
				  not postpone the request

	blk_post_runtime_suspend sets
	  q->rpm_status to RPM_SUSPENDED
	The drive goes into runtime suspend

				Issues the TEST UNIT READY request
				Request fails because the drive is suspended

One way to avoid this race is mutual exclusion: We could make sd_open 
prevent the drive from being runtime suspended until it returns.  
However I don't like this approach; it would mean tracking down every 
possible pathway to __scsi_execute and making sure that runtime suspend 
is blocked.

A more fine-grained approach would be to have __scsi_execute itself call 
scsi_autopm_get/put_device whenever the rq_flags argument doesn't 
contain RQF_PM.  This way we wouldn't have to worry about missing any 
possiible pathways.  But it relies on an implicit assumption that 
__scsi_execute is the only place where the PREEMPT flag gets set.

A third possibility is the approach I outlined before, adding a 
BLK_MQ_REQ_PM flag.  But to avoid the deadlock you pointed out, I would 
make blk_queue_enter smarter about whether to postpone a request.  The 
logic would go like this:

	If !blk_queue_pm_only:
		Allow
	If !BLK_MQ_REQ_PREEMPT:
		Postpone
	If q->rpm_status is RPM_ACTIVE:
		Allow
	If !BLK_MQ_REQ_PM:
		Postpone
	If q->rpm_status is RPM_SUSPENDED:
		Postpone
	Else:
		Allow

The assumption here is that the PREEMPT flag is set whenever the PM flag 
is.

I believe either the second or third possibility would work.  The second 
looks to be the simplest

What do you think?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-29 18:56                                                                                           ` Alan Stern
@ 2020-08-30  0:38                                                                                             ` Bart Van Assche
  2020-08-30  1:06                                                                                               ` Alan Stern
  0 siblings, 1 reply; 68+ messages in thread
From: Bart Van Assche @ 2020-08-30  0:38 UTC (permalink / raw)
  To: Alan Stern, Martin Kepplinger; +Cc: Can Guo, linux-scsi, linux-block, kernel

On 2020-08-29 11:56, Alan Stern wrote:
> A third possibility is the approach I outlined before, adding a 
> BLK_MQ_REQ_PM flag.  But to avoid the deadlock you pointed out, I would 
> make blk_queue_enter smarter about whether to postpone a request.  The 
> logic would go like this:
> 
> 	If !blk_queue_pm_only:
> 		Allow
> 	If !BLK_MQ_REQ_PREEMPT:
> 		Postpone
> 	If q->rpm_status is RPM_ACTIVE:
> 		Allow
> 	If !BLK_MQ_REQ_PM:
> 		Postpone
> 	If q->rpm_status is RPM_SUSPENDED:
> 		Postpone
> 	Else:
> 		Allow
> 
> The assumption here is that the PREEMPT flag is set whenever the PM flag 
> is.

Hi Alan,

Although interesting, these solutions sound like workarounds to me. How
about fixing the root cause by modifying the SCSI DV implementation such
that it doesn't use scsi_device_quiesce() anymore()? That change would
allow to remove BLK_MQ_REQ_PREEMPT / RQF_PREEMPT from the block layer and
move these flags into the SCSI and IDE code.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] block: Fix bug in runtime-resume handling
  2020-08-30  0:38                                                                                             ` Bart Van Assche
@ 2020-08-30  1:06                                                                                               ` Alan Stern
  0 siblings, 0 replies; 68+ messages in thread
From: Alan Stern @ 2020-08-30  1:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin Kepplinger, Can Guo, linux-scsi, linux-block, kernel

On Sat, Aug 29, 2020 at 05:38:50PM -0700, Bart Van Assche wrote:
> On 2020-08-29 11:56, Alan Stern wrote:
> > A third possibility is the approach I outlined before, adding a 
> > BLK_MQ_REQ_PM flag.  But to avoid the deadlock you pointed out, I would 
> > make blk_queue_enter smarter about whether to postpone a request.  The 
> > logic would go like this:
> > 
> > 	If !blk_queue_pm_only:
> > 		Allow
> > 	If !BLK_MQ_REQ_PREEMPT:
> > 		Postpone
> > 	If q->rpm_status is RPM_ACTIVE:
> > 		Allow
> > 	If !BLK_MQ_REQ_PM:
> > 		Postpone
> > 	If q->rpm_status is RPM_SUSPENDED:
> > 		Postpone
> > 	Else:
> > 		Allow
> > 
> > The assumption here is that the PREEMPT flag is set whenever the PM flag 
> > is.
> 
> Hi Alan,
> 
> Although interesting, these solutions sound like workarounds to me. How
> about fixing the root cause by modifying the SCSI DV implementation such
> that it doesn't use scsi_device_quiesce() anymore()? That change would
> allow to remove BLK_MQ_REQ_PREEMPT / RQF_PREEMPT from the block layer and
> move these flags into the SCSI and IDE code.

That's a perfectly reasonable approach, but I have no idea how to do it.
Any suggestions?

Alan Stern

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2020-08-30  1:06 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 11:10 [PATCH] scsi: sd: add runtime pm to open / release Martin Kepplinger
2020-06-24 13:33 ` Bart Van Assche
2020-06-25  8:16   ` Martin Kepplinger
2020-06-25 14:52     ` Alan Stern
2020-06-26  3:53     ` Bart Van Assche
2020-06-26 15:07     ` Bart Van Assche
2020-06-26 15:44       ` Alan Stern
2020-06-28  2:37         ` Bart Van Assche
2020-06-28 13:10           ` Alan Stern
2020-06-29  9:42         ` Martin Kepplinger
2020-06-29 16:15           ` Alan Stern
2020-06-29 16:56             ` Bart Van Assche
2020-06-29 17:40               ` Alan Stern
2020-06-30  3:33             ` Martin Kepplinger
2020-06-30 13:38               ` Alan Stern
2020-06-30 15:59             ` Bart Van Assche
2020-06-30 18:02               ` Alan Stern
2020-06-30 19:23                 ` Bart Van Assche
2020-06-30 19:38                   ` Alan Stern
2020-06-30 23:31                     ` Bart Van Assche
2020-07-01  0:49                       ` Alan Stern
2020-07-06 16:41                         ` Alan Stern
2020-07-28  7:02                           ` Martin Kepplinger
2020-07-28 20:02                             ` Alan Stern
2020-07-29 14:12                               ` Martin Kepplinger
2020-07-29 14:32                                 ` Alan Stern
2020-07-29 14:44                                   ` Martin K. Petersen
2020-07-29 14:56                                     ` Alan Stern
2020-07-29 14:46                                   ` James Bottomley
2020-07-29 14:53                                     ` James Bottomley
2020-07-29 15:40                                       ` Martin Kepplinger
2020-07-29 15:44                                         ` James Bottomley
2020-07-29 16:43                                           ` Martin Kepplinger
2020-07-29 18:25                                             ` Alan Stern
2020-07-29 18:29                                               ` James Bottomley
2020-07-30  8:52                                                 ` Martin Kepplinger
2020-07-30  8:54                                                   ` Martin Kepplinger
2020-07-30 15:10                                                   ` Alan Stern
2020-08-04  9:39                                                     ` Martin Kepplinger
2020-08-07  9:51                                                       ` Martin Kepplinger
2020-08-07 14:30                                                         ` Alan Stern
2020-08-08  6:59                                                           ` Martin Kepplinger
2020-08-08 15:05                                                             ` Alan Stern
2020-08-09  9:20                                                               ` Martin Kepplinger
2020-08-09 15:26                                                                 ` Alan Stern
2020-08-10 12:03                                                                   ` Martin Kepplinger
2020-08-10 14:13                                                                     ` Alan Stern
2020-08-11  7:55                                                                       ` Martin Kepplinger
2020-08-11 13:48                                                                         ` Alan Stern
2020-08-23 14:57                                                                         ` [PATCH] block: Fix bug in runtime-resume handling Alan Stern
2020-08-24 17:48                                                                           ` Bart Van Assche
2020-08-24 20:13                                                                             ` Alan Stern
2020-08-26  7:48                                                                               ` Martin Kepplinger
2020-08-27 17:42                                                                                 ` Martin Kepplinger
2020-08-27 20:29                                                                                   ` Alan Stern
2020-08-29  7:24                                                                                     ` Martin Kepplinger
2020-08-29 15:26                                                                                       ` Alan Stern
2020-08-29 16:33                                                                                         ` Martin Kepplinger
2020-08-29 18:56                                                                                           ` Alan Stern
2020-08-30  0:38                                                                                             ` Bart Van Assche
2020-08-30  1:06                                                                                               ` Alan Stern
2020-07-29 15:40                                       ` [PATCH] scsi: sd: add runtime pm to open / release Alan Stern
2020-07-29 15:49                                         ` James Bottomley
2020-07-29 16:17                                           ` Alan Stern
2020-07-29 15:52                                         ` Martin Kepplinger
2020-07-29 18:10                                   ` Douglas Gilbert
2020-07-30  8:05             ` Martin Kepplinger
2020-07-30 15:14               ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).