* [PATCH 0/3] Reduce ATA disk resume time @ 2022-06-28 22:21 Bart Van Assche 2022-06-28 22:21 ` [PATCH 1/3] scsi: core: Move the definition of SCSI_QUEUE_DELAY Bart Van Assche ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-28 22:21 UTC (permalink / raw) To: Martin K . Petersen; +Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche Hi Martin, Recently it was reported that patch "scsi: core: pm: Rely on the device driver core for async power management" causes resuming to take longer if an ATA disk is present. This patch series fixes that regression. Please consider this patch series for kernel v5.20. Thanks, Bart. Bart Van Assche (3): scsi: core: Move the definition of SCSI_QUEUE_DELAY scsi: core: Retry after a delay if the device is becoming ready scsi: sd: Rework asynchronous resume support drivers/scsi/scsi_error.c | 4 +- drivers/scsi/scsi_lib.c | 14 +++---- drivers/scsi/sd.c | 79 ++++++++++++++++++++++++++++++--------- drivers/scsi/sd.h | 5 +++ 4 files changed, 75 insertions(+), 27 deletions(-) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] scsi: core: Move the definition of SCSI_QUEUE_DELAY 2022-06-28 22:21 [PATCH 0/3] Reduce ATA disk resume time Bart Van Assche @ 2022-06-28 22:21 ` Bart Van Assche 2022-06-28 22:21 ` [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready Bart Van Assche 2022-06-28 22:21 ` [PATCH 3/3] scsi: sd: Rework asynchronous resume support Bart Van Assche 2 siblings, 0 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-28 22:21 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Ming Lei, Hannes Reinecke, John Garry Move the definition of SCSI_QUEUE_DELAY to just above the function that uses it. Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_lib.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 6ffc9e4258a8..2aca0a838ca5 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -75,13 +75,6 @@ int scsi_init_sense_cache(struct Scsi_Host *shost) return ret; } -/* - * When to reinvoke queueing after a resource shortage. It's 3 msecs to - * not change behaviour from the previous unplug mechanism, experimentation - * may prove this needs changing. - */ -#define SCSI_QUEUE_DELAY 3 - static void scsi_set_blocked(struct scsi_cmnd *cmd, int reason) { @@ -1648,6 +1641,13 @@ static void scsi_mq_put_budget(struct request_queue *q, int budget_token) sbitmap_put(&sdev->budget_map, budget_token); } +/* + * When to reinvoke queueing after a resource shortage. It's 3 msecs to + * not change behaviour from the previous unplug mechanism, experimentation + * may prove this needs changing. + */ +#define SCSI_QUEUE_DELAY 3 + static int scsi_mq_get_budget(struct request_queue *q) { struct scsi_device *sdev = q->queuedata; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready 2022-06-28 22:21 [PATCH 0/3] Reduce ATA disk resume time Bart Van Assche 2022-06-28 22:21 ` [PATCH 1/3] scsi: core: Move the definition of SCSI_QUEUE_DELAY Bart Van Assche @ 2022-06-28 22:21 ` Bart Van Assche 2022-06-29 1:21 ` Ming Lei 2022-06-28 22:21 ` [PATCH 3/3] scsi: sd: Rework asynchronous resume support Bart Van Assche 2 siblings, 1 reply; 11+ messages in thread From: Bart Van Assche @ 2022-06-28 22:21 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Ming Lei, Hannes Reinecke, John Garry If a logical unit reports that it is becoming ready, retry the command after a delay instead of retrying immediately. Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_error.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 49ef864df581..fb7e363c4c00 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -625,10 +625,10 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd) return NEEDS_RETRY; /* * if the device is in the process of becoming ready, we - * should retry. + * should retry after a delay. */ if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01)) - return NEEDS_RETRY; + return ADD_TO_MLQUEUE; /* * if the device is not started, we need to wake * the error handler to start the motor ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready 2022-06-28 22:21 ` [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready Bart Van Assche @ 2022-06-29 1:21 ` Ming Lei 2022-06-29 22:06 ` Bart Van Assche 0 siblings, 1 reply; 11+ messages in thread From: Ming Lei @ 2022-06-29 1:21 UTC (permalink / raw) To: Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Hannes Reinecke, John Garry On Tue, Jun 28, 2022 at 03:21:30PM -0700, Bart Van Assche wrote: > If a logical unit reports that it is becoming ready, retry the command > after a delay instead of retrying immediately. > > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.garry@huawei.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/scsi_error.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 49ef864df581..fb7e363c4c00 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -625,10 +625,10 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd) > return NEEDS_RETRY; > /* > * if the device is in the process of becoming ready, we > - * should retry. > + * should retry after a delay. > */ > if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01)) > - return NEEDS_RETRY; > + return ADD_TO_MLQUEUE; The above code & commit log just said changing to retry after a delay, but not explains why, care to document reason why the delay is useful? Thanks Ming ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready 2022-06-29 1:21 ` Ming Lei @ 2022-06-29 22:06 ` Bart Van Assche 0 siblings, 0 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-29 22:06 UTC (permalink / raw) To: Ming Lei Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Hannes Reinecke, John Garry On 6/28/22 18:21, Ming Lei wrote: > On Tue, Jun 28, 2022 at 03:21:30PM -0700, Bart Van Assche wrote: >> If a logical unit reports that it is becoming ready, retry the command >> after a delay instead of retrying immediately. >> >> Cc: Ming Lei <ming.lei@redhat.com> >> Cc: Hannes Reinecke <hare@suse.de> >> Cc: John Garry <john.garry@huawei.com> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org> >> --- >> drivers/scsi/scsi_error.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c >> index 49ef864df581..fb7e363c4c00 100644 >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -625,10 +625,10 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> /* >> * if the device is in the process of becoming ready, we >> - * should retry. >> + * should retry after a delay. >> */ >> if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01)) >> - return NEEDS_RETRY; >> + return ADD_TO_MLQUEUE; > > The above code & commit log just said changing to retry after a delay, but not > explains why, care to document reason why the delay is useful? I came up with this patch because I was concerned about the impact of the LOGICAL UNIT IS IN PROCESS OF BECOMING READY response on the START command. While reviewing SBC-4 I noticed that that response can be produced for any command but not for the START command with IMM=0. So I think this patch can be dropped since sd_start_stop_device() does not set the IMM flag. Thanks, Bart. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-28 22:21 [PATCH 0/3] Reduce ATA disk resume time Bart Van Assche 2022-06-28 22:21 ` [PATCH 1/3] scsi: core: Move the definition of SCSI_QUEUE_DELAY Bart Van Assche 2022-06-28 22:21 ` [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready Bart Van Assche @ 2022-06-28 22:21 ` Bart Van Assche 2022-06-29 6:02 ` Hannes Reinecke 2022-06-30 16:23 ` John Garry 2 siblings, 2 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-28 22:21 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Ming Lei, Hannes Reinecke, John Garry, ericspero, jason600.groome For some technologies, e.g. an ATA bus, resuming can take multiple seconds. Waiting for resume to finish can cause a very noticeable delay. Hence this patch that restores the behavior from before patch "scsi: core: pm: Rely on the device driver core for async power management" for most SCSI devices. This patch introduces a behavior change: if the START command fails, do not consider this as a SCSI disk resume failure. Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: ericspero@icloud.com Cc: jason600.groome@gmail.com Tested-by: jason600.groome@gmail.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/sd.c | 79 ++++++++++++++++++++++++++++++++++++----------- drivers/scsi/sd.h | 5 +++ 2 files changed, 66 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 895b56c8f25e..06888b675e71 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -103,6 +103,7 @@ static void sd_config_discard(struct scsi_disk *, unsigned int); static void sd_config_write_same(struct scsi_disk *); static int sd_revalidate_disk(struct gendisk *); static void sd_unlock_native_capacity(struct gendisk *disk); +static void sd_start_done_work(struct work_struct *work); static int sd_probe(struct device *); static int sd_remove(struct device *); static void sd_shutdown(struct device *); @@ -3463,6 +3464,7 @@ static int sd_probe(struct device *dev) sdkp->max_retries = SD_MAX_RETRIES; atomic_set(&sdkp->openers, 0); atomic_set(&sdkp->device->ioerr_cnt, 0); + INIT_WORK(&sdkp->start_done_work, sd_start_done_work); if (!sdp->request_queue->rq_timeout) { if (sdp->type != TYPE_MOD) @@ -3585,12 +3587,64 @@ static void scsi_disk_release(struct device *dev) kfree(sdkp); } +/* Process sense data after a START command finished. */ +static void sd_start_done_work(struct work_struct *work) +{ + struct scsi_disk *sdkp = container_of(work, typeof(*sdkp), + start_done_work); + struct scsi_sense_hdr sshdr; + int res = sdkp->start_result; + + if (res == 0) + return; + + sd_print_result(sdkp, "Start/Stop Unit failed", res); + if (res > 0 && scsi_normalize_sense(sdkp->start_sense_buffer, + sdkp->start_sense_len, &sshdr)) + sd_print_sense_hdr(sdkp, &sshdr); +} + +/* A START command finished. May be called from interrupt context. */ +static void sd_start_done(struct request *req, blk_status_t status) +{ + const struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); + struct scsi_disk *sdkp = scsi_disk(req->q->disk); + + sdkp->start_result = scmd->result; + WARN_ON_ONCE(scmd->sense_len > SCSI_SENSE_BUFFERSIZE); + sdkp->start_sense_len = scmd->sense_len; + memcpy(sdkp->start_sense_buffer, scmd->sense_buffer, scmd->sense_len); + WARN_ON_ONCE(!schedule_work(&sdkp->start_done_work)); +} + +/* Submit a START command asynchronously. */ +static int sd_submit_start(struct scsi_disk *sdkp, u8 cmd[], u8 cmd_len) +{ + struct scsi_device *sdev = sdkp->device; + struct request_queue *q = sdev->request_queue; + struct request *req; + struct scsi_cmnd *scmd; + + req = scsi_alloc_request(q, REQ_OP_DRV_IN, BLK_MQ_REQ_PM); + if (IS_ERR(req)) + return PTR_ERR(req); + + scmd = blk_mq_rq_to_pdu(req); + scmd->cmd_len = cmd_len; + memcpy(scmd->cmnd, cmd, cmd_len); + scmd->allowed = sdkp->max_retries; + req->timeout = SD_TIMEOUT; + req->rq_flags |= RQF_PM | RQF_QUIET; + req->end_io = sd_start_done; + blk_execute_rq_nowait(req, /*at_head=*/true); + + return 0; +} + static int sd_start_stop_device(struct scsi_disk *sdkp, int start) { unsigned char cmd[6] = { START_STOP }; /* START_VALID */ - struct scsi_sense_hdr sshdr; struct scsi_device *sdp = sdkp->device; - int res; if (start) cmd[4] |= 1; /* START */ @@ -3601,23 +3655,10 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) if (!scsi_device_online(sdp)) return -ENODEV; - res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, - SD_TIMEOUT, sdkp->max_retries, 0, RQF_PM, NULL); - if (res) { - sd_print_result(sdkp, "Start/Stop Unit failed", res); - if (res > 0 && scsi_sense_valid(&sshdr)) { - sd_print_sense_hdr(sdkp, &sshdr); - /* 0x3a is medium not present */ - if (sshdr.asc == 0x3a) - res = 0; - } - } + /* Wait until processing of sense data has finished. */ + flush_work(&sdkp->start_done_work); - /* SCSI error codes must not go to the generic layer */ - if (res) - return -EIO; - - return 0; + return sd_submit_start(sdkp, cmd, sizeof(cmd)); } /* @@ -3644,6 +3685,8 @@ static void sd_shutdown(struct device *dev) sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); sd_start_stop_device(sdkp, 0); } + + flush_work(&sdkp->start_done_work); } static int sd_suspend_common(struct device *dev, bool ignore_stop_errors) diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 5eea762f84d1..b89187761d61 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -150,6 +150,11 @@ struct scsi_disk { unsigned urswrz : 1; unsigned security : 1; unsigned ignore_medium_access_errors : 1; + + int start_result; + u32 start_sense_len; + u8 start_sense_buffer[SCSI_SENSE_BUFFERSIZE]; + struct work_struct start_done_work; }; #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev) ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-28 22:21 ` [PATCH 3/3] scsi: sd: Rework asynchronous resume support Bart Van Assche @ 2022-06-29 6:02 ` Hannes Reinecke 2022-06-30 16:09 ` Bart Van Assche 2022-06-30 16:23 ` John Garry 1 sibling, 1 reply; 11+ messages in thread From: Hannes Reinecke @ 2022-06-29 6:02 UTC (permalink / raw) To: Bart Van Assche, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Ming Lei, John Garry, ericspero, jason600.groome On 6/29/22 00:21, Bart Van Assche wrote: > For some technologies, e.g. an ATA bus, resuming can take multiple > seconds. Waiting for resume to finish can cause a very noticeable delay. > Hence this patch that restores the behavior from before patch "scsi: > core: pm: Rely on the device driver core for async power management" for > most SCSI devices. > > This patch introduces a behavior change: if the START command fails, do > not consider this as a SCSI disk resume failure. > > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.garry@huawei.com> > Cc: ericspero@icloud.com > Cc: jason600.groome@gmail.com > Tested-by: jason600.groome@gmail.com > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880 > Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/sd.c | 79 ++++++++++++++++++++++++++++++++++++----------- > drivers/scsi/sd.h | 5 +++ > 2 files changed, 66 insertions(+), 18 deletions(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 895b56c8f25e..06888b675e71 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -103,6 +103,7 @@ static void sd_config_discard(struct scsi_disk *, unsigned int); > static void sd_config_write_same(struct scsi_disk *); > static int sd_revalidate_disk(struct gendisk *); > static void sd_unlock_native_capacity(struct gendisk *disk); > +static void sd_start_done_work(struct work_struct *work); > static int sd_probe(struct device *); > static int sd_remove(struct device *); > static void sd_shutdown(struct device *); > @@ -3463,6 +3464,7 @@ static int sd_probe(struct device *dev) > sdkp->max_retries = SD_MAX_RETRIES; > atomic_set(&sdkp->openers, 0); > atomic_set(&sdkp->device->ioerr_cnt, 0); > + INIT_WORK(&sdkp->start_done_work, sd_start_done_work); > > if (!sdp->request_queue->rq_timeout) { > if (sdp->type != TYPE_MOD) > @@ -3585,12 +3587,64 @@ static void scsi_disk_release(struct device *dev) > kfree(sdkp); > } > > +/* Process sense data after a START command finished. */ > +static void sd_start_done_work(struct work_struct *work) > +{ > + struct scsi_disk *sdkp = container_of(work, typeof(*sdkp), > + start_done_work); > + struct scsi_sense_hdr sshdr; > + int res = sdkp->start_result; > + > + if (res == 0) > + return; > + > + sd_print_result(sdkp, "Start/Stop Unit failed", res); Surely START/STOP unit can succeed, no? > + if (res > 0 && scsi_normalize_sense(sdkp->start_sense_buffer, > + sdkp->start_sense_len, &sshdr)) > + sd_print_sense_hdr(sdkp, &sshdr); > +} > + > +/* A START command finished. May be called from interrupt context. */ > +static void sd_start_done(struct request *req, blk_status_t status) > +{ > + const struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); > + struct scsi_disk *sdkp = scsi_disk(req->q->disk); > + > + sdkp->start_result = scmd->result; > + WARN_ON_ONCE(scmd->sense_len > SCSI_SENSE_BUFFERSIZE); > + sdkp->start_sense_len = scmd->sense_len; > + memcpy(sdkp->start_sense_buffer, scmd->sense_buffer, scmd->sense_len); > + WARN_ON_ONCE(!schedule_work(&sdkp->start_done_work)); > +} > + > +/* Submit a START command asynchronously. */ > +static int sd_submit_start(struct scsi_disk *sdkp, u8 cmd[], u8 cmd_len) > +{ > + struct scsi_device *sdev = sdkp->device; > + struct request_queue *q = sdev->request_queue; > + struct request *req; > + struct scsi_cmnd *scmd; > + > + req = scsi_alloc_request(q, REQ_OP_DRV_IN, BLK_MQ_REQ_PM); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > + scmd = blk_mq_rq_to_pdu(req); > + scmd->cmd_len = cmd_len; > + memcpy(scmd->cmnd, cmd, cmd_len); > + scmd->allowed = sdkp->max_retries; > + req->timeout = SD_TIMEOUT; > + req->rq_flags |= RQF_PM | RQF_QUIET; > + req->end_io = sd_start_done; > + blk_execute_rq_nowait(req, /*at_head=*/true); > + > + return 0; > +} > + > static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > { > unsigned char cmd[6] = { START_STOP }; /* START_VALID */ > - struct scsi_sense_hdr sshdr; > struct scsi_device *sdp = sdkp->device; > - int res; > > if (start) > cmd[4] |= 1; /* START */ > @@ -3601,23 +3655,10 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > if (!scsi_device_online(sdp)) > return -ENODEV; > > - res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > - SD_TIMEOUT, sdkp->max_retries, 0, RQF_PM, NULL); > - if (res) { > - sd_print_result(sdkp, "Start/Stop Unit failed", res); > - if (res > 0 && scsi_sense_valid(&sshdr)) { > - sd_print_sense_hdr(sdkp, &sshdr); > - /* 0x3a is medium not present */ > - if (sshdr.asc == 0x3a) > - res = 0; > - } > - } > + /* Wait until processing of sense data has finished. */ > + flush_work(&sdkp->start_done_work); > > - /* SCSI error codes must not go to the generic layer */ > - if (res) > - return -EIO; > - > - return 0; > + return sd_submit_start(sdkp, cmd, sizeof(cmd)); > } > > /* > @@ -3644,6 +3685,8 @@ static void sd_shutdown(struct device *dev) > sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); > sd_start_stop_device(sdkp, 0); > } > + > + flush_work(&sdkp->start_done_work); > } > > static int sd_suspend_common(struct device *dev, bool ignore_stop_errors) > diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h > index 5eea762f84d1..b89187761d61 100644 > --- a/drivers/scsi/sd.h > +++ b/drivers/scsi/sd.h > @@ -150,6 +150,11 @@ struct scsi_disk { > unsigned urswrz : 1; > unsigned security : 1; > unsigned ignore_medium_access_errors : 1; > + > + int start_result; > + u32 start_sense_len; > + u8 start_sense_buffer[SCSI_SENSE_BUFFERSIZE]; > + struct work_struct start_done_work; > }; > #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev) > Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-29 6:02 ` Hannes Reinecke @ 2022-06-30 16:09 ` Bart Van Assche 0 siblings, 0 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-30 16:09 UTC (permalink / raw) To: Hannes Reinecke, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Ming Lei, John Garry, ericspero, jason600.groome On 6/28/22 23:02, Hannes Reinecke wrote: > On 6/29/22 00:21, Bart Van Assche wrote: >> +/* Process sense data after a START command finished. */ >> +static void sd_start_done_work(struct work_struct *work) >> +{ >> + struct scsi_disk *sdkp = container_of(work, typeof(*sdkp), >> + start_done_work); >> + struct scsi_sense_hdr sshdr; >> + int res = sdkp->start_result; >> + >> + if (res == 0) >> + return; >> + >> + sd_print_result(sdkp, "Start/Stop Unit failed", res); > > Surely START/STOP unit can succeed, no? Yes, hence the "if (res == 0) return;" code. Did I perhaps misunderstand your question? Bart. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-28 22:21 ` [PATCH 3/3] scsi: sd: Rework asynchronous resume support Bart Van Assche 2022-06-29 6:02 ` Hannes Reinecke @ 2022-06-30 16:23 ` John Garry 2022-06-30 18:57 ` Bart Van Assche 1 sibling, 1 reply; 11+ messages in thread From: John Garry @ 2022-06-30 16:23 UTC (permalink / raw) To: Bart Van Assche, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Ming Lei, Hannes Reinecke, ericspero, jason600.groome On 28/06/2022 23:21, Bart Van Assche wrote: > For some technologies, e.g. an ATA bus, resuming can take multiple > seconds. Waiting for resume to finish can cause a very noticeable delay. > Hence this patch that restores the behavior from before patch "scsi: > core: pm: Rely on the device driver core for async power management" for > most SCSI devices. > > This patch introduces a behavior change: if the START command fails, do > not consider this as a SCSI disk resume failure. > > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.garry@huawei.com> > Cc: ericspero@icloud.com > Cc: jason600.groome@gmail.com > Tested-by: jason600.groome@gmail.com > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880 > Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/sd.c | 79 ++++++++++++++++++++++++++++++++++++----------- > drivers/scsi/sd.h | 5 +++ > 2 files changed, 66 insertions(+), 18 deletions(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 895b56c8f25e..06888b675e71 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -103,6 +103,7 @@ static void sd_config_discard(struct scsi_disk *, unsigned int); > static void sd_config_write_same(struct scsi_disk *); > static int sd_revalidate_disk(struct gendisk *); > static void sd_unlock_native_capacity(struct gendisk *disk); > +static void sd_start_done_work(struct work_struct *work); > static int sd_probe(struct device *); > static int sd_remove(struct device *); > static void sd_shutdown(struct device *); > @@ -3463,6 +3464,7 @@ static int sd_probe(struct device *dev) > sdkp->max_retries = SD_MAX_RETRIES; > atomic_set(&sdkp->openers, 0); > atomic_set(&sdkp->device->ioerr_cnt, 0); > + INIT_WORK(&sdkp->start_done_work, sd_start_done_work); > > if (!sdp->request_queue->rq_timeout) { > if (sdp->type != TYPE_MOD) > @@ -3585,12 +3587,64 @@ static void scsi_disk_release(struct device *dev) > kfree(sdkp); > } > > +/* Process sense data after a START command finished. */ > +static void sd_start_done_work(struct work_struct *work) > +{ > + struct scsi_disk *sdkp = container_of(work, typeof(*sdkp), > + start_done_work); > + struct scsi_sense_hdr sshdr; > + int res = sdkp->start_result; > + > + if (res == 0) > + return; > + > + sd_print_result(sdkp, "Start/Stop Unit failed", res); > + if (res > 0 && scsi_normalize_sense(sdkp->start_sense_buffer, > + sdkp->start_sense_len, &sshdr)) > + sd_print_sense_hdr(sdkp, &sshdr); nit: maybe you can reduce indentation, like: if (res < 0) return; if (scsi_normalize_sense(sdkp->start_sense_buffer, sdkp->start_sense_len, &sshdr)) { sd_print_sense_hdr(sdkp, &sshdr); } > +} > + > +/* A START command finished. May be called from interrupt context. */ > +static void sd_start_done(struct request *req, blk_status_t status) > +{ > + const struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); > + struct scsi_disk *sdkp = scsi_disk(req->q->disk); > + > + sdkp->start_result = scmd->result; > + WARN_ON_ONCE(scmd->sense_len > SCSI_SENSE_BUFFERSIZE); If scmd->sense_len > SCSI_SENSE_BUFFERSIZE, do you really want to go on to copy at sdkp->start_sense_buffer (which is of size SCSI_SENSE_BUFFERSIZE)? Won't that cause a corruption? > + sdkp->start_sense_len = scmd->sense_len; > + memcpy(sdkp->start_sense_buffer, scmd->sense_buffer, scmd->sense_len); > + WARN_ON_ONCE(!schedule_work(&sdkp->start_done_work)); > +} > + > +/* Submit a START command asynchronously. */ > +static int sd_submit_start(struct scsi_disk *sdkp, u8 cmd[], u8 cmd_len) > +{ > + struct scsi_device *sdev = sdkp->device; > + struct request_queue *q = sdev->request_queue; > + struct request *req; > + struct scsi_cmnd *scmd; > + > + req = scsi_alloc_request(q, REQ_OP_DRV_IN, BLK_MQ_REQ_PM); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > + scmd = blk_mq_rq_to_pdu(req); > + scmd->cmd_len = cmd_len; > + memcpy(scmd->cmnd, cmd, cmd_len); > + scmd->allowed = sdkp->max_retries; > + req->timeout = SD_TIMEOUT; > + req->rq_flags |= RQF_PM | RQF_QUIET; > + req->end_io = sd_start_done; > + blk_execute_rq_nowait(req, /*at_head=*/true); > + > + return 0; > +} > + > static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > { > unsigned char cmd[6] = { START_STOP }; /* START_VALID */ > - struct scsi_sense_hdr sshdr; > struct scsi_device *sdp = sdkp->device; > - int res; > > if (start) > cmd[4] |= 1; /* START */ > @@ -3601,23 +3655,10 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > if (!scsi_device_online(sdp)) > return -ENODEV; > > - res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr, > - SD_TIMEOUT, sdkp->max_retries, 0, RQF_PM, NULL); > - if (res) { > - sd_print_result(sdkp, "Start/Stop Unit failed", res); > - if (res > 0 && scsi_sense_valid(&sshdr)) { > - sd_print_sense_hdr(sdkp, &sshdr); > - /* 0x3a is medium not present */ > - if (sshdr.asc == 0x3a) > - res = 0; > - } > - } > + /* Wait until processing of sense data has finished. */ > + flush_work(&sdkp->start_done_work); > > - /* SCSI error codes must not go to the generic layer */ > - if (res) > - return -EIO; > - > - return 0; > + return sd_submit_start(sdkp, cmd, sizeof(cmd)); > } > > /* > @@ -3644,6 +3685,8 @@ static void sd_shutdown(struct device *dev) > sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); > sd_start_stop_device(sdkp, 0); > } > + > + flush_work(&sdkp->start_done_work); > } > > static int sd_suspend_common(struct device *dev, bool ignore_stop_errors) > diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h > index 5eea762f84d1..b89187761d61 100644 > --- a/drivers/scsi/sd.h > +++ b/drivers/scsi/sd.h > @@ -150,6 +150,11 @@ struct scsi_disk { > unsigned urswrz : 1; > unsigned security : 1; > unsigned ignore_medium_access_errors : 1; > + > + int start_result; > + u32 start_sense_len; > + u8 start_sense_buffer[SCSI_SENSE_BUFFERSIZE]; > + struct work_struct start_done_work; > }; > #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev) > > . ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-30 16:23 ` John Garry @ 2022-06-30 18:57 ` Bart Van Assche 2022-06-30 19:28 ` Bart Van Assche 0 siblings, 1 reply; 11+ messages in thread From: Bart Van Assche @ 2022-06-30 18:57 UTC (permalink / raw) To: John Garry, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Ming Lei, Hannes Reinecke, ericspero, jason600.groome On 6/30/22 09:23, John Garry wrote: > On 28/06/2022 23:21, Bart Van Assche wrote: >> +/* A START command finished. May be called from interrupt context. */ >> +static void sd_start_done(struct request *req, blk_status_t status) >> +{ >> + const struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); >> + struct scsi_disk *sdkp = scsi_disk(req->q->disk); >> + >> + sdkp->start_result = scmd->result; >> + WARN_ON_ONCE(scmd->sense_len > SCSI_SENSE_BUFFERSIZE); > > If scmd->sense_len > SCSI_SENSE_BUFFERSIZE, do you really want to go on > to copy at sdkp->start_sense_buffer (which is of size > SCSI_SENSE_BUFFERSIZE)? Won't that cause a corruption? scsi_mq_init_request() allocates a buffer with size SCSI_SENSE_BUFFERSIZE. SCSI LLDs copy sense data into that buffer. I am not aware of any SCSI LLD that modifies the cmd->sense_buffer pointer. So if scmd->sense_len would be larger than SCSI_SENSE_BUFFERSIZE that either indicates that the LLD reported a sense length that is too large or that it wrote outside the bounds of the sense buffer. Do we really need to add a protection in the SCSI core against buggy LLDs? Thanks, Bart. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] scsi: sd: Rework asynchronous resume support 2022-06-30 18:57 ` Bart Van Assche @ 2022-06-30 19:28 ` Bart Van Assche 0 siblings, 0 replies; 11+ messages in thread From: Bart Van Assche @ 2022-06-30 19:28 UTC (permalink / raw) To: John Garry, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Ming Lei, Hannes Reinecke, ericspero, jason600.groome On 6/30/22 11:57, Bart Van Assche wrote: > On 6/30/22 09:23, John Garry wrote: >> On 28/06/2022 23:21, Bart Van Assche wrote: >>> +/* A START command finished. May be called from interrupt context. */ >>> +static void sd_start_done(struct request *req, blk_status_t status) >>> +{ >>> + const struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req); >>> + struct scsi_disk *sdkp = scsi_disk(req->q->disk); >>> + >>> + sdkp->start_result = scmd->result; >>> + WARN_ON_ONCE(scmd->sense_len > SCSI_SENSE_BUFFERSIZE); >> >> If scmd->sense_len > SCSI_SENSE_BUFFERSIZE, do you really want to go >> on to copy at sdkp->start_sense_buffer (which is of size >> SCSI_SENSE_BUFFERSIZE)? Won't that cause a corruption? > > scsi_mq_init_request() allocates a buffer with size > SCSI_SENSE_BUFFERSIZE. SCSI LLDs copy sense data into that buffer. I am > not aware of any SCSI LLD that modifies the cmd->sense_buffer pointer. > So if scmd->sense_len would be larger than SCSI_SENSE_BUFFERSIZE that > either indicates that the LLD reported a sense length that is too large > or that it wrote outside the bounds of the sense buffer. Do we really > need to add a protection in the SCSI core against buggy LLDs? A result of the above is that SCSI_SENSE_BUFFERSIZE bytes can be copied instead of scmd->sense_len. I will make that change. Bart. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-06-30 19:28 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-28 22:21 [PATCH 0/3] Reduce ATA disk resume time Bart Van Assche 2022-06-28 22:21 ` [PATCH 1/3] scsi: core: Move the definition of SCSI_QUEUE_DELAY Bart Van Assche 2022-06-28 22:21 ` [PATCH 2/3] scsi: core: Retry after a delay if the device is becoming ready Bart Van Assche 2022-06-29 1:21 ` Ming Lei 2022-06-29 22:06 ` Bart Van Assche 2022-06-28 22:21 ` [PATCH 3/3] scsi: sd: Rework asynchronous resume support Bart Van Assche 2022-06-29 6:02 ` Hannes Reinecke 2022-06-30 16:09 ` Bart Van Assche 2022-06-30 16:23 ` John Garry 2022-06-30 18:57 ` Bart Van Assche 2022-06-30 19:28 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).