All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv8 0/3] New EH command timeout handler
@ 2013-10-31 13:02 Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 1/3] scsi: Fix erratic device offline during EH Hannes Reinecke
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-10-31 13:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel,
	James Smart, Hannes Reinecke

Hi all,

this patchset implements a new SCSI EH command timeout handler
which will be sending command aborts inline without actually
engaging SCSI EH.
SCSI EH will only be invoked if command abort fails.

In addition the commands will be returned directly
if the command abort succeeded, cutting down recovery
times dramatically.

With the original SCSI EH I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s

real	2m22.657s
user	0m0.013s
sys	0m0.145s

With this patchset I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s

real	0m52.163s
user	0m0.012s
sys	0m0.145s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.

Changes to the original version:
- Use a private list in scsi_eh_abort_handler to avoid
  list starvation (pointed out by Joern Engel)
- Terminate command aborts when the first abort fails
- Do not attempt command aborts if the host is already in recovery
  or if the device is removed.
- Flush abort workqueue if the device is removed.

Changes to v2:
- Removed eh_entry initialisation
- Convert to per-command workqueue

Changes to v3:
- Use delayed_work
- Enable new eh timeout handler for virtio, SAS, and FC
- Modify logging messages to include scmd pointer

Changes to v4:
- Remove stubs when enabling new eh timeout handler
  for other drivers

Changes to v5:
- Enable new eh timeout handler per default
- Update documentation

Changes to v6:
- Include changes from James Bottomley for erratic device
  offline patch
- Rearrange patches
- Update SCSI midlayer documentation

Changes to v7:
- Merge obsolete patch

Hannes Reinecke (2):
  scsi: improved eh timeout handler
  scsi: Update documentation

James Bottomley (1):
  scsi: Fix erratic device offline during EH

 Documentation/scsi/scsi_eh.txt          |  69 +++++++------
 Documentation/scsi/scsi_mid_low_api.txt |   9 +-
 drivers/scsi/scsi.c                     |   9 +-
 drivers/scsi/scsi_error.c               | 178 ++++++++++++++++++++++++++++----
 drivers/scsi/scsi_priv.h                |   2 +
 drivers/scsi/sd.c                       |  26 +++--
 include/scsi/scsi_cmnd.h                |   2 +
 include/scsi/scsi_driver.h              |   2 +-
 include/scsi/scsi_host.h                |   5 +
 9 files changed, 229 insertions(+), 73 deletions(-)

-- 
1.8.1.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] scsi: Fix erratic device offline during EH
  2013-10-31 13:02 [PATCHv8 0/3] New EH command timeout handler Hannes Reinecke
@ 2013-10-31 13:02 ` Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 2/3] scsi: improved eh timeout handler Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 3/3] scsi: Update documentation Hannes Reinecke
  2 siblings, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-10-31 13:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel,
	James Smart, Hannes Reinecke

From: James Bottomley <jbottomley@parallels.com>

Commit 18a4d0a22ed6c54b67af7718c305cd010f09ddf8
(Handle disk devices which can not process medium access commands)
was introduced to offline any device which cannot process medium
access commands.
However, commit 3eef6257de48ff84a5d98ca533685df8a3beaeb8
(Reduce error recovery time by reducing use of TURs) reduced
the number of TURs by sending it only on the first failing
command, which might or might not be a medium access command.
So in combination this results in an erratic device offlining
during EH; if the command where the TUR was sent upon happens
to be a medium access command the device will be set offline,
if not everything proceeds as normal.

This patch moves the check to the final test, eliminating
this problem.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
---
 drivers/scsi/scsi_error.c  | 26 +++++++++++++++++---------
 drivers/scsi/sd.c          | 26 ++++++++++++--------------
 include/scsi/scsi_driver.h |  2 +-
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index e8bee9f..67c0014 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -941,12 +941,6 @@ retry:
 
 	scsi_eh_restore_cmnd(scmd, &ses);
 
-	if (scmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
-		struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
-		if (sdrv->eh_action)
-			rtn = sdrv->eh_action(scmd, cmnd, cmnd_size, rtn);
-	}
-
 	return rtn;
 }
 
@@ -964,6 +958,16 @@ static int scsi_request_sense(struct scsi_cmnd *scmd)
 	return scsi_send_eh_cmnd(scmd, NULL, 0, scmd->device->eh_timeout, ~0);
 }
 
+static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
+{
+	if (scmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
+		struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
+		if (sdrv->eh_action)
+			rtn = sdrv->eh_action(scmd, rtn);
+	}
+	return rtn;
+}
+
 /**
  * scsi_eh_finish_cmd - Handle a cmd that eh is finished with.
  * @scmd:	Original SCSI cmd that eh has finished.
@@ -1142,7 +1146,9 @@ static int scsi_eh_test_devices(struct list_head *cmd_list,
 
 		list_for_each_entry_safe(scmd, next, cmd_list, eh_entry)
 			if (scmd->device == sdev) {
-				if (finish_cmds)
+				if (finish_cmds &&
+				    (try_stu ||
+				     scsi_eh_action(scmd, SUCCESS) == SUCCESS))
 					scsi_eh_finish_cmd(scmd, done_q);
 				else
 					list_move_tail(&scmd->eh_entry, work_q);
@@ -1283,7 +1289,8 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 			    !scsi_eh_tur(stu_scmd)) {
 				list_for_each_entry_safe(scmd, next,
 							  work_q, eh_entry) {
-					if (scmd->device == sdev)
+					if (scmd->device == sdev &&
+					    scsi_eh_action(scmd, SUCCESS) == SUCCESS)
 						scsi_eh_finish_cmd(scmd, done_q);
 				}
 			}
@@ -1350,7 +1357,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 			    !scsi_eh_tur(bdr_scmd)) {
 				list_for_each_entry_safe(scmd, next,
 							 work_q, eh_entry) {
-					if (scmd->device == sdev)
+					if (scmd->device == sdev &&
+					    scsi_eh_action(scmd, rtn) != FAILED)
 						scsi_eh_finish_cmd(scmd,
 								   done_q);
 				}
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index fd874b8..1929838 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -110,7 +110,7 @@ static int sd_suspend_runtime(struct device *);
 static int sd_resume(struct device *);
 static void sd_rescan(struct device *);
 static int sd_done(struct scsi_cmnd *);
-static int sd_eh_action(struct scsi_cmnd *, unsigned char *, int, int);
+static int sd_eh_action(struct scsi_cmnd *, int);
 static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
 static void scsi_disk_release(struct device *cdev);
 static void sd_print_sense_hdr(struct scsi_disk *, struct scsi_sense_hdr *);
@@ -1551,23 +1551,23 @@ static const struct block_device_operations sd_fops = {
 /**
  *	sd_eh_action - error handling callback
  *	@scmd:		sd-issued command that has failed
- *	@eh_cmnd:	The command that was sent during error handling
- *	@eh_cmnd_len:	Length of eh_cmnd in bytes
  *	@eh_disp:	The recovery disposition suggested by the midlayer
  *
- *	This function is called by the SCSI midlayer upon completion of
- *	an error handling command (TEST UNIT READY, START STOP UNIT,
- *	etc.) The command sent to the device by the error handler is
- *	stored in eh_cmnd. The result of sending the eh command is
- *	passed in eh_disp.
+ *	This function is called by the SCSI midlayer upon completion of an
+ *	error test command (currently TEST UNIT READY). The result of sending
+ *	the eh command is passed in eh_disp.  We're looking for devices that
+ *	fail medium access commands but are OK with non access commands like
+ *	test unit ready (so wrongly see the device as having a successful
+ *	recovery)
  **/
-static int sd_eh_action(struct scsi_cmnd *scmd, unsigned char *eh_cmnd,
-			int eh_cmnd_len, int eh_disp)
+static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp)
 {
 	struct scsi_disk *sdkp = scsi_disk(scmd->request->rq_disk);
 
 	if (!scsi_device_online(scmd->device) ||
-	    !scsi_medium_access_command(scmd))
+	    !scsi_medium_access_command(scmd) ||
+	    host_byte(scmd->result) != DID_TIME_OUT ||
+	    eh_disp != SUCCESS)
 		return eh_disp;
 
 	/*
@@ -1577,9 +1577,7 @@ static int sd_eh_action(struct scsi_cmnd *scmd, unsigned char *eh_cmnd,
 	 * process of recovering or has it suffered an internal failure
 	 * that prevents access to the storage medium.
 	 */
-	if (host_byte(scmd->result) == DID_TIME_OUT && eh_disp == SUCCESS &&
-	    eh_cmnd_len && eh_cmnd[0] == TEST_UNIT_READY)
-		sdkp->medium_access_timed_out++;
+	sdkp->medium_access_timed_out++;
 
 	/*
 	 * If the device keeps failing read/write commands but TEST UNIT
diff --git a/include/scsi/scsi_driver.h b/include/scsi/scsi_driver.h
index d443aa0..20fdfc2 100644
--- a/include/scsi/scsi_driver.h
+++ b/include/scsi/scsi_driver.h
@@ -16,7 +16,7 @@ struct scsi_driver {
 
 	void (*rescan)(struct device *);
 	int (*done)(struct scsi_cmnd *);
-	int (*eh_action)(struct scsi_cmnd *, unsigned char *, int, int);
+	int (*eh_action)(struct scsi_cmnd *, int);
 };
 #define to_scsi_driver(drv) \
 	container_of((drv), struct scsi_driver, gendrv)
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] scsi: improved eh timeout handler
  2013-10-31 13:02 [PATCHv8 0/3] New EH command timeout handler Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 1/3] scsi: Fix erratic device offline during EH Hannes Reinecke
@ 2013-10-31 13:02 ` Hannes Reinecke
  2013-10-31 15:49   ` Christoph Hellwig
  2013-11-01  6:10   ` Ren Mingxin
  2013-10-31 13:02 ` [PATCH 3/3] scsi: Update documentation Hannes Reinecke
  2 siblings, 2 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-10-31 13:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel,
	James Smart, Hannes Reinecke

When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.

This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Ren Mingxin <renmx@cn.fujitsu.com>
---
 drivers/scsi/scsi.c       |   3 +
 drivers/scsi/scsi_error.c | 152 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/scsi/scsi_priv.h  |   2 +
 include/scsi/scsi_cmnd.h  |   2 +
 include/scsi/scsi_host.h  |   5 ++
 5 files changed, 151 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fe0bcb1..2b04a57 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -297,6 +297,7 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, gfp_t gfp_mask)
 
 		cmd->device = dev;
 		INIT_LIST_HEAD(&cmd->list);
+		INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);
 		spin_lock_irqsave(&dev->list_lock, flags);
 		list_add_tail(&cmd->list, &dev->cmd_list);
 		spin_unlock_irqrestore(&dev->list_lock, flags);
@@ -353,6 +354,8 @@ void scsi_put_command(struct scsi_cmnd *cmd)
 	list_del_init(&cmd->list);
 	spin_unlock_irqrestore(&cmd->device->list_lock, flags);
 
+	cancel_delayed_work(&cmd->abort_work);
+
 	__scsi_put_command(cmd->device->host, cmd, &sdev->sdev_gendev);
 }
 EXPORT_SYMBOL(scsi_put_command);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 67c0014..8bd2615 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -53,6 +53,8 @@ static void scsi_eh_done(struct scsi_cmnd *scmd);
 #define HOST_RESET_SETTLE_TIME  (10)
 
 static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
+static int scsi_try_to_abort_cmd(struct scsi_host_template *,
+				 struct scsi_cmnd *);
 
 /* called with shost->host_lock held */
 void scsi_eh_wakeup(struct Scsi_Host *shost)
@@ -100,6 +102,117 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 }
 
 /**
+ * scmd_eh_abort_handler - Handle command aborts
+ * @work:	command to be aborted.
+ */
+void
+scmd_eh_abort_handler(struct work_struct *work)
+{
+	struct scsi_cmnd *scmd =
+		container_of(work, struct scsi_cmnd, abort_work.work);
+	struct scsi_device *sdev = scmd->device;
+	unsigned long flags;
+	int rtn;
+
+	spin_lock_irqsave(sdev->host->host_lock, flags);
+	if (scsi_host_eh_past_deadline(sdev->host)) {
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p eh timeout, not aborting\n",
+				    scmd));
+	} else {
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "aborting command %p\n", scmd));
+		rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd);
+		if (rtn == SUCCESS) {
+			scmd->result |= DID_TIME_OUT << 16;
+			if (!scsi_noretry_cmd(scmd) &&
+			    (++scmd->retries <= scmd->allowed)) {
+				SCSI_LOG_ERROR_RECOVERY(3,
+					scmd_printk(KERN_WARNING, scmd,
+						    "scmd %p retry "
+						    "aborted command\n", scmd));
+				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+			} else {
+				SCSI_LOG_ERROR_RECOVERY(3,
+					scmd_printk(KERN_WARNING, scmd,
+						    "scmd %p finish "
+						    "aborted command\n", scmd));
+				scsi_finish_command(scmd);
+			}
+			return;
+		}
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p abort failed, rtn %d\n",
+				    scmd, rtn));
+	}
+
+	if (scsi_eh_scmd_add(scmd, 0)) {
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_WARNING, scmd,
+				    "scmd %p terminate "
+				    "aborted command\n", scmd));
+		scmd->result |= DID_TIME_OUT << 16;
+		scsi_finish_command(scmd);
+	}
+}
+
+/**
+ * scsi_abort_command - schedule a command abort
+ * @scmd:	scmd to abort.
+ *
+ * We only need to abort commands after a command timeout
+ */
+int
+scsi_abort_command(struct scsi_cmnd *scmd)
+{
+	struct scsi_device *sdev = scmd->device;
+	struct Scsi_Host *shost = sdev->host;
+	unsigned long flags;
+
+	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) {
+		/*
+		 * command abort timed out.
+		 */
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p abort timeout\n", scmd));
+		cancel_delayed_work(&scmd->abort_work);
+		return FAILED;
+	}
+
+	/*
+	 * Do not try a command abort if
+	 * SCSI EH has already started.
+	 */
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_in_recovery(shost)) {
+		spin_unlock_irqrestore(shost->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p not aborting, host in recovery\n",
+				    scmd));
+		return FAILED;
+	}
+
+	if (shost->eh_deadline && !shost->last_reset)
+		shost->last_reset = jiffies;
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
+	scmd->eh_eflags |= SCSI_EH_ABORT_SCHEDULED;
+	SCSI_LOG_ERROR_RECOVERY(3,
+		scmd_printk(KERN_INFO, scmd,
+			    "scmd %p abort scheduled\n", scmd));
+	schedule_delayed_work(&scmd->abort_work, HZ / 100);
+	return SUCCESS;
+}
+EXPORT_SYMBOL_GPL(scsi_abort_command);
+
+/**
  * scsi_eh_scmd_add - add scsi cmd to error handling.
  * @scmd:	scmd to run eh on.
  * @eh_flag:	optional SCSI_EH flag.
@@ -125,6 +238,8 @@ int scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
 		shost->last_reset = jiffies;
 
 	ret = 1;
+	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
+		eh_flag &= ~SCSI_EH_CANCEL_CMD;
 	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
 	shost->host_failed++;
@@ -161,6 +276,10 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	else if (host->hostt->eh_timed_out)
 		rtn = host->hostt->eh_timed_out(scmd);
 
+	if (rtn == BLK_EH_NOT_HANDLED && !host->hostt->no_async_abort)
+		if (scsi_abort_command(scmd) == SUCCESS)
+			return BLK_EH_NOT_HANDLED;
+
 	scmd->result |= DID_TIME_OUT << 16;
 
 	if (unlikely(rtn == BLK_EH_NOT_HANDLED &&
@@ -1577,7 +1696,7 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q,
 }
 
 /**
- * scsi_noretry_cmd - determinte if command should be failed fast
+ * scsi_noretry_cmd - determine if command should be failed fast
  * @scmd:	SCSI cmd to examine.
  */
 int scsi_noretry_cmd(struct scsi_cmnd *scmd)
@@ -1585,6 +1704,8 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	switch (host_byte(scmd->result)) {
 	case DID_OK:
 		break;
+	case DID_TIME_OUT:
+		goto check_type;
 	case DID_BUS_BUSY:
 		return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);
 	case DID_PARITY:
@@ -1598,18 +1719,19 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 		return (scmd->request->cmd_flags & REQ_FAILFAST_DRIVER);
 	}
 
-	switch (status_byte(scmd->result)) {
-	case CHECK_CONDITION:
-		/*
-		 * assume caller has checked sense and determinted
-		 * the check condition was retryable.
-		 */
-		if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
-		    scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
-			return 1;
-	}
+	switch (status_byte(scmd->result) != CHECK_CONDITION)
+		return 0;
 
-	return 0;
+check_type:
+	/*
+	 * assume caller has checked sense and determined
+	 * the check condition was retryable.
+	 */
+	if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
+	    scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
+		return 1;
+	else
+		return 0;
 }
 
 /**
@@ -1659,9 +1781,13 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * looks good.  drop through, and check the next byte.
 		 */
 		break;
+	case DID_ABORT:
+		if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) {
+			scmd->result |= DID_TIME_OUT << 16;
+			return SUCCESS;
+		}
 	case DID_NO_CONNECT:
 	case DID_BAD_TARGET:
-	case DID_ABORT:
 		/*
 		 * note - this means that we just report the status back
 		 * to the top level driver, not that we actually think
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 8f9a0ca..f079a59 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -19,6 +19,7 @@ struct scsi_nl_hdr;
  * Scsi Error Handler Flags
  */
 #define SCSI_EH_CANCEL_CMD	0x0001	/* Cancel this cmd */
+#define SCSI_EH_ABORT_SCHEDULED	0x0002	/* Abort has been scheduled */
 
 #define SCSI_SENSE_VALID(scmd) \
 	(((scmd)->sense_buffer[0] & 0x70) == 0x70)
@@ -66,6 +67,7 @@ extern int __init scsi_init_devinfo(void);
 extern void scsi_exit_devinfo(void);
 
 /* scsi_error.c */
+extern void scmd_eh_abort_handler(struct work_struct *work);
 extern enum blk_eh_timer_return scsi_times_out(struct request *req);
 extern int scsi_error_handler(void *host);
 extern int scsi_decide_disposition(struct scsi_cmnd *cmd);
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index de5f5d8..e3137e3 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -55,6 +55,7 @@ struct scsi_cmnd {
 	struct scsi_device *device;
 	struct list_head list;  /* scsi_cmnd participates in queue lists */
 	struct list_head eh_entry; /* entry for the host eh_cmd_q */
+	struct delayed_work abort_work;
 	int eh_eflags;		/* Used by error handlr */
 
 	/*
@@ -144,6 +145,7 @@ extern void scsi_put_command(struct scsi_cmnd *);
 extern void __scsi_put_command(struct Scsi_Host *, struct scsi_cmnd *,
 			       struct device *);
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
+extern enum blk_eh_timer_return scsi_abort_command(struct scsi_cmnd *cmd);
 
 extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count,
 				 size_t *offset, size_t *len);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 5460849..85430ae 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -476,6 +476,11 @@ struct scsi_host_template {
 	unsigned ordered_tag:1;
 
 	/*
+	 * True if asynchronous aborts are not supported
+	 */
+	unsigned no_async_abort:1;
+
+	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */
 	unsigned int max_host_blocked;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] scsi: Update documentation
  2013-10-31 13:02 [PATCHv8 0/3] New EH command timeout handler Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 1/3] scsi: Fix erratic device offline during EH Hannes Reinecke
  2013-10-31 13:02 ` [PATCH 2/3] scsi: improved eh timeout handler Hannes Reinecke
@ 2013-10-31 13:02 ` Hannes Reinecke
  2 siblings, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-10-31 13:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel,
	James Smart, Hannes Reinecke

The documentation has gone out-of-sync, so update it to
the current status.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 Documentation/scsi/scsi_eh.txt          | 69 +++++++++++++++++++--------------
 Documentation/scsi/scsi_mid_low_api.txt |  9 ++++-
 drivers/scsi/scsi.c                     |  6 +--
 3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 6ff16b6..a0c8511 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -42,20 +42,14 @@ discussion.
 
  Once LLDD gets hold of a scmd, either the LLDD will complete the
 command by calling scsi_done callback passed from midlayer when
-invoking hostt->queuecommand() or SCSI midlayer will time it out.
+invoking hostt->queuecommand() or the block layer will time it out.
 
 
 [1-2-1] Completing a scmd w/ scsi_done
 
  For all non-EH commands, scsi_done() is the completion callback.  It
-does the following.
-
- 1. Delete timeout timer.  If it fails, it means that timeout timer
-    has expired and is going to finish the command.  Just return.
-
- 2. Link scmd to per-cpu scsi_done_q using scmd->en_entry
-
- 3. Raise SCSI_SOFTIRQ
+just calls blk_complete_request() to delete the block layer timer and
+raise SCSI_SOFTIRQ
 
  SCSI_SOFTIRQ handler scsi_softirq calls scsi_decide_disposition() to
 determine what to do with the command.  scsi_decide_disposition()
@@ -64,10 +58,12 @@ with the command.
 
  - SUCCESS
 	scsi_finish_command() is invoked for the command.  The
-	function does some maintenance choirs and notify completion by
-	calling scmd->done() callback, which, for fs requests, would
-	be HLD completion callback - sd:sd_rw_intr, sr:rw_intr,
-	st:st_intr.
+	function does some maintenance chores and then calls
+	scsi_io_completion() to finish the I/O.
+	scsi_io_completion() then notifies the block layer on
+	the completed request by calling blk_end_request and
+	friends or figures out what to do with the remainder
+	of the data in case of an error.
 
  - NEEDS_RETRY
  - ADD_TO_MLQUEUE
@@ -86,33 +82,45 @@ function
  1. invokes optional hostt->eh_timed_out() callback.  Return value can
     be one of
 
-    - EH_HANDLED
-	This indicates that eh_timed_out() dealt with the timeout.  The
-	scmd is passed to __scsi_done() and thus linked into per-cpu
-	scsi_done_q.  Normal command completion described in [1-2-1]
-	follows.
+    - BLK_EH_HANDLED
+	This indicates that eh_timed_out() dealt with the timeout.
+	The command is passed back to the block layer and completed
+	via __blk_complete_requests().
+
+	*NOTE* After returning BLK_EH_HANDLED the SCSI layer is
+	assumed to be finished with the command, and no other
+	functions from the SCSI layer will be called. So this
+	should typically only be returned if the eh_timed_out()
+	handler raced with normal completion.
 
-    - EH_RESET_TIMER
+    - BLK_EH_RESET_TIMER
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.  This action is counted as a
 	retry and only allowed scmd->allowed + 1(!) times.  Once the
-	limit is reached, action for EH_NOT_HANDLED is taken instead.
+	limit is reached, action for BLK_EH_NOT_HANDLED is taken instead.
 
-	*NOTE* This action is racy as the LLDD could finish the scmd
-	after the timeout has expired but before it's added back.  In
-	such cases, scsi_done() would think that timeout has occurred
-	and return without doing anything.  We lose completion and the
-	command will time out again.
-
-    - EH_NOT_HANDLED
-	This is the same as when eh_timed_out() callback doesn't exist.
+    - BLK_EH_NOT_HANDLED
+        eh_timed_out() callback did not handle the command.
 	Step #2 is taken.
 
+ 2. If the host supports asynchronous completion (as indicated by the
+    no_async_abort setting in the host template) scsi_abort_command()
+    is invoked to schedule an asynchrous abort. If that fails
+    Step #3 is taken.
+
  2. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the
     command.  See [1-3] for more information.
 
+[1-3] Asynchronous command aborts
+
+ After a timeout occurs a command abort is scheduled from
+ scsi_abort_command(). If the abort is successful the command
+ will either be retried (if the number of retries is not exhausted)
+ or terminated with DID_TIME_OUT.
+ Otherwise scsi_eh_scmd_add() is invoked for the command.
+ See [1-4] for more information.
 
-[1-3] How EH takes over
+[1-4] How EH takes over
 
  scmds enter EH via scsi_eh_scmd_add(), which does the following.
 
@@ -320,7 +328,8 @@ scmd->allowed.
 
     <<scsi_eh_abort_cmds>>
 
-	This action is taken for each timed out command.
+	This action is taken for each timed out command when
+	no_async_abort is enabled in the host template.
 	hostt->eh_abort_handler() is invoked for each scmd.  The
 	handler returns SUCCESS if it has succeeded to make LLDD and
 	all related hardware forget about the scmd.
diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt
index 2b06aba..d6a9bde 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -882,8 +882,11 @@ Details:
  *
  *      Calling context: kernel thread
  *
- *      Notes: Invoked from scsi_eh thread. No other commands will be
- *      queued on current host during eh.
+ *      Notes: If 'no_async_abort' is defined this callback
+ *  	will be invoked from scsi_eh thread. No other commands
+ *	will then be queued on current host during eh.
+ *	Otherwise it will be called whenever scsi_times_out()
+ *      is called due to a command timeout.
  *
  *      Optionally defined in: LLD
  **/
@@ -1257,6 +1260,8 @@ of interest:
                    address space
     use_clustering - 1=>SCSI commands in mid level's queue can be merged,
                      0=>disallow SCSI command merging
+    no_async_abort - 1=>Asynchronous aborts are not supported
+                     0=>Timed-out commands will be aborted asynchronously
     hostt        - pointer to driver's struct scsi_host_template from which
                    this struct Scsi_Host instance was spawned
     hostt->proc_name  - name of LLD. This is the driver name that sysfs uses
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 2b04a57..d8afec8 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -745,15 +745,13 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 }
 
 /**
- * scsi_done - Enqueue the finished SCSI command into the done queue.
+ * scsi_done - Invoke completion on finished SCSI command.
  * @cmd: The SCSI Command for which a low-level device driver (LLDD) gives
  * ownership back to SCSI Core -- i.e. the LLDD has finished with it.
  *
  * Description: This function is the mid-level's (SCSI Core) interrupt routine,
  * which regains ownership of the SCSI command (de facto) from a LLDD, and
- * enqueues the command to the done queue for further processing.
- *
- * This is the producer of the done queue who enqueues at the tail.
+ * calls blk_complete_request() for further processing.
  *
  * This function is interrupt context safe.
  */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-10-31 13:02 ` [PATCH 2/3] scsi: improved eh timeout handler Hannes Reinecke
@ 2013-10-31 15:49   ` Christoph Hellwig
  2013-11-04 13:36     ` Hannes Reinecke
  2013-11-01  6:10   ` Ren Mingxin
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2013-10-31 15:49 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Ren Mingxin,
	Joern Engel, James Smart

Looks reasonable to me, but a few minor nitpicks:

> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> +	if (scsi_host_eh_past_deadline(sdev->host)) {

I don't have the implementation of scsi_host_eh_past_deadline in my
local tree, but do we really need the host lock for it?

> +int
> +scsi_abort_command(struct scsi_cmnd *scmd)

Semms like this should be static and not exported in the current version
of the code?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-10-31 13:02 ` [PATCH 2/3] scsi: improved eh timeout handler Hannes Reinecke
  2013-10-31 15:49   ` Christoph Hellwig
@ 2013-11-01  6:10   ` Ren Mingxin
  1 sibling, 0 replies; 17+ messages in thread
From: Ren Mingxin @ 2013-11-01  6:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Joern Engel, James Smart

Hi, Hannes:

I'm sorry that I don't know why you didn't consider my former patch
below which not only raises the minimum valid value of 'eh_deadline'
as '0' for your former patchset but also includes some fix for your
this patchset:

http://www.spinics.net/lists/linux-scsi/msg69361.html

If you think I'd post the minimum value issue as a improvement when
your patchset is accepted, I've no problem;-)

On 10/31/2013 09:02 PM, Hannes Reinecke wrote:
> +void
> +scmd_eh_abort_handler(struct work_struct *work)
> +{
> +	struct scsi_cmnd *scmd =
> +		container_of(work, struct scsi_cmnd, abort_work.work);
> +	struct scsi_device *sdev = scmd->device;
> +	unsigned long flags;
> +	int rtn;
> +
> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_INFO, scmd,
> +				    "scmd %p eh timeout, not aborting\n",
> +				    scmd));
> +	} else {
> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_INFO, scmd,
> +				    "aborting command %p\n", scmd));
> +		rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd);
> +		if (rtn == SUCCESS) {
> +			scmd->result |= DID_TIME_OUT<<  16;
> +			if (!scsi_noretry_cmd(scmd)&&
> +			    (++scmd->retries<= scmd->allowed)) {

scsi_host_eh_past_deadline() should also be checked here before long
term retrying.

>
> +				SCSI_LOG_ERROR_RECOVERY(3,
> +					scmd_printk(KERN_WARNING, scmd,
> +						    "scmd %p retry "
> +						    "aborted command\n", scmd));
> +				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
> +			} else {
> +				SCSI_LOG_ERROR_RECOVERY(3,
> +					scmd_printk(KERN_WARNING, scmd,
> +						    "scmd %p finish "
> +						    "aborted command\n", scmd));
> +				scsi_finish_command(scmd);
> +			}
> +			return;
> +		}
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_INFO, scmd,
> +				    "scmd %p abort failed, rtn %d\n",
> +				    scmd, rtn));
> +	}
> +
> +	if (scsi_eh_scmd_add(scmd, 0)) {

scsi_finish_command() should be invoked if scsi_eh_scmd_add() is
returned on failure.

Thanks,
Ren

>
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_WARNING, scmd,
> +				    "scmd %p terminate "
> +				    "aborted command\n", scmd));
> +		scmd->result |= DID_TIME_OUT<<  16;
> +		scsi_finish_command(scmd);
> +	}
> +}
> <snip>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-10-31 15:49   ` Christoph Hellwig
@ 2013-11-04 13:36     ` Hannes Reinecke
  2013-11-04 14:25       ` James Bottomley
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2013-11-04 13:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, linux-scsi, Ren Mingxin, Joern Engel, James Smart

On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
> Looks reasonable to me, but a few minor nitpicks:
> 
>> +	spin_lock_irqsave(sdev->host->host_lock, flags);
>> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> 
> I don't have the implementation of scsi_host_eh_past_deadline in my
> local tree, but do we really need the host lock for it?
> 
Yes. The eh_deadline variable might be set from an interrupt context
or from userland, so we need to protect access to it.

>> +int
>> +scsi_abort_command(struct scsi_cmnd *scmd)
> 
> Seems like this should be static and not exported in the current version
> of the code?
> 
Yep. Will be doing so.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-11-04 13:36     ` Hannes Reinecke
@ 2013-11-04 14:25       ` James Bottomley
  2013-11-04 14:46         ` Hannes Reinecke
  0 siblings, 1 reply; 17+ messages in thread
From: James Bottomley @ 2013-11-04 14:25 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel, James Smart

On Mon, 2013-11-04 at 14:36 +0100, Hannes Reinecke wrote:
> On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
> > Looks reasonable to me, but a few minor nitpicks:
> > 
> >> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> >> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> > 
> > I don't have the implementation of scsi_host_eh_past_deadline in my
> > local tree, but do we really need the host lock for it?
> > 
> Yes. The eh_deadline variable might be set from an interrupt context
> or from userland, so we need to protect access to it.

That's not really true.  on all our supported architectures 32 bit
reads/writes are atomic, which means that if one CPU writes a word at
the same time another reads one, the reader is guaranteed to see either
the old or the new data.  Given the expense of lock cache line bouncing
on the newer architectures, we really want to avoid a spinlock where
possible.

In this case, the problem with the implementation is that the writer
might set eh_deadline to zero, but this is fixable in
scsi_host_eh_past_deadline() by checking for zero before and after the
time_before (for the zero to non-zero and non-zero to zero cases).

James


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-11-04 14:25       ` James Bottomley
@ 2013-11-04 14:46         ` Hannes Reinecke
  2013-11-04 14:50           ` James Bottomley
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2013-11-04 14:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel, James Smart

[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]

On 11/04/2013 03:25 PM, James Bottomley wrote:
> On Mon, 2013-11-04 at 14:36 +0100, Hannes Reinecke wrote:
>> On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
>>> Looks reasonable to me, but a few minor nitpicks:
>>>
>>>> +	spin_lock_irqsave(sdev->host->host_lock, flags);
>>>> +	if (scsi_host_eh_past_deadline(sdev->host)) {
>>>
>>> I don't have the implementation of scsi_host_eh_past_deadline in my
>>> local tree, but do we really need the host lock for it?
>>>
>> Yes. The eh_deadline variable might be set from an interrupt context
>> or from userland, so we need to protect access to it.
> 
> That's not really true.  on all our supported architectures 32 bit
> reads/writes are atomic, which means that if one CPU writes a word at
> the same time another reads one, the reader is guaranteed to see either
> the old or the new data.  Given the expense of lock cache line bouncing
> on the newer architectures, we really want to avoid a spinlock where
> possible.
> 
> In this case, the problem with the implementation is that the writer
> might set eh_deadline to zero, but this is fixable in
> scsi_host_eh_past_deadline() by checking for zero before and after the
> time_before (for the zero to non-zero and non-zero to zero cases).
> 
IE you mean something like that attached patch?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

[-- Attachment #2: tmp-eh-unlock.patch --]
[-- Type: text/x-patch, Size: 5597 bytes --]

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6a137fa..8abf7ba 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -94,8 +94,10 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 	if (!shost->last_reset || !shost->eh_deadline)
 		return 0;
 
+	/* Double check eh_deadline to catch atomic updates */
 	if (time_before(jiffies,
-			shost->last_reset + shost->eh_deadline))
+			shost->last_reset + shost->eh_deadline) &&
+	    shost->eh_deadline)
 		return 0;
 
 	return 1;
@@ -114,15 +116,12 @@ scmd_eh_abort_handler(struct work_struct *work)
 	unsigned long flags;
 	int rtn;
 
-	spin_lock_irqsave(sdev->host->host_lock, flags);
 	if (scsi_host_eh_past_deadline(sdev->host)) {
-		spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3,
 			scmd_printk(KERN_INFO, scmd,
 				    "scmd %p eh timeout, not aborting\n",
 				    scmd));
 	} else {
-		spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3,
 			scmd_printk(KERN_INFO, scmd,
 				    "aborting command %p\n", scmd));
@@ -1140,16 +1139,13 @@ int scsi_eh_get_sense(struct list_head *work_q,
 			continue;
 
 		shost = scmd->device->host;
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
 						  "%s: requesting sense\n",
 						  current->comm));
@@ -1242,19 +1238,15 @@ static int scsi_eh_test_devices(struct list_head *cmd_list,
 		sdev = scmd->device;
 
 		if (!try_stu) {
-			spin_lock_irqsave(sdev->host->host_lock, flags);
 			if (scsi_host_eh_past_deadline(sdev->host)) {
 				/* Push items back onto work_q */
 				list_splice_init(cmd_list, work_q);
-				spin_unlock_irqrestore(sdev->host->host_lock,
-						       flags);
 				SCSI_LOG_ERROR_RECOVERY(3,
 					shost_printk(KERN_INFO, sdev->host,
 						     "skip %s, past eh deadline",
 						     __func__));
 				break;
 			}
-			spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		}
 
 		finish_cmds = !scsi_device_online(scmd->device) ||
@@ -1301,9 +1293,7 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
 		if (!(scmd->eh_eflags & SCSI_EH_CANCEL_CMD))
 			continue;
 		shost = scmd->device->host;
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			list_splice_init(&check_list, work_q);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
@@ -1311,7 +1301,6 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3, printk("%s: aborting cmd:"
 						  "0x%p\n", current->comm,
 						  scmd));
@@ -1378,16 +1367,13 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 	unsigned long flags;
 
 	shost_for_each_device(sdev, shost) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		stu_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
@@ -1445,16 +1431,13 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 	int rtn;
 
 	shost_for_each_device(sdev, shost) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		bdr_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev) {
@@ -1517,9 +1500,7 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
 		unsigned int id;
 		unsigned long flags;
 
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			/* push back on work queue for further processing */
 			list_splice_init(&check_list, work_q);
 			list_splice_init(&tmp_list, work_q);
@@ -1529,7 +1510,6 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		scmd = list_entry(tmp_list.next, struct scsi_cmnd, eh_entry);
 		id = scmd_id(scmd);
@@ -1584,9 +1564,7 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
 	 */
 
 	for (channel = 0; channel <= shost->max_channel; channel++) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			list_splice_init(&check_list, work_q);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
@@ -1594,7 +1572,6 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		chan_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-11-04 14:46         ` Hannes Reinecke
@ 2013-11-04 14:50           ` James Bottomley
  2013-11-04 15:43             ` Hannes Reinecke
  0 siblings, 1 reply; 17+ messages in thread
From: James Bottomley @ 2013-11-04 14:50 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel, James Smart

On Mon, 2013-11-04 at 15:46 +0100, Hannes Reinecke wrote:
> On 11/04/2013 03:25 PM, James Bottomley wrote:
> > On Mon, 2013-11-04 at 14:36 +0100, Hannes Reinecke wrote:
> >> On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
> >>> Looks reasonable to me, but a few minor nitpicks:
> >>>
> >>>> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> >>>> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> >>>
> >>> I don't have the implementation of scsi_host_eh_past_deadline in my
> >>> local tree, but do we really need the host lock for it?
> >>>
> >> Yes. The eh_deadline variable might be set from an interrupt context
> >> or from userland, so we need to protect access to it.
> > 
> > That's not really true.  on all our supported architectures 32 bit
> > reads/writes are atomic, which means that if one CPU writes a word at
> > the same time another reads one, the reader is guaranteed to see either
> > the old or the new data.  Given the expense of lock cache line bouncing
> > on the newer architectures, we really want to avoid a spinlock where
> > possible.
> > 
> > In this case, the problem with the implementation is that the writer
> > might set eh_deadline to zero, but this is fixable in
> > scsi_host_eh_past_deadline() by checking for zero before and after the
> > time_before (for the zero to non-zero and non-zero to zero cases).
> > 
> IE you mean something like that attached patch?

Yes (except that there should be a comment explaining why we do the read
twice), I think the cost of the extra read check is much less than the
spinlock on all of our platforms.

James


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-11-04 14:50           ` James Bottomley
@ 2013-11-04 15:43             ` Hannes Reinecke
  2013-11-05  1:07               ` James Bottomley
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2013-11-04 15:43 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel, James Smart

[-- Attachment #1: Type: text/plain, Size: 2084 bytes --]

On 11/04/2013 03:50 PM, James Bottomley wrote:
> On Mon, 2013-11-04 at 15:46 +0100, Hannes Reinecke wrote:
>> On 11/04/2013 03:25 PM, James Bottomley wrote:
>>> On Mon, 2013-11-04 at 14:36 +0100, Hannes Reinecke wrote:
>>>> On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
>>>>> Looks reasonable to me, but a few minor nitpicks:
>>>>>
>>>>>> +	spin_lock_irqsave(sdev->host->host_lock, flags);
>>>>>> +	if (scsi_host_eh_past_deadline(sdev->host)) {
>>>>>
>>>>> I don't have the implementation of scsi_host_eh_past_deadline in my
>>>>> local tree, but do we really need the host lock for it?
>>>>>
>>>> Yes. The eh_deadline variable might be set from an interrupt context
>>>> or from userland, so we need to protect access to it.
>>>
>>> That's not really true.  on all our supported architectures 32 bit
>>> reads/writes are atomic, which means that if one CPU writes a word at
>>> the same time another reads one, the reader is guaranteed to see either
>>> the old or the new data.  Given the expense of lock cache line bouncing
>>> on the newer architectures, we really want to avoid a spinlock where
>>> possible.
>>>
>>> In this case, the problem with the implementation is that the writer
>>> might set eh_deadline to zero, but this is fixable in
>>> scsi_host_eh_past_deadline() by checking for zero before and after the
>>> time_before (for the zero to non-zero and non-zero to zero cases).
>>>
>> IE you mean something like that attached patch?
> 
> Yes (except that there should be a comment explaining why we do the read
> twice), I think the cost of the extra read check is much less than the
> spinlock on all of our platforms.
> 
So, this is what I've ended up with; sadly I had to use 'volatile'
here which checkpatch doesn't like.
I _could_ move eh_deadline to be atomic, that would avoid the
'volatile' setting. Feels like an overkill, though.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

[-- Attachment #2: 0003-scsi-Unlock-accesses-to-eh_deadline.patch --]
[-- Type: text/x-patch, Size: 7927 bytes --]

>From 283f1b50e833fad969323531ccd0ce889a5e4044 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Mon, 4 Nov 2013 16:23:36 +0100
Subject: [PATCH 3/5] scsi: Unlock accesses to eh_deadline

32bit accesses are guaranteed to be atomic, so we can remove
the spinlock when checking for eh_deadline. We only need to
make sure to catch any updates which might happened during
the call to time_before(); if so we just recheck with the
correct value.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_error.c | 54 +++++++++++++++--------------------------------
 1 file changed, 17 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 7eecbb5..d122e89 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -91,13 +91,26 @@ EXPORT_SYMBOL_GPL(scsi_schedule_eh);
 
 static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 {
-	if (!shost->last_reset || !shost->eh_deadline)
-		return 0;
+	volatile int eh_deadline;
 
-	if (time_before(jiffies,
-			shost->last_reset + shost->eh_deadline))
+recheck:
+	eh_deadline = shost->eh_deadline;
+	if (!shost->last_reset || !eh_deadline)
 		return 0;
 
+	/*
+	 * 32bit accesses are guaranteed to be atomic
+	 * (on all supported architectures), so instead
+	 * of using a spinlock we can as well double check
+	 * if eh_deadline has been modified after the
+	 * time_before call; if so we need to recheck
+	 * with the correct values.
+	 */
+	if (time_before(jiffies, shost->last_reset + eh_deadline)) {
+		if (eh_deadline != shost->eh_deadline)
+			goto recheck;
+		return 0;
+	}
 	return 1;
 }
 
@@ -111,18 +124,14 @@ scmd_eh_abort_handler(struct work_struct *work)
 	struct scsi_cmnd *scmd =
 		container_of(work, struct scsi_cmnd, abort_work.work);
 	struct scsi_device *sdev = scmd->device;
-	unsigned long flags;
 	int rtn;
 
-	spin_lock_irqsave(sdev->host->host_lock, flags);
 	if (scsi_host_eh_past_deadline(sdev->host)) {
-		spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3,
 			scmd_printk(KERN_INFO, scmd,
 				    "scmd %p eh timeout, not aborting\n",
 				    scmd));
 	} else {
-		spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3,
 			scmd_printk(KERN_INFO, scmd,
 				    "aborting command %p\n", scmd));
@@ -1132,7 +1141,6 @@ int scsi_eh_get_sense(struct list_head *work_q,
 	struct scsi_cmnd *scmd, *next;
 	struct Scsi_Host *shost;
 	int rtn;
-	unsigned long flags;
 
 	list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
 		if ((scmd->eh_eflags & SCSI_EH_CANCEL_CMD) ||
@@ -1140,16 +1148,13 @@ int scsi_eh_get_sense(struct list_head *work_q,
 			continue;
 
 		shost = scmd->device->host;
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
 						  "%s: requesting sense\n",
 						  current->comm));
@@ -1235,26 +1240,21 @@ static int scsi_eh_test_devices(struct list_head *cmd_list,
 	struct scsi_cmnd *scmd, *next;
 	struct scsi_device *sdev;
 	int finish_cmds;
-	unsigned long flags;
 
 	while (!list_empty(cmd_list)) {
 		scmd = list_entry(cmd_list->next, struct scsi_cmnd, eh_entry);
 		sdev = scmd->device;
 
 		if (!try_stu) {
-			spin_lock_irqsave(sdev->host->host_lock, flags);
 			if (scsi_host_eh_past_deadline(sdev->host)) {
 				/* Push items back onto work_q */
 				list_splice_init(cmd_list, work_q);
-				spin_unlock_irqrestore(sdev->host->host_lock,
-						       flags);
 				SCSI_LOG_ERROR_RECOVERY(3,
 					shost_printk(KERN_INFO, sdev->host,
 						     "skip %s, past eh deadline",
 						     __func__));
 				break;
 			}
-			spin_unlock_irqrestore(sdev->host->host_lock, flags);
 		}
 
 		finish_cmds = !scsi_device_online(scmd->device) ||
@@ -1295,15 +1295,12 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
 	LIST_HEAD(check_list);
 	int rtn;
 	struct Scsi_Host *shost;
-	unsigned long flags;
 
 	list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
 		if (!(scmd->eh_eflags & SCSI_EH_CANCEL_CMD))
 			continue;
 		shost = scmd->device->host;
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			list_splice_init(&check_list, work_q);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
@@ -1311,7 +1308,6 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		SCSI_LOG_ERROR_RECOVERY(3, printk("%s: aborting cmd:"
 						  "0x%p\n", current->comm,
 						  scmd));
@@ -1375,19 +1371,15 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 {
 	struct scsi_cmnd *scmd, *stu_scmd, *next;
 	struct scsi_device *sdev;
-	unsigned long flags;
 
 	shost_for_each_device(sdev, shost) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		stu_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
@@ -1441,20 +1433,16 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 {
 	struct scsi_cmnd *scmd, *bdr_scmd, *next;
 	struct scsi_device *sdev;
-	unsigned long flags;
 	int rtn;
 
 	shost_for_each_device(sdev, shost) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
 					    "skip %s, past eh deadline\n",
 					     __func__));
 			break;
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 		bdr_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev) {
@@ -1515,11 +1503,8 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
 		struct scsi_cmnd *next, *scmd;
 		int rtn;
 		unsigned int id;
-		unsigned long flags;
 
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			/* push back on work queue for further processing */
 			list_splice_init(&check_list, work_q);
 			list_splice_init(&tmp_list, work_q);
@@ -1529,7 +1514,6 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		scmd = list_entry(tmp_list.next, struct scsi_cmnd, eh_entry);
 		id = scmd_id(scmd);
@@ -1574,7 +1558,6 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
 	LIST_HEAD(check_list);
 	unsigned int channel;
 	int rtn;
-	unsigned long flags;
 
 	/*
 	 * we really want to loop over the various channels, and do this on
@@ -1584,9 +1567,7 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
 	 */
 
 	for (channel = 0; channel <= shost->max_channel; channel++) {
-		spin_lock_irqsave(shost->host_lock, flags);
 		if (scsi_host_eh_past_deadline(shost)) {
-			spin_unlock_irqrestore(shost->host_lock, flags);
 			list_splice_init(&check_list, work_q);
 			SCSI_LOG_ERROR_RECOVERY(3,
 				shost_printk(KERN_INFO, shost,
@@ -1594,7 +1575,6 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
 					     __func__));
 			return list_empty(work_q);
 		}
-		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		chan_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry) {
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-11-04 15:43             ` Hannes Reinecke
@ 2013-11-05  1:07               ` James Bottomley
  0 siblings, 0 replies; 17+ messages in thread
From: James Bottomley @ 2013-11-05  1:07 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, linux-scsi, Ren Mingxin, Joern Engel, James Smart

On Mon, 2013-11-04 at 16:43 +0100, Hannes Reinecke wrote:
> On 11/04/2013 03:50 PM, James Bottomley wrote:
> > On Mon, 2013-11-04 at 15:46 +0100, Hannes Reinecke wrote:
> >> On 11/04/2013 03:25 PM, James Bottomley wrote:
> >>> On Mon, 2013-11-04 at 14:36 +0100, Hannes Reinecke wrote:
> >>>> On 10/31/2013 04:49 PM, Christoph Hellwig wrote:
> >>>>> Looks reasonable to me, but a few minor nitpicks:
> >>>>>
> >>>>>> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> >>>>>> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> >>>>>
> >>>>> I don't have the implementation of scsi_host_eh_past_deadline in my
> >>>>> local tree, but do we really need the host lock for it?
> >>>>>
> >>>> Yes. The eh_deadline variable might be set from an interrupt context
> >>>> or from userland, so we need to protect access to it.
> >>>
> >>> That's not really true.  on all our supported architectures 32 bit
> >>> reads/writes are atomic, which means that if one CPU writes a word at
> >>> the same time another reads one, the reader is guaranteed to see either
> >>> the old or the new data.  Given the expense of lock cache line bouncing
> >>> on the newer architectures, we really want to avoid a spinlock where
> >>> possible.
> >>>
> >>> In this case, the problem with the implementation is that the writer
> >>> might set eh_deadline to zero, but this is fixable in
> >>> scsi_host_eh_past_deadline() by checking for zero before and after the
> >>> time_before (for the zero to non-zero and non-zero to zero cases).
> >>>
> >> IE you mean something like that attached patch?
> > 
> > Yes (except that there should be a comment explaining why we do the read
> > twice), I think the cost of the extra read check is much less than the
> > spinlock on all of our platforms.
> > 
> So, this is what I've ended up with; sadly I had to use 'volatile'
> here which checkpatch doesn't like.

Why?  Volatile has no real meaning on a local variable.  You can just do
an ordinary eh_deadline = shost->eh_deadline; and it will see either the
before or after value.

> I _could_ move eh_deadline to be atomic, that would avoid the
> 'volatile' setting. Feels like an overkill, though.

Please dump the recheck loop and just check for zero again.  The race is
acceptable because we're not trying to mediate it in a meaningful way.
As long as the result is consistent with either the before or after
value, that's fine.

James


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-09-20  7:59   ` Ren Mingxin
@ 2013-10-02 16:24     ` Hannes Reinecke
  0 siblings, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-10-02 16:24 UTC (permalink / raw)
  To: James Bottomley
  Cc: Ren Mingxin, linux-scsi, Ewan Milne, Joern Engel, James Smart,
	Bart Van Assche, Roland Dreier

On 09/20/2013 09:59 AM, Ren Mingxin wrote:
> Hi, Hannes:
>
> On 09/02/2013 07:58 PM, Hannes Reinecke wrote:
>> +scmd_eh_abort_handler(struct work_struct *work)
>> +{
>> +    struct scsi_cmnd *scmd =
>> +        container_of(work, struct scsi_cmnd, abort_work.work);
>> +    struct scsi_device *sdev = scmd->device;
>> +    unsigned long flags;
>> +    int rtn;
>> +
>> +    spin_lock_irqsave(sdev->host->host_lock, flags);
>> +    if (scsi_host_eh_past_deadline(sdev->host)) {
>> +        spin_unlock_irqrestore(sdev->host->host_lock, flags);
>> +        SCSI_LOG_ERROR_RECOVERY(3,
>> +            scmd_printk(KERN_INFO, scmd,
>> +                    "scmd %p eh timeout, not aborting\n", scmd));
>> +    } else {
>> +        spin_unlock_irqrestore(sdev->host->host_lock, flags);
>> +        SCSI_LOG_ERROR_RECOVERY(3,
>> +            scmd_printk(KERN_INFO, scmd,
>> +                    "aborting command %p\n", scmd));
>> +        rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd);
>> +        if (rtn == SUCCESS) {
>> +            scmd->result |= DID_TIME_OUT<<  16;
>> +            if (!scsi_noretry_cmd(scmd)&&
>> +                (++scmd->retries<= scmd->allowed)) {
>
> I think scsi_host_eh_past_deadline() should be checked here like:
>
> -            if (!scsi_noretry_cmd(scmd)&&
> +            if (!scsi_host_eh_past_deadline(sdev->host)&&
> +                !scsi_noretry_cmd(scmd)&&
>
> According to my test, once retry requires 30 seconds. If eh_deadline
> is reached, we can stop EH here without waiting for long term
> retrying.
>

Hmm. We could. Providing the eh_deadline patches are going in.

James, should I resend the patchset?
Does it even have a chance of getting in?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-09-02 11:58 ` [PATCH 2/3] scsi: improved eh " Hannes Reinecke
  2013-09-11  9:16   ` Ren Mingxin
@ 2013-09-20  7:59   ` Ren Mingxin
  2013-10-02 16:24     ` Hannes Reinecke
  1 sibling, 1 reply; 17+ messages in thread
From: Ren Mingxin @ 2013-09-20  7:59 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: James Bottomley, linux-scsi, Ewan Milne, Joern Engel,
	James Smart, Bart Van Assche, Roland Dreier

Hi, Hannes:

On 09/02/2013 07:58 PM, Hannes Reinecke wrote:
> +scmd_eh_abort_handler(struct work_struct *work)
> +{
> +	struct scsi_cmnd *scmd =
> +		container_of(work, struct scsi_cmnd, abort_work.work);
> +	struct scsi_device *sdev = scmd->device;
> +	unsigned long flags;
> +	int rtn;
> +
> +	spin_lock_irqsave(sdev->host->host_lock, flags);
> +	if (scsi_host_eh_past_deadline(sdev->host)) {
> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_INFO, scmd,
> +				    "scmd %p eh timeout, not aborting\n", scmd));
> +	} else {
> +		spin_unlock_irqrestore(sdev->host->host_lock, flags);
> +		SCSI_LOG_ERROR_RECOVERY(3,
> +			scmd_printk(KERN_INFO, scmd,
> +				    "aborting command %p\n", scmd));
> +		rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd);
> +		if (rtn == SUCCESS) {
> +			scmd->result |= DID_TIME_OUT<<  16;
> +			if (!scsi_noretry_cmd(scmd)&&
> +			    (++scmd->retries<= scmd->allowed)) {

I think scsi_host_eh_past_deadline() should be checked here like:

-			if (!scsi_noretry_cmd(scmd)&&
+			if (!scsi_host_eh_past_deadline(sdev->host)&&
+			    !scsi_noretry_cmd(scmd)&&

According to my test, once retry requires 30 seconds. If eh_deadline
is reached, we can stop EH here without waiting for long term
retrying.

Thanks,
Ren

>
> +				SCSI_LOG_ERROR_RECOVERY(3,
> +					scmd_printk(KERN_WARNING, scmd,
> +						    "scmd %p retry "
> +						    "aborted command\n", scmd));
> +				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
> +			} else {
> +				SCSI_LOG_ERROR_RECOVERY(3,
> +					scmd_printk(KERN_WARNING, scmd,
> +						    "scmd %p finish "
> +						    "aborted command\n", scmd));
> +				scsi_finish_command(scmd);
> +			}
> +			return;
> +		}


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-09-11  9:16   ` Ren Mingxin
@ 2013-09-12 20:49     ` Hannes Reinecke
  0 siblings, 0 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-09-12 20:49 UTC (permalink / raw)
  To: Ren Mingxin
  Cc: James Bottomley, linux-scsi, Ewan Milne, Joern Engel,
	James Smart, Bart Van Assche, Roland Dreier

On 09/11/2013 11:16 AM, Ren Mingxin wrote:
> Hi, Hannes:
>
> On 09/02/2013 07:58 PM, Hannes Reinecke wrote:
>> If abort succeeds the command is either retried or terminated,
>> depending on the number of allowed retries. However, 'eh_eflags'
>> records the abort, so if the retry would fail again the
>> command is pushed onto the error handler without trying to
>> abort it (again); it'll be cleared up from SCSI EH.
>
> I'm still thinking about the aborting 'scsi_eh_abort_cmds()' in SCSI
> EH - does it make sense to abort in SCSI EH since we've tried to
> abort via your scsi_abort_command()? Though the aborting in SCSI EH
> will handle commands which havn't been aborted in scsi_abort_command
> since EH has been engaged.
>
Well, with the original design scsi_abort_command() was optional, and
only enabled for selected HBAs.
And I would rather keep it the original implementation, as there might
be HBAs who don't like inline aborts.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] scsi: improved eh timeout handler
  2013-09-02 11:58 ` [PATCH 2/3] scsi: improved eh " Hannes Reinecke
@ 2013-09-11  9:16   ` Ren Mingxin
  2013-09-12 20:49     ` Hannes Reinecke
  2013-09-20  7:59   ` Ren Mingxin
  1 sibling, 1 reply; 17+ messages in thread
From: Ren Mingxin @ 2013-09-11  9:16 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: James Bottomley, linux-scsi, Ewan Milne, Joern Engel,
	James Smart, Bart Van Assche, Roland Dreier

Hi, Hannes:

On 09/02/2013 07:58 PM, Hannes Reinecke wrote:
> If abort succeeds the command is either retried or terminated,
> depending on the number of allowed retries. However, 'eh_eflags'
> records the abort, so if the retry would fail again the
> command is pushed onto the error handler without trying to
> abort it (again); it'll be cleared up from SCSI EH.

I'm still thinking about the aborting 'scsi_eh_abort_cmds()' in SCSI
EH - does it make sense to abort in SCSI EH since we've tried to
abort via your scsi_abort_command()? Though the aborting in SCSI EH
will handle commands which havn't been aborted in scsi_abort_command
since EH has been engaged.

Thanks,
Ren

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/3] scsi: improved eh timeout handler
  2013-09-02 11:58 [PATCHv6 0/3] New EH command timeout handler Hannes Reinecke
@ 2013-09-02 11:58 ` Hannes Reinecke
  2013-09-11  9:16   ` Ren Mingxin
  2013-09-20  7:59   ` Ren Mingxin
  0 siblings, 2 replies; 17+ messages in thread
From: Hannes Reinecke @ 2013-09-02 11:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, Ewan Milne, Ren Mingxin, Joern Engel, James Smart,
	Bart Van Assche, Roland Dreier, Hannes Reinecke

When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.

This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Ren Mingxin <renmx@cn.fujitsu.com>
---
 drivers/scsi/scsi.c       |   3 +
 drivers/scsi/scsi_error.c | 149 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/scsi/scsi_priv.h  |   2 +
 include/scsi/scsi_cmnd.h  |   1 +
 include/scsi/scsi_host.h  |   5 ++
 5 files changed, 147 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fe0bcb1..2b04a57 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -297,6 +297,7 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, gfp_t gfp_mask)
 
 		cmd->device = dev;
 		INIT_LIST_HEAD(&cmd->list);
+		INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);
 		spin_lock_irqsave(&dev->list_lock, flags);
 		list_add_tail(&cmd->list, &dev->cmd_list);
 		spin_unlock_irqrestore(&dev->list_lock, flags);
@@ -353,6 +354,8 @@ void scsi_put_command(struct scsi_cmnd *cmd)
 	list_del_init(&cmd->list);
 	spin_unlock_irqrestore(&cmd->device->list_lock, flags);
 
+	cancel_delayed_work(&cmd->abort_work);
+
 	__scsi_put_command(cmd->device->host, cmd, &sdev->sdev_gendev);
 }
 EXPORT_SYMBOL(scsi_put_command);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c88cb7e..8eaf524 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -53,6 +53,7 @@ static void scsi_eh_done(struct scsi_cmnd *scmd);
 #define HOST_RESET_SETTLE_TIME  (10)
 
 static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
+static int scsi_try_to_abort_cmd(struct scsi_host_template *, struct scsi_cmnd *);
 
 /* called with shost->host_lock held */
 void scsi_eh_wakeup(struct Scsi_Host *shost)
@@ -100,6 +101,115 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 }
 
 /**
+ * scmd_eh_abort_handler - Handle command aborts
+ * @work:	command to be aborted.
+ */
+void
+scmd_eh_abort_handler(struct work_struct *work)
+{
+	struct scsi_cmnd *scmd =
+		container_of(work, struct scsi_cmnd, abort_work.work);
+	struct scsi_device *sdev = scmd->device;
+	unsigned long flags;
+	int rtn;
+
+	spin_lock_irqsave(sdev->host->host_lock, flags);
+	if (scsi_host_eh_past_deadline(sdev->host)) {
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p eh timeout, not aborting\n", scmd));
+	} else {
+		spin_unlock_irqrestore(sdev->host->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "aborting command %p\n", scmd));
+		rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd);
+		if (rtn == SUCCESS) {
+			scmd->result |= DID_TIME_OUT << 16;
+			if (!scsi_noretry_cmd(scmd) &&
+			    (++scmd->retries <= scmd->allowed)) {
+				SCSI_LOG_ERROR_RECOVERY(3,
+					scmd_printk(KERN_WARNING, scmd,
+						    "scmd %p retry "
+						    "aborted command\n", scmd));
+				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+			} else {
+				SCSI_LOG_ERROR_RECOVERY(3,
+					scmd_printk(KERN_WARNING, scmd,
+						    "scmd %p finish "
+						    "aborted command\n", scmd));
+				scsi_finish_command(scmd);
+			}
+			return;
+		}
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p abort failed, rtn %d\n",
+				    scmd, rtn));
+	}
+
+	if (scsi_eh_scmd_add(scmd, 0)) {
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_WARNING, scmd,
+				    "scmd %p terminate "
+				    "aborted command\n", scmd));
+		scmd->result |= DID_TIME_OUT << 16;
+		scsi_finish_command(scmd);
+	}
+}
+
+/**
+ * scsi_abort_command - schedule a command abort
+ * @scmd:	scmd to abort.
+ *
+ * We only need to abort commands after a command timeout
+ */
+static int
+scsi_abort_command(struct scsi_cmnd *scmd)
+{
+	struct scsi_device *sdev = scmd->device;
+	struct Scsi_Host *shost = sdev->host;
+	unsigned long flags;
+
+	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) {
+		/*
+		 * command abort timed out.
+		 */
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p abort timeout\n", scmd));
+		cancel_delayed_work(&scmd->abort_work);
+		return FAILED;
+	}
+
+	/*
+	 * Do not try a command abort if
+	 * SCSI EH has already started.
+	 */
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_in_recovery(shost)) {
+		spin_unlock_irqrestore(shost->host_lock, flags);
+		SCSI_LOG_ERROR_RECOVERY(3,
+			scmd_printk(KERN_INFO, scmd,
+				    "scmd %p not aborting, host in recovery\n",
+				    scmd));
+		return FAILED;
+	}
+
+	if (shost->eh_deadline && !shost->last_reset)
+		shost->last_reset = jiffies;
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
+	scmd->eh_eflags |= SCSI_EH_ABORT_SCHEDULED;
+	SCSI_LOG_ERROR_RECOVERY(3,
+		scmd_printk(KERN_INFO, scmd,
+			    "scmd %p abort scheduled\n", scmd));
+	schedule_delayed_work(&scmd->abort_work, HZ / 100);
+	return SUCCESS;
+}
+
+/**
  * scsi_eh_scmd_add - add scsi cmd to error handling.
  * @scmd:	scmd to run eh on.
  * @eh_flag:	optional SCSI_EH flag.
@@ -125,6 +235,8 @@ int scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
 		shost->last_reset = jiffies;
 
 	ret = 1;
+	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
+		eh_flag &= ~SCSI_EH_CANCEL_CMD;
 	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
 	shost->host_failed++;
@@ -161,6 +273,10 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	else if (host->hostt->eh_timed_out)
 		rtn = host->hostt->eh_timed_out(scmd);
 
+	if (rtn == BLK_EH_NOT_HANDLED && !host->hostt->no_async_abort)
+		if (scsi_abort_command(scmd) == SUCCESS)
+			return BLK_EH_NOT_HANDLED;
+
 	scmd->result |= DID_TIME_OUT << 16;
 
 	if (unlikely(rtn == BLK_EH_NOT_HANDLED &&
@@ -1577,7 +1693,7 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q,
 }
 
 /**
- * scsi_noretry_cmd - determinte if command should be failed fast
+ * scsi_noretry_cmd - determine if command should be failed fast
  * @scmd:	SCSI cmd to examine.
  */
 int scsi_noretry_cmd(struct scsi_cmnd *scmd)
@@ -1585,6 +1701,8 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	switch (host_byte(scmd->result)) {
 	case DID_OK:
 		break;
+	case DID_TIME_OUT:
+		goto check_type;
 	case DID_BUS_BUSY:
 		return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);
 	case DID_PARITY:
@@ -1598,18 +1716,19 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 		return (scmd->request->cmd_flags & REQ_FAILFAST_DRIVER);
 	}
 
-	switch (status_byte(scmd->result)) {
-	case CHECK_CONDITION:
-		/*
-		 * assume caller has checked sense and determinted
-		 * the check condition was retryable.
-		 */
-		if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
-		    scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
-			return 1;
-	}
+	switch (status_byte(scmd->result) != CHECK_CONDITION)
+		return 0;
 
-	return 0;
+check_type:
+	/*
+	 * assume caller has checked sense and determined
+	 * the check condition was retryable.
+	 */
+	if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
+	    scmd->request->cmd_type == REQ_TYPE_BLOCK_PC)
+		return 1;
+	else
+		return 0;
 }
 
 /**
@@ -1659,9 +1778,13 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * looks good.  drop through, and check the next byte.
 		 */
 		break;
+	case DID_ABORT:
+		if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) {
+			scmd->result |= DID_TIME_OUT << 16;
+			return SUCCESS;
+		}
 	case DID_NO_CONNECT:
 	case DID_BAD_TARGET:
-	case DID_ABORT:
 		/*
 		 * note - this means that we just report the status back
 		 * to the top level driver, not that we actually think
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 8f9a0ca..f079a59 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -19,6 +19,7 @@ struct scsi_nl_hdr;
  * Scsi Error Handler Flags
  */
 #define SCSI_EH_CANCEL_CMD	0x0001	/* Cancel this cmd */
+#define SCSI_EH_ABORT_SCHEDULED	0x0002	/* Abort has been scheduled */
 
 #define SCSI_SENSE_VALID(scmd) \
 	(((scmd)->sense_buffer[0] & 0x70) == 0x70)
@@ -66,6 +67,7 @@ extern int __init scsi_init_devinfo(void);
 extern void scsi_exit_devinfo(void);
 
 /* scsi_error.c */
+extern void scmd_eh_abort_handler(struct work_struct *work);
 extern enum blk_eh_timer_return scsi_times_out(struct request *req);
 extern int scsi_error_handler(void *host);
 extern int scsi_decide_disposition(struct scsi_cmnd *cmd);
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index de5f5d8..91558a1 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -55,6 +55,7 @@ struct scsi_cmnd {
 	struct scsi_device *device;
 	struct list_head list;  /* scsi_cmnd participates in queue lists */
 	struct list_head eh_entry; /* entry for the host eh_cmd_q */
+	struct delayed_work abort_work;
 	int eh_eflags;		/* Used by error handlr */
 
 	/*
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 896bb05..6ca9174 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -476,6 +476,11 @@ struct scsi_host_template {
 	unsigned ordered_tag:1;
 
 	/*
+	 * True if asynchronous aborts are not supported
+	 */
+	unsigned no_async_abort:1;
+
+	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */
 	unsigned int max_host_blocked;
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-11-05  1:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-31 13:02 [PATCHv8 0/3] New EH command timeout handler Hannes Reinecke
2013-10-31 13:02 ` [PATCH 1/3] scsi: Fix erratic device offline during EH Hannes Reinecke
2013-10-31 13:02 ` [PATCH 2/3] scsi: improved eh timeout handler Hannes Reinecke
2013-10-31 15:49   ` Christoph Hellwig
2013-11-04 13:36     ` Hannes Reinecke
2013-11-04 14:25       ` James Bottomley
2013-11-04 14:46         ` Hannes Reinecke
2013-11-04 14:50           ` James Bottomley
2013-11-04 15:43             ` Hannes Reinecke
2013-11-05  1:07               ` James Bottomley
2013-11-01  6:10   ` Ren Mingxin
2013-10-31 13:02 ` [PATCH 3/3] scsi: Update documentation Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2013-09-02 11:58 [PATCHv6 0/3] New EH command timeout handler Hannes Reinecke
2013-09-02 11:58 ` [PATCH 2/3] scsi: improved eh " Hannes Reinecke
2013-09-11  9:16   ` Ren Mingxin
2013-09-12 20:49     ` Hannes Reinecke
2013-09-20  7:59   ` Ren Mingxin
2013-10-02 16:24     ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.