[PATCH 0/14] IB SRP initiator patches for kernel 3.11

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/14] IB SRP initiator patches for kernel 3.11
@ 2013-06-12 13:17 Bart Van Assche
  2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
  0 siblings, 2 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:17 UTC (permalink / raw)
  To: linux-rdma, linux-scsi, Roland Dreier, David Dillow, Vu Pham,
	Sebastian Riemer

The purpose of this InfiniBand SRP initiator patch series is as follows:
- Make the SRP initiator driver better suited for use in a H.A. setup.
   Speed up failover by reducing the IB RC retry count and by notifying
   multipathd faster about transport layer failures by adding
   fast_io_fail_tmo and dev_loss_tmo parameters.
- Make the SRP initiator better suited for use on NUMA systems by
   making the HCA completion vector configurable.

Some of the patches in this patch series were posted previously as "Make 
ib_srp better suited for H.A. purposes". The changes compared to version 
five of that patch series are:
- Left out patches that are already upstream.
- Made it possible to set dev_loss_tmo to "off". This is useful in a
   setup using initiator side mirroring to avoid that new /dev/sd* names
   are reassigned after a failover or cable pull and reinsert.
- Added kernel module parameters to ib_srp for configuring default
   values of the fast_io_fail_tmo and dev_loss_tmo parameters.
- Added a patch from Dotan Barak that fixes a kernel oops during rmmod
   triggered by resource allocation failure at module load time.
- Avoid duplicate connections by refusing relogins instead of dropping
   duplicate connections, as proposed by Sebastian Riemer.
- Added a patch from Sebastian Riemer for failing SCSI commands
   silently.
- Added a patch from Vu Pham to make the transport layer (IB RC) retry
   count configurable.
- Made HCA completion vector configurable.

Changes since v4:
- Added a patch for removing SCSI devices upon a port down event

Changes since v3:
- Restored the dev_loss_tmo and fast_io_fail_tmo sysfs attributes.
- Included a patch to fix an ib_srp crash that could be triggered by
   cable pulling.

Changes since v2:
- Addressed the v2 review comments.
- Dropped the patches that have already been merged.
- Dropped the patches for integration with multipathd.
- Dropped the micro-optimization of the IB completion handlers.

The individual patches in this series are as follows:
0001-IB-srp-Fix-remove_one-crash-due-to-resource-exhausti.patch
0002-IB-srp-Fix-race-between-srp_queuecommand-and-srp_cla.patch
0003-IB-srp-Avoid-that-srp_reset_host-is-skipped-after-a-.patch
0004-IB-srp-Skip-host-settle-delay.patch
0005-IB-srp-Maintain-a-single-connection-per-I_T-nexus.patch
0006-IB-srp-Keep-rport-as-long-as-the-IB-transport-layer.patch
0007-scsi_transport_srp-Add-transport-layer-error-handlin.patch
0008-IB-srp-Add-srp_terminate_io.patch
0009-IB-srp-Use-SRP-transport-layer-error-recovery.patch
0010-IB-srp-Start-timers-if-a-transport-layer-error-occur.patch
0011-IB-srp-Fail-SCSI-commands-silently.patch
0012-IB-srp-Make-HCA-completion-vector-configurable.patch
0013-IB-srp-Make-transport-layer-retry-count-configurable.patch
0014-IB-srp-Bump-driver-version-and-release-date.patch


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
@ 2013-06-12 13:20   ` Bart Van Assche
       [not found]     ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:21   ` [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req() Bart Van Assche
                     ` (11 subsequent siblings)
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:20 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, Dotan Barak,
	Eli Cohen

If the add_one callback fails during driver load no resources are
allocated so there isn't a need to release any resources. Trying
to clean the resource may lead to the following kernel panic:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa0132331>] srp_remove_one+0x31/0x240 [ib_srp]
RIP: 0010:[<ffffffffa0132331>]  [<ffffffffa0132331>] srp_remove_one+0x31/0x240 [ib_srp]
Process rmmod (pid: 4562, threadinfo ffff8800dd738000, task ffff8801167e60c0)
Call Trace:
 [<ffffffffa024500e>] ib_unregister_client+0x4e/0x120 [ib_core]
 [<ffffffffa01361bd>] srp_cleanup_module+0x15/0x71 [ib_srp]
 [<ffffffff810ac6a4>] sys_delete_module+0x194/0x260
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

[bvanassche: Shortened patch description]
Signed-off-by: Dotan Barak <dotanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Reviewed-by: Eli Cohen <eli-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 7ccf328..368d160 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device *device)
 	struct srp_target_port *target;
 
 	srp_dev = ib_get_client_data(device, &srp_client);
+	if (!srp_dev)
+		return;
 
 	list_for_each_entry_safe(host, tmp_host, &srp_dev->dev_list, list) {
 		device_unregister(&host->dev);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:20   ` [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion Bart Van Assche
@ 2013-06-12 13:21   ` Bart Van Assche
       [not found]     ` <51B875EE.3030702-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:23   ` [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error Bart Van Assche
                     ` (10 subsequent siblings)
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:21 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Avoid that srp_claim_command() can claim a command while
srp_queuecommand() is still busy queueing the same command.
Found this via source reading.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 368d160..9c638dd 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1367,7 +1367,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 
 	req = list_first_entry(&target->free_reqs, struct srp_request, list);
 	list_del(&req->list);
-	spin_unlock_irqrestore(&target->lock, flags);
 
 	dev = target->srp_host->srp_dev->dev;
 	ib_dma_sync_single_for_cpu(dev, iu->dma, target->max_iu_len,
@@ -1401,6 +1400,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
 		goto err_unmap;
 	}
+	spin_unlock_irqrestore(&target->lock, flags);
 
 	return 0;
 
@@ -1409,8 +1409,6 @@ err_unmap:
 
 err_iu:
 	srp_put_tx_iu(target, iu, SRP_IU_CMD);
-
-	spin_lock_irqsave(&target->lock, flags);
 	list_add(&req->list, &target->free_reqs);
 
 err_unlock:
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:20   ` [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion Bart Van Assche
  2013-06-12 13:21   ` [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req() Bart Van Assche
@ 2013-06-12 13:23   ` Bart Van Assche
       [not found]     ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:24   ` [PATCH 04/14] IB/srp: Skip host settle delay Bart Van Assche
                     ` (9 subsequent siblings)
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:23 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

The SCSI error handler assumes that the transport layer is
operational if an eh_abort_handler() returns SUCCESS. Hence let
srp_abort() only return SUCCESS if sending the ABORT TASK task
management function succeeded. This patch avoids that the SCSI
error handler skips the srp_reset_host() call after a transport
layer error.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 9c638dd..fb37b47 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1742,18 +1742,23 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 {
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
 	struct srp_request *req = (struct srp_request *) scmnd->host_scribble;
+	int ret;
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
 	if (!req || !srp_claim_req(target, req, scmnd))
 		return FAILED;
-	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
-			  SRP_TSK_ABORT_TASK);
+	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
+			      SRP_TSK_ABORT_TASK) == 0 ||
+	    target->transport_offline)
+		ret = SUCCESS;
+	else
+		ret = FAILED;
 	srp_free_req(target, req, scmnd, 0);
 	scmnd->result = DID_ABORT << 16;
 	scmnd->scsi_done(scmnd);
 
-	return SUCCESS;
+	return ret;
 }
 
 static int srp_reset_device(struct scsi_cmnd *scmnd)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 04/14] IB/srp: Skip host settle delay
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2013-06-12 13:23   ` [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error Bart Van Assche
@ 2013-06-12 13:24   ` Bart Van Assche
       [not found]     ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:25   ` [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Bart Van Assche
                     ` (8 subsequent siblings)
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:24 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

The SRP initiator implements host reset by reconnecting to the SRP
target. That means that communication with the target is possible
as soon as host reset finished. Hence skip the host settle delay.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index fb37b47..be12780 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1949,6 +1949,7 @@ static struct scsi_host_template srp_template = {
 	.eh_abort_handler		= srp_abort,
 	.eh_device_reset_handler	= srp_reset_device,
 	.eh_host_reset_handler		= srp_reset_host,
+	.skip_settle_delay		= true,
 	.sg_tablesize			= SRP_DEF_SG_TABLESIZE,
 	.can_queue			= SRP_CMD_SQ_SIZE,
 	.this_id			= -1,
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2013-06-12 13:24   ` [PATCH 04/14] IB/srp: Skip host settle delay Bart Van Assche
@ 2013-06-12 13:25   ` Bart Van Assche
       [not found]     ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:26   ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
                     ` (7 subsequent siblings)
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:25 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

An SRP target is required to maintain a single connection between
initiator and target. This means that if the 'add_target' attribute
is used to create a second connection to a target that the first
connection will be logged out and that the SCSI error handler will
kick in. The SCSI error handler will cause the SRP initiator to
reconnect, which will cause I/O over the second connection to fail.
Avoid such ping-pong behavior by disabling relogins. Note: if
reconnecting manually is necessary, that is possible by deleting
and recreating an rport via sysfs.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   38 +++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index be12780..1a73b24 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -556,6 +556,36 @@ static void srp_rport_delete(struct srp_rport *rport)
 	srp_queue_remove_work(target);
 }
 
+/**
+ * srp_conn_unique() - check whether the connection to a target is unique
+ */
+static bool srp_conn_unique(struct srp_host *host,
+			    struct srp_target_port *target)
+{
+	struct srp_target_port *t;
+	bool ret = false;
+
+	if (target->state == SRP_TARGET_REMOVED)
+		goto out;
+
+	ret = true;
+
+	spin_lock(&host->target_lock);
+	list_for_each_entry(t, &host->target_list, list) {
+		if (t != target &&
+		    target->id_ext == t->id_ext &&
+		    target->ioc_guid == t->ioc_guid &&
+		    target->initiator_ext == t->initiator_ext) {
+			ret = false;
+			break;
+		}
+	}
+	spin_unlock(&host->target_lock);
+
+out:
+	return ret;
+}
+
 static int srp_connect_target(struct srp_target_port *target)
 {
 	int retries = 3;
@@ -2261,6 +2291,14 @@ static ssize_t srp_create_target(struct device *dev,
 	if (ret)
 		goto err;
 
+	if (!srp_conn_unique(target->srp_host, target)) {
+		shost_printk(KERN_INFO, target->scsi_host,
+			     PFX "Already connected to target port %.*s\n",
+			     (int)count, buf);
+		ret = -EEXIST;
+		goto err;
+	}
+
 	if (!host->srp_dev->fmr_pool && !target->allow_ext_sg &&
 				target->cmd_sg_cnt < target->sg_tablesize) {
 		pr_warn("No FMR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n");
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2013-06-12 13:25   ` [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Bart Van Assche
@ 2013-06-12 13:26   ` Bart Van Assche
  2013-06-12 13:29   ` [PATCH 08/14] IB/srp: Add srp_terminate_io() Bart Van Assche
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:26 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi,
	James Bottomley

Keep the rport data structure around after srp_remove_host() has
finished until cleanup of the IB transport layer has finished
completely. This is necessary because later patches use the rport
pointer inside the queuecommand callback. Without this patch
accessing the rport from inside a queuecommand callback is racy
because srp_remove_host() must be invoked before scsi_remove_host()
and because the queuecommand callback may get invoked after
srp_remove_host() has finished.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: James Bottomley <JBottomley-MU7nAjRaF3makBO8gow8eQ@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 drivers/scsi/scsi_transport_srp.c   |   18 ++++++++++++++++++
 include/scsi/scsi_transport_srp.h   |    2 ++
 4 files changed, 24 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 1a73b24..2bd0587 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -526,11 +526,13 @@ static void srp_remove_target(struct srp_target_port *target)
 	WARN_ON_ONCE(target->state != SRP_TARGET_REMOVED);
 
 	srp_del_scsi_host_attr(target->scsi_host);
+	srp_rport_get(target->rport);
 	srp_remove_host(target->scsi_host);
 	scsi_remove_host(target->scsi_host);
 	srp_disconnect_target(target);
 	ib_destroy_cm_id(target->cm_id);
 	srp_free_target_ib(target);
+	srp_rport_put(target->rport);
 	srp_free_req_data(target);
 	scsi_host_put(target->scsi_host);
 }
@@ -2009,6 +2011,7 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	}
 
 	rport->lld_data = target;
+	target->rport = rport;
 
 	spin_lock(&host->target_lock);
 	list_add_tail(&target->list, &host->target_list);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 66fbedd..1817ed5 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -153,6 +153,7 @@ struct srp_target_port {
 	u16			io_class;
 	struct srp_host	       *srp_host;
 	struct Scsi_Host       *scsi_host;
+	struct srp_rport       *rport;
 	char			target_name[32];
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index f379c7f..f7ba94a 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -185,6 +185,24 @@ static int srp_host_match(struct attribute_container *cont, struct device *dev)
 }
 
 /**
+ * srp_rport_get() - increment rport reference count
+ */
+void srp_rport_get(struct srp_rport *rport)
+{
+	get_device(&rport->dev);
+}
+EXPORT_SYMBOL(srp_rport_get);
+
+/**
+ * srp_rport_put() - decrement rport reference count
+ */
+void srp_rport_put(struct srp_rport *rport)
+{
+	put_device(&rport->dev);
+}
+EXPORT_SYMBOL(srp_rport_put);
+
+/**
  * srp_rport_add - add a SRP remote port to the device hierarchy
  * @shost:	scsi host the remote port is connected to.
  * @ids:	The port id for the remote port.
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index ff0f04a..5a2d2d1 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -38,6 +38,8 @@ extern struct scsi_transport_template *
 srp_attach_transport(struct srp_function_template *);
 extern void srp_release_transport(struct scsi_transport_template *);
 
+extern void srp_rport_get(struct srp_rport *rport);
+extern void srp_rport_put(struct srp_rport *rport);
 extern struct srp_rport *srp_rport_add(struct Scsi_Host *,
 				       struct srp_rport_identifiers *);
 extern void srp_rport_del(struct srp_rport *);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
@ 2013-06-12 13:28 ` Bart Van Assche
       [not found]   ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
  2013-06-23 21:13   ` Mike Christie
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
  1 sibling, 2 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:28 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, linux-scsi,
	James Bottomley

Add the necessary functions in the SRP transport module to allow
an SRP initiator driver to implement transport layer error handling
similar to the functionality already provided by the FC transport
layer. This includes:
- Support for implementing fast_io_fail_tmo, the time that should
  elapse after having detected a transport layer problem and
  before failing I/O.
- Support for implementing dev_loss_tmo, the time that should
  elapse after having detected a transport layer problem and
  before removing a remote port.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: David Dillow <dillowda@ornl.gov>
Cc: Vu Pham <vu@mellanox.com>
Cc: Sebastian Riemer <sebastian.riemer@profitbricks.com>
---
 Documentation/ABI/stable/sysfs-transport-srp |   36 ++
 drivers/scsi/scsi_transport_srp.c            |  489 +++++++++++++++++++++++++-
 include/scsi/scsi_transport_srp.h            |   54 ++-
 3 files changed, 576 insertions(+), 3 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp
index b36fb0d..b7759b1 100644
--- a/Documentation/ABI/stable/sysfs-transport-srp
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -5,6 +5,23 @@ Contact:	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
 Description:	Instructs an SRP initiator to disconnect from a target and to
 		remove all LUNs imported from that target.
 
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
+Date:		September 1, 2013
+KernelVersion:	3.11
+Contact:	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
+Description:	Number of seconds the SCSI layer will wait after a transport
+		layer error has been observed before removing a target port.
+		Zero means immediate removal.
+
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/fast_io_fail_tmo
+Date:		September 1, 2013
+KernelVersion:	3.11
+Contact:	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
+Description:	Number of seconds the SCSI layer will wait after a transport
+		layer error has been observed before failing I/O. Zero means
+		immediate removal. A negative value will disable this
+		behavior.
+
 What:		/sys/class/srp_remote_ports/port-<h>:<n>/port_id
 Date:		June 27, 2007
 KernelVersion:	2.6.24
@@ -12,8 +29,27 @@ Contact:	linux-scsi@vger.kernel.org
 Description:	16-byte local SRP port identifier in hexadecimal format. An
 		example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
 
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/reconnect_delay
+Date:		September 1, 2013
+KernelVersion:	3.11
+Contact:	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
+Description:	Number of seconds the SCSI layer will wait after a reconnect
+		attempt failed before retrying.
+
 What:		/sys/class/srp_remote_ports/port-<h>:<n>/roles
 Date:		June 27, 2007
 KernelVersion:	2.6.24
 Contact:	linux-scsi@vger.kernel.org
 Description:	Role of the remote port. Either "SRP Initiator" or "SRP Target".
+
+What:		/sys/class/srp_remote_ports/port-<h>:<n>/state
+Date:		September 1, 2013
+KernelVersion:	3.11
+Contact:	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
+Description:	State of the transport layer to the remote port. "running" if
+		the transport layer is operational; "blocked" if a transport
+		layer error has been encountered but the fail_io_fast_tmo
+		timer has not yet fired; "fail-fast" after the
+		fail_io_fast_tmo timer has fired and before the "dev_loss_tmo"
+		timer has fired; "lost" after the "dev_loss_tmo" timer has
+		fired and before the port is finally removed.
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index f7ba94a..53b34b3 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -24,12 +24,15 @@
 #include <linux/err.h>
 #include <linux/slab.h>
 #include <linux/string.h>
+#include <linux/delay.h>
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_device.h>
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_transport.h>
+#include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_transport_srp.h>
+#include "scsi_priv.h"
 #include "scsi_transport_srp_internal.h"
 
 struct srp_host_attrs {
@@ -38,7 +41,7 @@ struct srp_host_attrs {
 #define to_srp_host_attrs(host)	((struct srp_host_attrs *)(host)->shost_data)
 
 #define SRP_HOST_ATTRS 0
-#define SRP_RPORT_ATTRS 3
+#define SRP_RPORT_ATTRS 8
 
 struct srp_internal {
 	struct scsi_transport_template t;
@@ -54,6 +57,28 @@ struct srp_internal {
 
 #define	dev_to_rport(d)	container_of(d, struct srp_rport, dev)
 #define transport_class_to_srp_rport(dev) dev_to_rport((dev)->parent)
+static inline struct Scsi_Host *rport_to_shost(struct srp_rport *r)
+{
+	return dev_to_shost(r->dev.parent);
+}
+
+/**
+ * srp_tmo_valid() - check timeout combination validity
+ *
+ * If no fast I/O fail timeout has been configured then the device loss timeout
+ * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail timeout has
+ * been configured then it must be below the device loss timeout.
+ */
+int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
+{
+	return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
+		dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
+		|| (0 <= fast_io_fail_tmo &&
+		    (dev_loss_tmo < 0 ||
+		     (fast_io_fail_tmo < dev_loss_tmo &&
+		      dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
+}
+EXPORT_SYMBOL_GPL(srp_tmo_valid);
 
 static int srp_host_setup(struct transport_container *tc, struct device *dev,
 			  struct device *cdev)
@@ -134,10 +159,441 @@ static ssize_t store_srp_rport_delete(struct device *dev,
 
 static DEVICE_ATTR(delete, S_IWUSR, NULL, store_srp_rport_delete);
 
+static ssize_t show_srp_rport_state(struct device *dev,
+				    struct device_attribute *attr,
+				    char *buf)
+{
+	static const char *const state_name[] = {
+		[SRP_RPORT_RUNNING]	= "running",
+		[SRP_RPORT_BLOCKED]	= "blocked",
+		[SRP_RPORT_FAIL_FAST]	= "fail-fast",
+		[SRP_RPORT_LOST]	= "lost",
+	};
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	enum srp_rport_state state = rport->state;
+
+	return sprintf(buf, "%s\n",
+		       (unsigned)state < ARRAY_SIZE(state_name) ?
+		       state_name[state] : "???");
+}
+
+static DEVICE_ATTR(state, S_IRUGO, show_srp_rport_state, NULL);
+
+static ssize_t show_reconnect_delay(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	return sprintf(buf, "%d\n", rport->reconnect_delay);
+}
+
+static ssize_t store_reconnect_delay(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, const size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	char ch[16];
+	int res, delay;
+
+	sprintf(ch, "%.*s", min_t(int, sizeof(ch) - 1, count), buf);
+	res = kstrtoint(ch, 0, &delay);
+	if (res)
+		goto out;
+
+	res = mutex_lock_interruptible(&rport->mutex);
+	if (res)
+		goto out;
+	if (rport->reconnect_delay <= 0 && delay > 0 &&
+	    rport->state != SRP_RPORT_RUNNING) {
+		queue_delayed_work(system_long_wq, &rport->reconnect_work,
+				   delay * HZ);
+	} else if (delay <= 0) {
+		cancel_delayed_work(&rport->reconnect_work);
+	}
+	rport->reconnect_delay = delay;
+	mutex_unlock(&rport->mutex);
+
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(reconnect_delay, S_IRUGO | S_IWUSR, show_reconnect_delay,
+		   store_reconnect_delay);
+
+static ssize_t show_failed_reconnects(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	return sprintf(buf, "%d\n", rport->failed_reconnects);
+}
+
+static DEVICE_ATTR(failed_reconnects, S_IRUGO, show_failed_reconnects, NULL);
+
+static ssize_t show_srp_rport_fast_io_fail_tmo(struct device *dev,
+					       struct device_attribute *attr,
+					       char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	if (rport->fast_io_fail_tmo >= 0)
+		return sprintf(buf, "%d\n", rport->fast_io_fail_tmo);
+	else
+		return sprintf(buf, "off\n");
+}
+
+static ssize_t store_srp_rport_fast_io_fail_tmo(struct device *dev,
+						struct device_attribute *attr,
+						const char *buf, size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	char ch[16], *p;
+	int res;
+	int fast_io_fail_tmo;
+
+	sprintf(ch, "%.*s", min_t(int, sizeof(ch) - 1, count), buf);
+	p = strchr(ch, '\n');
+	if (p)
+		*p = '\0';
+
+	if (strcmp(ch, "off") != 0) {
+		res = kstrtoint(ch, 0, &fast_io_fail_tmo);
+		if (res)
+			goto out;
+	} else {
+		fast_io_fail_tmo = -1;
+	}
+	res = srp_tmo_valid(fast_io_fail_tmo, rport->dev_loss_tmo);
+	if (res)
+		goto out;
+	rport->fast_io_fail_tmo = fast_io_fail_tmo;
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(fast_io_fail_tmo, S_IRUGO | S_IWUSR,
+		   show_srp_rport_fast_io_fail_tmo,
+		   store_srp_rport_fast_io_fail_tmo);
+
+static ssize_t show_srp_rport_dev_loss_tmo(struct device *dev,
+					   struct device_attribute *attr,
+					   char *buf)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+
+	if (rport->dev_loss_tmo >= 0)
+		return sprintf(buf, "%d\n", rport->dev_loss_tmo);
+	else
+		return sprintf(buf, "off\n");
+}
+
+static ssize_t store_srp_rport_dev_loss_tmo(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct srp_rport *rport = transport_class_to_srp_rport(dev);
+	char ch[16];
+	int res;
+	int dev_loss_tmo;
+
+	sprintf(ch, "%.*s", min_t(int, sizeof(ch) - 1, count), buf);
+	if (strcmp(ch, "off") != 0) {
+		res = kstrtoint(ch, 0, &dev_loss_tmo);
+		if (res)
+			goto out;
+	} else {
+		dev_loss_tmo = -1;
+	}
+	res = srp_tmo_valid(rport->fast_io_fail_tmo, dev_loss_tmo);
+	if (res)
+		goto out;
+	rport->dev_loss_tmo = dev_loss_tmo;
+	res = count;
+
+out:
+	return res;
+}
+
+static DEVICE_ATTR(dev_loss_tmo, S_IRUGO | S_IWUSR,
+		   show_srp_rport_dev_loss_tmo,
+		   store_srp_rport_dev_loss_tmo);
+
+static int srp_rport_set_state(struct srp_rport *rport,
+			       enum srp_rport_state new_state)
+{
+	enum srp_rport_state old_state = rport->state;
+
+	lockdep_assert_held(&rport->mutex);
+
+	switch (new_state) {
+	case SRP_RPORT_RUNNING:
+		switch (old_state) {
+		case SRP_RPORT_LOST:
+			goto invalid;
+		default:
+			break;
+		}
+		break;
+	case SRP_RPORT_BLOCKED:
+		switch (old_state) {
+		case SRP_RPORT_RUNNING:
+			break;
+		default:
+			goto invalid;
+		}
+		break;
+	case SRP_RPORT_FAIL_FAST:
+		switch (old_state) {
+		case SRP_RPORT_LOST:
+			goto invalid;
+		default:
+			break;
+		}
+		break;
+	case SRP_RPORT_LOST:
+		break;
+	}
+	rport->state = new_state;
+	return 0;
+
+invalid:
+	return -EINVAL;
+}
+
+/**
+ * scsi_request_fn_active - number of kernel threads inside scsi_request_fn()
+ */
+static int scsi_request_fn_active(struct Scsi_Host *shost)
+{
+	struct scsi_device *sdev;
+	struct request_queue *q;
+	int request_fn_active = 0;
+
+	shost_for_each_device(sdev, shost) {
+		q = sdev->request_queue;
+
+		spin_lock_irq(q->queue_lock);
+		request_fn_active += q->request_fn_active;
+		spin_unlock_irq(q->queue_lock);
+	}
+
+	return request_fn_active;
+}
+
+/**
+ * srp_reconnect_rport - reconnect by invoking srp_function_template.reconnect()
+ *
+ * Blocks SCSI command queueing before invoking reconnect() such that the
+ * scsi_host_template.queuecommand() won't be invoked concurrently with
+ * reconnect(). This is important since a reconnect() implementation may
+ * reallocate resources needed by queuecommand(). Please note that this
+ * function neither waits until outstanding requests have finished nor tries
+ * to abort these. It is the responsibility of the reconnect() function to
+ * finish outstanding commands before reconnecting to the target port.
+ */
+int srp_reconnect_rport(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i = to_srp_internal(shost->transportt);
+	struct scsi_device *sdev;
+	int res;
+
+	pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev));
+
+	res = mutex_lock_interruptible(&rport->mutex);
+	if (res) {
+		pr_debug("%s: mutex_lock_interruptible() returned %d\n",
+			 dev_name(&shost->shost_gendev), res);
+		goto out;
+	}
+
+	spin_lock_irq(shost->host_lock);
+	scsi_block_requests(shost);
+	spin_unlock_irq(shost->host_lock);
+
+	while (scsi_request_fn_active(shost))
+		msleep(20);
+
+	res = i->f->reconnect(rport);
+	scsi_unblock_requests(shost);
+	pr_debug("%s (state %d): transport.reconnect() returned %d\n",
+		 dev_name(&shost->shost_gendev), rport->state, res);
+	if (res == 0) {
+		cancel_delayed_work(&rport->fast_io_fail_work);
+		cancel_delayed_work(&rport->dev_loss_work);
+		rport->failed_reconnects = 0;
+		scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING);
+		srp_rport_set_state(rport, SRP_RPORT_RUNNING);
+		/*
+		 * It can occur that after fast_io_fail_tmo expired and before
+		 * dev_loss_tmo expired that the SCSI error handler has
+		 * offlined one or more devices. scsi_target_unblock() doesn't
+		 * change the state of these devices into running, so do that
+		 * explicitly.
+		 */
+		spin_lock_irq(shost->host_lock);
+		__shost_for_each_device(sdev, shost)
+			if (sdev->sdev_state == SDEV_OFFLINE)
+				sdev->sdev_state = SDEV_RUNNING;
+		spin_unlock_irq(shost->host_lock);
+	} else if (rport->state == SRP_RPORT_RUNNING) {
+		srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
+		scsi_target_unblock(&shost->shost_gendev,
+				    SDEV_TRANSPORT_OFFLINE);
+	}
+	mutex_unlock(&rport->mutex);
+
+out:
+	return res;
+}
+EXPORT_SYMBOL(srp_reconnect_rport);
+
+static void srp_reconnect_work(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, reconnect_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	int delay, res;
+
+	res = srp_reconnect_rport(rport);
+	if (res != 0) {
+		shost_printk(KERN_ERR, shost,
+			     "reconnect attempt %d failed (%d)\n",
+			     ++rport->failed_reconnects, res);
+		delay = rport->reconnect_delay *
+			min(100, max(1, rport->failed_reconnects - 10));
+		if (delay > 0)
+			queue_delayed_work(system_long_wq,
+					   &rport->reconnect_work, delay * HZ);
+	}
+}
+
+static void __rport_fast_io_fail_timedout(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i;
+
+	lockdep_assert_held(&rport->mutex);
+
+	if (srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST))
+		return;
+	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
+
+	/* Involve the LLDD if possible to terminate all io on the rport. */
+	i = to_srp_internal(shost->transportt);
+	if (i->f->terminate_rport_io)
+		i->f->terminate_rport_io(rport);
+}
+
+/**
+ * rport_fast_io_fail_timedout() - fast I/O failure timeout handler
+ *
+ * Unblocks the SCSI host.
+ */
+static void rport_fast_io_fail_timedout(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, fast_io_fail_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+
+	pr_debug("fast_io_fail_tmo expired for %s.\n",
+		 dev_name(&shost->shost_gendev));
+
+	mutex_lock(&rport->mutex);
+	__rport_fast_io_fail_timedout(rport);
+	mutex_unlock(&rport->mutex);
+}
+
+/**
+ * rport_dev_loss_timedout() - device loss timeout handler
+ *
+ * Note: rport->ft->rport_delete must either unblock the SCSI host or schedule
+ * SCSI host removal.
+ */
+static void rport_dev_loss_timedout(struct work_struct *work)
+{
+	struct srp_rport *rport = container_of(to_delayed_work(work),
+					struct srp_rport, dev_loss_work);
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct srp_internal *i = to_srp_internal(shost->transportt);
+
+	pr_err("dev_loss_tmo expired for %s.\n",
+	       dev_name(&shost->shost_gendev));
+
+	mutex_lock(&rport->mutex);
+	WARN_ON(srp_rport_set_state(rport, SRP_RPORT_LOST) != 0);
+	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
+	mutex_unlock(&rport->mutex);
+
+	i->f->rport_delete(rport);
+}
+
+/**
+ * srp_start_tl_fail_timers() - start the transport layer failure timers
+ *
+ * Start the transport layer fast I/O failure and device loss timers. Do not
+ * modify a timer that was already started.
+ */
+void srp_start_tl_fail_timers(struct srp_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	int delay;
+
+	mutex_lock(&rport->mutex);
+	pr_debug("%s current state: %d\n", dev_name(&shost->shost_gendev),
+		 rport->state);
+	if (srp_rport_set_state(rport, SRP_RPORT_BLOCKED) != 0)
+		goto out_unlock;
+	pr_debug("%s new state: %d\n", dev_name(&shost->shost_gendev),
+		 rport->state);
+	scsi_target_block(&shost->shost_gendev);
+
+	if (rport->fast_io_fail_tmo >= 0)
+		queue_delayed_work(system_long_wq, &rport->fast_io_fail_work,
+				   1UL * rport->fast_io_fail_tmo * HZ);
+	if (rport->dev_loss_tmo >= 0)
+		queue_delayed_work(system_long_wq, &rport->dev_loss_work,
+				   1UL * rport->dev_loss_tmo * HZ);
+	delay = rport->reconnect_delay;
+	if (delay > 0)
+		queue_delayed_work(system_long_wq, &rport->reconnect_work,
+				   1UL * delay * HZ);
+out_unlock:
+	mutex_unlock(&rport->mutex);
+}
+EXPORT_SYMBOL(srp_start_tl_fail_timers);
+
+/**
+ * srp_timed_out - SRP transport intercept of the SCSI timeout EH
+ *
+ * If a timeout occurs while an rport is in the blocked state, ask the SCSI
+ * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
+ * handle the timeout (BLK_EH_NOT_HANDLED).
+ *
+ * Note: This function is called from soft-IRQ context and with the request
+ * queue lock held.
+ */
+static enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
+{
+	struct scsi_device *sdev = scmd->device;
+
+	pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
+	return scsi_device_blocked(sdev) ? BLK_EH_RESET_TIMER :
+		BLK_EH_NOT_HANDLED;
+}
+
 static void srp_rport_release(struct device *dev)
 {
 	struct srp_rport *rport = dev_to_rport(dev);
 
+	cancel_delayed_work_sync(&rport->reconnect_work);
+	cancel_delayed_work_sync(&rport->fast_io_fail_work);
+	cancel_delayed_work_sync(&rport->dev_loss_work);
+
 	put_device(dev->parent);
 	kfree(rport);
 }
@@ -214,12 +670,15 @@ struct srp_rport *srp_rport_add(struct Scsi_Host *shost,
 {
 	struct srp_rport *rport;
 	struct device *parent = &shost->shost_gendev;
+	struct srp_internal *i = to_srp_internal(shost->transportt);
 	int id, ret;
 
 	rport = kzalloc(sizeof(*rport), GFP_KERNEL);
 	if (!rport)
 		return ERR_PTR(-ENOMEM);
 
+	mutex_init(&rport->mutex);
+
 	device_initialize(&rport->dev);
 
 	rport->dev.parent = get_device(parent);
@@ -228,6 +687,17 @@ struct srp_rport *srp_rport_add(struct Scsi_Host *shost,
 	memcpy(rport->port_id, ids->port_id, sizeof(rport->port_id));
 	rport->roles = ids->roles;
 
+	if (i->f->reconnect)
+		rport->reconnect_delay = i->f->reconnect_delay ?
+			*i->f->reconnect_delay : 10;
+	INIT_DELAYED_WORK(&rport->reconnect_work, srp_reconnect_work);
+	rport->fast_io_fail_tmo = i->f->fast_io_fail_tmo ?
+		*i->f->fast_io_fail_tmo : 15;
+	rport->dev_loss_tmo = i->f->dev_loss_tmo ? *i->f->dev_loss_tmo : 600;
+	INIT_DELAYED_WORK(&rport->fast_io_fail_work,
+			  rport_fast_io_fail_timedout);
+	INIT_DELAYED_WORK(&rport->dev_loss_work, rport_dev_loss_timedout);
+
 	id = atomic_inc_return(&to_srp_host_attrs(shost)->next_port_id);
 	dev_set_name(&rport->dev, "port-%d:%d", shost->host_no, id);
 
@@ -277,6 +747,12 @@ void srp_rport_del(struct srp_rport *rport)
 	transport_remove_device(dev);
 	device_del(dev);
 	transport_destroy_device(dev);
+
+	mutex_lock(&rport->mutex);
+	if (rport->state == SRP_RPORT_BLOCKED)
+		__rport_fast_io_fail_timedout(rport);
+	mutex_unlock(&rport->mutex);
+
 	put_device(dev);
 }
 EXPORT_SYMBOL_GPL(srp_rport_del);
@@ -328,6 +804,8 @@ srp_attach_transport(struct srp_function_template *ft)
 	if (!i)
 		return NULL;
 
+	i->t.eh_timed_out = srp_timed_out;
+
 	i->t.tsk_mgmt_response = srp_tsk_mgmt_response;
 	i->t.it_nexus_response = srp_it_nexus_response;
 
@@ -345,6 +823,15 @@ srp_attach_transport(struct srp_function_template *ft)
 	count = 0;
 	i->rport_attrs[count++] = &dev_attr_port_id;
 	i->rport_attrs[count++] = &dev_attr_roles;
+	if (ft->has_rport_state) {
+		i->rport_attrs[count++] = &dev_attr_state;
+		i->rport_attrs[count++] = &dev_attr_fast_io_fail_tmo;
+		i->rport_attrs[count++] = &dev_attr_dev_loss_tmo;
+	}
+	if (ft->reconnect) {
+		i->rport_attrs[count++] = &dev_attr_reconnect_delay;
+		i->rport_attrs[count++] = &dev_attr_failed_reconnects;
+	}
 	if (ft->rport_delete)
 		i->rport_attrs[count++] = &dev_attr_delete;
 	i->rport_attrs[count++] = NULL;
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index 5a2d2d1..e077496 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -13,6 +13,18 @@ struct srp_rport_identifiers {
 	u8 roles;
 };
 
+enum srp_rport_state {
+	SRP_RPORT_RUNNING,
+	SRP_RPORT_BLOCKED,
+	SRP_RPORT_FAIL_FAST,
+	SRP_RPORT_LOST,
+};
+
+/**
+ * struct srp_rport
+ * @mutex:   Protects against concurrent rport fast_io_fail / dev_loss_tmo /
+ *   reconnect activity.
+ */
 struct srp_rport {
 	/* for initiator and target drivers */
 
@@ -23,11 +35,27 @@ struct srp_rport {
 
 	/* for initiator drivers */
 
-	void *lld_data;	/* LLD private data */
+	void			*lld_data;	/* LLD private data */
+
+	struct mutex		mutex;
+	enum srp_rport_state	state;
+	int			reconnect_delay;
+	int			failed_reconnects;
+	struct delayed_work	reconnect_work;
+	int			fast_io_fail_tmo;
+	int			dev_loss_tmo;
+	struct delayed_work	fast_io_fail_work;
+	struct delayed_work	dev_loss_work;
 };
 
 struct srp_function_template {
 	/* for initiator drivers */
+	bool has_rport_state;
+	int *reconnect_delay;
+	int *fast_io_fail_tmo;
+	int *dev_loss_tmo;
+	int (*reconnect)(struct srp_rport *rport);
+	void (*terminate_rport_io)(struct srp_rport *rport);
 	void (*rport_delete)(struct srp_rport *rport);
 	/* for target drivers */
 	int (* tsk_mgmt_response)(struct Scsi_Host *, u64, u64, int);
@@ -43,7 +71,29 @@ extern void srp_rport_put(struct srp_rport *rport);
 extern struct srp_rport *srp_rport_add(struct Scsi_Host *,
 				       struct srp_rport_identifiers *);
 extern void srp_rport_del(struct srp_rport *);
-
+extern int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo);
+extern int srp_reconnect_rport(struct srp_rport *rport);
+extern void srp_start_tl_fail_timers(struct srp_rport *rport);
 extern void srp_remove_host(struct Scsi_Host *);
 
+/**
+ * srp_chkready() - evaluate the transport layer state before I/O
+ *
+ * Returns a SCSI result code that can be returned by the LLDD. The role of
+ * this function is similar to that of fc_remote_port_chkready().
+ */
+static inline int srp_chkready(struct srp_rport *rport)
+{
+	switch (rport->state) {
+	case SRP_RPORT_RUNNING:
+	case SRP_RPORT_BLOCKED:
+	default:
+		return 0;
+	case SRP_RPORT_FAIL_FAST:
+		return DID_TRANSPORT_FAILFAST << 16;
+	case SRP_RPORT_LOST:
+		return DID_NO_CONNECT << 16;
+	}
+}
+
 #endif
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 08/14] IB/srp: Add srp_terminate_io()
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2013-06-12 13:26   ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
@ 2013-06-12 13:29   ` Bart Van Assche
  2013-06-12 13:30   ` [PATCH 09/14] IB/srp: Use SRP transport layer error recovery Bart Van Assche
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:29 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Finish all outstanding I/O requests after fast_io_fail_tmo expired,
which speeds up failover in a multipath setup. This patch is a
reworked version of a patch from Sebastian Riemer.

Reported-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2bd0587..ffda0ca 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -716,17 +716,29 @@ static void srp_free_req(struct srp_target_port *target,
 	spin_unlock_irqrestore(&target->lock, flags);
 }
 
-static void srp_reset_req(struct srp_target_port *target, struct srp_request *req)
+static void srp_finish_req(struct srp_target_port *target,
+			   struct srp_request *req, int result)
 {
 	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
 
 	if (scmnd) {
 		srp_free_req(target, req, scmnd, 0);
-		scmnd->result = DID_RESET << 16;
+		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 	}
 }
 
+static void srp_terminate_io(struct srp_rport *rport)
+{
+	struct srp_target_port *target = rport->lld_data;
+	int i;
+
+	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+		struct srp_request *req = &target->req_ring[i];
+		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
+	}
+}
+
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct Scsi_Host *shost = target->scsi_host;
@@ -753,8 +765,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
 
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		if (req->scmnd)
-			srp_reset_req(target, req);
+		srp_finish_req(target, req, DID_RESET << 16);
 	}
 
 	INIT_LIST_HEAD(&target->free_tx);
@@ -1809,7 +1820,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd && req->scmnd->device == scmnd->device)
-			srp_reset_req(target, req);
+			srp_finish_req(target, req, DID_RESET << 16);
 	}
 
 	return SUCCESS;
@@ -2589,6 +2600,7 @@ static void srp_remove_one(struct ib_device *device)
 
 static struct srp_function_template ib_srp_transport_functions = {
 	.rport_delete		 = srp_rport_delete,
+	.terminate_rport_io	 = srp_terminate_io,
 };
 
 static int __init srp_init_module(void)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 09/14] IB/srp: Use SRP transport layer error recovery
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2013-06-12 13:29   ` [PATCH 08/14] IB/srp: Add srp_terminate_io() Bart Van Assche
@ 2013-06-12 13:30   ` Bart Van Assche
  2013-06-12 13:31   ` [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:30 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB
SRP initiator.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |  123 +++++++++++++++++++++++++----------
 drivers/infiniband/ulp/srp/ib_srp.h |    1 -
 2 files changed, 88 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index ffda0ca..6f3e0e5 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -86,6 +86,31 @@ module_param(topspin_workarounds, int, 0444);
 MODULE_PARM_DESC(topspin_workarounds,
 		 "Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
 
+static int srp_reconnect_delay = 10;
+module_param_named(reconnect_delay, srp_reconnect_delay, int, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(reconnect_delay, "Time between successive reconnect attempts");
+
+static struct kernel_param_ops srp_tmo_ops;
+
+static int srp_fast_io_fail_tmo = 15;
+module_param_cb(fast_io_fail_tmo, &srp_tmo_ops, &srp_fast_io_fail_tmo,
+		S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(fast_io_fail_tmo,
+		 "Number of seconds between the observation of a transport"
+		 " layer error and failing all I/O. \"off\" means that this"
+		 " functionality is disabled.");
+
+static int srp_dev_loss_tmo = 600;
+module_param_cb(dev_loss_tmo, &srp_tmo_ops, &srp_dev_loss_tmo,
+		S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(dev_loss_tmo,
+		 "Maximum number of seconds that the SRP transport should"
+		 " insulate transport layer errors. After this time has been"
+		 " exceeded the SCSI target is removed. Should be"
+		 " between 1 and " __stringify(SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
+		 " if fast_io_fail_tmo has not been set. \"off\" means that"
+		 " this functionality is disabled.");
+
 static void srp_add_one(struct ib_device *device);
 static void srp_remove_one(struct ib_device *device);
 static void srp_recv_completion(struct ib_cq *cq, void *target_ptr);
@@ -102,6 +127,44 @@ static struct ib_client srp_client = {
 
 static struct ib_sa_client srp_sa_client;
 
+static int srp_tmo_get(char *buffer, const struct kernel_param *kp)
+{
+	int tmo = *(int *)kp->arg;
+
+	if (tmo >= 0)
+		return sprintf(buffer, "%d", tmo);
+	else
+		return sprintf(buffer, "off");
+}
+
+static int srp_tmo_set(const char *val, const struct kernel_param *kp)
+{
+	int tmo, res;
+
+	if (strcmp(val, "off") != 0) {
+		res = kstrtoint(val, 0, &tmo);
+		if (res)
+			goto out;
+	} else {
+		tmo = -1;
+	}
+	if (kp->arg == &srp_fast_io_fail_tmo)
+		res = srp_tmo_valid(tmo, srp_dev_loss_tmo);
+	else
+		res = srp_tmo_valid(srp_fast_io_fail_tmo, tmo);
+	if (res)
+		goto out;
+	*(int *)kp->arg = tmo;
+
+out:
+	return res;
+}
+
+static struct kernel_param_ops srp_tmo_ops = {
+	.get = srp_tmo_get,
+	.set = srp_tmo_set,
+};
+
 static inline struct srp_target_port *host_to_target(struct Scsi_Host *host)
 {
 	return (struct srp_target_port *) host->hostdata;
@@ -739,13 +802,20 @@ static void srp_terminate_io(struct srp_rport *rport)
 	}
 }
 
-static int srp_reconnect_target(struct srp_target_port *target)
+/*
+ * It is up to the caller to ensure that srp_rport_reconnect() calls are
+ * serialized and that no concurrent srp_queuecommand(), srp_abort(),
+ * srp_reset_device() or srp_reset_host() calls will occur while this function
+ * is in progress. One way to realize that is not to call this function
+ * directly but to call srp_reconnect_rport() instead since that last function
+ * serializes calls of this function via rport->mutex and also blocks
+ * srp_queuecommand() calls before invoking this function.
+ */
+static int srp_rport_reconnect(struct srp_rport *rport)
 {
-	struct Scsi_Host *shost = target->scsi_host;
+	struct srp_target_port *target = rport->lld_data;
 	int i, ret;
 
-	scsi_target_block(&shost->shost_gendev);
-
 	srp_disconnect_target(target);
 	/*
 	 * Now get a new local CM ID so that we avoid confusing the target in
@@ -775,28 +845,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	if (ret == 0)
 		ret = srp_connect_target(target);
 
-	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
-			    SDEV_TRANSPORT_OFFLINE);
-	target->transport_offline = !!ret;
-
-	if (ret)
-		goto err;
-
-	shost_printk(KERN_INFO, target->scsi_host, PFX "reconnect succeeded\n");
-
-	return ret;
-
-err:
-	shost_printk(KERN_ERR, target->scsi_host,
-		     PFX "reconnect failed (%d), removing target port.\n", ret);
-
-	/*
-	 * We couldn't reconnect, so kill our target port off.
-	 * However, we have to defer the real removal because we
-	 * are in the context of the SCSI error handler now, which
-	 * will deadlock if we call scsi_remove_host().
-	 */
-	srp_queue_remove_work(target);
+	if (ret == 0)
+		shost_printk(KERN_INFO, target->scsi_host,
+			     PFX "reconnect succeeded\n");
 
 	return ret;
 }
@@ -1395,10 +1446,11 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	struct srp_cmd *cmd;
 	struct ib_device *dev;
 	unsigned long flags;
-	int len;
+	int len, result;
 
-	if (unlikely(target->transport_offline)) {
-		scmnd->result = DID_NO_CONNECT << 16;
+	result = srp_chkready(target->rport);
+	if (unlikely(result)) {
+		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 		return 0;
 	}
@@ -1793,7 +1845,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 		return FAILED;
 	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			      SRP_TSK_ABORT_TASK) == 0 ||
-	    target->transport_offline)
+	    srp_chkready(target->rport) != 0)
 		ret = SUCCESS;
 	else
 		ret = FAILED;
@@ -1829,14 +1881,10 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 static int srp_reset_host(struct scsi_cmnd *scmnd)
 {
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
-	int ret = FAILED;
 
 	shost_printk(KERN_ERR, target->scsi_host, PFX "SRP reset_host called\n");
 
-	if (!srp_reconnect_target(target))
-		ret = SUCCESS;
-
-	return ret;
+	return srp_reconnect_rport(target->rport) == 0 ? SUCCESS : FAILED;
 }
 
 static int srp_slave_configure(struct scsi_device *sdev)
@@ -2599,6 +2647,11 @@ static void srp_remove_one(struct ib_device *device)
 }
 
 static struct srp_function_template ib_srp_transport_functions = {
+	.has_rport_state	 = true,
+	.reconnect_delay	 = &srp_reconnect_delay,
+	.fast_io_fail_tmo	 = &srp_fast_io_fail_tmo,
+	.dev_loss_tmo		 = &srp_dev_loss_tmo,
+	.reconnect		 = srp_rport_reconnect,
 	.rport_delete		 = srp_rport_delete,
 	.terminate_rport_io	 = srp_terminate_io,
 };
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 1817ed5..fda82f7 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -140,7 +140,6 @@ struct srp_target_port {
 	unsigned int		cmd_sg_cnt;
 	unsigned int		indirect_size;
 	bool			allow_ext_sg;
-	bool			transport_offline;
 
 	/* Everything above this point is used in the hot path of
 	 * command processing. Try to keep them packed into cachelines.
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2013-06-12 13:30   ` [PATCH 09/14] IB/srp: Use SRP transport layer error recovery Bart Van Assche
@ 2013-06-12 13:31   ` Bart Van Assche
  2013-06-12 13:33   ` [PATCH 11/14] IB/srp: Fail SCSI commands silently Bart Van Assche
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:31 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Start the reconnect timer, fast_io_fail timer and dev_loss timer
if a transport layer error occurs.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   19 +++++++++++++++++++
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 6f3e0e5..7cf44e1 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -595,6 +595,7 @@ static void srp_remove_target(struct srp_target_port *target)
 	srp_disconnect_target(target);
 	ib_destroy_cm_id(target->cm_id);
 	srp_free_target_ib(target);
+	cancel_work_sync(&target->tl_err_work);
 	srp_rport_put(target->rport);
 	srp_free_req_data(target);
 	scsi_host_put(target->scsi_host);
@@ -1394,6 +1395,21 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 			     PFX "Recv failed with error code %d\n", res);
 }
 
+/**
+ * srp_tl_err_work() - handle a transport layer error
+ *
+ * Note: This function may get invoked before the rport has been created,
+ * hence the target->rport test.
+ */
+static void srp_tl_err_work(struct work_struct *work)
+{
+	struct srp_target_port *target;
+
+	target = container_of(work, struct srp_target_port, tl_err_work);
+	if (target->rport)
+		srp_start_tl_fail_timers(target->rport);
+}
+
 static void srp_handle_qp_err(enum ib_wc_status wc_status,
 			      enum ib_wc_opcode wc_opcode,
 			      struct srp_target_port *target)
@@ -1403,6 +1419,7 @@ static void srp_handle_qp_err(enum ib_wc_status wc_status,
 			     PFX "failed %s status %d\n",
 			     wc_opcode & IB_WC_RECV ? "receive" : "send",
 			     wc_status);
+		queue_work(system_long_wq, &target->tl_err_work);
 	}
 	target->qp_in_error = true;
 }
@@ -1763,6 +1780,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 		if (ib_send_cm_drep(cm_id, NULL, 0))
 			shost_printk(KERN_ERR, target->scsi_host,
 				     PFX "Sending CM DREP failed\n");
+		queue_work(system_long_wq, &target->tl_err_work);
 		break;
 
 	case IB_CM_TIMEWAIT_EXIT:
@@ -2374,6 +2392,7 @@ static ssize_t srp_create_target(struct device *dev,
 			     sizeof (struct srp_indirect_buf) +
 			     target->cmd_sg_cnt * sizeof (struct srp_direct_buf);
 
+	INIT_WORK(&target->tl_err_work, srp_tl_err_work);
 	INIT_WORK(&target->remove_work, srp_remove_work);
 	spin_lock_init(&target->lock);
 	INIT_LIST_HEAD(&target->free_tx);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index fda82f7..e45d9d0 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -175,6 +175,7 @@ struct srp_target_port {
 	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
 	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
 
+	struct work_struct	tl_err_work;
 	struct work_struct	remove_work;
 
 	struct list_head	list;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 11/14] IB/srp: Fail SCSI commands silently
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (8 preceding siblings ...)
  2013-06-12 13:31   ` [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
@ 2013-06-12 13:33   ` Bart Van Assche
  2013-06-12 13:35   ` [PATCH 12/14] IB/srp: Make HCA completion vector configurable Bart Van Assche
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:33 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

From: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>

Avoid that path failover in a multipath setup causes the SCSI layer
to generate kernel messages about SCSI command failures. This patch
speeds up SRP initiator operation significantly when monitoring
kernel messages over a serial port.

[bvanassche: Changed patch description]
Signed-off-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 7cf44e1..dbedac9 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -787,6 +787,7 @@ static void srp_finish_req(struct srp_target_port *target,
 
 	if (scmnd) {
 		srp_free_req(target, req, scmnd, 0);
+		scmnd->request->cmd_flags |= REQ_QUIET;
 		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 	}
@@ -1467,6 +1468,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 
 	result = srp_chkready(target->rport);
 	if (unlikely(result)) {
+		scmnd->request->cmd_flags |= REQ_QUIET;
 		scmnd->result = result;
 		scmnd->scsi_done(scmnd);
 		return 0;
@@ -1868,6 +1870,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 	else
 		ret = FAILED;
 	srp_free_req(target, req, scmnd, 0);
+	scmnd->request->cmd_flags |= REQ_QUIET;
 	scmnd->result = DID_ABORT << 16;
 	scmnd->scsi_done(scmnd);
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 12/14] IB/srp: Make HCA completion vector configurable
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (9 preceding siblings ...)
  2013-06-12 13:33   ` [PATCH 11/14] IB/srp: Fail SCSI commands silently Bart Van Assche
@ 2013-06-12 13:35   ` Bart Van Assche
       [not found]     ` <51B87904.1070803-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:36   ` [PATCH 13/14] IB/srp: Make transport layer retry count configurable Bart Van Assche
  2013-06-12 13:37   ` [PATCH 14/14] IB/srp: Bump driver version and release date Bart Van Assche
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:35 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

Several InfiniBand HCA's allow to configure the completion vector
per queue pair. This allows to spread the workload created by IB
completion interrupts over multiple MSI-X vectors and hence over
multiple CPU cores. In other words, configuring the completion
vector properly not only allows to reduce latency on an initiator
connected to multiple SRP targets but also allows to improve
throughput.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   26 ++++++++++++++++++++++++--
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index dbedac9..49d8464 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -294,14 +294,16 @@ static int srp_create_target_ib(struct srp_target_port *target)
 		return -ENOMEM;
 
 	recv_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_recv_completion, NULL, target, SRP_RQ_SIZE, 0);
+			       srp_recv_completion, NULL, target, SRP_RQ_SIZE,
+			       target->comp_vector);
 	if (IS_ERR(recv_cq)) {
 		ret = PTR_ERR(recv_cq);
 		goto err;
 	}
 
 	send_cq = ib_create_cq(target->srp_host->srp_dev->dev,
-			       srp_send_completion, NULL, target, SRP_SQ_SIZE, 0);
+			       srp_send_completion, NULL, target, SRP_SQ_SIZE,
+			       target->comp_vector);
 	if (IS_ERR(send_cq)) {
 		ret = PTR_ERR(send_cq);
 		goto err_recv_cq;
@@ -2006,6 +2008,14 @@ static ssize_t show_local_ib_device(struct device *dev,
 	return sprintf(buf, "%s\n", target->srp_host->srp_dev->dev->name);
 }
 
+static ssize_t show_comp_vector(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct srp_target_port *target = host_to_target(class_to_shost(dev));
+
+	return sprintf(buf, "%d\n", target->comp_vector);
+}
+
 static ssize_t show_cmd_sg_entries(struct device *dev,
 				   struct device_attribute *attr, char *buf)
 {
@@ -2032,6 +2042,7 @@ static DEVICE_ATTR(req_lim,         S_IRUGO, show_req_lim,         NULL);
 static DEVICE_ATTR(zero_req_lim,    S_IRUGO, show_zero_req_lim,	   NULL);
 static DEVICE_ATTR(local_ib_port,   S_IRUGO, show_local_ib_port,   NULL);
 static DEVICE_ATTR(local_ib_device, S_IRUGO, show_local_ib_device, NULL);
+static DEVICE_ATTR(comp_vector,     S_IRUGO, show_comp_vector,     NULL);
 static DEVICE_ATTR(cmd_sg_entries,  S_IRUGO, show_cmd_sg_entries,  NULL);
 static DEVICE_ATTR(allow_ext_sg,    S_IRUGO, show_allow_ext_sg,    NULL);
 
@@ -2046,6 +2057,7 @@ static struct device_attribute *srp_host_attrs[] = {
 	&dev_attr_zero_req_lim,
 	&dev_attr_local_ib_port,
 	&dev_attr_local_ib_device,
+	&dev_attr_comp_vector,
 	&dev_attr_cmd_sg_entries,
 	&dev_attr_allow_ext_sg,
 	NULL
@@ -2140,6 +2152,7 @@ enum {
 	SRP_OPT_CMD_SG_ENTRIES	= 1 << 9,
 	SRP_OPT_ALLOW_EXT_SG	= 1 << 10,
 	SRP_OPT_SG_TABLESIZE	= 1 << 11,
+	SRP_OPT_COMP_VECTOR	= 1 << 12,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -2160,6 +2173,7 @@ static const match_table_t srp_opt_tokens = {
 	{ SRP_OPT_CMD_SG_ENTRIES,	"cmd_sg_entries=%u"	},
 	{ SRP_OPT_ALLOW_EXT_SG,		"allow_ext_sg=%u"	},
 	{ SRP_OPT_SG_TABLESIZE,		"sg_tablesize=%u"	},
+	{ SRP_OPT_COMP_VECTOR,		"comp_vector=%u"	},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -2315,6 +2329,14 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			target->sg_tablesize = token;
 			break;
 
+		case SRP_OPT_COMP_VECTOR:
+			if (match_int(args, &token) || token < 0) {
+				pr_warn("bad comp_vector parameter '%s'\n", p);
+				goto out;
+			}
+			target->comp_vector = token;
+			break;
+
 		default:
 			pr_warn("unknown parameter or missing value '%s' in target creation request\n",
 				p);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e45d9d0..cbc0b14 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -156,6 +156,7 @@ struct srp_target_port {
 	char			target_name[32];
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
+	int			comp_vector;
 
 	struct ib_sa_path_rec	path;
 	__be16			orig_dgid[8];
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 13/14] IB/srp: Make transport layer retry count configurable
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (10 preceding siblings ...)
  2013-06-12 13:35   ` [PATCH 12/14] IB/srp: Make HCA completion vector configurable Bart Van Assche
@ 2013-06-12 13:36   ` Bart Van Assche
       [not found]     ` <51B8794F.6050003-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:37   ` [PATCH 14/14] IB/srp: Bump driver version and release date Bart Van Assche
  12 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:36 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Allow the InfiniBand RC retry count to be configured by the user
as an option in the target login string. The transport layer
timeout in nanoseconds is computed as follows from the retry count:

rc_timeout = rc_retry_count * 4 * 4096 * (1 << qp->timeout)

The default value for tl_retry_count is changed from 7 into 2.
Hence with a qp->timedout value of 19 this patch reduces the
default transport layer timeout from about 60s to about 17s. The
purpose of this patch is to reduce the time needed for SCSI error
handling significantly and at the same time to avoid activating
the SCSI error handler on an IB path with a regular BER or due to
brief IB network congestion.

[bvanassche: Rewrote patch description]
Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   24 +++++++++++++++++++++++-
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 49d8464..44ee93a 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -453,7 +453,7 @@ static int srp_send_req(struct srp_target_port *target)
 	req->param.responder_resources	      = 4;
 	req->param.remote_cm_response_timeout = 20;
 	req->param.local_cm_response_timeout  = 20;
-	req->param.retry_count 		      = 7;
+	req->param.retry_count                = target->tl_retry_count;
 	req->param.rnr_retry_count 	      = 7;
 	req->param.max_cm_retries 	      = 15;
 
@@ -2016,6 +2016,14 @@ static ssize_t show_comp_vector(struct device *dev,
 	return sprintf(buf, "%d\n", target->comp_vector);
 }
 
+static ssize_t show_tl_retry_count(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct srp_target_port *target = host_to_target(class_to_shost(dev));
+
+	return sprintf(buf, "%d\n", target->tl_retry_count);
+}
+
 static ssize_t show_cmd_sg_entries(struct device *dev,
 				   struct device_attribute *attr, char *buf)
 {
@@ -2043,6 +2051,7 @@ static DEVICE_ATTR(zero_req_lim,    S_IRUGO, show_zero_req_lim,	   NULL);
 static DEVICE_ATTR(local_ib_port,   S_IRUGO, show_local_ib_port,   NULL);
 static DEVICE_ATTR(local_ib_device, S_IRUGO, show_local_ib_device, NULL);
 static DEVICE_ATTR(comp_vector,     S_IRUGO, show_comp_vector,     NULL);
+static DEVICE_ATTR(tl_retry_count,  S_IRUGO, show_tl_retry_count,  NULL);
 static DEVICE_ATTR(cmd_sg_entries,  S_IRUGO, show_cmd_sg_entries,  NULL);
 static DEVICE_ATTR(allow_ext_sg,    S_IRUGO, show_allow_ext_sg,    NULL);
 
@@ -2058,6 +2067,7 @@ static struct device_attribute *srp_host_attrs[] = {
 	&dev_attr_local_ib_port,
 	&dev_attr_local_ib_device,
 	&dev_attr_comp_vector,
+	&dev_attr_tl_retry_count,
 	&dev_attr_cmd_sg_entries,
 	&dev_attr_allow_ext_sg,
 	NULL
@@ -2153,6 +2163,7 @@ enum {
 	SRP_OPT_ALLOW_EXT_SG	= 1 << 10,
 	SRP_OPT_SG_TABLESIZE	= 1 << 11,
 	SRP_OPT_COMP_VECTOR	= 1 << 12,
+	SRP_OPT_TL_RETRY_COUNT	= 1 << 13,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -2174,6 +2185,7 @@ static const match_table_t srp_opt_tokens = {
 	{ SRP_OPT_ALLOW_EXT_SG,		"allow_ext_sg=%u"	},
 	{ SRP_OPT_SG_TABLESIZE,		"sg_tablesize=%u"	},
 	{ SRP_OPT_COMP_VECTOR,		"comp_vector=%u"	},
+	{ SRP_OPT_TL_RETRY_COUNT,	"tl_retry_count=%u"	},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -2337,6 +2349,15 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			target->comp_vector = token;
 			break;
 
+		case SRP_OPT_TL_RETRY_COUNT:
+			if (match_int(args, &token) || token < 2 || token > 7) {
+				pr_warn("bad tl_retry_count parameter '%s' (must be a number between 2 and 7)\n",
+					p);
+				goto out;
+			}
+			target->tl_retry_count = token;
+			break;
+
 		default:
 			pr_warn("unknown parameter or missing value '%s' in target creation request\n",
 				p);
@@ -2391,6 +2412,7 @@ static ssize_t srp_create_target(struct device *dev,
 	target->cmd_sg_cnt	= cmd_sg_entries;
 	target->sg_tablesize	= indirect_sg_entries ? : cmd_sg_entries;
 	target->allow_ext_sg	= allow_ext_sg;
+	target->tl_retry_count	= 2;
 
 	ret = srp_parse_options(buf, target);
 	if (ret)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index cbc0b14..446b045 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -157,6 +157,7 @@ struct srp_target_port {
 	unsigned int		scsi_id;
 	unsigned int		sg_tablesize;
 	int			comp_vector;
+	int			tl_retry_count;
 
 	struct ib_sa_path_rec	path;
 	__be16			orig_dgid[8];
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 14/14] IB/srp: Bump driver version and release date
       [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
                     ` (11 preceding siblings ...)
  2013-06-12 13:36   ` [PATCH 13/14] IB/srp: Make transport layer retry count configurable Bart Van Assche
@ 2013-06-12 13:37   ` Bart Van Assche
  12 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:37 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma

From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Signed-off-by: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 44ee93a..dd4b1a7 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -53,8 +53,8 @@
 
 #define DRV_NAME	"ib_srp"
 #define PFX		DRV_NAME ": "
-#define DRV_VERSION	"0.2"
-#define DRV_RELDATE	"November 1, 2005"
+#define DRV_VERSION	"1.0"
+#define DRV_RELDATE	"June 1, 2013"
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("InfiniBand SCSI RDMA Protocol initiator "
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found]     ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
@ 2013-06-12 13:38       ` Bart Van Assche
       [not found]         ` <51B879CF.1080802-HInyCGIudOg@public.gmane.org>
  2013-06-27 21:01       ` David Dillow
  1 sibling, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 13:38 UTC (permalink / raw)
  Cc: Roland Dreier, David Dillow, Vu Pham, Sebastian Riemer,
	linux-rdma, Dotan Barak, Eli Cohen

On 06/12/13 15:20, Bart Van Assche wrote:
> If the add_one callback fails during driver load no resources are
> allocated so there isn't a need to release any resources. Trying
> to clean the resource may lead to the following kernel panic:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffffa0132331>] srp_remove_one+0x31/0x240 [ib_srp]
> RIP: 0010:[<ffffffffa0132331>]  [<ffffffffa0132331>] srp_remove_one+0x31/0x240 [ib_srp]
> Process rmmod (pid: 4562, threadinfo ffff8800dd738000, task ffff8801167e60c0)
> Call Trace:
>   [<ffffffffa024500e>] ib_unregister_client+0x4e/0x120 [ib_core]
>   [<ffffffffa01361bd>] srp_cleanup_module+0x15/0x71 [ib_srp]
>   [<ffffffff810ac6a4>] sys_delete_module+0x194/0x260
>   [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
>
> [bvanassche: Shortened patch description]
> Signed-off-by: Dotan Barak <dotanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
> Reviewed-by: Eli Cohen <eli-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c |    2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 7ccf328..368d160 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device *device)
>   	struct srp_target_port *target;
>
>   	srp_dev = ib_get_client_data(device, &srp_client);
> +	if (!srp_dev)
> +		return;
>
>   	list_for_each_entry_safe(host, tmp_host, &srp_dev->dev_list, list) {
>   		device_unregister(&host->dev);
>

Please note that this patch was authored by Dotan Barak, so I should 
have mentioned:

From: Dotan Barak <dotanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found]         ` <51B879CF.1080802-HInyCGIudOg@public.gmane.org>
@ 2013-06-12 14:24           ` Sebastian Riemer
  0 siblings, 0 replies; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-12 14:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma, Dotan Barak, Eli Cohen

On 12.06.2013 15:38, Bart Van Assche wrote:
> On 06/12/13 15:20, Bart Van Assche wrote:
>> If the add_one callback fails during driver load no resources are
>> allocated so there isn't a need to release any resources. Trying
>> to clean the resource may lead to the following kernel panic:
>>
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<ffffffffa0132331>] srp_remove_one+0x31/0x240 [ib_srp]
>> RIP: 0010:[<ffffffffa0132331>]  [<ffffffffa0132331>]
>> srp_remove_one+0x31/0x240 [ib_srp]
>> Process rmmod (pid: 4562, threadinfo ffff8800dd738000, task
>> ffff8801167e60c0)
>> Call Trace:
>>   [<ffffffffa024500e>] ib_unregister_client+0x4e/0x120 [ib_core]
>>   [<ffffffffa01361bd>] srp_cleanup_module+0x15/0x71 [ib_srp]
>>   [<ffffffff810ac6a4>] sys_delete_module+0x194/0x260
>>   [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
>>
>> [bvanassche: Shortened patch description]
>> Signed-off-by: Dotan Barak <dotanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
>> Reviewed-by: Eli Cohen <eli-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c |    2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
>> b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 7ccf328..368d160 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device
>> *device)
>>       struct srp_target_port *target;
>>
>>       srp_dev = ib_get_client_data(device, &srp_client);
>> +    if (!srp_dev)
>> +        return;
>>
>>       list_for_each_entry_safe(host, tmp_host, &srp_dev->dev_list,
>> list) {
>>           device_unregister(&host->dev);
>>
> 
> Please note that this patch was authored by Dotan Barak, so I should
> have mentioned:
> 
> From: Dotan Barak <dotanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>

Acked-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
       [not found]     ` <51B875EE.3030702-HInyCGIudOg@public.gmane.org>
@ 2013-06-12 14:58       ` Sebastian Riemer
       [not found]         ` <51B88C7C.4030209-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-12 14:58 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

Wait a minute, so you've changed this commit to also hold that target
lock in the following functions in error case:

srp_unmap_data(),
srp_put_tx_iu()

This is different from:
https://github.com/bvanassche/ib_srp-backport/commit/6ce0e30dbb69973926df84292239f0c20f6a2d6c

srp_unmap_data() calls ib_fmr_pool_unmap() which uses an own spin lock
(pool->pool_lock).

srp_put_tx_iu() acquires the target lock as well (target->lock). That's
spin lock in spin lock. I would say that this dead locks.

I like the other version more.

Cheers,
Sebastian


On 12.06.2013 15:21, Bart Van Assche wrote:
> Avoid that srp_claim_command() can claim a command while
> srp_queuecommand() is still busy queueing the same command.
> Found this via source reading.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 368d160..9c638dd 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1367,7 +1367,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>  
>  	req = list_first_entry(&target->free_reqs, struct srp_request, list);
>  	list_del(&req->list);
> -	spin_unlock_irqrestore(&target->lock, flags);
>  
>  	dev = target->srp_host->srp_dev->dev;
>  	ib_dma_sync_single_for_cpu(dev, iu->dma, target->max_iu_len,
> @@ -1401,6 +1400,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>  		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
>  		goto err_unmap;
>  	}
> +	spin_unlock_irqrestore(&target->lock, flags);
>  
>  	return 0;
>  
> @@ -1409,8 +1409,6 @@ err_unmap:
>  
>  err_iu:
>  	srp_put_tx_iu(target, iu, SRP_IU_CMD);
> -
> -	spin_lock_irqsave(&target->lock, flags);
>  	list_add(&req->list, &target->free_reqs);
>  
>  err_unlock:
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
       [not found]         ` <51B88C7C.4030209-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-12 15:14           ` Bart Van Assche
       [not found]             ` <51B8903E.3000609-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 15:14 UTC (permalink / raw)
  To: Sebastian Riemer; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

On 06/12/13 16:58, Sebastian Riemer wrote:
> Wait a minute, so you've changed this commit to also hold that target
> lock in the following functions in error case:
>
> srp_unmap_data(),
> srp_put_tx_iu()
>
> This is different from:
> https://github.com/bvanassche/ib_srp-backport/commit/6ce0e30dbb69973926df84292239f0c20f6a2d6c
>
> srp_unmap_data() calls ib_fmr_pool_unmap() which uses an own spin lock
> (pool->pool_lock).
>
> srp_put_tx_iu() acquires the target lock as well (target->lock). That's
> spin lock in spin lock. I would say that this dead locks.
>
> I like the other version more.

Not sure how I missed that ... I will drop this version and replace it 
with the proper fix.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error
       [not found]     ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
@ 2013-06-13  9:30       ` Sebastian Riemer
       [not found]         ` <51B99120.9000503-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  2013-06-27 21:03       ` David Dillow
  1 sibling, 1 reply; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-13  9:30 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

On 12.06.2013 15:23, Bart Van Assche wrote:
> The SCSI error handler assumes that the transport layer is
> operational if an eh_abort_handler() returns SUCCESS. Hence let
> srp_abort() only return SUCCESS if sending the ABORT TASK task
> management function succeeded. This patch avoids that the SCSI
> error handler skips the srp_reset_host() call after a transport
> layer error.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |   11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 9c638dd..fb37b47 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1742,18 +1742,23 @@ static int srp_abort(struct scsi_cmnd *scmnd)
>  {
>  	struct srp_target_port *target = host_to_target(scmnd->device->host);
>  	struct srp_request *req = (struct srp_request *) scmnd->host_scribble;
> +	int ret;
>  
>  	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
>  
>  	if (!req || !srp_claim_req(target, req, scmnd))
>  		return FAILED;
> -	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
> -			  SRP_TSK_ABORT_TASK);
> +	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
> +			      SRP_TSK_ABORT_TASK) == 0 ||
> +	    target->transport_offline)
> +		ret = SUCCESS;

Here you try to hide a little trick. Returning success upon
(target->transport_offline == true) is perhaps not the best way. I guess
you try to fail IO fast here but up to this point
"target->transport_offline = true" is only set in srp_reset_host().

Please explain for what you need that in this patch!

Furthermore, returning "FAST_IO_FAIL" sounds better to me in this situation.

> +	else
> +		ret = FAILED;
>  	srp_free_req(target, req, scmnd, 0);
>  	scmnd->result = DID_ABORT << 16;
>  	scmnd->scsi_done(scmnd);
>  
> -	return SUCCESS;
> +	return ret;
>  }
>  
>  static int srp_reset_device(struct scsi_cmnd *scmnd)
> 

The rest is okay.

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/14] IB/srp: Skip host settle delay
       [not found]     ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
@ 2013-06-13  9:53       ` Sebastian Riemer
       [not found]         ` <51B996A1.6080604-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  2013-06-27 21:04       ` David Dillow
  1 sibling, 1 reply; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-13  9:53 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma, Christoph Hellwig

On 12.06.2013 15:24, Bart Van Assche wrote:
> The SRP initiator implements host reset by reconnecting to the SRP
> target. That means that communication with the target is possible
> as soon as host reset finished. Hence skip the host settle delay.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index fb37b47..be12780 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1949,6 +1949,7 @@ static struct scsi_host_template srp_template = {
>  	.eh_abort_handler		= srp_abort,
>  	.eh_device_reset_handler	= srp_reset_device,
>  	.eh_host_reset_handler		= srp_reset_host,
> +	.skip_settle_delay		= true,
>  	.sg_tablesize			= SRP_DEF_SG_TABLESIZE,
>  	.can_queue			= SRP_CMD_SQ_SIZE,
>  	.this_id			= -1,
> 

Signed-off-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Acked-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Tested-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Reviewed-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Reviewed-by: Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>

Choose something, I totally agree. Adding Christoph in CC as he has
reviewed this as well.

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error
       [not found]         ` <51B99120.9000503-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-13  9:57           ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-13  9:57 UTC (permalink / raw)
  To: Sebastian Riemer
  Cc: Bart Van Assche, Roland Dreier, David Dillow, Vu Pham, linux-rdma

On 06/13/13 11:30, Sebastian Riemer wrote:
> On 12.06.2013 15:23, Bart Van Assche wrote:
>> The SCSI error handler assumes that the transport layer is
>> operational if an eh_abort_handler() returns SUCCESS. Hence let
>> srp_abort() only return SUCCESS if sending the ABORT TASK task
>> management function succeeded. This patch avoids that the SCSI
>> error handler skips the srp_reset_host() call after a transport
>> layer error.
>>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c |   11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 9c638dd..fb37b47 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -1742,18 +1742,23 @@ static int srp_abort(struct scsi_cmnd *scmnd)
>>   {
>>   	struct srp_target_port *target = host_to_target(scmnd->device->host);
>>   	struct srp_request *req = (struct srp_request *) scmnd->host_scribble;
>> +	int ret;
>>
>>   	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
>>
>>   	if (!req || !srp_claim_req(target, req, scmnd))
>>   		return FAILED;
>> -	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
>> -			  SRP_TSK_ABORT_TASK);
>> +	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
>> +			      SRP_TSK_ABORT_TASK) == 0 ||
>> +	    target->transport_offline)
>> +		ret = SUCCESS;
>
> Here you try to hide a little trick. Returning success upon
> (target->transport_offline == true) is perhaps not the best way. I guess
> you try to fail IO fast here but up to this point
> "target->transport_offline = true" is only set in srp_reset_host().
>
> Please explain for what you need that in this patch!
>
> Furthermore, returning "FAST_IO_FAIL" sounds better to me in this situation.

Hello Sebastian,

Sorry, forgot to explain that bit in the patch description. The purpose 
of the "transport_offline" test is indeed to fail I/O fast in case no IB 
RC connection to the target is available. You suggestion to return 
FAST_IO_FAIL in that case makes sense to me. I will move that change to 
a separate patch.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/14] IB/srp: Skip host settle delay
       [not found]         ` <51B996A1.6080604-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-13 13:06           ` Or Gerlitz
  0 siblings, 0 replies; 64+ messages in thread
From: Or Gerlitz @ 2013-06-13 13:06 UTC (permalink / raw)
  To: Sebastian Riemer, Bart Van Assche
  Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma, Christoph Hellwig

On 13/06/2013 12:53, Sebastian Riemer wrote:
> On 12.06.2013 15:24, Bart Van Assche wrote:
>> The SRP initiator implements host reset by reconnecting to the SRP
>> target. That means that communication with the target is possible
>> as soon as host reset finished. Hence skip the host settle delay.
>>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c |    1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index fb37b47..be12780 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -1949,6 +1949,7 @@ static struct scsi_host_template srp_template = {
>>   	.eh_abort_handler		= srp_abort,
>>   	.eh_device_reset_handler	= srp_reset_device,
>>   	.eh_host_reset_handler		= srp_reset_host,
>> +	.skip_settle_delay		= true,
>>   	.sg_tablesize			= SRP_DEF_SG_TABLESIZE,
>>   	.can_queue			= SRP_CMD_SQ_SIZE,
>>   	.this_id			= -1,
>>
> Signed-off-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> Acked-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> Tested-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> Reviewed-by: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> Reviewed-by: Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
>
> Choose something,

yes, but too many things... else we will end up with one liner patch 
that has ten yyy-by: credit lines...

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]     ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
@ 2013-06-13 13:57       ` Sebastian Riemer
       [not found]         ` <51B9CFC3.8080008-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  2013-06-13 17:50       ` Vu Pham
  2013-06-27 21:10       ` David Dillow
  2 siblings, 1 reply; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-13 13:57 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

Hi Bart,

thanks for picking up the idea not to use this 'add_target' file for
manual reconnects. I have only small remarks but basically you've got my
Acked-by and Tested-by.

Please find the remarks in-line.

Cheers,
Sebastian

On 12.06.2013 15:25, Bart Van Assche wrote:
> An SRP target is required to maintain a single connection between
> initiator and target. This means that if the 'add_target' attribute
> is used to create a second connection to a target that the first
> connection will be logged out and that the SCSI error handler will
> kick in. The SCSI error handler will cause the SRP initiator to
> reconnect, which will cause I/O over the second connection to fail.
> Avoid such ping-pong behavior by disabling relogins. Note: if
> reconnecting manually is necessary, that is possible by deleting
> and recreating an rport via sysfs.
> 
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |   38 +++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index be12780..1a73b24 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -556,6 +556,36 @@ static void srp_rport_delete(struct srp_rport *rport)
>  	srp_queue_remove_work(target);
>  }
>  
> +/**
> + * srp_conn_unique() - check whether the connection to a target is unique
> + */
> +static bool srp_conn_unique(struct srp_host *host,
> +			    struct srp_target_port *target)
> +{
> +	struct srp_target_port *t;
> +	bool ret = false;
> +
> +	if (target->state == SRP_TARGET_REMOVED)
> +		goto out;
> +
> +	ret = true;
> +
> +	spin_lock(&host->target_lock);
> +	list_for_each_entry(t, &host->target_list, list) {
> +		if (t != target &&
> +		    target->id_ext == t->id_ext &&
> +		    target->ioc_guid == t->ioc_guid &&
> +		    target->initiator_ext == t->initiator_ext) {
> +			ret = false;
> +			break;
> +		}
> +	}
> +	spin_unlock(&host->target_lock);
> +
> +out:
> +	return ret;
> +}
> +

You've only changed the style of this function. Functionality is still
the same. Fine for me.

But why do you put it that high in the source code?
Do you (still) need it for something else?

I would put it directly in front of srp_create_target() or even in front
of that option parsing stuff for correct bottom-up.

>  static int srp_connect_target(struct srp_target_port *target)
>  {
>  	int retries = 3;
> @@ -2261,6 +2291,14 @@ static ssize_t srp_create_target(struct device *dev,
>  	if (ret)
>  		goto err;
>  
> +	if (!srp_conn_unique(target->srp_host, target)) {
> +		shost_printk(KERN_INFO, target->scsi_host,
> +			     PFX "Already connected to target port %.*s\n",
> +			     (int)count, buf);
> +		ret = -EEXIST;
> +		goto err;
> +	}
> +

Yes, this looks good! Nice idea to print the connection string!
Would be even cooler without trailing '\n' from within 'buf' but that's
okay.

I was a little bit afraid of overflows here so I did security testing.
But srp_parse_options() already rejected my evil connection strings. :-)

I've tried things like this:
id_ext=0002c903004ed0b2,\
ioc_guid=0002c903004ed0b2,\
dgid=fe800000000000000002c903004ed0b4,\
pkey=ffff,service_id=0002c903004ed0b2,\
xxxxxxxxxxxxxxxxxxxxxxxxx... until 4096 chars

id_ext=0002c903004ed0b2,\
ioc_guid=0002c903004ed0b2,\
dgid=fe800000000000000002c903004ed0b4,\
pkey=ffff,service_id=0002c903004ed0b2,\
id_ext=0002c903004ed0b2,\
ioc_guid=0002c903004ed0b2,\
dgid=fe800000000000000002c903004ed0b4,\
pkey=ffff,service_id=0002c903004ed0b2,\
... until 4096 chars

This string looked kind of funny. Also in the kernel message it was a
little bit longer than usual but the parsing detected that I have too
many parameters. So everything is fine in terms of security.

>  	if (!host->srp_dev->fmr_pool && !target->allow_ext_sg &&
>  				target->cmd_sg_cnt < target->sg_tablesize) {
>  		pr_warn("No FMR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n");
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]         ` <51B9CFC3.8080008-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-13 15:07           ` Bart Van Assche
       [not found]             ` <51B9E046.3030008-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-13 15:07 UTC (permalink / raw)
  To: Sebastian Riemer; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

On 06/13/13 15:57, Sebastian Riemer wrote:
> You've only changed the style of this function. Functionality is still
> the same. Fine for me.
> 
> But why do you put it that high in the source code?
> Do you (still) need it for something else?
> 
> I would put it directly in front of srp_create_target() or even in front
> of that option parsing stuff for correct bottom-up.

Good idea. Will move the definition of that function down.
 
>>   static int srp_connect_target(struct srp_target_port *target)
>>   {
>>   	int retries = 3;
>> @@ -2261,6 +2291,14 @@ static ssize_t srp_create_target(struct device *dev,
>>   	if (ret)
>>   		goto err;
>>   
>> +	if (!srp_conn_unique(target->srp_host, target)) {
>> +		shost_printk(KERN_INFO, target->scsi_host,
>> +			     PFX "Already connected to target port %.*s\n",
>> +			     (int)count, buf);
>> +		ret = -EEXIST;
>> +		goto err;
>> +	}
>> +
> 
> Yes, this looks good! Nice idea to print the connection string!
> Would be even cooler without trailing '\n' from within 'buf' but that's
> okay.
> 
> I was a little bit afraid of overflows here so I did security testing.
> But srp_parse_options() already rejected my evil connection strings. :-)
> 
> I've tried things like this:
> id_ext=0002c903004ed0b2,\
> ioc_guid=0002c903004ed0b2,\
> dgid=fe800000000000000002c903004ed0b4,\
> pkey=ffff,service_id=0002c903004ed0b2,\
> xxxxxxxxxxxxxxxxxxxxxxxxx... until 4096 chars
> 
> id_ext=0002c903004ed0b2,\
> ioc_guid=0002c903004ed0b2,\
> dgid=fe800000000000000002c903004ed0b4,\
> pkey=ffff,service_id=0002c903004ed0b2,\
> id_ext=0002c903004ed0b2,\
> ioc_guid=0002c903004ed0b2,\
> dgid=fe800000000000000002c903004ed0b4,\
> pkey=ffff,service_id=0002c903004ed0b2,\
> ... until 4096 chars
> 
> This string looked kind of funny. Also in the kernel message it was a
> little bit longer than usual but the parsing detected that I have too
> many parameters. So everything is fine in terms of security.

The "%.*s" should only copy the data provided by the user, even if it
is not '\0' terminated. Stripping the trailing newline is probably
possible with something like the (untested) code below (will only work
if there is only one newline in the input string and if it's at the
end):
		shost_printk(KERN_INFO, target->scsi_host,
			     PFX "Already connected to target port %.*s\n",
			     (int)count - (memchr(buf, '\n', count) ==
					   buf + count - 1), buf);

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]             ` <51B9E046.3030008-HInyCGIudOg@public.gmane.org>
@ 2013-06-13 15:35               ` Sebastian Riemer
  0 siblings, 0 replies; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-13 15:35 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Vu Pham, linux-rdma

On 13.06.2013 17:07, Bart Van Assche wrote:
[...]
> The "%.*s" should only copy the data provided by the user, even if it
> is not '\0' terminated. Stripping the trailing newline is probably
> possible with something like the (untested) code below (will only work
> if there is only one newline in the input string and if it's at the
> end):
> 		shost_printk(KERN_INFO, target->scsi_host,
> 			     PFX "Already connected to target port %.*s\n",
> 			     (int)count - (memchr(buf, '\n', count) ==
> 					   buf + count - 1), buf);

I thought more like this existing message (as the input string by the
user is possibly long with a lot of configuration options):

shost_printk(KERN_DEBUG, target->scsi_host, PFX
	     "new target: id_ext %016llx ioc_guid %016llx pkey %04x "
	     "service_id %016llx dgid %pI6\n",
	(unsigned long long) be64_to_cpu(target->id_ext),
	(unsigned long long) be64_to_cpu(target->ioc_guid),
	be16_to_cpu(target->path.pkey),
	(unsigned long long) be64_to_cpu(target->service_id),
	target->path.dgid.raw);

But this thing takes a lot of code lines. Perhaps this string formatting
should be put into a macro/inline function then.

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]     ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
  2013-06-13 13:57       ` Sebastian Riemer
@ 2013-06-13 17:50       ` Vu Pham
       [not found]         ` <51BA0655.6090707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-06-27 21:10       ` David Dillow
  2 siblings, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-13 17:50 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma

Hello Bart,

> +/**
> + * srp_conn_unique() - check whether the connection to a target is unique
> + */
> +static bool srp_conn_unique(struct srp_host *host,
> +			    struct srp_target_port *target)
> +{
> +	struct srp_target_port *t;
> +	bool ret = false;
> +
> +	if (target->state == SRP_TARGET_REMOVED)
> +		goto out;
> +
> +	ret = true;
> +
> +	spin_lock(&host->target_lock);
> +	list_for_each_entry(t, &host->target_list, list) {
> +		if (t != target &&
> +		    target->id_ext == t->id_ext &&
>   

Targets may advertise/expose on different pkeys
You can have multiple connections  (or paths/scsi hosts) to same target 
with different pkeys.
We need extra check to detect the uniqueness:
                                             target->path.pkey == 
t->path.pkey &&

-vu
> +		    target->ioc_guid == t->ioc_guid &&
> +		    target->initiator_ext == t->initiator_ext) {
> +			ret = false;
> +			break;
> +		}
> +	}
> +	spin_unlock(&host->target_lock);
> +
> +out:
> +	return ret;
> +}
> +
>  static int srp_connect_target(struct srp_target_port *target)
>  {
>  	int retries = 3;
> @@ -2261,6 +2291,14 @@ static ssize_t srp_create_target(struct device *dev,
>  	if (ret)
>  		goto err;
>  
> +	if (!srp_conn_unique(target->srp_host, target)) {
> +		shost_printk(KERN_INFO, target->scsi_host,
> +			     PFX "Already connected to target port %.*s\n",
> +			     (int)count, buf);
> +		ret = -EEXIST;
> +		goto err;
> +	}
> +
>  	if (!host->srp_dev->fmr_pool && !target->allow_ext_sg &&
>  				target->cmd_sg_cnt < target->sg_tablesize) {
>  		pr_warn("No FMR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n");
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]         ` <51BA0655.6090707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-13 18:25           ` Bart Van Assche
       [not found]             ` <51BA0E8F.3030104-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-13 18:25 UTC (permalink / raw)
  To: Vu Pham; +Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma

On 06/13/13 19:50, Vu Pham wrote:
> Hello Bart,
>
>> +/**
>> + * srp_conn_unique() - check whether the connection to a target is
>> unique
>> + */
>> +static bool srp_conn_unique(struct srp_host *host,
>> +                struct srp_target_port *target)
>> +{
>> +    struct srp_target_port *t;
>> +    bool ret = false;
>> +
>> +    if (target->state == SRP_TARGET_REMOVED)
>> +        goto out;
>> +
>> +    ret = true;
>> +
>> +    spin_lock(&host->target_lock);
>> +    list_for_each_entry(t, &host->target_list, list) {
>> +        if (t != target &&
>> +            target->id_ext == t->id_ext &&
>
> Targets may advertise/expose on different pkeys
> You can have multiple connections  (or paths/scsi hosts) to same target
> with different pkeys.
> We need extra check to detect the uniqueness:
>                                              target->path.pkey ==
> t->path.pkey &&

Hello Vu,

Thanks for the feedback. This is something I have already thinking about 
myself. Unfortunately I have not found any requirements in the T10 SRP 
standard with regard to InfiniBand partitions. However, in that document 
there is a section about single RDMA channel operation. In that section 
it is explained that an SRP target must log out established sessions 
upon receipt of a new login request. What I'm not sure about is whether 
only sessions with the same P_Key must be logged out or all established 
sessions if a new login request is received. I assume the latter since 
otherwise that would mean that an SRP target would be required to 
maintain multiple sessions if it allows connections with more than one 
P_Key to a target port ? My concern about adding a pkey comparison in 
the function srp_conn_unique() is that if a target allows an initiator 
to choose which partition to use when logging in, that this could result 
in the undesired SRP initiator ping-pong effect this patch tries to avoid.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]   ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
@ 2013-06-13 19:43     ` Vu Pham
  2013-06-14 13:19       ` Bart Van Assche
  0 siblings, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-13 19:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

Hello Bart,

>  
> +What:		/sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
> +Date:		September 1, 2013
> +KernelVersion:	3.11
> +Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> +Description:	Number of seconds the SCSI layer will wait after a transport
> +		layer error has been observed before removing a target port.
> +		Zero means immediate removal.
>   
A negative value will disable the target port removal.
> +
> +What:		/sys/class/srp_remote_ports/port-<h>:<n>/fast_io_fail_tmo
> +Date:		September 1, 2013
> +KernelVersion:	3.11
> +Contact:	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> +Description:	Number of seconds the SCSI layer will wait after a transport
> +		layer error has been observed before failing I/O. Zero means
> +		immediate removal. A negative value will disable this
> +		behavior.
>
>   
<snip>
> +
> +/**
> + * srp_tmo_valid() - check timeout combination validity
> + *
> + * If no fast I/O fail timeout has been configured then the device loss timeout
> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail timeout has
> + * been configured then it must be below the device loss timeout.
> + */
> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
> +{
> +	return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
> +		dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
> +		|| (0 <= fast_io_fail_tmo &&
> +		    (dev_loss_tmo < 0 ||
> +		     (fast_io_fail_tmo < dev_loss_tmo &&
> +		      dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
> +}
> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>   
fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with negative 
value
dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with negative 
value

<snip>
>
> + * srp_reconnect_rport - reconnect by invoking srp_function_template.reconnect()
> + *
> + * Blocks SCSI command queueing before invoking reconnect() such that the
> + * scsi_host_template.queuecommand() won't be invoked concurrently with
> + * reconnect(). This is important since a reconnect() implementation may
> + * reallocate resources needed by queuecommand(). Please note that this
> + * function neither waits until outstanding requests have finished nor tries
> + * to abort these. It is the responsibility of the reconnect() function to
> + * finish outstanding commands before reconnecting to the target port.
> + */
> +int srp_reconnect_rport(struct srp_rport *rport)
> +{
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +	struct srp_internal *i = to_srp_internal(shost->transportt);
> +	struct scsi_device *sdev;
> +	int res;
> +
> +	pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev));
> +
> +	res = mutex_lock_interruptible(&rport->mutex);
> +	if (res) {
> +		pr_debug("%s: mutex_lock_interruptible() returned %d\n",
> +			 dev_name(&shost->shost_gendev), res);
> +		goto out;
> +	}
> +
> +	spin_lock_irq(shost->host_lock);
> +	scsi_block_requests(shost);
> +	spin_unlock_irq(shost->host_lock);
> +
>   
In scsi_block_requests() definition, no locks are assumed held.

If rport's state is already SRP_RPORT_BLOCKED, I don't think we need to 
do extra block with scsi_block_requests()

Shouldn't we avoid using both scsi_block/unblock_requests() and 
scsi_target_block/unblock()?
ie. in srp_start_tl_fail_timers()  call scsi_block_requests(), in 
__rport_fast_io_fail_timedout() and rport_dev_loss_timedout() call 
scsi_unblock_requests()
and using scsi_block/unblock_requests in this fuction.
> +	while (scsi_request_fn_active(shost))
> +		msleep(20);
> +
> +	res = i->f->reconnect(rport);
> +	scsi_unblock_requests(shost);
> +	pr_debug("%s (state %d): transport.reconnect() returned %d\n",
> +		 dev_name(&shost->shost_gendev), rport->state, res);
> +	if (res == 0) {
> +		cancel_delayed_work(&rport->fast_io_fail_work);
> +		cancel_delayed_work(&rport->dev_loss_work);
> +		rport->failed_reconnects = 0;
> +		scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING);
> +		srp_rport_set_state(rport, SRP_RPORT_RUNNING);
> +		/*
> +		 * It can occur that after fast_io_fail_tmo expired and before
> +		 * dev_loss_tmo expired that the SCSI error handler has
> +		 * offlined one or more devices. scsi_target_unblock() doesn't
> +		 * change the state of these devices into running, so do that
> +		 * explicitly.
> +		 */
> +		spin_lock_irq(shost->host_lock);
> +		__shost_for_each_device(sdev, shost)
> +			if (sdev->sdev_state == SDEV_OFFLINE)
> +				sdev->sdev_state = SDEV_RUNNING;
> +		spin_unlock_irq(shost->host_lock);
> +	} else if (rport->state == SRP_RPORT_RUNNING) {
> +		srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
> +		scsi_target_unblock(&shost->shost_gendev,
> +				    SDEV_TRANSPORT_OFFLINE);
>   
Would this unblock override the fast_io_fail_tmo functionality?

> +	}
> +	mutex_unlock(&rport->mutex);
> +
> +out:
> +	return res;
> +}
> +EXPORT_SYMBOL(srp_reconnect_rport);
> +
> +static void srp_reconnect_work(struct work_struct *work)
> +{
> +	struct srp_rport *rport = container_of(to_delayed_work(work),
> +					struct srp_rport, reconnect_work);
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +	int delay, res;
> +
> +	res = srp_reconnect_rport(rport);
> +	if (res != 0) {
> +		shost_printk(KERN_ERR, shost,
> +			     "reconnect attempt %d failed (%d)\n",
> +			     ++rport->failed_reconnects, res);
> +		delay = rport->reconnect_delay *
> +			min(100, max(1, rport->failed_reconnects - 10));
> +		if (delay > 0)
> +			queue_delayed_work(system_long_wq,
> +					   &rport->reconnect_work, delay * HZ);
> +	}
> +}
> +
> +static void __rport_fast_io_fail_timedout(struct srp_rport *rport)
> +{
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +	struct srp_internal *i;
> +
> +	lockdep_assert_held(&rport->mutex);
> +
> +	if (srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST))
> +		return;
> +	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
> +
> +	/* Involve the LLDD if possible to terminate all io on the rport. */
> +	i = to_srp_internal(shost->transportt);
> +	if (i->f->terminate_rport_io)
> +		i->f->terminate_rport_io(rport);
> +}
> +
> +/**
> + * rport_fast_io_fail_timedout() - fast I/O failure timeout handler
> + *
> + * Unblocks the SCSI host.
> + */
> +static void rport_fast_io_fail_timedout(struct work_struct *work)
> +{
> +	struct srp_rport *rport = container_of(to_delayed_work(work),
> +					struct srp_rport, fast_io_fail_work);
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +
> +	pr_debug("fast_io_fail_tmo expired for %s.\n",
> +		 dev_name(&shost->shost_gendev));
> +
> +	mutex_lock(&rport->mutex);
> +	__rport_fast_io_fail_timedout(rport);
> +	mutex_unlock(&rport->mutex);
> +}
> +
> +/**
> + * rport_dev_loss_timedout() - device loss timeout handler
> + *
> + * Note: rport->ft->rport_delete must either unblock the SCSI host or schedule
> + * SCSI host removal.
> + */
> +static void rport_dev_loss_timedout(struct work_struct *work)
> +{
> +	struct srp_rport *rport = container_of(to_delayed_work(work),
> +					struct srp_rport, dev_loss_work);
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +	struct srp_internal *i = to_srp_internal(shost->transportt);
> +
> +	pr_err("dev_loss_tmo expired for %s.\n",
> +	       dev_name(&shost->shost_gendev));
> +
> +	mutex_lock(&rport->mutex);
> +	WARN_ON(srp_rport_set_state(rport, SRP_RPORT_LOST) != 0);
> +	scsi_target_unblock(rport->dev.parent, SDEV_TRANSPORT_OFFLINE);
> +	mutex_unlock(&rport->mutex);
> +
> +	i->f->rport_delete(rport);
> +}
> +
> +/**
> + * srp_start_tl_fail_timers() - start the transport layer failure timers
> + *
> + * Start the transport layer fast I/O failure and device loss timers. Do not
> + * modify a timer that was already started.
> + */
> +void srp_start_tl_fail_timers(struct srp_rport *rport)
> +{
> +	struct Scsi_Host *shost = rport_to_shost(rport);
> +	int delay;
> +
> +	mutex_lock(&rport->mutex);
> +	pr_debug("%s current state: %d\n", dev_name(&shost->shost_gendev),
> +		 rport->state);
> +	if (srp_rport_set_state(rport, SRP_RPORT_BLOCKED) != 0)
> +		goto out_unlock;
> +	pr_debug("%s new state: %d\n", dev_name(&shost->shost_gendev),
> +		 rport->state);
> +	scsi_target_block(&shost->shost_gendev);
> +
> +	if (rport->fast_io_fail_tmo >= 0)
> +		queue_delayed_work(system_long_wq, &rport->fast_io_fail_work,
> +				   1UL * rport->fast_io_fail_tmo * HZ);
> +	if (rport->dev_loss_tmo >= 0)
> +		queue_delayed_work(system_long_wq, &rport->dev_loss_work,
> +				   1UL * rport->dev_loss_tmo * HZ);
> +	delay = rport->reconnect_delay;
> +	if (delay > 0)
> +		queue_delayed_work(system_long_wq, &rport->reconnect_work,
> +				   1UL * delay * HZ);
> +out_unlock:
> +	mutex_unlock(&rport->mutex);
> +}
> +EXPORT_SYMBOL(srp_start_tl_fail_timers);
> +
> +/**
> + * srp_timed_out - SRP transport intercept of the SCSI timeout EH
> + *
> + * If a timeout occurs while an rport is in the blocked state, ask the SCSI
> + * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
> + * handle the timeout (BLK_EH_NOT_HANDLED).
> + *
> + * Note: This function is called from soft-IRQ context and with the request
> + * queue lock held.
> + */
> +static enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
> +{
> +	struct scsi_device *sdev = scmd->device;
> +
> +	pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
> +	return scsi_device_blocked(sdev) ? BLK_EH_RESET_TIMER :
> +		BLK_EH_NOT_HANDLED;
>   
if one turn off fast_io_fail_tmo, rport_fast_io_fail_timedout() won't be 
called, and reconnect is failed, scsi_device_blocked remains true, error 
recovery cannot happen
I think it should be
return (scsi_device_blocked(sdev) && rport->fast_io_fail_tmo > 0) ? 
BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;

-vu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]             ` <51BA0E8F.3030104-HInyCGIudOg@public.gmane.org>
@ 2013-06-13 23:27               ` Vu Pham
       [not found]                 ` <51BA555F.9060807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-13 23:27 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma

Bart Van Assche wrote:
> On 06/13/13 19:50, Vu Pham wrote:
>> Hello Bart,
>>
>>> +/**
>>> + * srp_conn_unique() - check whether the connection to a target is
>>> unique
>>> + */
>>> +static bool srp_conn_unique(struct srp_host *host,
>>> +                struct srp_target_port *target)
>>> +{
>>> +    struct srp_target_port *t;
>>> +    bool ret = false;
>>> +
>>> +    if (target->state == SRP_TARGET_REMOVED)
>>> +        goto out;
>>> +
>>> +    ret = true;
>>> +
>>> +    spin_lock(&host->target_lock);
>>> +    list_for_each_entry(t, &host->target_list, list) {
>>> +        if (t != target &&
>>> +            target->id_ext == t->id_ext &&
>>
>> Targets may advertise/expose on different pkeys
>> You can have multiple connections  (or paths/scsi hosts) to same target
>> with different pkeys.
>> We need extra check to detect the uniqueness:
>>                                              target->path.pkey ==
>> t->path.pkey &&
>
> Hello Vu,
>
> Thanks for the feedback. This is something I have already thinking 
> about myself. Unfortunately I have not found any requirements in the 
> T10 SRP standard with regard to InfiniBand partitions. However, in 
> that document there is a section about single RDMA channel operation. 
> In that section it is explained that an SRP target must log out 
> established sessions upon receipt of a new login request. What I'm not 
> sure about is whether only sessions with the same P_Key must be logged 
> out or all established sessions if a new login request is received. I 
> assume the latter since otherwise that would mean that an SRP target 
> would be required to maintain multiple sessions if it allows 
> connections with more than one P_Key to a target port ? My concern 
> about adding a pkey comparison in the function srp_conn_unique() is 
> that if a target allows an initiator to choose which partition to use 
> when logging in, that this could result in the undesired SRP initiator 
> ping-pong effect this patch tries to avoid.
>
> Bart.
>
Hello Bart,

Yes, you pointed out the unclear/undefined area.

If we stick to single RDMA channel per IT Nexus with unique tuple 
Initiator Port Identier - Target Port Indentifier then newly created 
connection with same tuple (I_port_id, T_port_id) but with different 
P_Key or different DGID is not unique.

Sticking to this rule by excluding P_Key and DGID  out of rdma channel 
indentity, your srp_conn_unique() checking is ok; however, some SRP 
target implementations may include DGID as part of rdma channel 
identifier. I'm not sure about different p_key part.

-vu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]                 ` <51BA555F.9060807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-14  9:38                   ` Sebastian Riemer
       [not found]                     ` <51BAE482.1050304-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-14  9:38 UTC (permalink / raw)
  To: Vu Pham; +Cc: Bart Van Assche, Roland Dreier, David Dillow, linux-rdma

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 14.06.2013 01:27, Vu Pham wrote:
> Bart Van Assche wrote:
>> On 06/13/13 19:50, Vu Pham wrote:
>>> Hello Bart,
>>> 
>>>> +/** + * srp_conn_unique() - check whether the connection to
>>>> a target is unique + */ +static bool srp_conn_unique(struct
>>>> srp_host *host, +                struct srp_target_port
>>>> *target) +{ +    struct srp_target_port *t; +    bool ret =
>>>> false; + +    if (target->state == SRP_TARGET_REMOVED) +
>>>> goto out; + +    ret = true; + +
>>>> spin_lock(&host->target_lock); +    list_for_each_entry(t,
>>>> &host->target_list, list) { +        if (t != target && +
>>>> target->id_ext == t->id_ext &&
>>> 
>>> Targets may advertise/expose on different pkeys You can have
>>> multiple connections  (or paths/scsi hosts) to same target with
>>> different pkeys. We need extra check to detect the uniqueness: 
>>> target->path.pkey == t->path.pkey &&
>> 
>> Hello Vu,
>> 
>> Thanks for the feedback. This is something I have already
>> thinking about myself. Unfortunately I have not found any
>> requirements in the T10 SRP standard with regard to InfiniBand
>> partitions. However, in that document there is a section about
>> single RDMA channel operation. In that section it is explained
>> that an SRP target must log out established sessions upon receipt
>> of a new login request. What I'm not sure about is whether only
>> sessions with the same P_Key must be logged out or all
>> established sessions if a new login request is received. I assume
>> the latter since otherwise that would mean that an SRP target 
>> would be required to maintain multiple sessions if it allows 
>> connections with more than one P_Key to a target port ? My
>> concern about adding a pkey comparison in the function
>> srp_conn_unique() is that if a target allows an initiator to
>> choose which partition to use when logging in, that this could
>> result in the undesired SRP initiator ping-pong effect this patch
>> tries to avoid.
>> 
>> Bart.
>> 
> Hello Bart,
> 
> Yes, you pointed out the unclear/undefined area.
> 
> If we stick to single RDMA channel per IT Nexus with unique tuple 
> Initiator Port Identier - Target Port Indentifier then newly
> created connection with same tuple (I_port_id, T_port_id) but with
> different P_Key or different DGID is not unique.
> 
> Sticking to this rule by excluding P_Key and DGID  out of rdma
> channel indentity, your srp_conn_unique() checking is ok; however,
> some SRP target implementations may include DGID as part of rdma
> channel identifier. I'm not sure about different p_key part.
> 
> -vu

Hi Vu!

For what do you need the same target with multiple pkeys on the same
local SRP port?

Which other SRP targets exist?

I only know SCST, Solaris COMSTAR and that broken LIO stuff.
Does SCST still not support to set the pkey?

Why should we check the dgid?

Doesn't make any sense to me to connect both target ports to the same
local port. If you do so, the multipath-tools will crash. Note: This
function is called per local SRP port. Perhaps, a note should be added
to that function that it only has to be called per local SRP port.

Cheers,
Sebastian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRuuSCAAoJEH4DRb7WXajZcFcH+gKsSs64Js/CUqMSyPeFPQ7u
7jKHvLr2wqHqSMIg5rEeZxZJpE9rL+wi8k5TMAMBrV+Povdwr8tWHgdq7mh5N1xO
V517YTgdzrwPIFy9e2uktxx4VYpsFGrV8iw3rdAzXRmcYa5U8feXhiD1VZyKjs4p
3//wvGAR0po7Pm0WgU9Q+h0arQos8CmeHkpoaNp/nNINXpXlTX21WVvHjwQrMFhC
Kr8zoCOTd0Sn+WoSs+CT/7Y4oTknukwR5vh6wfKgz2W74YkMKpD658QZozlafyK/
rwdajV19YYvi8YRTjUXuY5TN0qshYOGDxJDtNFkRGbx+IxIqFkGyyFCp0LPCfto=
=nlf2
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-13 19:43     ` Vu Pham
@ 2013-06-14 13:19       ` Bart Van Assche
       [not found]         ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-14 13:19 UTC (permalink / raw)
  To: Vu Pham
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

On 06/13/13 21:43, Vu Pham wrote:
> Hello Bart,
> 
>>
>> +What:        /sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
>> +Date:        September 1, 2013
>> +KernelVersion:    3.11
>> +Contact:    linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org
>> +Description:    Number of seconds the SCSI layer will wait after a 
>> transport
>> +        layer error has been observed before removing a target port.
>> +        Zero means immediate removal.
> A negative value will disable the target port removal.
>
> <snip>
>> +
>> +/**
>> + * srp_tmo_valid() - check timeout combination validity
>> + *
>> + * If no fast I/O fail timeout has been configured then the device 
>> loss timeout
>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail 
>> timeout has
>> + * been configured then it must be below the device loss timeout.
>> + */
>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>> +{
>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>> +        || (0 <= fast_io_fail_tmo &&
>> +            (dev_loss_tmo < 0 ||
>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>> +}
>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with negative 
> value
> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with negative 
> value

OK, will update the documentation such that it correctly refers to "off" instead of a negative value and I will also mention that dev_loss_tmo can now be disabled.

> <snip>
>>
>> + * srp_reconnect_rport - reconnect by invoking 
>> srp_function_template.reconnect()
>> + *
>> + * Blocks SCSI command queueing before invoking reconnect() such that 
>> the
>> + * scsi_host_template.queuecommand() won't be invoked concurrently with
>> + * reconnect(). This is important since a reconnect() implementation may
>> + * reallocate resources needed by queuecommand(). Please note that this
>> + * function neither waits until outstanding requests have finished 
>> nor tries
>> + * to abort these. It is the responsibility of the reconnect() 
>> function to
>> + * finish outstanding commands before reconnecting to the target port.
>> + */
>> +int srp_reconnect_rport(struct srp_rport *rport)
>> +{
>> +    struct Scsi_Host *shost = rport_to_shost(rport);
>> +    struct srp_internal *i = to_srp_internal(shost->transportt);
>> +    struct scsi_device *sdev;
>> +    int res;
>> +
>> +    pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev));
>> +
>> +    res = mutex_lock_interruptible(&rport->mutex);
>> +    if (res) {
>> +        pr_debug("%s: mutex_lock_interruptible() returned %d\n",
>> +             dev_name(&shost->shost_gendev), res);
>> +        goto out;
>> +    }
>> +
>> +    spin_lock_irq(shost->host_lock);
>> +    scsi_block_requests(shost);
>> +    spin_unlock_irq(shost->host_lock);
>> +
> In scsi_block_requests() definition, no locks are assumed held.

Good catch :-) However, if you look around in drivers/scsi you will see that several SCSI LLD drivers invoke scsi_block_requests() with the host lock held. I'm not sure whether these LLDs or the scsi_block_requests() documentation is incorrect. Anyway, I'll leave the locking statements out since these are not necessary around this call of scsi_block_requests().

> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need to 
> do extra block with scsi_block_requests()

Please keep in mind that srp_reconnect_rport() can be called from two different contexts: that function can not only be called from inside the SRP transport layer but also from inside the SCSI error handler (see also the srp_reset_device() modifications in a later patch in this series). If this function is invoked from the context of the SCSI error handler the chance is high that the SCSI device will have another state than SDEV_BLOCK. Hence the scsi_block_requests() call in this function.

> Shouldn't we avoid using both scsi_block/unblock_requests() and 
> scsi_target_block/unblock()?
> ie. in srp_start_tl_fail_timers()  call scsi_block_requests(), in 
> __rport_fast_io_fail_timedout() and rport_dev_loss_timedout() call 
> scsi_unblock_requests()
> and using scsi_block/unblock_requests in this function.

I agree that using only one of these two mechanisms would be more elegant. However, the advantage of using scsi_target_block() as long as fast_io_fail_tmo has not expired is that this functions set the SDEV_BLOCK state which is recognized by multipathd. The reason scsi_block_requests() is used in the above function is to prevent that a user can reenable requests via sysfs while a new InfiniBand RC connection is being established. A call of srp_queuecommand() concurrently with srp_create_target_ib() can namely result in a kernel crash.

>> +    while (scsi_request_fn_active(shost))
>> +        msleep(20);
>> +
>> +    res = i->f->reconnect(rport);
>> +    scsi_unblock_requests(shost);
>> +    pr_debug("%s (state %d): transport.reconnect() returned %d\n",
>> +         dev_name(&shost->shost_gendev), rport->state, res);
>> +    if (res == 0) {
>> +        cancel_delayed_work(&rport->fast_io_fail_work);
>> +        cancel_delayed_work(&rport->dev_loss_work);
>> +        rport->failed_reconnects = 0;
>> +        scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING);
>> +        srp_rport_set_state(rport, SRP_RPORT_RUNNING);
>> +        /*
>> +         * It can occur that after fast_io_fail_tmo expired and before
>> +         * dev_loss_tmo expired that the SCSI error handler has
>> +         * offlined one or more devices. scsi_target_unblock() doesn't
>> +         * change the state of these devices into running, so do that
>> +         * explicitly.
>> +         */
>> +        spin_lock_irq(shost->host_lock);
>> +        __shost_for_each_device(sdev, shost)
>> +            if (sdev->sdev_state == SDEV_OFFLINE)
>> +                sdev->sdev_state = SDEV_RUNNING;
>> +        spin_unlock_irq(shost->host_lock);
>> +    } else if (rport->state == SRP_RPORT_RUNNING) {
>> +        srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
>> +        scsi_target_unblock(&shost->shost_gendev,
>> +                    SDEV_TRANSPORT_OFFLINE);
> Would this unblock override the fast_io_fail_tmo functionality?

Sorry but I don't think so. If something goes wrong with the transport layer then the function srp_start_tl_fail_timers() gets invoked. That function starts with changing the state from SRP_RPORT_RUNNING into SRP_RPORT_BLOCKED. In other words, the code below "if (rport->state == SRP_RPORT_RUNNING)" can only be reached if srp_reconnect_rport() is invoked from the context of the SCSI error handler and not if it is called from inside one of the new SRP transport layer times. In other words, whether or not that code will be reached depends on whether fast_io_fail_tmo is below or above the SCSI command timeout.

>> +static enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
>> +{
>> +    struct scsi_device *sdev = scmd->device;
>> +
>> +    pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
>> +    return scsi_device_blocked(sdev) ? BLK_EH_RESET_TIMER :
>> +        BLK_EH_NOT_HANDLED;
> if one turn off fast_io_fail_tmo, rport_fast_io_fail_timedout() won't be 
> called, and reconnect is failed, scsi_device_blocked remains true, error 
> recovery cannot happen
> I think it should be
> return (scsi_device_blocked(sdev) && rport->fast_io_fail_tmo > 0) ? 
> BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;

How about the following alternative ?

diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 53b34b3..84574fb 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -550,11 +550,12 @@ void srp_start_tl_fail_timers(struct srp_rport *rport)
                goto out_unlock;
        pr_debug("%s new state: %d\n", dev_name(&shost->shost_gendev),
                 rport->state);
-       scsi_target_block(&shost->shost_gendev);
 
-       if (rport->fast_io_fail_tmo >= 0)
+       if (rport->fast_io_fail_tmo >= 0) {
+               scsi_target_block(&shost->shost_gendev);
                queue_delayed_work(system_long_wq, &rport->fast_io_fail_work,
                                   1UL * rport->fast_io_fail_tmo * HZ);
+       }
        if (rport->dev_loss_tmo >= 0)
                queue_delayed_work(system_long_wq, &rport->dev_loss_work,
                                   1UL * rport->dev_loss_tmo * HZ);

Bart.

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]                     ` <51BAE482.1050304-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-14 17:07                       ` Vu Pham
       [not found]                         ` <51BB4DBB.4070800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-14 17:07 UTC (permalink / raw)
  To: Sebastian Riemer; +Cc: Bart Van Assche, Roland Dreier, David Dillow, linux-rdma

Hello Sebastian,

> On 14.06.2013 01:27, Vu Pham wrote:
>   
>> Bart Van Assche wrote:
>>     
>>> On 06/13/13 19:50, Vu Pham wrote:
>>>       
>>>> Hello Bart,
>>>>
>>>>         
>>>>> +/** + * srp_conn_unique() - check whether the connection to
>>>>> a target is unique + */ +static bool srp_conn_unique(struct
>>>>> srp_host *host, +                struct srp_target_port
>>>>> *target) +{ +    struct srp_target_port *t; +    bool ret =
>>>>> false; + +    if (target->state == SRP_TARGET_REMOVED) +
>>>>> goto out; + +    ret = true; + +
>>>>> spin_lock(&host->target_lock); +    list_for_each_entry(t,
>>>>> &host->target_list, list) { +        if (t != target && +
>>>>> target->id_ext == t->id_ext &&
>>>>>           
>>>> Targets may advertise/expose on different pkeys You can have
>>>> multiple connections  (or paths/scsi hosts) to same target with
>>>> different pkeys. We need extra check to detect the uniqueness: 
>>>> target->path.pkey == t->path.pkey &&
>>>>         
>>> Hello Vu,
>>>
>>> Thanks for the feedback. This is something I have already
>>> thinking about myself. Unfortunately I have not found any
>>> requirements in the T10 SRP standard with regard to InfiniBand
>>> partitions. However, in that document there is a section about
>>> single RDMA channel operation. In that section it is explained
>>> that an SRP target must log out established sessions upon receipt
>>> of a new login request. What I'm not sure about is whether only
>>> sessions with the same P_Key must be logged out or all
>>> established sessions if a new login request is received. I assume
>>> the latter since otherwise that would mean that an SRP target 
>>> would be required to maintain multiple sessions if it allows 
>>> connections with more than one P_Key to a target port ? My
>>> concern about adding a pkey comparison in the function
>>> srp_conn_unique() is that if a target allows an initiator to
>>> choose which partition to use when logging in, that this could
>>> result in the undesired SRP initiator ping-pong effect this patch
>>> tries to avoid.
>>>
>>> Bart.
>>>
>>>       
>> Hello Bart,
>>
>> Yes, you pointed out the unclear/undefined area.
>>
>> If we stick to single RDMA channel per IT Nexus with unique tuple 
>> Initiator Port Identier - Target Port Indentifier then newly
>> created connection with same tuple (I_port_id, T_port_id) but with
>> different P_Key or different DGID is not unique.
>>
>> Sticking to this rule by excluding P_Key and DGID  out of rdma
>> channel indentity, your srp_conn_unique() checking is ok; however,
>> some SRP target implementations may include DGID as part of rdma
>> channel identifier. I'm not sure about different p_key part.
>>
>> -vu
>>     
>
> Hi Vu!
>
> For what do you need the same target with multiple pkeys on the same
> local SRP port?
>   
There is no need, it's just a gray area that you can choose to have 
multiple connections to same target using different pkeys (same as dgid)
> Which other SRP targets exist?
>   

Netapp/LSI/Engenio, DDN, TexasMemorySystem Ramsan (IBM), Nimbus, Violin 
Memory, StreamScale
The last three may be derived from SCST base target.

> I only know SCST, Solaris COMSTAR and that broken LIO stuff.
> Does SCST still not support to set the pkey?
>
>   
Yes, I think so

> Why should we check the dgid?
>
>   
If you want to have multiple connections/qps to same target, but as I 
said above, it's a gray area.

> Doesn't make any sense to me to connect both target ports to the same
> local port. 
What if a target always expose single consistent and unique SRP port 
with tuple <id_ext, ioc_guid>, the ioc_guid part is not derived from any 
of its local HCA's GUID, then you can connect to this target thru 
different HCA ports (different dgid) as different paths to same target.

-vu
> If you do so, the multipath-tools will crash. Note: This
> function is called per local SRP port. Perhaps, a note should be added
> to that function that it only has to be called per local SRP port.
>
> Cheers,
> Sebastian
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJRuuSCAAoJEH4DRb7WXajZcFcH+gKsSs64Js/CUqMSyPeFPQ7u
> 7jKHvLr2wqHqSMIg5rEeZxZJpE9rL+wi8k5TMAMBrV+Povdwr8tWHgdq7mh5N1xO
> V517YTgdzrwPIFy9e2uktxx4VYpsFGrV8iw3rdAzXRmcYa5U8feXhiD1VZyKjs4p
> 3//wvGAR0po7Pm0WgU9Q+h0arQos8CmeHkpoaNp/nNINXpXlTX21WVvHjwQrMFhC
> Kr8zoCOTd0Sn+WoSs+CT/7Y4oTknukwR5vh6wfKgz2W74YkMKpD658QZozlafyK/
> rwdajV19YYvi8YRTjUXuY5TN0qshYOGDxJDtNFkRGbx+IxIqFkGyyFCp0LPCfto=
> =nlf2
> -----END PGP SIGNATURE-----
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]         ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
@ 2013-06-14 17:59           ` Vu Pham
       [not found]             ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-14 17:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

Hello Bart,

> On 06/13/13 21:43, Vu Pham wrote:
>   
>> Hello Bart,
>>
>>     
>>> +What:        /sys/class/srp_remote_ports/port-<h>:<n>/dev_loss_tmo
>>> +Date:        September 1, 2013
>>> +KernelVersion:    3.11
>>> +Contact:    linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> +Description:    Number of seconds the SCSI layer will wait after a 
>>> transport
>>> +        layer error has been observed before removing a target port.
>>> +        Zero means immediate removal.
>>>       
>> A negative value will disable the target port removal.
>>
>> <snip>
>>     
>>> +
>>> +/**
>>> + * srp_tmo_valid() - check timeout combination validity
>>> + *
>>> + * If no fast I/O fail timeout has been configured then the device 
>>> loss timeout
>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail 
>>> timeout has
>>> + * been configured then it must be below the device loss timeout.
>>> + */
>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>> +{
>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>> +        || (0 <= fast_io_fail_tmo &&
>>> +            (dev_loss_tmo < 0 ||
>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>> +}
>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>       
>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with negative 
>> value
>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with negative 
>> value
>>     
>
> OK, will update the documentation such that it correctly refers to "off" instead of a negative value and I will also mention that dev_loss_tmo can now be disabled.
>
>   
It's not only the documentation but also the code logic, you cannot turn 
dev_loss_tmo off if fast_io_fail_tmo already turned off and vice versa 
with the return statement above.
>> <snip>
>>     
>>> + * srp_reconnect_rport - reconnect by invoking 
>>> srp_function_template.reconnect()
>>> + *
>>> + * Blocks SCSI command queueing before invoking reconnect() such that 
>>> the
>>> + * scsi_host_template.queuecommand() won't be invoked concurrently with
>>> + * reconnect(). This is important since a reconnect() implementation may
>>> + * reallocate resources needed by queuecommand(). Please note that this
>>> + * function neither waits until outstanding requests have finished 
>>> nor tries
>>> + * to abort these. It is the responsibility of the reconnect() 
>>> function to
>>> + * finish outstanding commands before reconnecting to the target port.
>>> + */
>>> +int srp_reconnect_rport(struct srp_rport *rport)
>>> +{
>>> +    struct Scsi_Host *shost = rport_to_shost(rport);
>>> +    struct srp_internal *i = to_srp_internal(shost->transportt);
>>> +    struct scsi_device *sdev;
>>> +    int res;
>>> +
>>> +    pr_debug("SCSI host %s\n", dev_name(&shost->shost_gendev));
>>> +
>>> +    res = mutex_lock_interruptible(&rport->mutex);
>>> +    if (res) {
>>> +        pr_debug("%s: mutex_lock_interruptible() returned %d\n",
>>> +             dev_name(&shost->shost_gendev), res);
>>> +        goto out;
>>> +    }
>>> +
>>> +    spin_lock_irq(shost->host_lock);
>>> +    scsi_block_requests(shost);
>>> +    spin_unlock_irq(shost->host_lock);
>>> +
>>>       
>> In scsi_block_requests() definition, no locks are assumed held.
>>     
>
> Good catch :-) However, if you look around in drivers/scsi you will see that several SCSI LLD drivers invoke scsi_block_requests() with the host lock held. I'm not sure whether these LLDs or the scsi_block_requests() documentation is incorrect. Anyway, I'll leave the locking statements out since these are not necessary around this call of scsi_block_requests().
>
>   
>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need to 
>> do extra block with scsi_block_requests()
>>     
>
> Please keep in mind that srp_reconnect_rport() can be called from two different contexts: that function can not only be called from inside the SRP transport layer but also from inside the SCSI error handler (see also the srp_reset_device() modifications in a later patch in this series). If this function is invoked from the context of the SCSI error handler the chance is high that the SCSI device will have another state than SDEV_BLOCK. Hence the scsi_block_requests() call in this function.
>   
Yes, srp_reconnect_rport() can be called from two contexts; however, it 
deals with same rport & rport's state.
I'm thinking something like this:

    if (rport->state != SRP_RPORT_BLOCKED) {
	scsi_block_requests(shost);

    while (scsi_request_fn_active(shost))
        msleep(20);

    res = i->f->reconnect(rport);

    pr_debug("%s (state %d): transport.reconnect() returned %d\n",
         dev_name(&shost->shost_gendev), rport->state, res);
    if (res == 0) {
        cancel_delayed_work(&rport->fast_io_fail_work);
        cancel_delayed_work(&rport->dev_loss_work);
        rport->failed_reconnects = 0;
        scsi_unblock_requests(shost);
        srp_rport_set_state(rport, SRP_RPORT_RUNNING);
        /*
         * It can occur that after fast_io_fail_tmo expired and before
         * dev_loss_tmo expired that the SCSI error handler has
         * offlined one or more devices. scsi_target_unblock() doesn't
         * change the state of these devices into running, so do that
         * explicitly.
         */
        spin_lock_irq(shost->host_lock);
        __shost_for_each_device(sdev, shost)
            if (sdev->sdev_state == SDEV_OFFLINE)
                sdev->sdev_state = SDEV_RUNNING;
        spin_unlock_irq(shost->host_lock);
    } else if (rport->state != SRP_RPORT_BLOCKED) {
        scsi_unblock_requests(shost);
	mutex_unlock(&rport->mutex);
	srp_start_tl_fail_timers(rport);  /* we should consider some elapse time already passed ie. scsi command timedout seconds)
	mutex_lock(&rport->mutex);
    }

>   
>> Shouldn't we avoid using both scsi_block/unblock_requests() and 
>> scsi_target_block/unblock()?
>> ie. in srp_start_tl_fail_timers()  call scsi_block_requests(), in 
>> __rport_fast_io_fail_timedout() and rport_dev_loss_timedout() call 
>> scsi_unblock_requests()
>> and using scsi_block/unblock_requests in this function.
>>     
>
> I agree that using only one of these two mechanisms would be more elegant. However, the advantage of using scsi_target_block() as long as fast_io_fail_tmo has not expired is that this functions set the SDEV_BLOCK state which is recognized by multipathd.
Could you point out the advantage of multipathd recognizing the 
SDEV_BLOCK state while fast_io_fail_tmo has not expired?

>  The reason scsi_block_requests() is used in the above function is to prevent that a user can reenable requests via sysfs while a new InfiniBand RC connection is being established. A call of srp_queuecommand() concurrently with srp_create_target_ib() can namely result in a kernel crash.
>
>   
Agreed.
I think that we can use only the pair 
scsi_block_requests()/scsi_unblock_requests() unless the advantage of 
multipathd recognizing the SDEV_BLOCK is noticeable.
>>> +    while (scsi_request_fn_active(shost))
>>> +        msleep(20);
>>> +
>>> +    res = i->f->reconnect(rport);
>>> +    scsi_unblock_requests(shost);
>>> +    pr_debug("%s (state %d): transport.reconnect() returned %d\n",
>>> +         dev_name(&shost->shost_gendev), rport->state, res);
>>> +    if (res == 0) {
>>> +        cancel_delayed_work(&rport->fast_io_fail_work);
>>> +        cancel_delayed_work(&rport->dev_loss_work);
>>> +        rport->failed_reconnects = 0;
>>> +        scsi_target_unblock(&shost->shost_gendev, SDEV_RUNNING);
>>> +        srp_rport_set_state(rport, SRP_RPORT_RUNNING);
>>> +        /*
>>> +         * It can occur that after fast_io_fail_tmo expired and before
>>> +         * dev_loss_tmo expired that the SCSI error handler has
>>> +         * offlined one or more devices. scsi_target_unblock() doesn't
>>> +         * change the state of these devices into running, so do that
>>> +         * explicitly.
>>> +         */
>>> +        spin_lock_irq(shost->host_lock);
>>> +        __shost_for_each_device(sdev, shost)
>>> +            if (sdev->sdev_state == SDEV_OFFLINE)
>>> +                sdev->sdev_state = SDEV_RUNNING;
>>> +        spin_unlock_irq(shost->host_lock);
>>> +    } else if (rport->state == SRP_RPORT_RUNNING) {
>>> +        srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
>>> +        scsi_target_unblock(&shost->shost_gendev,
>>> +                    SDEV_TRANSPORT_OFFLINE);
>>>       
>> Would this unblock override the fast_io_fail_tmo functionality?
>>     
>
> Sorry but I don't think so. If something goes wrong with the transport layer then the function srp_start_tl_fail_timers() gets invoked. That function starts with changing the state from SRP_RPORT_RUNNING into SRP_RPORT_BLOCKED. In other words, the code below "if (rport->state == SRP_RPORT_RUNNING)" can only be reached if srp_reconnect_rport() is invoked from the context of the SCSI error handler and not if it is called from inside one of the new SRP transport layer times. In other words, whether or not that code will be reached depends on whether fast_io_fail_tmo is below or above the SCSI command timeout.
>
>   
Yes, if one set RC retry_count=7 and fast_io_fail_tmo > scsi command 
timed out, the scsi command timed out may happen before transport error 
detected, so srp_reconnect_rport() will called in scsi error handler 
context and the else if statement above will override the 
fast_io_fail_tmo functionality.
Please look back to my reference code above.
>>> +static enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
>>> +{
>>> +    struct scsi_device *sdev = scmd->device;
>>> +
>>> +    pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
>>> +    return scsi_device_blocked(sdev) ? BLK_EH_RESET_TIMER :
>>> +        BLK_EH_NOT_HANDLED;
>>>       
>> if one turn off fast_io_fail_tmo, rport_fast_io_fail_timedout() won't be 
>> called, and reconnect is failed, scsi_device_blocked remains true, error 
>> recovery cannot happen
>> I think it should be
>> return (scsi_device_blocked(sdev) && rport->fast_io_fail_tmo > 0) ? 
>> BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
>>     
>
> How about the following alternative ?
>   
It look good. Let me do some testing

-vu

> diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
> index 53b34b3..84574fb 100644
> --- a/drivers/scsi/scsi_transport_srp.c
> +++ b/drivers/scsi/scsi_transport_srp.c
> @@ -550,11 +550,12 @@ void srp_start_tl_fail_timers(struct srp_rport *rport)
>                 goto out_unlock;
>         pr_debug("%s new state: %d\n", dev_name(&shost->shost_gendev),
>                  rport->state);
> -       scsi_target_block(&shost->shost_gendev);
>  
> -       if (rport->fast_io_fail_tmo >= 0)
> +       if (rport->fast_io_fail_tmo >= 0) {
> +               scsi_target_block(&shost->shost_gendev);
>                 queue_delayed_work(system_long_wq, &rport->fast_io_fail_work,
>                                    1UL * rport->fast_io_fail_tmo * HZ);
> +       }
>         if (rport->dev_loss_tmo >= 0)
>                 queue_delayed_work(system_long_wq, &rport->dev_loss_work,
>                                    1UL * rport->dev_loss_tmo * HZ);
>
> Bart.
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]             ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-15  9:52               ` Bart Van Assche
       [not found]                 ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
  2013-06-18 16:59                 ` Vu Pham
  0 siblings, 2 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-15  9:52 UTC (permalink / raw)
  To: Vu Pham
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

On 06/14/13 19:59, Vu Pham wrote:
>> On 06/13/13 21:43, Vu Pham wrote:
>>>> +/**
>>>> + * srp_tmo_valid() - check timeout combination validity
>>>> + *
>>>> + * If no fast I/O fail timeout has been configured then the device
>>>> loss timeout
>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail
>>>> timeout has
>>>> + * been configured then it must be below the device loss timeout.
>>>> + */
>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>> +{
>>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>> +        || (0 <= fast_io_fail_tmo &&
>>>> +            (dev_loss_tmo < 0 ||
>>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>> negative value
>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>> negative value
>>
>> OK, will update the documentation such that it correctly refers to
>> "off" instead of a negative value and I will also mention that
>> dev_loss_tmo can now be disabled.
>>
> It's not only the documentation but also the code logic, you cannot turn
> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice versa
> with the return statement above.

Does this mean that you think it would be useful to disable both the 
fast_io_fail and the dev_loss mechanisms, and hence rely on the user to 
remove remote ports that have disappeared and on the SCSI command 
timeout to detect path failures ? I'll start testing this to see whether 
that combination does not trigger any adverse behavior.

>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need
>>> to do extra block with scsi_block_requests()
>>
>> Please keep in mind that srp_reconnect_rport() can be called from two
>> different contexts: that function can not only be called from inside
>> the SRP transport layer but also from inside the SCSI error handler
>> (see also the srp_reset_device() modifications in a later patch in
>> this series). If this function is invoked from the context of the SCSI
>> error handler the chance is high that the SCSI device will have
>> another state than SDEV_BLOCK. Hence the scsi_block_requests() call in
>> this function.
> Yes, srp_reconnect_rport() can be called from two contexts; however, it
> deals with same rport & rport's state.
> I'm thinking something like this:
>
>     if (rport->state != SRP_RPORT_BLOCKED) {
>      scsi_block_requests(shost);

Sorry but I'm afraid that that approach would still allow the user to 
unblock one or more SCSI devices via sysfs during the 
i->f->reconnect(rport) call, something we do not want.

> I think that we can use only the pair
> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
> multipathd recognizing the SDEV_BLOCK is noticeable.

I think the advantage of multipathd recognizing the SDEV_BLOCK state 
before the fast_io_fail_tmo timer has expired is important. Multipathd 
does not queue I/O to paths that are in the SDEV_BLOCK state so setting 
that state helps I/O to fail over more quickly, especially for large 
values of fast_io_fail_tmo.

Hope this helps,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]                 ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
@ 2013-06-17  6:18                   ` Hannes Reinecke
  2013-06-17  7:04                     ` Bart Van Assche
  0 siblings, 1 reply; 64+ messages in thread
From: Hannes Reinecke @ 2013-06-17  6:18 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, Roland Dreier, David Dillow, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/15/2013 11:52 AM, Bart Van Assche wrote:
> On 06/14/13 19:59, Vu Pham wrote:
>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>> +/**
>>>>> + * srp_tmo_valid() - check timeout combination validity
>>>>> + *
>>>>> + * If no fast I/O fail timeout has been configured then the
>>>>> device
>>>>> loss timeout
>>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O
>>>>> fail
>>>>> timeout has
>>>>> + * been configured then it must be below the device loss timeout.
>>>>> + */
>>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>>> +{
>>>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>>> +        || (0 <= fast_io_fail_tmo &&
>>>>> +            (dev_loss_tmo < 0 ||
>>>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>>> negative value
>>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>>> negative value
>>>
>>> OK, will update the documentation such that it correctly refers to
>>> "off" instead of a negative value and I will also mention that
>>> dev_loss_tmo can now be disabled.
>>>
>> It's not only the documentation but also the code logic, you
>> cannot turn
>> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice
>> versa
>> with the return statement above.
> 
> Does this mean that you think it would be useful to disable both the
> fast_io_fail and the dev_loss mechanisms, and hence rely on the user
> to remove remote ports that have disappeared and on the SCSI command
> timeout to detect path failures ? I'll start testing this to see
> whether that combination does not trigger any adverse behavior.
> 
>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we
>>>> need
>>>> to do extra block with scsi_block_requests()
>>>
>>> Please keep in mind that srp_reconnect_rport() can be called from
>>> two
>>> different contexts: that function can not only be called from inside
>>> the SRP transport layer but also from inside the SCSI error handler
>>> (see also the srp_reset_device() modifications in a later patch in
>>> this series). If this function is invoked from the context of the
>>> SCSI
>>> error handler the chance is high that the SCSI device will have
>>> another state than SDEV_BLOCK. Hence the scsi_block_requests()
>>> call in
>>> this function.
>> Yes, srp_reconnect_rport() can be called from two contexts;
>> however, it
>> deals with same rport & rport's state.
>> I'm thinking something like this:
>>
>>     if (rport->state != SRP_RPORT_BLOCKED) {
>>      scsi_block_requests(shost);
> 
> Sorry but I'm afraid that that approach would still allow the user
> to unblock one or more SCSI devices via sysfs during the
> i->f->reconnect(rport) call, something we do not want.
> 
>> I think that we can use only the pair
>> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
>> multipathd recognizing the SDEV_BLOCK is noticeable.
> 
> I think the advantage of multipathd recognizing the SDEV_BLOCK state
> before the fast_io_fail_tmo timer has expired is important.
> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
> state so setting that state helps I/O to fail over more quickly,
> especially for large values of fast_io_fail_tmo.
> 
Sadly it doesn't work that way.

SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
path, but there still will be I/O queued on that path.
For these multipath _has_ to wait for I/O completion.
And as it turns out, in most cases the application itself will wait
for completion on these I/O before continue sending more I/O.
So in effect multipath would queue new I/O to other paths, but won't
_receive_ new I/O as the upper layers are still waiting for
completion of the queued I/O.

The only way to excite fast failover with multipathing is to set
fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
the outstanding I/Os.

Large values of fast_io_fail will almost guarantee sluggish I/O
failover.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare-l3A5Bk7waGM@public.gmane.org			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-17  6:18                   ` Hannes Reinecke
@ 2013-06-17  7:04                     ` Bart Van Assche
  2013-06-17  7:14                       ` Hannes Reinecke
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-17  7:04 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Vu Pham, Roland Dreier, David Dillow, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/17/13 08:18, Hannes Reinecke wrote:
> On 06/15/2013 11:52 AM, Bart Van Assche wrote:
>> On 06/14/13 19:59, Vu Pham wrote:
>>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>>> +/**
>>>>>> + * srp_tmo_valid() - check timeout combination validity
>>>>>> + *
>>>>>> + * If no fast I/O fail timeout has been configured then the
>>>>>> device
>>>>>> loss timeout
>>>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O
>>>>>> fail
>>>>>> timeout has
>>>>>> + * been configured then it must be below the device loss timeout.
>>>>>> + */
>>>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>>>> +{
>>>>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>>>> +        || (0 <= fast_io_fail_tmo &&
>>>>>> +            (dev_loss_tmo < 0 ||
>>>>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>>>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>>>> +}
>>>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>>>> negative value
>>>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>>>> negative value
>>>>
>>>> OK, will update the documentation such that it correctly refers to
>>>> "off" instead of a negative value and I will also mention that
>>>> dev_loss_tmo can now be disabled.
>>>>
>>> It's not only the documentation but also the code logic, you
>>> cannot turn
>>> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice
>>> versa
>>> with the return statement above.
>>
>> Does this mean that you think it would be useful to disable both the
>> fast_io_fail and the dev_loss mechanisms, and hence rely on the user
>> to remove remote ports that have disappeared and on the SCSI command
>> timeout to detect path failures ? I'll start testing this to see
>> whether that combination does not trigger any adverse behavior.
>>
>>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we
>>>>> need
>>>>> to do extra block with scsi_block_requests()
>>>>
>>>> Please keep in mind that srp_reconnect_rport() can be called from
>>>> two
>>>> different contexts: that function can not only be called from inside
>>>> the SRP transport layer but also from inside the SCSI error handler
>>>> (see also the srp_reset_device() modifications in a later patch in
>>>> this series). If this function is invoked from the context of the
>>>> SCSI
>>>> error handler the chance is high that the SCSI device will have
>>>> another state than SDEV_BLOCK. Hence the scsi_block_requests()
>>>> call in
>>>> this function.
>>> Yes, srp_reconnect_rport() can be called from two contexts;
>>> however, it
>>> deals with same rport & rport's state.
>>> I'm thinking something like this:
>>>
>>>      if (rport->state != SRP_RPORT_BLOCKED) {
>>>       scsi_block_requests(shost);
>>
>> Sorry but I'm afraid that that approach would still allow the user
>> to unblock one or more SCSI devices via sysfs during the
>> i->f->reconnect(rport) call, something we do not want.
>>
>>> I think that we can use only the pair
>>> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
>>> multipathd recognizing the SDEV_BLOCK is noticeable.
>>
>> I think the advantage of multipathd recognizing the SDEV_BLOCK state
>> before the fast_io_fail_tmo timer has expired is important.
>> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
>> state so setting that state helps I/O to fail over more quickly,
>> especially for large values of fast_io_fail_tmo.
>>
> Sadly it doesn't work that way.
>
> SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
> path, but there still will be I/O queued on that path.
> For these multipath _has_ to wait for I/O completion.
> And as it turns out, in most cases the application itself will wait
> for completion on these I/O before continue sending more I/O.
> So in effect multipath would queue new I/O to other paths, but won't
> _receive_ new I/O as the upper layers are still waiting for
> completion of the queued I/O.
>
> The only way to excite fast failover with multipathing is to set
> fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
> the outstanding I/Os.
>
> Large values of fast_io_fail will almost guarantee sluggish I/O
> failover.

Hello Hannes,

I agree that the value of fast_io_fail_tmo should be kept small. 
Although as you explained changing the SCSI device state into SDEV_BLOCK 
doesn't help for I/O that has already been queued on a failed path, I 
think it's still useful for I/O that is queued after the fast_io_fail 
timer has been started and before that timer has expired.

Bart.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-17  7:04                     ` Bart Van Assche
@ 2013-06-17  7:14                       ` Hannes Reinecke
  2013-06-17  7:29                         ` Bart Van Assche
  0 siblings, 1 reply; 64+ messages in thread
From: Hannes Reinecke @ 2013-06-17  7:14 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, Roland Dreier, David Dillow, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/17/2013 09:04 AM, Bart Van Assche wrote:
> On 06/17/13 08:18, Hannes Reinecke wrote:
>> On 06/15/2013 11:52 AM, Bart Van Assche wrote:
[ .. ]
>>>
>>> I think the advantage of multipathd recognizing the SDEV_BLOCK state
>>> before the fast_io_fail_tmo timer has expired is important.
>>> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
>>> state so setting that state helps I/O to fail over more quickly,
>>> especially for large values of fast_io_fail_tmo.
>>>
>> Sadly it doesn't work that way.
>>
>> SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
>> path, but there still will be I/O queued on that path.
>> For these multipath _has_ to wait for I/O completion.
>> And as it turns out, in most cases the application itself will wait
>> for completion on these I/O before continue sending more I/O.
>> So in effect multipath would queue new I/O to other paths, but won't
>> _receive_ new I/O as the upper layers are still waiting for
>> completion of the queued I/O.
>>
>> The only way to excite fast failover with multipathing is to set
>> fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
>> the outstanding I/Os.
>>
>> Large values of fast_io_fail will almost guarantee sluggish I/O
>> failover.
> 
> Hello Hannes,
> 
> I agree that the value of fast_io_fail_tmo should be kept small.
> Although as you explained changing the SCSI device state into
> SDEV_BLOCK doesn't help for I/O that has already been queued on a
> failed path, I think it's still useful for I/O that is queued after
> the fast_io_fail timer has been started and before that timer has
> expired.
> 
Why, but of course.

The typical scenario would be:
-> detect link-loss
-> call scsi_block_request()
-> start dev_loss_tmo and fast_io_fail_tmo

-> When fast_io_fail_tmo triggers:
   -> Abort all outstanding requests

-> When dev_loss_tmo triggers:
   -> Abort all outstanding requests
   -> Remove/disable the I_T nexus
   -> call scsi_unblock_request()

However, if and whether multipath detects SDEV_BLOCK doesn't
guarantee a fast failover; in fact is was only added rather recently
as it's not a big win in most cases.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-17  7:14                       ` Hannes Reinecke
@ 2013-06-17  7:29                         ` Bart Van Assche
       [not found]                           ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-17  7:29 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Vu Pham, Roland Dreier, David Dillow, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/17/13 09:14, Hannes Reinecke wrote:
> On 06/17/2013 09:04 AM, Bart Van Assche wrote:
>> I agree that the value of fast_io_fail_tmo should be kept small.
>> Although as you explained changing the SCSI device state into
>> SDEV_BLOCK doesn't help for I/O that has already been queued on a
>> failed path, I think it's still useful for I/O that is queued after
>> the fast_io_fail timer has been started and before that timer has
>> expired.
>
> Why, but of course.
>
> The typical scenario would be:
> -> detect link-loss
> -> call scsi_block_request()
> -> start dev_loss_tmo and fast_io_fail_tmo
>
> -> When fast_io_fail_tmo triggers:
>     -> Abort all outstanding requests
>
> -> When dev_loss_tmo triggers:
>     -> Abort all outstanding requests
>     -> Remove/disable the I_T nexus
>     -> call scsi_unblock_request()
>
> However, if and whether multipath detects SDEV_BLOCK doesn't
> guarantee a fast failover; in fact is was only added rather recently
> as it's not a big win in most cases.

Even if setting the state SDEV_BLOCK doesn't help much with improving 
failover time, it still has the advantage over using 
scsi_block_requests() that it can be overridden by a user via sysfs.

Bart.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]                           ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
@ 2013-06-17  8:10                             ` Hannes Reinecke
  2013-06-17 10:13                             ` Sebastian Riemer
  1 sibling, 0 replies; 64+ messages in thread
From: Hannes Reinecke @ 2013-06-17  8:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, Roland Dreier, David Dillow, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/17/2013 09:29 AM, Bart Van Assche wrote:
> On 06/17/13 09:14, Hannes Reinecke wrote:
>> On 06/17/2013 09:04 AM, Bart Van Assche wrote:
>>> I agree that the value of fast_io_fail_tmo should be kept small.
>>> Although as you explained changing the SCSI device state into
>>> SDEV_BLOCK doesn't help for I/O that has already been queued on a
>>> failed path, I think it's still useful for I/O that is queued after
>>> the fast_io_fail timer has been started and before that timer has
>>> expired.
>>
>> Why, but of course.
>>
>> The typical scenario would be:
>> -> detect link-loss
>> -> call scsi_block_request()
>> -> start dev_loss_tmo and fast_io_fail_tmo
>>
>> -> When fast_io_fail_tmo triggers:
>>     -> Abort all outstanding requests
>>
>> -> When dev_loss_tmo triggers:
>>     -> Abort all outstanding requests
>>     -> Remove/disable the I_T nexus
>>     -> call scsi_unblock_request()
>>
>> However, if and whether multipath detects SDEV_BLOCK doesn't
>> guarantee a fast failover; in fact is was only added rather recently
>> as it's not a big win in most cases.
> 
> Even if setting the state SDEV_BLOCK doesn't help much with
> improving failover time, it still has the advantage over using
> scsi_block_requests() that it can be overridden by a user via sysfs.
> 
Argl. I meant scsi_internal_device_block(), not
scsi_block_requests(). Of course.

So we're arguing a non-existing point :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare-l3A5Bk7waGM@public.gmane.org			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]                         ` <51BB4DBB.4070800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-17  9:41                           ` Sebastian Riemer
  0 siblings, 0 replies; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-17  9:41 UTC (permalink / raw)
  To: Vu Pham; +Cc: Bart Van Assche, Roland Dreier, David Dillow, linux-rdma

On 14.06.2013 19:07, Vu Pham wrote:
[...]
>> For what do you need the same target with multiple pkeys on the same
>> local SRP port?
>>   
> There is no need, it's just a gray area that you can choose to have
> multiple connections to same target using different pkeys (same as dgid)
>> Which other SRP targets exist?
>>   
> 
> Netapp/LSI/Engenio, DDN, TexasMemorySystem Ramsan (IBM), Nimbus, Violin
> Memory, StreamScale
> The last three may be derived from SCST base target.
> 
>> I only know SCST, Solaris COMSTAR and that broken LIO stuff.
>> Does SCST still not support to set the pkey?
>>
>>   
> Yes, I think so
> 
>> Why should we check the dgid?
>>
>>   
> If you want to have multiple connections/qps to same target, but as I
> said above, it's a gray area.
> 
>> Doesn't make any sense to me to connect both target ports to the same
>> local port. 
> What if a target always expose single consistent and unique SRP port
> with tuple <id_ext, ioc_guid>, the ioc_guid part is not derived from any
> of its local HCA's GUID, then you can connect to this target thru
> different HCA ports (different dgid) as different paths to same target.
>
Do you have an example for a target which does it like this or a use
case where this makes sense?

I guess you're proposing here to use a driver global list of target
connections instead of handling this per local SRP port. This would
result in bigger changes which I wouldn't do without a good reason.

>> If you do so, the multipath-tools will crash. Note: This
>> function is called per local SRP port. Perhaps, a note should be added
>> to that function that it only has to be called per local SRP port.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]                           ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
  2013-06-17  8:10                             ` Hannes Reinecke
@ 2013-06-17 10:13                             ` Sebastian Riemer
  1 sibling, 0 replies; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-17 10:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Hannes Reinecke, Vu Pham, Roland Dreier, David Dillow,
	linux-rdma, linux-scsi, James Bottomley

On 17.06.2013 09:29, Bart Van Assche wrote:
> On 06/17/13 09:14, Hannes Reinecke wrote:
>> On 06/17/2013 09:04 AM, Bart Van Assche wrote:
>>> I agree that the value of fast_io_fail_tmo should be kept small.
>>> Although as you explained changing the SCSI device state into
>>> SDEV_BLOCK doesn't help for I/O that has already been queued on a
>>> failed path, I think it's still useful for I/O that is queued after
>>> the fast_io_fail timer has been started and before that timer has
>>> expired.
>>
>> Why, but of course.
>>
>> The typical scenario would be:
>> -> detect link-loss
>> -> call scsi_block_request()
>> -> start dev_loss_tmo and fast_io_fail_tmo
>>
>> -> When fast_io_fail_tmo triggers:
>>     -> Abort all outstanding requests
>>
>> -> When dev_loss_tmo triggers:
>>     -> Abort all outstanding requests
>>     -> Remove/disable the I_T nexus
>>     -> call scsi_unblock_request()
>>
>> However, if and whether multipath detects SDEV_BLOCK doesn't
>> guarantee a fast failover; in fact is was only added rather recently
>> as it's not a big win in most cases.
> 
> Even if setting the state SDEV_BLOCK doesn't help much with improving
> failover time, it still has the advantage over using
> scsi_block_requests() that it can be overridden by a user via sysfs.

In my opinion that SDEV_BLOCK can help the reconnect. The only reason
for high fast_io_fail_tmo is that you don't use multipath at all and
hope that the connection becomes available again before that timeout.
You place the reconnects in between so that there is a chance that the
reconnect succeeds and the transport layer error work can be canceled.

But I have to look at all of your patches first to see how you
implemented the big picture.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-15  9:52               ` Bart Van Assche
       [not found]                 ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
@ 2013-06-18 16:59                 ` Vu Pham
       [not found]                   ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 64+ messages in thread
From: Vu Pham @ 2013-06-18 16:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

Bart Van Assche wrote:
> On 06/14/13 19:59, Vu Pham wrote:
>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>> +/**
>>>>> + * srp_tmo_valid() - check timeout combination validity
>>>>> + *
>>>>> + * If no fast I/O fail timeout has been configured then the device
>>>>> loss timeout
>>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail
>>>>> timeout has
>>>>> + * been configured then it must be below the device loss timeout.
>>>>> + */
>>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>>> +{
>>>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>>> +        || (0 <= fast_io_fail_tmo &&
>>>>> +            (dev_loss_tmo < 0 ||
>>>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>>> negative value
>>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>>> negative value
>>>
>>> OK, will update the documentation such that it correctly refers to
>>> "off" instead of a negative value and I will also mention that
>>> dev_loss_tmo can now be disabled.
>>>
>> It's not only the documentation but also the code logic, you cannot turn
>> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice versa
>> with the return statement above.
>
> Does this mean that you think it would be useful to disable both the 
> fast_io_fail and the dev_loss mechanisms, and hence rely on the user 
> to remove remote ports that have disappeared and on the SCSI command 
> timeout to detect path failures ?
Yes.

> I'll start testing this to see whether that combination does not 
> trigger any adverse behavior.
>

Ok

>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need
>>>> to do extra block with scsi_block_requests()
>>>
>>> Please keep in mind that srp_reconnect_rport() can be called from two
>>> different contexts: that function can not only be called from inside
>>> the SRP transport layer but also from inside the SCSI error handler
>>> (see also the srp_reset_device() modifications in a later patch in
>>> this series). If this function is invoked from the context of the SCSI
>>> error handler the chance is high that the SCSI device will have
>>> another state than SDEV_BLOCK. Hence the scsi_block_requests() call in
>>> this function.
>> Yes, srp_reconnect_rport() can be called from two contexts; however, it
>> deals with same rport & rport's state.
>> I'm thinking something like this:
>>
>>     if (rport->state != SRP_RPORT_BLOCKED) {
>>      scsi_block_requests(shost);
>
> Sorry but I'm afraid that that approach would still allow the user to 
> unblock one or more SCSI devices via sysfs during the 
> i->f->reconnect(rport) call, something we do not want.
>
I don't think that user can unblock scsi device(s) via sysfs if you use 
scsi_block_requests(shost) in srp_start_tl_fail_timers().

-vu

>> I think that we can use only the pair
>> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
>> multipathd recognizing the SDEV_BLOCK is noticeable.
>
> I think the advantage of multipathd recognizing the SDEV_BLOCK state 
> before the fast_io_fail_tmo timer has expired is important. Multipathd 
> does not queue I/O to paths that are in the SDEV_BLOCK state so 
> setting that state helps I/O to fail over more quickly, especially for 
> large values of fast_io_fail_tmo.
>
> Hope this helps,
>
> Bart.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]                   ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-19 13:00                     ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-19 13:00 UTC (permalink / raw)
  To: Vu Pham
  Cc: Roland Dreier, David Dillow, Sebastian Riemer, linux-rdma,
	linux-scsi, James Bottomley

On 06/18/13 18:59, Vu Pham wrote:
> Bart Van Assche wrote:
>> On 06/14/13 19:59, Vu Pham wrote:
>>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we need
>>>>> to do extra block with scsi_block_requests()
>>>>
>>>> Please keep in mind that srp_reconnect_rport() can be called from two
>>>> different contexts: that function can not only be called from inside
>>>> the SRP transport layer but also from inside the SCSI error handler
>>>> (see also the srp_reset_device() modifications in a later patch in
>>>> this series). If this function is invoked from the context of the SCSI
>>>> error handler the chance is high that the SCSI device will have
>>>> another state than SDEV_BLOCK. Hence the scsi_block_requests() call in
>>>> this function.
>>> Yes, srp_reconnect_rport() can be called from two contexts; however, it
>>> deals with same rport & rport's state.
>>> I'm thinking something like this:
>>>
>>>     if (rport->state != SRP_RPORT_BLOCKED) {
>>>      scsi_block_requests(shost);
>>
>> Sorry but I'm afraid that that approach would still allow the user to
>> unblock one or more SCSI devices via sysfs during the
>> i->f->reconnect(rport) call, something we do not want.
>>
> I don't think that user can unblock scsi device(s) via sysfs if you use
> scsi_block_requests(shost) in srp_start_tl_fail_timers().

Hello Vu,

If scsi_block_requests() would be used in srp_start_tl_fail_timers() 
instead of scsi_target_block() then multipathd would no longer be able 
to notice that a path is blocked after the fast_io_fail and dev_loss 
timers started and hence wouldn't be able to use the optimization where 
blocked paths are skipped when queueing a new I/O request.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
  2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
       [not found]   ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
@ 2013-06-23 21:13   ` Mike Christie
       [not found]     ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
  1 sibling, 1 reply; 64+ messages in thread
From: Mike Christie @ 2013-06-23 21:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, David Dillow, Vu Pham, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/12/2013 08:28 AM, Bart Van Assche wrote:
> +		/*
> +		 * It can occur that after fast_io_fail_tmo expired and before
> +		 * dev_loss_tmo expired that the SCSI error handler has
> +		 * offlined one or more devices. scsi_target_unblock() doesn't
> +		 * change the state of these devices into running, so do that
> +		 * explicitly.
> +		 */
> +		spin_lock_irq(shost->host_lock);
> +		__shost_for_each_device(sdev, shost)
> +			if (sdev->sdev_state == SDEV_OFFLINE)
> +				sdev->sdev_state = SDEV_RUNNING;
> +		spin_unlock_irq(shost->host_lock);

Is it possible for this to race with scsi_eh_offline_sdevs? Can it be
looping over cmds offlining devices while this is looping over devices
onlining them?

It seems this can also happen for all transports/drivers. Maybe a a scsi
eh/lib helper function that syncrhonizes with the scsi eh completion
would be better.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
       [not found]     ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
@ 2013-06-24  7:37       ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-24  7:37 UTC (permalink / raw)
  To: Mike Christie
  Cc: Roland Dreier, David Dillow, Vu Pham, Sebastian Riemer,
	linux-rdma, linux-scsi, James Bottomley

On 06/23/13 23:13, Mike Christie wrote:
> On 06/12/2013 08:28 AM, Bart Van Assche wrote:
>> +		/*
>> +		 * It can occur that after fast_io_fail_tmo expired and before
>> +		 * dev_loss_tmo expired that the SCSI error handler has
>> +		 * offlined one or more devices. scsi_target_unblock() doesn't
>> +		 * change the state of these devices into running, so do that
>> +		 * explicitly.
>> +		 */
>> +		spin_lock_irq(shost->host_lock);
>> +		__shost_for_each_device(sdev, shost)
>> +			if (sdev->sdev_state == SDEV_OFFLINE)
>> +				sdev->sdev_state = SDEV_RUNNING;
>> +		spin_unlock_irq(shost->host_lock);
> 
> Is it possible for this to race with scsi_eh_offline_sdevs? Can it be
> looping over cmds offlining devices while this is looping over devices
> onlining them?
> 
> It seems this can also happen for all transports/drivers. Maybe a a scsi
> eh/lib helper function that syncrhonizes with the scsi eh completion
> would be better.

I'm not sure it's possible to avoid such a race without introducing
a new mutex. How about something like the (untested) SCSI core patch
below, and invoking scsi_block_eh() and scsi_unblock_eh() around any
reconnect activity not initiated from the SCSI EH thread ?

[PATCH] Add scsi_block_eh() and scsi_unblock_eh()

---
 drivers/scsi/hosts.c      |    1 +
 drivers/scsi/scsi_error.c |   10 ++++++++++
 include/scsi/scsi_host.h  |    1 +
 3 files changed, 12 insertions(+)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 17e2ccb..0df3ec8 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -360,6 +360,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	init_waitqueue_head(&shost->host_wait);
 
 	mutex_init(&shost->scan_mutex);
+	mutex_init(&shost->block_eh_mutex);
 
 	/*
 	 * subtract one because we increment first then return, but we need to
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ab16930..566daaa 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -551,6 +551,10 @@ static int scsi_begin_eh(struct Scsi_Host *host)
 {
 	int res;
 
+	res = mutex_lock_interruptible(&host->block_eh_mutex);
+	if (res)
+		goto out;
+
 	spin_lock_irq(host->host_lock);
 	switch (host->shost_state) {
 	case SHOST_DEL:
@@ -565,6 +569,10 @@ static int scsi_begin_eh(struct Scsi_Host *host)
 	}
 	spin_unlock_irq(host->host_lock);
 
+	if (res)
+		mutex_unlock(&host->block_eh_mutex);
+
+out:
 	return res;
 }
 
@@ -579,6 +587,8 @@ static void scsi_end_eh(struct Scsi_Host *host)
 	if (host->eh_active == 0)
 		wake_up(&host->host_wait);
 	spin_unlock_irq(host->host_lock);
+
+	mutex_unlock(&host->block_eh_mutex);
 }
 
 /**
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 9785e51..d7ce065 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -573,6 +573,7 @@ struct Scsi_Host {
 	spinlock_t		*host_lock;
 
 	struct mutex		scan_mutex;/* serialize scanning activity */
+	struct mutex		block_eh_mutex; /* block ML LLD EH calls */
 
 	struct list_head	eh_cmd_q;
 	struct task_struct    * ehandler;  /* Error recovery thread. */
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found]     ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
  2013-06-12 13:38       ` Bart Van Assche
@ 2013-06-27 21:01       ` David Dillow
       [not found]         ` <1372366870.32164.30.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  1 sibling, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:01 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma,
	Dotan Barak, Eli Cohen

On Wed, 2013-06-12 at 15:20 +0200, Bart Van Assche wrote:
> If the add_one callback fails during driver load no resources are
> allocated so there isn't a need to release any resources. Trying
> to clean the resource may lead to the following kernel panic:

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
       [not found]             ` <51B8903E.3000609-HInyCGIudOg@public.gmane.org>
@ 2013-06-27 21:02               ` David Dillow
       [not found]                 ` <1372366945.32164.32.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:02 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Sebastian Riemer, Roland Dreier, Vu Pham, linux-rdma

On Wed, 2013-06-12 at 17:14 +0200, Bart Van Assche wrote:
> Not sure how I missed that ... I will drop this version and replace it 
> with the proper fix.

I expect the merge window will be opening soon; I should be able to make
time on Sunday to review anything you post by then -- do you think the
next version will be out then?

I'll keep pushing through the series in the meantime.
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error
       [not found]     ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
  2013-06-13  9:30       ` Sebastian Riemer
@ 2013-06-27 21:03       ` David Dillow
  1 sibling, 0 replies; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:03 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Wed, 2013-06-12 at 15:23 +0200, Bart Van Assche wrote:
> The SCSI error handler assumes that the transport layer is
> operational if an eh_abort_handler() returns SUCCESS. Hence let
> srp_abort() only return SUCCESS if sending the ABORT TASK task
> management function succeeded. This patch avoids that the SCSI
> error handler skips the srp_reset_host() call after a transport
> layer error.

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/14] IB/srp: Skip host settle delay
       [not found]     ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
  2013-06-13  9:53       ` Sebastian Riemer
@ 2013-06-27 21:04       ` David Dillow
  1 sibling, 0 replies; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:04 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Wed, 2013-06-12 at 15:24 +0200, Bart Van Assche wrote:
> The SRP initiator implements host reset by reconnecting to the SRP
> target. That means that communication with the target is possible
> as soon as host reset finished. Hence skip the host settle delay.

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]     ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
  2013-06-13 13:57       ` Sebastian Riemer
  2013-06-13 17:50       ` Vu Pham
@ 2013-06-27 21:10       ` David Dillow
       [not found]         ` <1372367432.32164.36.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  2 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:10 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Wed, 2013-06-12 at 15:25 +0200, Bart Van Assche wrote:
> An SRP target is required to maintain a single connection between
> initiator and target. This means that if the 'add_target' attribute
> is used to create a second connection to a target that the first
> connection will be logged out and that the SCSI error handler will
> kick in. The SCSI error handler will cause the SRP initiator to
> reconnect, which will cause I/O over the second connection to fail.
> Avoid such ping-pong behavior by disabling relogins. Note: if
> reconnecting manually is necessary, that is possible by deleting
> and recreating an rport via sysfs.

Will wait for the second version of this as well.

Was an agreement reached on the pkey issue?
I also would rather we mimic the current messages rather than
regurgitate the user's login string back at them.

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/14] IB/srp: Make transport layer retry count configurable
       [not found]     ` <51B8794F.6050003-HInyCGIudOg@public.gmane.org>
@ 2013-06-27 21:22       ` David Dillow
       [not found]         ` <1372368138.32164.40.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:22 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Wed, 2013-06-12 at 15:36 +0200, Bart Van Assche wrote:
> From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> Allow the InfiniBand RC retry count to be configured by the user
> as an option in the target login string. The transport layer
> timeout in nanoseconds is computed as follows from the retry count:
> 
> rc_timeout = rc_retry_count * 4 * 4096 * (1 << qp->timeout)
> 
> The default value for tl_retry_count is changed from 7 into 2.

I'm not sure you should change the default, but I've not thought about
it much.

It should probably also have a default set via module parameter similar
to the indirect_command_descriptors, etc. Unless srp_daemon is patched,
this won't be usable without that. I'm not tied to this position, though
-- we should be moving in the direction of more control via the login
string.

Otherwise,

Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>


-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] IB/srp: Make HCA completion vector configurable
       [not found]     ` <51B87904.1070803-HInyCGIudOg@public.gmane.org>
@ 2013-06-27 21:24       ` David Dillow
       [not found]         ` <1372368256.32164.41.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-27 21:24 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Wed, 2013-06-12 at 15:35 +0200, Bart Van Assche wrote:
> Several InfiniBand HCA's allow to configure the completion vector
> per queue pair. This allows to spread the workload created by IB
> completion interrupts over multiple MSI-X vectors and hence over
> multiple CPU cores. In other words, configuring the completion
> vector properly not only allows to reduce latency on an initiator
> connected to multiple SRP targets but also allows to improve
> throughput.

Shouldn't there be some documentation about how to set this? What
happens when you try to set it on an HCA that doesn't support this how
does the user know what's wrong?

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found]         ` <1372366870.32164.30.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-06-27 23:45           ` Roland Dreier
       [not found]             ` <CAL1RGDWVgAKSL-GNZCkP1FEt9r_y5QWp+74NzDcga6+tcvWpXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Roland Dreier @ 2013-06-27 23:45 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, Vu Pham, Sebastian Riemer, linux-rdma,
	Dotan Barak, Eli Cohen

On Thu, Jun 27, 2013 at 2:01 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> On Wed, 2013-06-12 at 15:20 +0200, Bart Van Assche wrote:
>> If the add_one callback fails during driver load no resources are
>> allocated so there isn't a need to release any resources. Trying
>> to clean the resource may lead to the following kernel panic:
>
> Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>

Thanks, I've queued up the 1, 3, and 4/14 patches that Dave acked so far.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
       [not found]                 ` <1372366945.32164.32.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-06-28  7:36                   ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28  7:36 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, Sebastian Riemer, Roland Dreier, Vu Pham, linux-rdma

On 06/27/13 23:02, David Dillow wrote:
> On Wed, 2013-06-12 at 17:14 +0200, Bart Van Assche wrote:
>> Not sure how I missed that ... I will drop this version and replace it
>> with the proper fix.
>
> I expect the merge window will be opening soon; I should be able to make
> time on Sunday to review anything you post by then -- do you think the
> next version will be out then?
>
> I'll keep pushing through the series in the meantime.

Hello Dave,

I'll do my best to post a new version later today.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found]         ` <1372367432.32164.36.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-06-28  7:40           ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28  7:40 UTC (permalink / raw)
  To: David Dillow; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On 06/27/13 23:10, David Dillow wrote:
> On Wed, 2013-06-12 at 15:25 +0200, Bart Van Assche wrote:
>> An SRP target is required to maintain a single connection between
>> initiator and target. This means that if the 'add_target' attribute
>> is used to create a second connection to a target that the first
>> connection will be logged out and that the SCSI error handler will
>> kick in. The SCSI error handler will cause the SRP initiator to
>> reconnect, which will cause I/O over the second connection to fail.
>> Avoid such ping-pong behavior by disabling relogins. Note: if
>> reconnecting manually is necessary, that is possible by deleting
>> and recreating an rport via sysfs.
>
> Will wait for the second version of this as well.
>
> Was an agreement reached on the pkey issue?
> I also would rather we mimic the current messages rather than
> regurgitate the user's login string back at them.

Hello Dave,

My proposal is to leave out the pkey check.

I will update the error message printed when a login gets rejected.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
       [not found]             ` <CAL1RGDWVgAKSL-GNZCkP1FEt9r_y5QWp+74NzDcga6+tcvWpXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-06-28  7:41               ` Sebastian Riemer
  0 siblings, 0 replies; 64+ messages in thread
From: Sebastian Riemer @ 2013-06-28  7:41 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David Dillow, Bart Van Assche, Vu Pham, linux-rdma, Dotan Barak,
	Eli Cohen

On 28.06.2013 01:45, Roland Dreier wrote:
> On Thu, Jun 27, 2013 at 2:01 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
>> On Wed, 2013-06-12 at 15:20 +0200, Bart Van Assche wrote:
>>> If the add_one callback fails during driver load no resources are
>>> allocated so there isn't a need to release any resources. Trying
>>> to clean the resource may lead to the following kernel panic:
>>
>> Acked-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
> 
> Thanks, I've queued up the 1, 3, and 4/14 patches that Dave acked so far.

Hi Roland,

did you queue 3 without the target->transport_offline check?

Otherwise I can't agree on that.
1 and 4 are also for me okay.

Cheers,
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] IB/srp: Make HCA completion vector configurable
       [not found]         ` <1372368256.32164.41.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-06-28  8:18           ` Bart Van Assche
       [not found]             ` <51CD46F0.60301-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28  8:18 UTC (permalink / raw)
  To: David Dillow; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On 06/27/13 23:24, David Dillow wrote:
> On Wed, 2013-06-12 at 15:35 +0200, Bart Van Assche wrote:
>> Several InfiniBand HCA's allow to configure the completion vector
>> per queue pair. This allows to spread the workload created by IB
>> completion interrupts over multiple MSI-X vectors and hence over
>> multiple CPU cores. In other words, configuring the completion
>> vector properly not only allows to reduce latency on an initiator
>> connected to multiple SRP targets but also allows to improve
>> throughput.
>
> Shouldn't there be some documentation about how to set this? What
> happens when you try to set it on an HCA that doesn't support this how
> does the user know what's wrong?

Hello Dave,

HCA drivers that support only one completion vector ignore the 
comp_vector parameter. See e.g. c2_create_cq() in 
drivers/infiniband/hw/amso1100/c2_provider.c for an example. The third 
argument of that function, called "vector", is ignored.

What will happen if an out-of-range value is chosen for this parameter 
depends on the HCA driver. This is how the mlx4 driver handles the 
comp_vector parameter:
	vector = dev->eq_table[vector % ibdev->num_comp_vectors];
So an out-of-range value for comp_vector won't cause a failure. And if 
ever a HCA driver would be added to the kernel tree that causes 
create_cq() to fail if the comp_vector parameter is out of range then 
the ib_srp driver will propagate the error code returned by 
ib_create_cq() to the process that wrote into the add_target sysfs 
attribute.

The procedure for choosing a value for this parameter is as follows:
* Look up the interrupt numbers that have been assigned to each HCA
   completion vector, e.g. via grep mlx4-ib- /proc/interrupts.
* Decide which CPU to bind each interrupt to and update
   /proc/irq/$irq/smp_affinity.
* Decide which CPU and completion vector should be used for processing
   SRP completions and specify that completion vector number in the SRP
   login string.

To me this looks like a standard procedure that applies to any HCA that 
supports multiple MSI-X vectors ?

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/14] IB/srp: Make transport layer retry count configurable
       [not found]         ` <1372368138.32164.40.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
@ 2013-06-28  8:28           ` Bart Van Assche
       [not found]             ` <51CD4933.5080709-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28  8:28 UTC (permalink / raw)
  To: David Dillow; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On 06/27/13 23:22, David Dillow wrote:
> On Wed, 2013-06-12 at 15:36 +0200, Bart Van Assche wrote:
>> From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> Allow the InfiniBand RC retry count to be configured by the user
>> as an option in the target login string. The transport layer
>> timeout in nanoseconds is computed as follows from the retry count:
>>
>> rc_timeout = rc_retry_count * 4 * 4096 * (1 << qp->timeout)
>>
>> The default value for tl_retry_count is changed from 7 into 2.
>
> I'm not sure you should change the default, but I've not thought about
> it much.
>
> It should probably also have a default set via module parameter similar
> to the indirect_command_descriptors, etc. Unless srp_daemon is patched,
> this won't be usable without that. I'm not tied to this position, though
> -- we should be moving in the direction of more control via the login
> string.

Hello Dave,

Are you perhaps aware of any IB networks where it is necessary to set 
the RC retry count above 2 ? Anyway, I can prepare an srp_daemon patch 
that makes it easy to configure this parameter if you want.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] IB/srp: Make HCA completion vector configurable
       [not found]             ` <51CD46F0.60301-HInyCGIudOg@public.gmane.org>
@ 2013-06-28 12:04               ` David Dillow
       [not found]                 ` <1372421041.28740.14.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-28 12:04 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Fri, 2013-06-28 at 10:18 +0200, Bart Van Assche wrote:
> On 06/27/13 23:24, David Dillow wrote:
> > Shouldn't there be some documentation about how to set this? What
> > happens when you try to set it on an HCA that doesn't support this how
> > does the user know what's wrong?
> 
> Hello Dave,
> 
> HCA drivers that support only one completion vector ignore the 
> comp_vector parameter. See e.g. c2_create_cq() in 
> drivers/infiniband/hw/amso1100/c2_provider.c for an example. The third 
> argument of that function, called "vector", is ignored.
> 
> What will happen if an out-of-range value is chosen for this parameter 
> depends on the HCA driver. This is how the mlx4 driver handles the 
> comp_vector parameter:
> 	vector = dev->eq_table[vector % ibdev->num_comp_vectors];
> So an out-of-range value for comp_vector won't cause a failure. And if 
> ever a HCA driver would be added to the kernel tree that causes 
> create_cq() to fail if the comp_vector parameter is out of range then 
> the ib_srp driver will propagate the error code returned by 
> ib_create_cq() to the process that wrote into the add_target sysfs 
> attribute.

I understand that the error code will unwind to the process, at which
point the user gets an EINVAL without any indication as to why. Our
error handling here sucks, but that's for another day; it shouldn't hold
up the patch -- I was more wondering if there was a specific error code
that would let us know we screwed up the vector, or if it wasn't
supported, and it appears the answer is no. Somewhat of a bummer, that.

> The procedure for choosing a value for this parameter is as follows:
> * Look up the interrupt numbers that have been assigned to each HCA
>    completion vector, e.g. via grep mlx4-ib- /proc/interrupts.
> * Decide which CPU to bind each interrupt to and update
>    /proc/irq/$irq/smp_affinity.
> * Decide which CPU and completion vector should be used for processing
>    SRP completions and specify that completion vector number in the SRP
>    login string.
> 
> To me this looks like a standard procedure that applies to any HCA that 
> supports multiple MSI-X vectors ?

To you and I, sure, this is relatively standard. And now that google has
it, newcomers will be able to find it. However, this changes the login
string and so Documentation/ABI/stable/sysfs-driver-ib_srp needs to be
updated, with at least minimal text.

Thanks,
Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/14] IB/srp: Make transport layer retry count configurable
       [not found]             ` <51CD4933.5080709-HInyCGIudOg@public.gmane.org>
@ 2013-06-28 12:07               ` David Dillow
       [not found]                 ` <1372421227.28740.17.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 64+ messages in thread
From: David Dillow @ 2013-06-28 12:07 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On Fri, 2013-06-28 at 10:28 +0200, Bart Van Assche wrote:
> On 06/27/13 23:22, David Dillow wrote:
> > On Wed, 2013-06-12 at 15:36 +0200, Bart Van Assche wrote:
> >> From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>
> >> Allow the InfiniBand RC retry count to be configured by the user
> >> as an option in the target login string. The transport layer
> >> timeout in nanoseconds is computed as follows from the retry count:
> >>
> >> rc_timeout = rc_retry_count * 4 * 4096 * (1 << qp->timeout)
> >>
> >> The default value for tl_retry_count is changed from 7 into 2.
> >
> > I'm not sure you should change the default, but I've not thought about
> > it much.
> >
> > It should probably also have a default set via module parameter similar
> > to the indirect_command_descriptors, etc. Unless srp_daemon is patched,
> > this won't be usable without that. I'm not tied to this position, though
> > -- we should be moving in the direction of more control via the login
> > string.
> 
> Hello Dave,
> 
> Are you perhaps aware of any IB networks where it is necessary to set 
> the RC retry count above 2 ? 

I'm not aware of all IB networks in the world. The default has been 7
since the beginning of time, and I'm reluctant to change it without good
reasons, especially when it is now configurable and a user can drop it
as needed.

> Anyway, I can prepare an srp_daemon patch 
> that makes it easy to configure this parameter if you want.

That's a separate issue, but we've been accumulating options so it needs
to be done eventually, by someone. I won't hold up a patch for it,
though.

This does need to be documented in
Documentation/ABI/stable/sysfs-driver-ib_srp, though.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/14] IB/srp: Make HCA completion vector configurable
       [not found]                 ` <1372421041.28740.14.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2013-06-28 12:29                   ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28 12:29 UTC (permalink / raw)
  To: David Dillow; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On 06/28/13 14:04, David Dillow wrote:
> On Fri, 2013-06-28 at 10:18 +0200, Bart Van Assche wrote:
>> On 06/27/13 23:24, David Dillow wrote:
>> To me this looks like a standard procedure that applies to any HCA that
>> supports multiple MSI-X vectors ?
>
> To you and I, sure, this is relatively standard. And now that google has
> it, newcomers will be able to find it. However, this changes the login
> string and so Documentation/ABI/stable/sysfs-driver-ib_srp needs to be
> updated, with at least minimal text.

OK, will do.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/14] IB/srp: Make transport layer retry count configurable
       [not found]                 ` <1372421227.28740.17.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2013-06-28 12:30                   ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-28 12:30 UTC (permalink / raw)
  To: David Dillow; +Cc: Roland Dreier, Vu Pham, Sebastian Riemer, linux-rdma

On 06/28/13 14:07, David Dillow wrote:
> This does need to be documented in
> Documentation/ABI/stable/sysfs-driver-ib_srp, though.

OK, I will update that document in v2 of this patch series.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
       [not found] ` <51B885C7.1020701-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
@ 2013-06-12 15:15   ` Bart Van Assche
  0 siblings, 0 replies; 64+ messages in thread
From: Bart Van Assche @ 2013-06-12 15:15 UTC (permalink / raw)
  To: Jack Wang
  Cc: David Dillow, Vu Pham, Sebastian Riemer, linux-rdma, Roland Dreier

On 06/12/13 16:29, Jack Wang wrote:
>> An SRP target is required to maintain a single connection between
>> initiator and target. This means that if the 'add_target' attribute
>> is used to create a second connection to a target that the first
>> connection will be logged out and that the SCSI error handler will
>> kick in. The SCSI error handler will cause the SRP initiator to
>> reconnect, which will cause I/O over the second connection to fail.
>> Avoid such ping-pong behavior by disabling relogins. Note: if
>> reconnecting manually is necessary, that is possible by deleting
>> and recreating an rport via sysfs.
>>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>> Cc: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
>> Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Cc: Sebastian Riemer <sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
>
> Thanks Bart for refreshing this new patch set. I think you should add
> Signed-off-by for Sebastian ?

This patch differs slightly from what Sebastian had posted. But if 
Sebastian agrees with this version, I'll be happy to add his signed-off-by.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2013-06-28 12:30 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
     [not found]   ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
2013-06-13 19:43     ` Vu Pham
2013-06-14 13:19       ` Bart Van Assche
     [not found]         ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
2013-06-14 17:59           ` Vu Pham
     [not found]             ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-15  9:52               ` Bart Van Assche
     [not found]                 ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
2013-06-17  6:18                   ` Hannes Reinecke
2013-06-17  7:04                     ` Bart Van Assche
2013-06-17  7:14                       ` Hannes Reinecke
2013-06-17  7:29                         ` Bart Van Assche
     [not found]                           ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
2013-06-17  8:10                             ` Hannes Reinecke
2013-06-17 10:13                             ` Sebastian Riemer
2013-06-18 16:59                 ` Vu Pham
     [not found]                   ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-19 13:00                     ` Bart Van Assche
2013-06-23 21:13   ` Mike Christie
     [not found]     ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2013-06-24  7:37       ` Bart Van Assche
     [not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
2013-06-12 13:20   ` [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion Bart Van Assche
     [not found]     ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
2013-06-12 13:38       ` Bart Van Assche
     [not found]         ` <51B879CF.1080802-HInyCGIudOg@public.gmane.org>
2013-06-12 14:24           ` Sebastian Riemer
2013-06-27 21:01       ` David Dillow
     [not found]         ` <1372366870.32164.30.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-27 23:45           ` Roland Dreier
     [not found]             ` <CAL1RGDWVgAKSL-GNZCkP1FEt9r_y5QWp+74NzDcga6+tcvWpXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-28  7:41               ` Sebastian Riemer
2013-06-12 13:21   ` [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req() Bart Van Assche
     [not found]     ` <51B875EE.3030702-HInyCGIudOg@public.gmane.org>
2013-06-12 14:58       ` Sebastian Riemer
     [not found]         ` <51B88C7C.4030209-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-12 15:14           ` Bart Van Assche
     [not found]             ` <51B8903E.3000609-HInyCGIudOg@public.gmane.org>
2013-06-27 21:02               ` David Dillow
     [not found]                 ` <1372366945.32164.32.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28  7:36                   ` Bart Van Assche
2013-06-12 13:23   ` [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error Bart Van Assche
     [not found]     ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
2013-06-13  9:30       ` Sebastian Riemer
     [not found]         ` <51B99120.9000503-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13  9:57           ` Bart Van Assche
2013-06-27 21:03       ` David Dillow
2013-06-12 13:24   ` [PATCH 04/14] IB/srp: Skip host settle delay Bart Van Assche
     [not found]     ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
2013-06-13  9:53       ` Sebastian Riemer
     [not found]         ` <51B996A1.6080604-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 13:06           ` Or Gerlitz
2013-06-27 21:04       ` David Dillow
2013-06-12 13:25   ` [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Bart Van Assche
     [not found]     ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
2013-06-13 13:57       ` Sebastian Riemer
     [not found]         ` <51B9CFC3.8080008-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 15:07           ` Bart Van Assche
     [not found]             ` <51B9E046.3030008-HInyCGIudOg@public.gmane.org>
2013-06-13 15:35               ` Sebastian Riemer
2013-06-13 17:50       ` Vu Pham
     [not found]         ` <51BA0655.6090707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 18:25           ` Bart Van Assche
     [not found]             ` <51BA0E8F.3030104-HInyCGIudOg@public.gmane.org>
2013-06-13 23:27               ` Vu Pham
     [not found]                 ` <51BA555F.9060807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-14  9:38                   ` Sebastian Riemer
     [not found]                     ` <51BAE482.1050304-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-14 17:07                       ` Vu Pham
     [not found]                         ` <51BB4DBB.4070800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-17  9:41                           ` Sebastian Riemer
2013-06-27 21:10       ` David Dillow
     [not found]         ` <1372367432.32164.36.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28  7:40           ` Bart Van Assche
2013-06-12 13:26   ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
2013-06-12 13:29   ` [PATCH 08/14] IB/srp: Add srp_terminate_io() Bart Van Assche
2013-06-12 13:30   ` [PATCH 09/14] IB/srp: Use SRP transport layer error recovery Bart Van Assche
2013-06-12 13:31   ` [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
2013-06-12 13:33   ` [PATCH 11/14] IB/srp: Fail SCSI commands silently Bart Van Assche
2013-06-12 13:35   ` [PATCH 12/14] IB/srp: Make HCA completion vector configurable Bart Van Assche
     [not found]     ` <51B87904.1070803-HInyCGIudOg@public.gmane.org>
2013-06-27 21:24       ` David Dillow
     [not found]         ` <1372368256.32164.41.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28  8:18           ` Bart Van Assche
     [not found]             ` <51CD46F0.60301-HInyCGIudOg@public.gmane.org>
2013-06-28 12:04               ` David Dillow
     [not found]                 ` <1372421041.28740.14.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:29                   ` Bart Van Assche
2013-06-12 13:36   ` [PATCH 13/14] IB/srp: Make transport layer retry count configurable Bart Van Assche
     [not found]     ` <51B8794F.6050003-HInyCGIudOg@public.gmane.org>
2013-06-27 21:22       ` David Dillow
     [not found]         ` <1372368138.32164.40.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28  8:28           ` Bart Van Assche
     [not found]             ` <51CD4933.5080709-HInyCGIudOg@public.gmane.org>
2013-06-28 12:07               ` David Dillow
     [not found]                 ` <1372421227.28740.17.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:30                   ` Bart Van Assche
2013-06-12 13:37   ` [PATCH 14/14] IB/srp: Bump driver version and release date Bart Van Assche
2013-06-12 14:29 RE:[PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Jack Wang
     [not found] ` <51B885C7.1020701-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-12 15:15   ` [PATCH " Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.