All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
@ 2013-02-01 15:18 Bart Van Assche
       [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-01 15:18 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow, Or Gerlitz,
	Vu Pham, Alex Turin

This patch series avoids that SCSI error handling triggers an endless 
loop and also restores reporting of QP errors in the kernel log.

Changes between v3 and v2:
- As proposed by Dave, added a patch that prevents sending of a task
   management function over a closed connection.

Changes between v2 and v1:
- Track connection state properly.
- Make srp_reset_host() reset requests even if reconnecting fails

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly
       [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
@ 2013-02-01 15:18   ` Bart Van Assche
  2013-02-01 15:19   ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2013-02-01 15:18 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow, Or Gerlitz,
	Vu Pham, Alex Turin

Remove an assignment that incorrectly overwrites the connection
state update by srp_connect_target().

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index d5088ce..94f76b9 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1972,7 +1972,6 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	spin_unlock(&host->target_lock);
 
 	target->state = SRP_TARGET_LIVE;
-	target->connected = false;
 
 	scsi_scan_target(&target->scsi_host->shost_gendev,
 			 0, target->scsi_id, SCAN_WILD_CARD, 0);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly
       [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
  2013-02-01 15:18   ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche
@ 2013-02-01 15:19   ` Bart Van Assche
  2013-02-01 15:21   ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
  2013-02-04 21:11   ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
  3 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2013-02-01 15:19 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow, Or Gerlitz,
	Vu Pham, Alex Turin, Roland Dreier

Do not send a task management function if sending will fail anyway
because either there is no RDMA/RC connection or the QP is in the
error state.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 94f76b9..2633258 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1695,6 +1695,9 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 	struct srp_iu *iu;
 	struct srp_tsk_mgmt *tsk_mgmt;
 
+	if (!target->connected || target->qp_in_error)
+		return -1;
+
 	init_completion(&target->tsk_mgmt_done);
 
 	spin_lock_irq(&target->lock);
@@ -1754,8 +1757,6 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n");
 
-	if (target->qp_in_error)
-		return FAILED;
 	if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun,
 			      SRP_TSK_LUN_RESET))
 		return FAILED;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop
       [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
  2013-02-01 15:18   ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche
  2013-02-01 15:19   ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
@ 2013-02-01 15:21   ` Bart Van Assche
  2013-02-04 21:11   ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
  3 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2013-02-01 15:21 UTC (permalink / raw)
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Dillow, Or Gerlitz,
	Vu Pham, Alex Turin, Roland Dreier

If a SCSI command times out it is passed to the SCSI error
handler. The SCSI error handler will try to abort the commands
that timed out. If aborting failed a device reset will be
attempted. If the device reset also fails a host reset will
be attempted. If the host reset also fails the whole procedure
will be repeated.

srp_abort() and srp_reset_device() fail for a QP in the error
state. srp_reset_host() fails after host removal has started.
Hence if the SCSI error handler gets invoked after host removal
has started and with the QP in the error state an endless loop
will be triggered.

Modify the SCSI error handling functions in ib_srp as follows:
- Abort SCSI commands properly even if the QP is in the error
  state.
- Make srp_reset_host() reset SCSI requests even after host
  removal has already started or if reconnecting fails.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Acked-by: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Alex Turin <alextu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2633258..8a7eb9f 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -700,23 +700,24 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	struct Scsi_Host *shost = target->scsi_host;
 	int i, ret;
 
-	if (target->state != SRP_TARGET_LIVE)
-		return -EAGAIN;
-
 	scsi_target_block(&shost->shost_gendev);
 
 	srp_disconnect_target(target);
 	/*
-	 * Now get a new local CM ID so that we avoid confusing the
-	 * target in case things are really fouled up.
+	 * Now get a new local CM ID so that we avoid confusing the target in
+	 * case things are really fouled up. Doing so also ensures that all CM
+	 * callbacks will have finished before a new QP is allocated.
 	 */
 	ret = srp_new_cm_id(target);
-	if (ret)
-		goto unblock;
-
-	ret = srp_create_target_ib(target);
-	if (ret)
-		goto unblock;
+	/*
+	 * Whether or not creating a new CM ID succeeded, create a new
+	 * QP. This guarantees that all completion callback function
+	 * invocations have finished before request resetting starts.
+	 */
+	if (ret == 0)
+		ret = srp_create_target_ib(target);
+	else
+		srp_create_target_ib(target);
 
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
@@ -728,9 +729,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	for (i = 0; i < SRP_SQ_SIZE; ++i)
 		list_add(&target->tx_ring[i]->list, &target->free_tx);
 
-	ret = srp_connect_target(target);
+	if (ret == 0)
+		ret = srp_connect_target(target);
 
-unblock:
 	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
 			    SDEV_TRANSPORT_OFFLINE);
 
@@ -1739,7 +1740,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (!req || target->qp_in_error || !srp_claim_req(target, req, scmnd))
+	if (!req || !srp_claim_req(target, req, scmnd))
 		return FAILED;
 	srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			  SRP_TSK_ABORT_TASK);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2013-02-01 15:21   ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
@ 2013-02-04 21:11   ` Or Gerlitz
       [not found]     ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  3 siblings, 1 reply; 23+ messages in thread
From: Or Gerlitz @ 2013-02-04 21:11 UTC (permalink / raw)
  To: Bart Van Assche, David Dillow, Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Vu Pham, Oren Duer

On Fri, Feb 1, 2013 at 5:18 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
> This patch series avoids that SCSI error handling triggers an endless loop
> and also restores reporting of QP errors in the kernel log.

Bart,

You wrote "resend" in the subject line, anything new, in these patches
OR in other patches merged through the SCSI tree for 3.8 vs. what you
had posted earlier on Dec 20th, 2012
(http://marc.info/?t=135592692800001&r=1&w=2)?! as I wrote there, it
didn't work for me.

Dave, Roland, as I wrote here
http://marc.info/?l=linux-rdma&m=135603401830703&w=2 I think what we
need now is 2nd opinion, and I asked if Dave can give the patches a
try, no more but no less... we need to know if it works

Or.

> Changes between v3 and v2:
> - As proposed by Dave, added a patch that prevents sending of a task
>   management function over a closed connection.
>
> Changes between v2 and v1:
> - Track connection state properly.
> - Make srp_reset_host() reset requests even if reconnecting fails
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]     ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-02-05 16:25       ` Bart Van Assche
       [not found]         ` <5111327F.6050402-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-05 16:25 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Vu Pham, Oren Duer

On 02/04/13 22:11, Or Gerlitz wrote:
> On Fri, Feb 1, 2013 at 5:18 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
>> This patch series avoids that SCSI error handling triggers an endless loop
>> and also restores reporting of QP errors in the kernel log.
>
> Bart,
>
> You wrote "resend" in the subject line, anything new, in these patches
> OR in other patches merged through the SCSI tree for 3.8 vs. what you
> had posted earlier on Dec 20th, 2012.

Hello Or,

I have just reposted my SCSI device removal patch series. There are 
several changes in v8 of that patch series compared to v7, including:
* Saving and restoring the host scribble field in the SCSI error
   handler.
* A robustness improvement for when the SCSI timeout is below the
   maximum LLD response processing time.

Version eight of the SCSI device removal patch series can be found here: 
http://github.com/bvanassche/linux/tree/device-removal-fixes.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]         ` <5111327F.6050402-HInyCGIudOg@public.gmane.org>
@ 2013-02-05 20:54           ` Or Gerlitz
       [not found]             ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Or Gerlitz @ 2013-02-05 20:54 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Vu Pham, Oren Duer

On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
> On 02/04/13 22:11, Or Gerlitz wrote:

>> You wrote "resend" in the subject line, anything new, in these patches
>> OR in other patches merged through the SCSI tree for 3.8 vs. what you
>> had posted earlier on Dec 20th, 2012.

> I have just reposted my SCSI device removal patch series. There are

Bart, I'd like to sharpen the point: could you please clarify if the
series posted to linux-rdma stands for itself in the sense that SRP HA
scheme X (please state it) now works/better when the patches applied
on top of the latest 3.8-rc cut? OR for X to do better/work, one needs
this series AND the one you posted to linux-scsi.

Or.

> several changes in v8 of that patch series compared to v7, including:
> * Saving and restoring the host scribble field in the SCSI error handler.
> * A robustness improvement for when the SCSI timeout is below the
>   maximum LLD response processing time.
>
> Version eight of the SCSI device removal patch series can be found here:
> http://github.com/bvanassche/linux/tree/device-removal-fixes.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]             ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-02-06  7:22               ` Bart Van Assche
       [not found]                 ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-06  7:22 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Dillow, Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Or Gerlitz, Vu Pham, Oren Duer

On 02/05/13 21:54, Or Gerlitz wrote:
> On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> wrote:
>> On 02/04/13 22:11, Or Gerlitz wrote:
> Bart, I'd like to sharpen the point: could you please clarify if the
> series posted to linux-rdma stands for itself in the sense that SRP HA
> scheme X (please state it) now works/better when the patches applied
> on top of the latest 3.8-rc cut? OR for X to do better/work, one needs
> this series AND the one you posted to linux-scsi.

Hello Or,

A huge number of patches have been taken upstream between 3.8-rc1 and 
3.8-rc6. I have retested these three patches with 3.8-rc6 and would 
appreciate if you would also repeat your tests.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                 ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
@ 2013-02-06  7:44                   ` Or Gerlitz
       [not found]                     ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-02-06 21:42                   ` Vu Pham
  1 sibling, 1 reply; 23+ messages in thread
From: Or Gerlitz @ 2013-02-06  7:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham, Oren Duer,
	Sagi Grimberg

On 06/02/2013 09:22, Bart Van Assche wrote:
>
> A huge number of patches have been taken upstream between 3.8-rc1 and 
> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would 
> appreciate if you would also repeat your tests.

not really... this is what I see on Linus tree for the relevant 
directories, anywhere else I need to look

linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/ 
drivers/block/ block/drivers/infiniband/ulp/srp
bdb0ae6 Merge branch 'x86-urgent-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
83e6818 efi: Make 'efi_enabled' a function to query EFI facilities
2263647 Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
8d85fce Drivers: block: remove __dev* attributes.
6f03979 Drivers: scsi: remove __dev* attributes.
f4953fe virtio-blk: Don't free ida when disk is in use

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                     ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-06  7:59                       ` Bart Van Assche
       [not found]                         ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-06  7:59 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham, Oren Duer,
	Sagi Grimberg

On 02/06/13 08:44, Or Gerlitz wrote:
> On 06/02/2013 09:22, Bart Van Assche wrote:
>>
>> A huge number of patches have been taken upstream between 3.8-rc1 and
>> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would
>> appreciate if you would also repeat your tests.
>
> not really... this is what I see on Linus tree for the relevant
> directories, anywhere else I need to look
>
> linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/
> drivers/block/ block/drivers/infiniband/ulp/srp
> bdb0ae6 Merge branch 'x86-urgent-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> 83e6818 efi: Make 'efi_enabled' a function to query EFI facilities
> 2263647 Merge tag 'fixes-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
> 8d85fce Drivers: block: remove __dev* attributes.
> 6f03979 Drivers: scsi: remove __dev* attributes.
> f4953fe virtio-blk: Don't free ida when disk is in use

Hello Or,

Nobody outside Mellanox has ever been able to reproduce the behavior 
reported by you. Something in your tests might have been specific to the 
Mellanox environment. Have you perhaps been running your tests with a 
firmware version that is not available to the general public ? I would 
appreciate it if you could check your test environment and repeat your 
tests.

Thanks for your understanding,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                         ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org>
@ 2013-02-06  8:25                           ` Or Gerlitz
  0 siblings, 0 replies; 23+ messages in thread
From: Or Gerlitz @ 2013-02-06  8:25 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham, Oren Duer,
	Sagi Grimberg

On 06/02/2013 09:59, Bart Van Assche wrote:
> On 02/06/13 08:44, Or Gerlitz wrote:
>> On 06/02/2013 09:22, Bart Van Assche wrote:
>>>
>>> A huge number of patches have been taken upstream between 3.8-rc1 and
>>> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would
>>> appreciate if you would also repeat your tests.
>>
>> not really... this is what I see on Linus tree for the relevant
>> directories, anywhere else I need to look
>>
>> linux-2.6]# git log --oneline v3.8-rc1..v3.8-rc6 drivers/scsi/
>> drivers/block/ block/drivers/infiniband/ulp/srp
>> bdb0ae6 Merge branch 'x86-urgent-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> 83e6818 efi: Make 'efi_enabled' a function to query EFI facilities
>> 2263647 Merge tag 'fixes-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
>> 8d85fce Drivers: block: remove __dev* attributes.
>> 6f03979 Drivers: scsi: remove __dev* attributes.
>> f4953fe virtio-blk: Don't free ida when disk is in use
>
> Nobody outside Mellanox has ever been able to reproduce the behavior 
> reported by you. 

I have asked for 2nd opinion so we can get a quorum either way.

> Something in your tests might have been specific to the Mellanox 
> environment. Have you perhaps been running your tests with a firmware 
> version that is not available to the general public ? 

NO


> I would appreciate it if you could check your test environment and 
> repeat your tests.

We will repeat the tests, indeed.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                 ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
  2013-02-06  7:44                   ` Or Gerlitz
@ 2013-02-06 21:42                   ` Vu Pham
       [not found]                     ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Vu Pham @ 2013-02-06 21:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Oren Duer,
	Sagi Grimberg

[-- Attachment #1: Type: text/plain, Size: 3961 bytes --]

Bart Van Assche wrote:
> On 02/05/13 21:54, Or Gerlitz wrote:
>> On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> 
>> wrote:
>>> On 02/04/13 22:11, Or Gerlitz wrote:
>> Bart, I'd like to sharpen the point: could you please clarify if the
>> series posted to linux-rdma stands for itself in the sense that SRP HA
>> scheme X (please state it) now works/better when the patches applied
>> on top of the latest 3.8-rc cut? OR for X to do better/work, one needs
>> this series AND the one you posted to linux-scsi.
>
> Hello Or,
>
> A huge number of patches have been taken upstream between 3.8-rc1 and 
> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would 
> appreciate if you would also repeat your tests.
>
> Thanks,
>
> Bart.
Hello Bart,

I tested your 3.8 v3 patchset. I did the following:
- clone & checkout Roland's ib tree for-next branch
- applied Bart's 3.8 v3 patchset
- applied "save & restore host_scribble during error handling" patch - 
http://www.mail-archive.com/linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg17809.html

I have two paths to target thru port 1 & 2 (scsi_host host9 & host10)

- run I/Os
- disable port 1 @ 19:11:30
- error recovery for host9 kick in @ 19:12:04
- multipath remove the path, I/Os fail-over @ 19:12:51
- error recovery was still going on with host9 (sysfs entry for host9 
still intact)
- enable port 1 @19:15:00
- host9 reconnect to target thru error recovery, multipathd module 
re-instate the path in kernel; and then host9 is REMOVED, usermode 
"multipath -l" did not show re-instate path thru host9

Feb  6 19:15:04 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:05 vsa30 multipathd: overflow in attribute 
'/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host9/target9:0:0/9:0:0:2/state'
Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP reset_device called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: SRP reset_host called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: reconnect succeeded
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: sdd 
- tur checker reports path is up
Feb  6 19:15:26 vsa30 multipathd: 8:48: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: 
remaining active paths: 2
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: sdc 
- tur checker reports path is up
Feb  6 19:15:26 vsa30 multipathd: 8:32: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: 
remaining active paths: 2
Feb  6 19:15:26 vsa30 multipathd: sdc: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:80 1]
Feb  6 19:15:26 vsa30 multipathd: sdc: path removed from map 
3600144f0665c4400000050a522180002
Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:1: [sdc] Synchronizing SCSI cache
Feb  6 19:15:26 vsa30 multipathd: sdd: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:96 1]
Feb  6 19:15:26 vsa30 multipathd: sdd: path removed from map 
3600144f0665c4400000050a522180003
Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:2: [sdd] Synchronizing SCSI cache

- disable port 2 @19:22:50
- error recovery kicked in on host10 @ 19:23:40
- I/Os failed with NO path to target @ 19:24:27
- without enabling port 2, error recovery was still going on host10 
still 19:57:52 and stop.
- host10 was still in sysfs /sys/class/scsi_host/host10 & taking 
reference on ib_srp module
- enable port 2 - nothing happened.

Conclusion:
1. disable the port/path long enough >35 minutes, we have dangling scsi 
host.
2. enable the port within 30 minute, scsi host re-establish connection, 
path re-instate and then scsi_host was removed (no entry in sysfs)

I attached a log here to show what happened above.

thanks,
-vu

[-- Attachment #2: messages.bz2 --]
[-- Type: application/octet-stream, Size: 10661 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                     ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-07  9:05                       ` Bart Van Assche
       [not found]                         ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-07  9:05 UTC (permalink / raw)
  To: Vu Pham
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Oren Duer,
	Sagi Grimberg

On 02/06/13 22:42, Vu Pham wrote:
> Conclusion:
> 1. disable the port/path long enough >35 minutes, we have dangling scsi
> host.
> 2. enable the port within 30 minute, scsi host re-establish connection,
> path re-instate and then scsi_host was removed (no entry in sysfs)
>
> I attached a log here to show what happened above.

Hello Vu,

I found the following in the attached logs:

[ ... ]
Feb  6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called
[ ... ]
Feb  6 19:25:28 vsa30 kernel: scsi host10: SRP abort called
[ ... ]

It is easy to see in patch 3/3 that srp_reset_host() invokes 
srp_reconnect_target() unconditionally and that that last function kills 
all outstanding requests via srp_reset_req(). So to me the above output 
means that the attached logs were generated by a kernel missing at least 
patch 3/3. This means that the above conclusions are invalid.

I think it is also worth mentioning here that I asked Mellanox two 
months ago via private e-mail to provide me access to a setup on which 
this issue can be reproduced and on which I can recompile the kernel 
myself. However, such access was never provided.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                         ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
@ 2013-02-07  9:41                           ` Or Gerlitz
       [not found]                             ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-02-07 18:20                           ` Vu Pham
  1 sibling, 1 reply; 23+ messages in thread
From: Or Gerlitz @ 2013-02-07  9:41 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Oren Duer, Sagi Grimberg

On 07/02/2013 11:05, Bart Van Assche wrote:
> On 02/06/13 22:42, Vu Pham wrote:
>> Conclusion:
>> 1. disable the port/path long enough >35 minutes, we have dangling scsi
>> host.
>> 2. enable the port within 30 minute, scsi host re-establish connection,
>> path re-instate and then scsi_host was removed (no entry in sysfs)
>>
>> I attached a log here to show what happened above.
>
> Hello Vu,
>
> I found the following in the attached logs:
>
> [ ... ]
> Feb  6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called
> [ ... ]
> Feb  6 19:25:28 vsa30 kernel: scsi host10: SRP abort called
> [ ... ]
>
> It is easy to see in patch 3/3 that srp_reset_host() invokes 
> srp_reconnect_target() unconditionally and that that last function 
> kills all outstanding requests via srp_reset_req(). So to me the above 
> output means that the attached logs were generated by a kernel missing 
> at least patch 3/3. This means that the above conclusions are invalid.

Hi Bart,

Please calm down, we truly think your work is great step in the right 
direction. We will double check the environment and the logs provided to 
you.
On the half side of the glass, I think Vu also saw things that work 
better with these patches (BTW - if the fourth patch that Vu used "save 
& restore host_scribble during error handling" is also needed, maybe you 
add  it to this series, so they are reviewed/accepted together).

We will set with you online session in the coming 2-3 working days to 
build the kernel and conduct the test together.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                             ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-07 10:15                               ` Bart Van Assche
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2013-02-07 10:15 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Vu Pham, Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Oren Duer, Sagi Grimberg

On 02/07/13 10:41, Or Gerlitz wrote:
> (BTW - if the fourth patch that Vu used "save
> & restore host_scribble during error handling" is also needed, maybe you
> add  it to this series, so they are reviewed/accepted together).

Hello Or,

The three patches I posted guarantee timely host removal even without 
the host_scribble patch. Even if srp_abort() would not have aborted a 
request because the host_scribble field had been overwritten, 
srp_reset_host() will be invoked later on and will finish such a request 
properly.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
       [not found]                         ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
  2013-02-07  9:41                           ` Or Gerlitz
@ 2013-02-07 18:20                           ` Vu Pham
       [not found]                             ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: Vu Pham @ 2013-02-07 18:20 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Oren Duer,
	Sagi Grimberg

Bart Van Assche wrote:
> On 02/06/13 22:42, Vu Pham wrote:
>> Conclusion:
>> 1. disable the port/path long enough >35 minutes, we have dangling scsi
>> host.
>> 2. enable the port within 30 minute, scsi host re-establish connection,
>> path re-instate and then scsi_host was removed (no entry in sysfs)
>>
>> I attached a log here to show what happened above.
>
> Hello Vu,
>
> I found the following in the attached logs:
>
> [ ... ]
> Feb  6 19:24:25 vsa30 kernel: scsi host10: ib_srp: SRP reset_host called
> [ ... ]
> Feb  6 19:25:28 vsa30 kernel: scsi host10: SRP abort called
> [ ... ]
>
> It is easy to see in patch 3/3 that srp_reset_host() invokes 
> srp_reconnect_target() unconditionally and that that last function 
> kills all outstanding requests via srp_reset_req(). So to me the above 
> output means that the attached logs were generated by a kernel missing 
> at least patch 3/3. This means that the above conclusions are invalid.
>
> I think it is also worth mentioning here that I asked Mellanox two 
> months ago via private e-mail to provide me access to a setup on which 
> this issue can be reproduced and on which I can recompile the kernel 
> myself. However, such access was never provided.
>
> Bart.
Hello Bart,

srp_reconnect_target() kill all outstanding requests, fail to reconnect 
(port offline), queued to remove scsi_host --> srp_reset_host() return 
FAILED.

While scsi host has not been removed, multipath periodically still send 
TUR commands to check liveness of this path.
Current srp_queuecommand() still process these TUR commands; therefore, 
the next SRP aborts are for those tur checker commands

You can see these tur checker errors from multipath in the log.

Sagi from Mellanox will work with you on webex.

thanks,
-vu


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                             ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-15  9:39                               ` Bart Van Assche
       [not found]                                 ` <511E024E.70002-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-15  9:39 UTC (permalink / raw)
  To: Vu Pham
  Cc: Or Gerlitz, David Dillow, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Sagi Grimberg

If an SRP target is no longer reachable and srp_reset_host()
fails to reconnect then ib_srp will invoke scsi_remove_host().
That function will invoke __scsi_remove_device() for each LUN.
And that last function will change the device state from
SDEV_TRANSPORT_OFFLINE into SDEV_CANCEL. Certain user space
software, e.g. older versions of multipathd, continue queueing
I/O to SCSI devices that are in the SDEV_CANCEL state. If these
I/O requests are submitted as SG_IO that means that the
REQ_PREEMPT flag will be set and hence that these requests will
be passed to srp_queuecommand(). These requests will time out.
If new requests are queued fast enough from user space these
active requests will prevent __scsi_remove_device() to finish.
Avoid this by failing I/O requests in the SDEV_CANCEL state if
the transport is offline. Introduce a new variable to keep
track of the transport state instead of failing requests if
(!target->connected || target->qp_in_error) such that the SCSI
error handler has a chance to retry commands after a transport
layer failure occurred.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    7 +++++++
 drivers/infiniband/ulp/srp/ib_srp.h |    1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
 
 	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
 			    SDEV_TRANSPORT_OFFLINE);
+	target->transport_offline = ret != 0;
 
 	if (ret)
 		goto err;
@@ -1353,6 +1354,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	unsigned long flags;
 	int len;
 
+	if (unlikely(target->transport_offline)) {
+		scmnd->result = DID_NO_CONNECT << 16;
+		scmnd->scsi_done(scmnd);
+		return 0;
+	}
+
 	spin_lock_irqsave(&target->lock, flags);
 	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
 	if (!iu)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index de2d0b3..66fbedd 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -140,6 +140,7 @@ struct srp_target_port {
 	unsigned int		cmd_sg_cnt;
 	unsigned int		indirect_size;
 	bool			allow_ext_sg;
+	bool			transport_offline;
 
 	/* Everything above this point is used in the hot path of
 	 * command processing. Try to keep them packed into cachelines.
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                 ` <511E024E.70002-HInyCGIudOg@public.gmane.org>
@ 2013-02-18  4:06                                   ` David Dillow
       [not found]                                     ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: David Dillow @ 2013-02-18  4:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vu Pham, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Sagi Grimberg

On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 8a7eb9f..b34752d 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
>  
>  	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
>  			    SDEV_TRANSPORT_OFFLINE);
> +	target->transport_offline = ret != 0;

Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
	target->transport_offline = !!ret;
or
	target->transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Or, Vu, does this solve the issue you've seen? I may have time to test
later this week, but not before.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                     ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2013-02-18  8:11                                       ` Sagi Grimberg
       [not found]                                         ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-02-21 16:10                                       ` Bart Van Assche
  1 sibling, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2013-02-18  8:11 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, Vu Pham, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz

On 2/18/2013 6:06 AM, David Dillow wrote:
> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 8a7eb9f..b34752d 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
>>   
>>   	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
>>   			    SDEV_TRANSPORT_OFFLINE);
>> +	target->transport_offline = ret != 0;
> Minor nit, that line is hard to read; I keep thinking it needs parens
> around the conditional...
>
> Perhaps
> 	target->transport_offline = !!ret;
> or
> 	target->transport_offline = ret;
>
> gcc should do the right conversion since we're assigning to a bool.
>
>
> Or, Vu, does this solve the issue you've seen? I may have time to test
> later this week, but not before.
>

Hey David,

This indeed solve scsi_host removal issues.
Vu is on vacation, I'll perform some more failover tests...

-Sagi
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                     ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  2013-02-18  8:11                                       ` Sagi Grimberg
@ 2013-02-21 16:10                                       ` Bart Van Assche
  1 sibling, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2013-02-21 16:10 UTC (permalink / raw)
  To: David Dillow
  Cc: Vu Pham, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Sagi Grimberg

On 02/18/13 05:06, David Dillow wrote:
> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 8a7eb9f..b34752d 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
>>
>>   	scsi_target_unblock(&shost->shost_gendev, ret == 0 ? SDEV_RUNNING :
>>   			    SDEV_TRANSPORT_OFFLINE);
>> +	target->transport_offline = ret != 0;
>
> Minor nit, that line is hard to read; I keep thinking it needs parens
> around the conditional...
>
> Perhaps
> 	target->transport_offline = !!ret;
> or
> 	target->transport_offline = ret;
>
> gcc should do the right conversion since we're assigning to a bool.

Personally I prefer changing the assignment into the first alternative 
since it's more explicit than the second alternative - it doesn't 
require the person who's reading the source code that the 
transport_offline variable has been declared as "bool".

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                         ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-24  8:09                                           ` Bart Van Assche
       [not found]                                             ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2013-02-24  8:09 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: David Dillow, Vu Pham, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz

On 02/18/13 09:11, Sagi Grimberg wrote:
> On 2/18/2013 6:06 AM, David Dillow wrote:
>> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
>>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
>>> b/drivers/infiniband/ulp/srp/ib_srp.c
>>> index 8a7eb9f..b34752d 100644
>>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct
>>> srp_target_port *target)
>>>       scsi_target_unblock(&shost->shost_gendev, ret == 0 ?
>>> SDEV_RUNNING :
>>>                   SDEV_TRANSPORT_OFFLINE);
>>> +    target->transport_offline = ret != 0;
>> Minor nit, that line is hard to read; I keep thinking it needs parens
>> around the conditional...
>>
>> Perhaps
>>     target->transport_offline = !!ret;
>> or
>>     target->transport_offline = ret;
>>
>> gcc should do the right conversion since we're assigning to a bool.
>>
>>
>> Or, Vu, does this solve the issue you've seen? I may have time to test
>> later this week, but not before.
>
> This indeed solve scsi_host removal issues.
> Vu is on vacation, I'll perform some more failover tests...

Hello Sagi,

Since no further feedback was posted on the list I assume that means 
that all tests passed ?

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                             ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>
@ 2013-02-24  8:59                                               ` Sagi Grimberg
       [not found]                                                 ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2013-02-24  8:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: David Dillow, Vu Pham, Or Gerlitz, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz

On 2/24/2013 10:09 AM, Bart Van Assche wrote:
> On 02/18/13 09:11, Sagi Grimberg wrote:
>> On 2/18/2013 6:06 AM, David Dillow wrote:
>>> On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
>>>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
>>>> b/drivers/infiniband/ulp/srp/ib_srp.c
>>>> index 8a7eb9f..b34752d 100644
>>>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>>>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>>>> @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct
>>>> srp_target_port *target)
>>>>       scsi_target_unblock(&shost->shost_gendev, ret == 0 ?
>>>> SDEV_RUNNING :
>>>>                   SDEV_TRANSPORT_OFFLINE);
>>>> +    target->transport_offline = ret != 0;
>>> Minor nit, that line is hard to read; I keep thinking it needs parens
>>> around the conditional...
>>>
>>> Perhaps
>>>     target->transport_offline = !!ret;
>>> or
>>>     target->transport_offline = ret;
>>>
>>> gcc should do the right conversion since we're assigning to a bool.
>>>
>>>
>>> Or, Vu, does this solve the issue you've seen? I may have time to test
>>> later this week, but not before.
>>
>> This indeed solve scsi_host removal issues.
>> Vu is on vacation, I'll perform some more failover tests...
>
> Hello Sagi,
>
> Since no further feedback was posted on the list I assume that means 
> that all tests passed ?
>
> Bart.
>

Hey Bart,
Sorry for the delay I was just about to reply...

 From my end, the related patchset seems solve the scsi_host removal 
issue and prevents the SCSI error handling loop.
Generally our tests passed, I still have some issue with long-term 
failover test but I'm not sure its SRP (perhaps might origin in IB layer).
So ack from me...

-Sagi
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline
       [not found]                                                 ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-02-24 14:42                                                   ` Or Gerlitz
  0 siblings, 0 replies; 23+ messages in thread
From: Or Gerlitz @ 2013-02-24 14:42 UTC (permalink / raw)
  To: Sagi Grimberg, David Dillow, Roland Dreier
  Cc: Bart Van Assche, Vu Pham, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz

On Sun, Feb 24, 2013 at 10:59 AM, Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 2/24/2013 10:09 AM, Bart Van Assche wrote:
>> On 02/18/13 09:11, Sagi Grimberg wrote:
>>> On 2/18/2013 6:06 AM, David Dillow wrote:

>>>> Or, Vu, does this solve the issue you've seen? I may have time to test
>>>> later this week, but not before.

>>> This indeed solve scsi_host removal issues.
>>> Vu is on vacation, I'll perform some more failover tests...

>> Hello Sagi, Since no further feedback
>> was posted on the list I assume that means that all tests passed ?

> Hey Bart, Sorry for the delay I was just about to reply...
> From my end, the related patchset seems solve the scsi_host removal issue
> and prevents the SCSI error handling loop.


> Generally our tests passed, I still have some issue with long-term
> failover test but I'm not sure its SRP (perhaps might origin in IB layer).
> So ack from me...

OK

Dave/Roland - sounds like we did managed to get real progress here...
lets get these fix in for 3.9 and 3.8-stable too,

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-02-24 14:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-01 15:18 [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Bart Van Assche
     [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
2013-02-01 15:18   ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche
2013-02-01 15:19   ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
2013-02-01 15:21   ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
2013-02-04 21:11   ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
     [not found]     ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-05 16:25       ` Bart Van Assche
     [not found]         ` <5111327F.6050402-HInyCGIudOg@public.gmane.org>
2013-02-05 20:54           ` Or Gerlitz
     [not found]             ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-06  7:22               ` Bart Van Assche
     [not found]                 ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
2013-02-06  7:44                   ` Or Gerlitz
     [not found]                     ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-06  7:59                       ` Bart Van Assche
     [not found]                         ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org>
2013-02-06  8:25                           ` Or Gerlitz
2013-02-06 21:42                   ` Vu Pham
     [not found]                     ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07  9:05                       ` Bart Van Assche
     [not found]                         ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
2013-02-07  9:41                           ` Or Gerlitz
     [not found]                             ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07 10:15                               ` Bart Van Assche
2013-02-07 18:20                           ` Vu Pham
     [not found]                             ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-15  9:39                               ` [PATCH] IB/srp: Fail I/O requests if the transport is offline Bart Van Assche
     [not found]                                 ` <511E024E.70002-HInyCGIudOg@public.gmane.org>
2013-02-18  4:06                                   ` David Dillow
     [not found]                                     ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-02-18  8:11                                       ` Sagi Grimberg
     [not found]                                         ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24  8:09                                           ` Bart Van Assche
     [not found]                                             ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>
2013-02-24  8:59                                               ` Sagi Grimberg
     [not found]                                                 ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24 14:42                                                   ` Or Gerlitz
2013-02-21 16:10                                       ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.