All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] SRP initiator fixes for kernel 3.15
@ 2014-02-24 14:30 Sagi Grimberg
       [not found] ` <1393252218-30638-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-24 14:30 UTC (permalink / raw)
  To: bvanassche-HInyCGIudOg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hey Bart,

Here are three more fixes for kernel 3.15. These are the patches
Sebastian was referring to.

Changes from v0:
- Fix typos in commit message

Sagi Grimberg (2):
  IB/srp: Fix crash when unmapping data loop
  IB/srp: Check ib_query_gid return value

Vu Pham (1):
  IB/srp: Protect free_tx iu list from concurrent flows

 drivers/infiniband/ulp/srp/ib_srp.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found] ` <1393252218-30638-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-02-24 14:30   ` Sagi Grimberg
       [not found]     ` <1393252218-30638-2-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-02-24 14:30   ` [PATCH v1 2/3] IB/srp: Check ib_query_gid return value Sagi Grimberg
  2014-02-24 14:30   ` [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows Sagi Grimberg
  2 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-24 14:30 UTC (permalink / raw)
  To: bvanassche-HInyCGIudOg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

When unmapping request data, it is unsafe automatically
decrement req->nfmr regardless of it's value. This may
happen since IO and reconnect flow may run concurrently
resulting in req->nfmr = -1 and falsely call ib_fmr_pool_unmap.

Fix the loop condition to be greater than zero (which
explicitly means that FMRs were used on this request)
and only increment when needed.

This crash is easily reproduceable with ConnectX VFs OR
Connect-IB (where FMRs are not supported)

Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 529b6bc..0e20bfb 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -766,8 +766,11 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
 		return;
 
 	pfmr = req->fmr_list;
-	while (req->nfmr--)
+
+	while (req->nfmr > 0) {
 		ib_fmr_pool_unmap(*pfmr++);
+		req->nfmr--;
+	}
 
 	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
 			scmnd->sc_data_direction);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 2/3] IB/srp: Check ib_query_gid return value
       [not found] ` <1393252218-30638-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-02-24 14:30   ` [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop Sagi Grimberg
@ 2014-02-24 14:30   ` Sagi Grimberg
       [not found]     ` <1393252218-30638-3-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-02-24 14:30   ` [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows Sagi Grimberg
  2 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-24 14:30 UTC (permalink / raw)
  To: bvanassche-HInyCGIudOg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Detected by Coverity.

Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 0e20bfb..b615135 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2652,7 +2652,9 @@ static ssize_t srp_create_target(struct device *dev,
 	if (ret)
 		goto err_free_mem;
 
-	ib_query_gid(ibdev, host->port, 0, &target->path.sgid);
+	ret = ib_query_gid(ibdev, host->port, 0, &target->path.sgid);
+	if (ret)
+		goto err_free_mem;
 
 	shost_printk(KERN_DEBUG, target->scsi_host, PFX
 		     "new target: id_ext %016llx ioc_guid %016llx pkey %04x "
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows
       [not found] ` <1393252218-30638-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-02-24 14:30   ` [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop Sagi Grimberg
  2014-02-24 14:30   ` [PATCH v1 2/3] IB/srp: Check ib_query_gid return value Sagi Grimberg
@ 2014-02-24 14:30   ` Sagi Grimberg
       [not found]     ` <1393252218-30638-4-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-24 14:30 UTC (permalink / raw)
  To: bvanassche-HInyCGIudOg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham

From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

srp_reconnect_rport() serializes calls of srp_rport_reconnect()
with srp_queuecommand(), srp_abort(), srp_reset_device(),
srp_reset_host() via rport->mutex and also blocks srp_queuecommand();
however, it cannot block scsi error handler commands (stu, tur).
This may introduces corruption in free_tx IUs list and IU itself

Signed-off-by: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index b615135..656602b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -859,6 +859,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 {
 	struct srp_target_port *target = rport->lld_data;
 	int i, ret;
+	unsigned long flags;
 
 	srp_disconnect_target(target);
 	/*
@@ -882,9 +883,11 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 		srp_finish_req(target, req, DID_RESET << 16);
 	}
 
+	spin_lock_irqsave(&target->lock, flags);
 	INIT_LIST_HEAD(&target->free_tx);
 	for (i = 0; i < target->queue_size; ++i)
 		list_add(&target->tx_ring[i]->list, &target->free_tx);
+	spin_unlock_irqrestore(&target->lock, flags);
 
 	if (ret == 0)
 		ret = srp_connect_target(target);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]     ` <1393252218-30638-2-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-02-24 15:11       ` Sebastian Riemer
  2014-02-24 15:24       ` Bart Van Assche
  1 sibling, 0 replies; 24+ messages in thread
From: Sebastian Riemer @ 2014-02-24 15:11 UTC (permalink / raw)
  To: Sagi Grimberg, bvanassche-HInyCGIudOg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 24.02.2014 15:30, Sagi Grimberg wrote:
> When unmapping request data, it is unsafe automatically
> decrement req->nfmr regardless of it's value. This may
> happen since IO and reconnect flow may run concurrently
> resulting in req->nfmr = -1 and falsely call ib_fmr_pool_unmap.

Something is still strange here. What about the following:
"unsafe to decrement req->nfmr automatically"

"its" instead of "it's"

"and calling ib_fmr_pool_unmap falsely"

> Fix the loop condition to be greater than zero (which
> explicitly means that FMRs were used on this request)
> and only increment when needed.
> 
> This crash is easily reproduceable with ConnectX VFs OR
> Connect-IB (where FMRs are not supported)
> 
> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 529b6bc..0e20bfb 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -766,8 +766,11 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>  		return;
>  
>  	pfmr = req->fmr_list;
> -	while (req->nfmr--)
> +
> +	while (req->nfmr > 0) {
>  		ib_fmr_pool_unmap(*pfmr++);
> +		req->nfmr--;
> +	}
>  
>  	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
>  			scmnd->sc_data_direction);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]     ` <1393252218-30638-2-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-02-24 15:11       ` Sebastian Riemer
@ 2014-02-24 15:24       ` Bart Van Assche
       [not found]         ` <530B6444.1000805-HInyCGIudOg@public.gmane.org>
  1 sibling, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-02-24 15:24 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 02/24/14 15:30, Sagi Grimberg wrote:
> When unmapping request data, it is unsafe automatically
> decrement req->nfmr regardless of it's value. This may
> happen since IO and reconnect flow may run concurrently
> resulting in req->nfmr = -1 and falsely call ib_fmr_pool_unmap.
> 
> Fix the loop condition to be greater than zero (which
> explicitly means that FMRs were used on this request)
> and only increment when needed.
> 
> This crash is easily reproduceable with ConnectX VFs OR
> Connect-IB (where FMRs are not supported)
> 
> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 529b6bc..0e20bfb 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -766,8 +766,11 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>  		return;
>  
>  	pfmr = req->fmr_list;
> -	while (req->nfmr--)
> +
> +	while (req->nfmr > 0) {
>  		ib_fmr_pool_unmap(*pfmr++);
> +		req->nfmr--;
> +	}
>  
>  	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
>  			scmnd->sc_data_direction);
> 

Hello Sagi,

If srp_unmap_data() got invoked twice for the same request that means
that a race condition has been hit. I would like to see the root cause
of that race condition fixed instead of making it safe to invoke
srp_unmap_data() multiple times. It would be appreciated if you could
verify whether the following patch is a valid alternative for the above
patch:

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index b753a7a..2c75f95 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2141,8 +2141,7 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	cmd->tag    = req->index;
 	memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
 
-	req->scmnd    = scmnd;
-	req->cmd      = iu;
+	req->cmd = iu;
 
 	len = srp_map_data(scmnd, ch, req);
 	if (len < 0) {
@@ -2160,7 +2159,12 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
 				      DMA_TO_DEVICE);
 
-	if (srp_post_send(ch, iu, len)) {
+	spin_lock_irqsave(&ch->lock, flags);
+	req->scmnd = scmnd;
+	ret = srp_post_send(ch, iu, len) ? SCSI_MLQUEUE_HOST_BUSY : 0;
+	spin_unlock_irqrestore(&ch->lock, flags);
+
+	if (ret) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
 		goto err_unmap;
 	}

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 2/3] IB/srp: Check ib_query_gid return value
       [not found]     ` <1393252218-30638-3-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-02-24 15:34       ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2014-02-24 15:34 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 02/24/14 15:30, Sagi Grimberg wrote:
> Detected by Coverity.
> 
> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 0e20bfb..b615135 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2652,7 +2652,9 @@ static ssize_t srp_create_target(struct device *dev,
>  	if (ret)
>  		goto err_free_mem;
>  
> -	ib_query_gid(ibdev, host->port, 0, &target->path.sgid);
> +	ret = ib_query_gid(ibdev, host->port, 0, &target->path.sgid);
> +	if (ret)
> +		goto err_free_mem;
>  
>  	shost_printk(KERN_DEBUG, target->scsi_host, PFX
>  		     "new target: id_ext %016llx ioc_guid %016llx pkey %04x "
> 

This patch looks fine to me, so

Acked-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows
       [not found]     ` <1393252218-30638-4-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-02-24 15:38       ` Bart Van Assche
       [not found]         ` <530B677A.1040508-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-02-24 15:38 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham

On 02/24/14 15:30, Sagi Grimberg wrote:
> From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> srp_reconnect_rport() serializes calls of srp_rport_reconnect()
> with srp_queuecommand(), srp_abort(), srp_reset_device(),
> srp_reset_host() via rport->mutex and also blocks srp_queuecommand();
> however, it cannot block scsi error handler commands (stu, tur).
> This may introduces corruption in free_tx IUs list and IU itself
> 
> Signed-off-by: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index b615135..656602b 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -859,6 +859,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>  {
>  	struct srp_target_port *target = rport->lld_data;
>  	int i, ret;
> +	unsigned long flags;
>  
>  	srp_disconnect_target(target);
>  	/*
> @@ -882,9 +883,11 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>  		srp_finish_req(target, req, DID_RESET << 16);
>  	}
>  
> +	spin_lock_irqsave(&target->lock, flags);
>  	INIT_LIST_HEAD(&target->free_tx);
>  	for (i = 0; i < target->queue_size; ++i)
>  		list_add(&target->tx_ring[i]->list, &target->free_tx);
> +	spin_unlock_irqrestore(&target->lock, flags);
>  
>  	if (ret == 0)
>  		ret = srp_connect_target(target);

Hello Sagi and Vu,

srp_rport_reconnect() should never get invoked concurrently with
srp_queuecommand() - see e.g. the "in_scsi_eh" variable in
srp_queuecommand(). Is the list corruption reproducible with the patch
mentioned in my reply to patch 1/3 ?

Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]         ` <530B6444.1000805-HInyCGIudOg@public.gmane.org>
@ 2014-02-27 11:32           ` Sagi Grimberg
       [not found]             ` <530F225E.5070500-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-27 11:32 UTC (permalink / raw)
  To: Bart Van Assche, Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 2/24/2014 5:24 PM, Bart Van Assche wrote:
> On 02/24/14 15:30, Sagi Grimberg wrote:
>> When unmapping request data, it is unsafe automatically
>> decrement req->nfmr regardless of it's value. This may
>> happen since IO and reconnect flow may run concurrently
>> resulting in req->nfmr = -1 and falsely call ib_fmr_pool_unmap.
>>
>> Fix the loop condition to be greater than zero (which
>> explicitly means that FMRs were used on this request)
>> and only increment when needed.
>>
>> This crash is easily reproduceable with ConnectX VFs OR
>> Connect-IB (where FMRs are not supported)
>>
>> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c |    5 ++++-
>>   1 files changed, 4 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 529b6bc..0e20bfb 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -766,8 +766,11 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>>   		return;
>>   
>>   	pfmr = req->fmr_list;
>> -	while (req->nfmr--)
>> +
>> +	while (req->nfmr > 0) {
>>   		ib_fmr_pool_unmap(*pfmr++);
>> +		req->nfmr--;
>> +	}
>>   
>>   	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
>>   			scmnd->sc_data_direction);
>>
> Hello Sagi,
>
> If srp_unmap_data() got invoked twice for the same request that means
> that a race condition has been hit. I would like to see the root cause
> of that race condition fixed instead of making it safe to invoke
> srp_unmap_data() multiple times. It would be appreciated if you could
> verify whether the following patch is a valid alternative for the above
> patch:

Hey Bart, Sorry for the late response,

It's been a while since I did this, but as I recall the problem wasn't 
in srp_post_send failure path.
Moreover I don't think a spinlock on srp_post_send is a good idea unless 
it is absolutely needed
(I don't think so).

As I recall  (need to re-confirm this), the problem was that in unstable 
ports condition srp_abort is
called invoking srp_free_req (unmap call #1) and reconnect process (or 
reset_device or terminate_io)
finishes all requests in the request ring (unmap call #2). when FMRs are 
used then nfmr goes to zero
the first time, but when FMRs are not supported nfmr goes from 0 to -1 
causing the crash since nfmr condition
is not safe.

> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index b753a7a..2c75f95 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2141,8 +2141,7 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	cmd->tag    = req->index;
>   	memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
>   
> -	req->scmnd    = scmnd;
> -	req->cmd      = iu;
> +	req->cmd = iu;
>   
>   	len = srp_map_data(scmnd, ch, req);
>   	if (len < 0) {
> @@ -2160,7 +2159,12 @@ static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
>   				      DMA_TO_DEVICE);
>   
> -	if (srp_post_send(ch, iu, len)) {
> +	spin_lock_irqsave(&ch->lock, flags);
> +	req->scmnd = scmnd;
> +	ret = srp_post_send(ch, iu, len) ? SCSI_MLQUEUE_HOST_BUSY : 0;
> +	spin_unlock_irqrestore(&ch->lock, flags);
> +
> +	if (ret) {
>   		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
>   		goto err_unmap;
>   	}
>
> Thanks,
>
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows
       [not found]         ` <530B677A.1040508-HInyCGIudOg@public.gmane.org>
@ 2014-02-27 11:51           ` Sagi Grimberg
       [not found]             ` <530F26CE.5070404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-02-27 11:51 UTC (permalink / raw)
  To: Bart Van Assche, Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham

On 2/24/2014 5:38 PM, Bart Van Assche wrote:
> On 02/24/14 15:30, Sagi Grimberg wrote:
>> From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> srp_reconnect_rport() serializes calls of srp_rport_reconnect()
>> with srp_queuecommand(), srp_abort(), srp_reset_device(),
>> srp_reset_host() via rport->mutex and also blocks srp_queuecommand();
>> however, it cannot block scsi error handler commands (stu, tur).
>> This may introduces corruption in free_tx IUs list and IU itself
>>
>> Signed-off-by: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
>>   1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index b615135..656602b 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -859,6 +859,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>>   {
>>   	struct srp_target_port *target = rport->lld_data;
>>   	int i, ret;
>> +	unsigned long flags;
>>   
>>   	srp_disconnect_target(target);
>>   	/*
>> @@ -882,9 +883,11 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>>   		srp_finish_req(target, req, DID_RESET << 16);
>>   	}
>>   
>> +	spin_lock_irqsave(&target->lock, flags);
>>   	INIT_LIST_HEAD(&target->free_tx);
>>   	for (i = 0; i < target->queue_size; ++i)
>>   		list_add(&target->tx_ring[i]->list, &target->free_tx);
>> +	spin_unlock_irqrestore(&target->lock, flags);
>>   
>>   	if (ret == 0)
>>   		ret = srp_connect_target(target);
> Hello Sagi and Vu,
>
> srp_rport_reconnect() should never get invoked concurrently with
> srp_queuecommand() - see e.g. the "in_scsi_eh" variable in
> srp_queuecommand(). Is the list corruption reproducible with the patch
> mentioned in my reply to patch 1/3 ?
>
> Thanks,
>
> Bart.

I need to re-test this.

Regarding in_scsi_eh, can you end-up still posting a send if you are in 
an interrupt context?
it's just that we have a *very* rare case (not easy to reproduce) in 
RH6.5 where we end-up posting on a just destroyed QP
(race right in between destroy_qp and assignment of new qp in 
srp_create_target_ib).
We tested it with in_scsi_eh patch and it still happened.

As I see it, SRP problems comes in a distinct period when rport is in 
state BLOCKED.
On one hand, all request processing are allowed (not failing commands), 
and on the other reconnect flow may be running in concurrently.
Will it be acceptable to take the rport_mutex in queue_command if rport 
is in BLOCKED state?

Thoughts?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows
       [not found]             ` <530F26CE.5070404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-02-27 14:22               ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2014-02-27 14:22 UTC (permalink / raw)
  To: Sagi Grimberg, Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Vu Pham

On 02/27/14 12:51, Sagi Grimberg wrote:
> Regarding in_scsi_eh, can you end-up still posting a send if you are in
> an interrupt context?
> it's just that we have a *very* rare case (not easy to reproduce) in
> RH6.5 where we end-up posting on a just destroyed QP
> (race right in between destroy_qp and assignment of new qp in
> srp_create_target_ib).
> We tested it with in_scsi_eh patch and it still happened.
> 
> As I see it, SRP problems comes in a distinct period when rport is in
> state BLOCKED.
> On one hand, all request processing are allowed (not failing commands),
> and on the other reconnect flow may be running in concurrently.
> Will it be acceptable to take the rport_mutex in queue_command if rport
> is in BLOCKED state?

Hello Sagi,

The issue you described is probably specific to the RHEL backport of the
SRP initiator and does not affect the upstream SRP initiator. The
function scsi_request_fn_active() in drivers/scsi/scsi_transport_srp.c
is used by srp_reconnect_rport() to wait until ongoing
srp_queuecommand() calls have finished after the SCSI host state has
been changed into BLOCKED. That function relies on a member variable of
struct request_queue that has been introduced upstream in kernel 3.7.0.
In other words, that function needs attention when porting the SRP
initiator to RHEL 6. A small delay is probably sufficient to wait until
ongoing srp_queuecommand() calls have finished.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]             ` <530F225E.5070500-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-06  9:10               ` Bart Van Assche
       [not found]                 ` <53183B71.4090609-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-06  9:10 UTC (permalink / raw)
  To: Sagi Grimberg, Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 02/27/14 12:32, Sagi Grimberg wrote:
> As I recall  (need to re-confirm this), the problem was that in unstable
> ports condition srp_abort is
> called invoking srp_free_req (unmap call #1) and reconnect process (or
> reset_device or terminate_io)
> finishes all requests in the request ring (unmap call #2). when FMRs are
> used then nfmr goes to zero
> the first time, but when FMRs are not supported nfmr goes from 0 to -1
> causing the crash since nfmr condition
> is not safe.

Hello Sagi,

Thinking further about this, it looks to me like the only way to avoid
that srp_terminate_io() triggers a race condition with the error path
in srp_queuecommand() is to set req->scmnd only if srp_post_send() has
succeeded. How about the patch below ? That patch should ensure that
srp_unmap_data() is only called once for each request and hence should
avoid triggering the crash you mentioned.

Thanks,

Bart.

[PATCH] IB/srp: Fix a race condition between fast failing I/O and I/O queueing

Avoid that srp_unmap_data() can get invoked concurrently by
srp_terminate_io() and from the error path in srp_queuecommand()
if srp_post_send() fails.

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 46 +++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 7858240..85eb59f 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -783,6 +783,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
  * srp_claim_req - Take ownership of the scmnd associated with a request.
  * @target: SRP target port.
  * @req: SRP request.
+ * @sdev: If not NULL, only take ownership for this SCSI device.
  * @scmnd: If NULL, take ownership of @req->scmnd. If not NULL, only take
  *         ownership of @req->scmnd if it equals @scmnd.
  *
@@ -791,16 +792,17 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
  */
 static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
 				       struct srp_request *req,
+				       struct scsi_device *sdev,
 				       struct scsi_cmnd *scmnd)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&target->lock, flags);
-	if (!scmnd) {
+	if (req->scmnd &&
+	    (!sdev || req->scmnd->device == sdev) &&
+	    (!scmnd || req->scmnd == scmnd)) {
 		scmnd = req->scmnd;
 		req->scmnd = NULL;
-	} else if (req->scmnd == scmnd) {
-		req->scmnd = NULL;
 	} else {
 		scmnd = NULL;
 	}
@@ -827,9 +829,10 @@ static void srp_free_req(struct srp_target_port *target,
 }
 
 static void srp_finish_req(struct srp_target_port *target,
-			   struct srp_request *req, int result)
+			   struct srp_request *req, struct scsi_device *sdev,
+			   int result)
 {
-	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
+	struct scsi_cmnd *scmnd = srp_claim_req(target, req, sdev, NULL);
 
 	if (scmnd) {
 		srp_free_req(target, req, scmnd, 0);
@@ -845,7 +848,7 @@ static void srp_terminate_io(struct srp_rport *rport)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
+		srp_finish_req(target, req, NULL, DID_TRANSPORT_FAILFAST << 16);
 	}
 }
 
@@ -882,7 +885,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		srp_finish_req(target, req, DID_RESET << 16);
+		srp_finish_req(target, req, NULL, DID_RESET << 16);
 	}
 
 	INIT_LIST_HEAD(&target->free_tx);
@@ -1290,7 +1293,7 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 		complete(&target->tsk_mgmt_done);
 	} else {
 		req = &target->req_ring[rsp->tag];
-		scmnd = srp_claim_req(target, req, NULL);
+		scmnd = srp_claim_req(target, req, NULL, NULL);
 		if (!scmnd) {
 			shost_printk(KERN_ERR, target->scsi_host,
 				     "Null scmnd for RSP w/tag %016llx\n",
@@ -1509,7 +1512,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	struct srp_cmd *cmd;
 	struct ib_device *dev;
 	unsigned long flags;
-	int len, result;
+	int len, result, ret = 0;
 	const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
 
 	/*
@@ -1552,8 +1555,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	cmd->tag    = req->index;
 	memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
 
-	req->scmnd    = scmnd;
-	req->cmd      = iu;
+	req->cmd = iu;
 
 	len = srp_map_data(scmnd, target, req);
 	if (len < 0) {
@@ -1565,7 +1567,14 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 	ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
 				      DMA_TO_DEVICE);
 
-	if (srp_post_send(target, iu, len)) {
+	spin_lock_irqsave(&target->lock, flags);
+	if (srp_post_send(target, iu, len) == 0)
+		req->scmnd = scmnd;
+	else
+		ret = SCSI_MLQUEUE_HOST_BUSY;
+	spin_unlock_irqrestore(&target->lock, flags);
+
+	if (ret) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
 		goto err_unmap;
 	}
@@ -1574,7 +1583,7 @@ unlock_rport:
 	if (in_scsi_eh)
 		mutex_unlock(&rport->mutex);
 
-	return 0;
+	return ret;
 
 err_unmap:
 	srp_unmap_data(scmnd, target, req);
@@ -1588,10 +1597,8 @@ err_iu:
 err_unlock:
 	spin_unlock_irqrestore(&target->lock, flags);
 
-	if (in_scsi_eh)
-		mutex_unlock(&rport->mutex);
-
-	return SCSI_MLQUEUE_HOST_BUSY;
+	ret = SCSI_MLQUEUE_HOST_BUSY;
+	goto unlock_rport;
 }
 
 /*
@@ -2008,7 +2015,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (!req || !srp_claim_req(target, req, scmnd))
+	if (!req || !srp_claim_req(target, req, NULL, scmnd))
 		return SUCCESS;
 	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			      SRP_TSK_ABORT_TASK) == 0)
@@ -2039,8 +2046,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		if (req->scmnd && req->scmnd->device == scmnd->device)
-			srp_finish_req(target, req, DID_RESET << 16);
+		srp_finish_req(target, req, scmnd->device, DID_RESET << 16);
 	}
 
 	return SUCCESS;
-- 
1.8.4.5



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                 ` <53183B71.4090609-HInyCGIudOg@public.gmane.org>
@ 2014-03-06 15:32                   ` sagi grimberg
       [not found]                     ` <53189526.2010704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: sagi grimberg @ 2014-03-06 15:32 UTC (permalink / raw)
  To: Bart Van Assche, Sagi Grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/6/2014 11:10 AM, Bart Van Assche wrote:
> On 02/27/14 12:32, Sagi Grimberg wrote:
>> As I recall  (need to re-confirm this), the problem was that in unstable
>> ports condition srp_abort is
>> called invoking srp_free_req (unmap call #1) and reconnect process (or
>> reset_device or terminate_io)
>> finishes all requests in the request ring (unmap call #2). when FMRs are
>> used then nfmr goes to zero
>> the first time, but when FMRs are not supported nfmr goes from 0 to -1
>> causing the crash since nfmr condition
>> is not safe.
> Hello Sagi,
>
> Thinking further about this, it looks to me like the only way to avoid
> that srp_terminate_io() triggers a race condition with the error path
> in srp_queuecommand() is to set req->scmnd only if srp_post_send() has
> succeeded. How about the patch below ? That patch should ensure that
> srp_unmap_data() is only called once for each request and hence should
> avoid triggering the crash you mentioned.

Hey Bart, Thanks for taking the time to think on this issue.

I see your point, and it might work. let me try it out (if I won't get 
to it today I'll do it on Sunday)
and reply if that resolves the issue.

Sagi.

> Thanks,
>
> Bart.
>
> [PATCH] IB/srp: Fix a race condition between fast failing I/O and I/O queueing
>
> Avoid that srp_unmap_data() can get invoked concurrently by
> srp_terminate_io() and from the error path in srp_queuecommand()
> if srp_post_send() fails.
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 46 +++++++++++++++++++++----------------
>   1 file changed, 26 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 7858240..85eb59f 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -783,6 +783,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>    * srp_claim_req - Take ownership of the scmnd associated with a request.
>    * @target: SRP target port.
>    * @req: SRP request.
> + * @sdev: If not NULL, only take ownership for this SCSI device.
>    * @scmnd: If NULL, take ownership of @req->scmnd. If not NULL, only take
>    *         ownership of @req->scmnd if it equals @scmnd.
>    *
> @@ -791,16 +792,17 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>    */
>   static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
>   				       struct srp_request *req,
> +				       struct scsi_device *sdev,
>   				       struct scsi_cmnd *scmnd)
>   {
>   	unsigned long flags;
>   
>   	spin_lock_irqsave(&target->lock, flags);
> -	if (!scmnd) {
> +	if (req->scmnd &&
> +	    (!sdev || req->scmnd->device == sdev) &&
> +	    (!scmnd || req->scmnd == scmnd)) {
>   		scmnd = req->scmnd;
>   		req->scmnd = NULL;
> -	} else if (req->scmnd == scmnd) {
> -		req->scmnd = NULL;
>   	} else {
>   		scmnd = NULL;
>   	}
> @@ -827,9 +829,10 @@ static void srp_free_req(struct srp_target_port *target,
>   }
>   
>   static void srp_finish_req(struct srp_target_port *target,
> -			   struct srp_request *req, int result)
> +			   struct srp_request *req, struct scsi_device *sdev,
> +			   int result)
>   {
> -	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
> +	struct scsi_cmnd *scmnd = srp_claim_req(target, req, sdev, NULL);
>   
>   	if (scmnd) {
>   		srp_free_req(target, req, scmnd, 0);
> @@ -845,7 +848,7 @@ static void srp_terminate_io(struct srp_rport *rport)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
> +		srp_finish_req(target, req, NULL, DID_TRANSPORT_FAILFAST << 16);
>   	}
>   }
>   
> @@ -882,7 +885,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		srp_finish_req(target, req, DID_RESET << 16);
> +		srp_finish_req(target, req, NULL, DID_RESET << 16);
>   	}
>   
>   	INIT_LIST_HEAD(&target->free_tx);
> @@ -1290,7 +1293,7 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
>   		complete(&target->tsk_mgmt_done);
>   	} else {
>   		req = &target->req_ring[rsp->tag];
> -		scmnd = srp_claim_req(target, req, NULL);
> +		scmnd = srp_claim_req(target, req, NULL, NULL);
>   		if (!scmnd) {
>   			shost_printk(KERN_ERR, target->scsi_host,
>   				     "Null scmnd for RSP w/tag %016llx\n",
> @@ -1509,7 +1512,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	struct srp_cmd *cmd;
>   	struct ib_device *dev;
>   	unsigned long flags;
> -	int len, result;
> +	int len, result, ret = 0;
>   	const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
>   
>   	/*
> @@ -1552,8 +1555,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	cmd->tag    = req->index;
>   	memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
>   
> -	req->scmnd    = scmnd;
> -	req->cmd      = iu;
> +	req->cmd = iu;
>   
>   	len = srp_map_data(scmnd, target, req);
>   	if (len < 0) {
> @@ -1565,7 +1567,14 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>   	ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
>   				      DMA_TO_DEVICE);
>   
> -	if (srp_post_send(target, iu, len)) {
> +	spin_lock_irqsave(&target->lock, flags);
> +	if (srp_post_send(target, iu, len) == 0)
> +		req->scmnd = scmnd;
> +	else
> +		ret = SCSI_MLQUEUE_HOST_BUSY;
> +	spin_unlock_irqrestore(&target->lock, flags);
> +
> +	if (ret) {
>   		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
>   		goto err_unmap;
>   	}
> @@ -1574,7 +1583,7 @@ unlock_rport:
>   	if (in_scsi_eh)
>   		mutex_unlock(&rport->mutex);
>   
> -	return 0;
> +	return ret;
>   
>   err_unmap:
>   	srp_unmap_data(scmnd, target, req);
> @@ -1588,10 +1597,8 @@ err_iu:
>   err_unlock:
>   	spin_unlock_irqrestore(&target->lock, flags);
>   
> -	if (in_scsi_eh)
> -		mutex_unlock(&rport->mutex);
> -
> -	return SCSI_MLQUEUE_HOST_BUSY;
> +	ret = SCSI_MLQUEUE_HOST_BUSY;
> +	goto unlock_rport;
>   }
>   
>   /*
> @@ -2008,7 +2015,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
>   
>   	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
>   
> -	if (!req || !srp_claim_req(target, req, scmnd))
> +	if (!req || !srp_claim_req(target, req, NULL, scmnd))
>   		return SUCCESS;
>   	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
>   			      SRP_TSK_ABORT_TASK) == 0)
> @@ -2039,8 +2046,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		if (req->scmnd && req->scmnd->device == scmnd->device)
> -			srp_finish_req(target, req, DID_RESET << 16);
> +		srp_finish_req(target, req, scmnd->device, DID_RESET << 16);
>   	}
>   
>   	return SUCCESS;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                     ` <53189526.2010704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-03-06 16:10                       ` Sagi Grimberg
       [not found]                         ` <53189DDE.7090003-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-06 16:10 UTC (permalink / raw)
  To: sagi grimberg, Bart Van Assche
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/6/2014 5:32 PM, sagi grimberg wrote:
> On 3/6/2014 11:10 AM, Bart Van Assche wrote:
>> On 02/27/14 12:32, Sagi Grimberg wrote:
>>> As I recall  (need to re-confirm this), the problem was that in 
>>> unstable
>>> ports condition srp_abort is
>>> called invoking srp_free_req (unmap call #1) and reconnect process (or
>>> reset_device or terminate_io)
>>> finishes all requests in the request ring (unmap call #2). when FMRs 
>>> are
>>> used then nfmr goes to zero
>>> the first time, but when FMRs are not supported nfmr goes from 0 to -1
>>> causing the crash since nfmr condition
>>> is not safe.
>> Hello Sagi,
>>
>> Thinking further about this, it looks to me like the only way to avoid
>> that srp_terminate_io() triggers a race condition with the error path
>> in srp_queuecommand() is to set req->scmnd only if srp_post_send() has
>> succeeded. How about the patch below ? That patch should ensure that
>> srp_unmap_data() is only called once for each request and hence should
>> avoid triggering the crash you mentioned.
>
> Hey Bart, Thanks for taking the time to think on this issue.
>
> I see your point, and it might work. let me try it out (if I won't get 
> to it today I'll do it on Sunday)
> and reply if that resolves the issue.
>
> Sagi.
>

Hey Bart (replying to my own email...),

So I took Roland latest 3.14-rc1 and tried to reproduce this issue using 
HCA with no FMRs support and
was *NOT* able to reproduce this issue. This issue reproduced for me on 
RH6 backported srp and
I can't tell where is the delta at the moment. Perhaps Sebastian can 
share his scenario that reproduces
this issue and was solved by the patch I proposed.

I suggest to try and reproduce it with your ib_srp-backport code over 
RH6, worth a look right?

If this is a false alarm, then I'm sorry for the bother...

Sagi.

>> Thanks,
>>
>> Bart.
>>
>> [PATCH] IB/srp: Fix a race condition between fast failing I/O and I/O 
>> queueing
>>
>> Avoid that srp_unmap_data() can get invoked concurrently by
>> srp_terminate_io() and from the error path in srp_queuecommand()
>> if srp_post_send() fails.
>>
>> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c | 46 
>> +++++++++++++++++++++----------------
>>   1 file changed, 26 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
>> b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 7858240..85eb59f 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -783,6 +783,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>>    * srp_claim_req - Take ownership of the scmnd associated with a 
>> request.
>>    * @target: SRP target port.
>>    * @req: SRP request.
>> + * @sdev: If not NULL, only take ownership for this SCSI device.
>>    * @scmnd: If NULL, take ownership of @req->scmnd. If not NULL, 
>> only take
>>    *         ownership of @req->scmnd if it equals @scmnd.
>>    *
>> @@ -791,16 +792,17 @@ static void srp_unmap_data(struct scsi_cmnd 
>> *scmnd,
>>    */
>>   static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
>>                          struct srp_request *req,
>> +                       struct scsi_device *sdev,
>>                          struct scsi_cmnd *scmnd)
>>   {
>>       unsigned long flags;
>>         spin_lock_irqsave(&target->lock, flags);
>> -    if (!scmnd) {
>> +    if (req->scmnd &&
>> +        (!sdev || req->scmnd->device == sdev) &&
>> +        (!scmnd || req->scmnd == scmnd)) {
>>           scmnd = req->scmnd;
>>           req->scmnd = NULL;
>> -    } else if (req->scmnd == scmnd) {
>> -        req->scmnd = NULL;
>>       } else {
>>           scmnd = NULL;
>>       }
>> @@ -827,9 +829,10 @@ static void srp_free_req(struct srp_target_port 
>> *target,
>>   }
>>     static void srp_finish_req(struct srp_target_port *target,
>> -               struct srp_request *req, int result)
>> +               struct srp_request *req, struct scsi_device *sdev,
>> +               int result)
>>   {
>> -    struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
>> +    struct scsi_cmnd *scmnd = srp_claim_req(target, req, sdev, NULL);
>>         if (scmnd) {
>>           srp_free_req(target, req, scmnd, 0);
>> @@ -845,7 +848,7 @@ static void srp_terminate_io(struct srp_rport 
>> *rport)
>>         for (i = 0; i < target->req_ring_size; ++i) {
>>           struct srp_request *req = &target->req_ring[i];
>> -        srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
>> +        srp_finish_req(target, req, NULL, DID_TRANSPORT_FAILFAST << 
>> 16);
>>       }
>>   }
>>   @@ -882,7 +885,7 @@ static int srp_rport_reconnect(struct srp_rport 
>> *rport)
>>         for (i = 0; i < target->req_ring_size; ++i) {
>>           struct srp_request *req = &target->req_ring[i];
>> -        srp_finish_req(target, req, DID_RESET << 16);
>> +        srp_finish_req(target, req, NULL, DID_RESET << 16);
>>       }
>>         INIT_LIST_HEAD(&target->free_tx);
>> @@ -1290,7 +1293,7 @@ static void srp_process_rsp(struct 
>> srp_target_port *target, struct srp_rsp *rsp)
>>           complete(&target->tsk_mgmt_done);
>>       } else {
>>           req = &target->req_ring[rsp->tag];
>> -        scmnd = srp_claim_req(target, req, NULL);
>> +        scmnd = srp_claim_req(target, req, NULL, NULL);
>>           if (!scmnd) {
>>               shost_printk(KERN_ERR, target->scsi_host,
>>                        "Null scmnd for RSP w/tag %016llx\n",
>> @@ -1509,7 +1512,7 @@ static int srp_queuecommand(struct Scsi_Host 
>> *shost, struct scsi_cmnd *scmnd)
>>       struct srp_cmd *cmd;
>>       struct ib_device *dev;
>>       unsigned long flags;
>> -    int len, result;
>> +    int len, result, ret = 0;
>>       const bool in_scsi_eh = !in_interrupt() && current == 
>> shost->ehandler;
>>         /*
>> @@ -1552,8 +1555,7 @@ static int srp_queuecommand(struct Scsi_Host 
>> *shost, struct scsi_cmnd *scmnd)
>>       cmd->tag    = req->index;
>>       memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
>>   -    req->scmnd    = scmnd;
>> -    req->cmd      = iu;
>> +    req->cmd = iu;
>>         len = srp_map_data(scmnd, target, req);
>>       if (len < 0) {
>> @@ -1565,7 +1567,14 @@ static int srp_queuecommand(struct Scsi_Host 
>> *shost, struct scsi_cmnd *scmnd)
>>       ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
>>                         DMA_TO_DEVICE);
>>   -    if (srp_post_send(target, iu, len)) {
>> +    spin_lock_irqsave(&target->lock, flags);
>> +    if (srp_post_send(target, iu, len) == 0)
>> +        req->scmnd = scmnd;
>> +    else
>> +        ret = SCSI_MLQUEUE_HOST_BUSY;
>> +    spin_unlock_irqrestore(&target->lock, flags);
>> +
>> +    if (ret) {
>>           shost_printk(KERN_ERR, target->scsi_host, PFX "Send 
>> failed\n");
>>           goto err_unmap;
>>       }
>> @@ -1574,7 +1583,7 @@ unlock_rport:
>>       if (in_scsi_eh)
>>           mutex_unlock(&rport->mutex);
>>   -    return 0;
>> +    return ret;
>>     err_unmap:
>>       srp_unmap_data(scmnd, target, req);
>> @@ -1588,10 +1597,8 @@ err_iu:
>>   err_unlock:
>>       spin_unlock_irqrestore(&target->lock, flags);
>>   -    if (in_scsi_eh)
>> -        mutex_unlock(&rport->mutex);
>> -
>> -    return SCSI_MLQUEUE_HOST_BUSY;
>> +    ret = SCSI_MLQUEUE_HOST_BUSY;
>> +    goto unlock_rport;
>>   }
>>     /*
>> @@ -2008,7 +2015,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
>>         shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
>>   -    if (!req || !srp_claim_req(target, req, scmnd))
>> +    if (!req || !srp_claim_req(target, req, NULL, scmnd))
>>           return SUCCESS;
>>       if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
>>                     SRP_TSK_ABORT_TASK) == 0)
>> @@ -2039,8 +2046,7 @@ static int srp_reset_device(struct scsi_cmnd 
>> *scmnd)
>>         for (i = 0; i < target->req_ring_size; ++i) {
>>           struct srp_request *req = &target->req_ring[i];
>> -        if (req->scmnd && req->scmnd->device == scmnd->device)
>> -            srp_finish_req(target, req, DID_RESET << 16);
>> +        srp_finish_req(target, req, scmnd->device, DID_RESET << 16);
>>       }
>>         return SUCCESS;
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                         ` <53189DDE.7090003-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-07  8:13                           ` Bart Van Assche
       [not found]                             ` <53197FC5.4090102-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-07  8:13 UTC (permalink / raw)
  To: Sagi Grimberg, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/06/14 17:10, Sagi Grimberg wrote:
> So I took Roland latest 3.14-rc1 and tried to reproduce this issue using
> HCA with no FMRs support and was *NOT* able to reproduce this issue. This
> issue reproduced for me on RH6 backported srp and I can't tell where is
> the delta at the moment. Perhaps Sebastian can share his scenario that
> reproduces this issue and was solved by the patch I proposed.
> 
> I suggest to try and reproduce it with your ib_srp-backport code over
> RH6, worth a look right?

Hello Sagi,

It would help if you could explain why you are concerned about holding a
spin lock during the srp_post_send() call. According to
Documentation/infiniband/core_locking.txt ib_post_send() is not allowed
to sleep. And before the SCSI host lock push-down the SCSI host lock was
held around the srp_queuecommand() invocation. In other words, holding a
spin lock around the srp_post_send() call in srp_queuecommand() is
neither new nor disallowed by the RDMA API documentation. And if there
would be any (legacy?) RDMA HCA drivers that are not upstream and for
which it is not safe to invoke ib_post_send() with a spin lock held, how
about invoking srp_post_send() in distro's that support these drivers
without holding target->lock ? The patch I posted fixes a race between
srp_finish_req() and the error path in srp_queuecommand by setting
req->scmnd after the srp_post_send() call instead of before. Holding
target->lock around the srp_post_send() call and the req->scmnd
assignment is only needed to avoid the (theoretical) race where
processing of an SRP reply would already have been started before
req->scmnd has been assigned by srp_queuecommand().

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                             ` <53197FC5.4090102-HInyCGIudOg@public.gmane.org>
@ 2014-03-11 13:38                               ` Sagi Grimberg
       [not found]                                 ` <531F11B8.5030202-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-11 13:38 UTC (permalink / raw)
  To: Bart Van Assche, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/7/2014 10:13 AM, Bart Van Assche wrote:
> On 03/06/14 17:10, Sagi Grimberg wrote:
>> So I took Roland latest 3.14-rc1 and tried to reproduce this issue using
>> HCA with no FMRs support and was *NOT* able to reproduce this issue. This
>> issue reproduced for me on RH6 backported srp and I can't tell where is
>> the delta at the moment. Perhaps Sebastian can share his scenario that
>> reproduces this issue and was solved by the patch I proposed.
>>
>> I suggest to try and reproduce it with your ib_srp-backport code over
>> RH6, worth a look right?
> Hello Sagi,

Hey Bart,

A bit late but I hope not too late...

> It would help if you could explain why you are concerned about holding a
> spin lock during the srp_post_send() call. According to
> Documentation/infiniband/core_locking.txt ib_post_send() is not allowed
> to sleep.

Correct, taking the spinlock around ib_post_send is allowed and I don't 
think any HCA drivers
will ever sleep there.

>   And before the SCSI host lock push-down the SCSI host lock was
> held around the srp_queuecommand() invocation. In other words, holding a
> spin lock around the srp_post_send() call in srp_queuecommand() is
> neither new nor disallowed by the RDMA API documentation.

I just noted that I think it should be avoided due to performance reasons.
We are going towards scsi-mq and I suspect fast-path spinlocks will 
become much
more expensive in this era, and we should try to avoid them rather than 
take them.

>   And if there
> would be any (legacy?) RDMA HCA drivers that are not upstream and for
> which it is not safe to invoke ib_post_send() with a spin lock held, how
> about invoking srp_post_send() in distro's that support these drivers
> without holding target->lock ?

I don't think we need to worry about that.

>   The patch I posted fixes a race between
> srp_finish_req() and the error path in srp_queuecommand by setting
> req->scmnd after the srp_post_send() call instead of before.

Yes, I agree this would help.

>   Holding
> target->lock around the srp_post_send() call and the req->scmnd
> assignment is only needed to avoid the (theoretical) race where
> processing of an SRP reply would already have been started before
> req->scmnd has been assigned by srp_queuecommand().

I see... will assigning scmnd = NULL only if srp_post_send will fail, 
and restore the assignment as before help?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                 ` <531F11B8.5030202-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-11 13:51                                   ` Bart Van Assche
       [not found]                                     ` <531F14D0.7010608-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-11 13:51 UTC (permalink / raw)
  To: Sagi Grimberg, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/11/14 14:38, Sagi Grimberg wrote:
>> And before the SCSI host lock push-down the SCSI host lock was
>> held around the srp_queuecommand() invocation. In other words, holding a
>> spin lock around the srp_post_send() call in srp_queuecommand() is
>> neither new nor disallowed by the RDMA API documentation.
> 
> I just noted that I think it should be avoided due to performance reasons.
> We are going towards scsi-mq and I suspect fast-path spinlocks will
> become much
> more expensive in this era, and we should try to avoid them rather than
> take them.

Once the SRP initiator driver will support multiple queues the lock
around srp_post_send() will be per queue instead of per target so if the
number of queues is large enough that lock won't be a contention point.
I have experimental code ready that follows this approach.

>> Holding
>> target->lock around the srp_post_send() call and the req->scmnd
>> assignment is only needed to avoid the (theoretical) race where
>> processing of an SRP reply would already have been started before
>> req->scmnd has been assigned by srp_queuecommand().
> 
> I see... will assigning scmnd = NULL only if srp_post_send will fail,
> and restore the assignment as before help?

Sorry but I'm not enthusiast about that approach because it could result
in resources being freed while srp_post_send() is in progress. I prefer
to avoid this kind of race conditions.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                     ` <531F14D0.7010608-HInyCGIudOg@public.gmane.org>
@ 2014-03-11 14:07                                       ` Sagi Grimberg
       [not found]                                         ` <531F18A7.1000907-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-11 14:07 UTC (permalink / raw)
  To: Bart Van Assche, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/11/2014 3:51 PM, Bart Van Assche wrote:
> On 03/11/14 14:38, Sagi Grimberg wrote:
>>> And before the SCSI host lock push-down the SCSI host lock was
>>> held around the srp_queuecommand() invocation. In other words, holding a
>>> spin lock around the srp_post_send() call in srp_queuecommand() is
>>> neither new nor disallowed by the RDMA API documentation.
>> I just noted that I think it should be avoided due to performance reasons.
>> We are going towards scsi-mq and I suspect fast-path spinlocks will
>> become much
>> more expensive in this era, and we should try to avoid them rather than
>> take them.
> Once the SRP initiator driver will support multiple queues the lock
> around srp_post_send() will be per queue instead of per target so if the
> number of queues is large enough that lock won't be a contention point.
> I have experimental code ready that follows this approach.

This sounds right to me.

>>> Holding
>>> target->lock around the srp_post_send() call and the req->scmnd
>>> assignment is only needed to avoid the (theoretical) race where
>>> processing of an SRP reply would already have been started before
>>> req->scmnd has been assigned by srp_queuecommand().
>> I see... will assigning scmnd = NULL only if srp_post_send will fail,
>> and restore the assignment as before help?
> Sorry but I'm not enthusiast about that approach because it could result
> in resources being freed while srp_post_send() is in progress. I prefer
> to avoid this kind of race conditions.

Are you referring to srp_create_target_ib racing with srp_post_send? if 
not, what resources are you referring to?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                         ` <531F18A7.1000907-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-11 14:29                                           ` Bart Van Assche
       [not found]                                             ` <531F1DBD.9070505-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-11 14:29 UTC (permalink / raw)
  To: Sagi Grimberg, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/11/14 15:07, Sagi Grimberg wrote:
> On 3/11/2014 3:51 PM, Bart Van Assche wrote:
>> On 03/11/14 14:38, Sagi Grimberg wrote:
>>> I see... will assigning scmnd = NULL only if srp_post_send will fail,
>>> and restore the assignment as before help?
>> Sorry but I'm not enthusiast about that approach because it could result
>> in resources being freed while srp_post_send() is in progress. I prefer
>> to avoid this kind of race conditions.
> 
> Are you referring to srp_create_target_ib racing with srp_post_send? if
> not, what resources are you referring to?

I was referring to a potential race between srp_terminate_io() and
srp_post_send(). The function srp_claim_req() can grab a SCSI command as
soon as req->scmnd has been set so srp_terminate_io() can also get
invoked as soon as req->scmnd has been set. I'd like to avoid that
srp_post_send() reads information from a struct srp_iu that has already
been freed and that maybe has already been reused for another request.
Hence my preference to set req->scmnd after srp_post_send() since that
avoids triggering a race condition between srp_post_send() and
srp_terminate_io().

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                             ` <531F1DBD.9070505-HInyCGIudOg@public.gmane.org>
@ 2014-03-11 14:48                                               ` Sagi Grimberg
       [not found]                                                 ` <531F2221.2090403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-11 14:48 UTC (permalink / raw)
  To: Bart Van Assche, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/11/2014 4:29 PM, Bart Van Assche wrote:
> On 03/11/14 15:07, Sagi Grimberg wrote:
>> On 3/11/2014 3:51 PM, Bart Van Assche wrote:
>>> On 03/11/14 14:38, Sagi Grimberg wrote:
>>>> I see... will assigning scmnd = NULL only if srp_post_send will fail,
>>>> and restore the assignment as before help?
>>> Sorry but I'm not enthusiast about that approach because it could result
>>> in resources being freed while srp_post_send() is in progress. I prefer
>>> to avoid this kind of race conditions.
>> Are you referring to srp_create_target_ib racing with srp_post_send? if
>> not, what resources are you referring to?
> I was referring to a potential race between srp_terminate_io() and
> srp_post_send(). The function srp_claim_req() can grab a SCSI command as
> soon as req->scmnd has been set so srp_terminate_io() can also get
> invoked as soon as req->scmnd has been set. I'd like to avoid that
> srp_post_send() reads information from a struct srp_iu that has already
> been freed and that maybe has already been reused for another request.
> Hence my preference to set req->scmnd after srp_post_send() since that
> avoids triggering a race condition between srp_post_send() and
> srp_terminate_io().

I see your point,

But I think this lock can be taken conditionally. I mean in the 
sunny-day scenario we don't expect
srp_terminate_io to race with srp_post_send. That can happen only when 
rport state transitions
to FAIL_FAST right? and also in some strange case when the rport is in 
state RUNNING but the
sdev is offline and queuecommand is not even invoked. Do you think that 
acquiring the target->lock
in case rport state is FAST_FAIL suffices?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                                 ` <531F2221.2090403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-11 15:00                                                   ` Bart Van Assche
       [not found]                                                     ` <531F2501.9090005-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-11 15:00 UTC (permalink / raw)
  To: Sagi Grimberg, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/11/14 15:48, Sagi Grimberg wrote:
> But I think this lock can be taken conditionally. I mean in the
> sunny-day scenario we don't expect
> srp_terminate_io to race with srp_post_send. That can happen only when
> rport state transitions
> to FAIL_FAST right? and also in some strange case when the rport is in
> state RUNNING but the
> sdev is offline and queuecommand is not even invoked. Do you think that
> acquiring the target->lock
> in case rport state is FAST_FAIL suffices?

That would be tricky to implement since the rport state can change after
having checked it and while srp_post_send() is in progress.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                                     ` <531F2501.9090005-HInyCGIudOg@public.gmane.org>
@ 2014-03-11 15:30                                                       ` Sagi Grimberg
       [not found]                                                         ` <531F2C2C.7080306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-11 15:30 UTC (permalink / raw)
  To: Bart Van Assche, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/11/2014 5:00 PM, Bart Van Assche wrote:
> On 03/11/14 15:48, Sagi Grimberg wrote:
>> But I think this lock can be taken conditionally. I mean in the
>> sunny-day scenario we don't expect
>> srp_terminate_io to race with srp_post_send. That can happen only when
>> rport state transitions
>> to FAIL_FAST right? and also in some strange case when the rport is in
>> state RUNNING but the
>> sdev is offline and queuecommand is not even invoked. Do you think that
>> acquiring the target->lock
>> in case rport state is FAST_FAIL suffices?
> That would be tricky to implement since the rport state can change after
> having checked it and while srp_post_send() is in progress.

Yes I agree, Just another comment before I give up on this.

State FAIL_FAST must come *after* stated BLOCKED. Do you think that 
taking the lock
once the rport transitions to state BLOCKED suffices? I'm aiming to 
avoid this lock in
the sunny-day flow. Taking this lock always to protect against some 
error flow
that might occur feels somewhat wrong to me.

Sagi.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                                         ` <531F2C2C.7080306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-03-12 13:16                                                           ` Bart Van Assche
       [not found]                                                             ` <53205E1E.3070304-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2014-03-12 13:16 UTC (permalink / raw)
  To: Sagi Grimberg, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/11/14 16:30, Sagi Grimberg wrote:
> State FAIL_FAST must come *after* stated BLOCKED. Do you think that
> taking the lock
> once the rport transitions to state BLOCKED suffices? I'm aiming to
> avoid this lock in
> the sunny-day flow. Taking this lock always to protect against some
> error flow
> that might occur feels somewhat wrong to me.

Hello Sagi,

I agree that today the SRP initiator only invokes srp_terminate_io()
after having quiesced I/O first so from that point of view it is not
necessary to add more locking in srp_queuecommand(). However, since
this is nontrivial I'd like to trigger a kernel warning if
srp_terminate_io() is ever invoked concurrently with
srp_queuecommand(). Additionally, I think the code in
srp_reset_device() can trigger a race with the I/O completion path. How
about addressing all this with the patch below ?

Thanks,

Bart.

[PATCH] IB/srp: Fix a race condition between failing I/O and I/O completion

Avoid that srp_terminate_io() can access req->scmnd after it has been
cleared by the I/O completion code. Do this by protecting req->scmnd
accesses from srp_terminate_io() via locking

Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index a64e469..66a908b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -783,6 +783,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
  * srp_claim_req - Take ownership of the scmnd associated with a request.
  * @target: SRP target port.
  * @req: SRP request.
+ * @sdev: If not NULL, only take ownership for this SCSI device.
  * @scmnd: If NULL, take ownership of @req->scmnd. If not NULL, only take
  *         ownership of @req->scmnd if it equals @scmnd.
  *
@@ -791,16 +792,17 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
  */
 static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
 				       struct srp_request *req,
+				       struct scsi_device *sdev,
 				       struct scsi_cmnd *scmnd)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&target->lock, flags);
-	if (!scmnd) {
+	if (req->scmnd &&
+	    (!sdev || req->scmnd->device == sdev) &&
+	    (!scmnd || req->scmnd == scmnd)) {
 		scmnd = req->scmnd;
 		req->scmnd = NULL;
-	} else if (req->scmnd == scmnd) {
-		req->scmnd = NULL;
 	} else {
 		scmnd = NULL;
 	}
@@ -827,9 +829,10 @@ static void srp_free_req(struct srp_target_port *target,
 }
 
 static void srp_finish_req(struct srp_target_port *target,
-			   struct srp_request *req, int result)
+			   struct srp_request *req, struct scsi_device *sdev,
+			   int result)
 {
-	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
+	struct scsi_cmnd *scmnd = srp_claim_req(target, req, sdev, NULL);
 
 	if (scmnd) {
 		srp_free_req(target, req, scmnd, 0);
@@ -841,11 +844,20 @@ static void srp_finish_req(struct srp_target_port *target,
 static void srp_terminate_io(struct srp_rport *rport)
 {
 	struct srp_target_port *target = rport->lld_data;
+	struct Scsi_Host *shost = target->scsi_host;
+	struct scsi_device *sdev;
 	int i;
 
+	/*
+	 * Invoking srp_terminate_io() while srp_queuecommand() is running
+	 * is not safe. Hence the warning statement below.
+	 */
+	shost_for_each_device(sdev, shost)
+		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
+
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
+		srp_finish_req(target, req, NULL, DID_TRANSPORT_FAILFAST << 16);
 	}
 }
 
@@ -882,7 +894,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		srp_finish_req(target, req, DID_RESET << 16);
+		srp_finish_req(target, req, NULL, DID_RESET << 16);
 	}
 
 	INIT_LIST_HEAD(&target->free_tx);
@@ -1290,7 +1302,7 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 		complete(&target->tsk_mgmt_done);
 	} else {
 		req = &target->req_ring[rsp->tag];
-		scmnd = srp_claim_req(target, req, NULL);
+		scmnd = srp_claim_req(target, req, NULL, NULL);
 		if (!scmnd) {
 			shost_printk(KERN_ERR, target->scsi_host,
 				     "Null scmnd for RSP w/tag %016llx\n",
@@ -2008,7 +2020,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (!req || !srp_claim_req(target, req, scmnd))
+	if (!req || !srp_claim_req(target, req, NULL, scmnd))
 		return SUCCESS;
 	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
 			      SRP_TSK_ABORT_TASK) == 0)
@@ -2039,8 +2051,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		struct srp_request *req = &target->req_ring[i];
-		if (req->scmnd && req->scmnd->device == scmnd->device)
-			srp_finish_req(target, req, DID_RESET << 16);
+		srp_finish_req(target, req, scmnd->device, DID_RESET << 16);
 	}
 
 	return SUCCESS;
-- 
1.8.4.5


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop
       [not found]                                                             ` <53205E1E.3070304-HInyCGIudOg@public.gmane.org>
@ 2014-03-13  7:42                                                               ` Sagi Grimberg
  0 siblings, 0 replies; 24+ messages in thread
From: Sagi Grimberg @ 2014-03-13  7:42 UTC (permalink / raw)
  To: Bart Van Assche, sagi grimberg
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, oren-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/12/2014 3:16 PM, Bart Van Assche wrote:
> On 03/11/14 16:30, Sagi Grimberg wrote:
>> State FAIL_FAST must come *after* stated BLOCKED. Do you think that
>> taking the lock
>> once the rport transitions to state BLOCKED suffices? I'm aiming to
>> avoid this lock in
>> the sunny-day flow. Taking this lock always to protect against some
>> error flow
>> that might occur feels somewhat wrong to me.
> Hello Sagi,
>
> I agree that today the SRP initiator only invokes srp_terminate_io()
> after having quiesced I/O first so from that point of view it is not
> necessary to add more locking in srp_queuecommand(). However, since
> this is nontrivial I'd like to trigger a kernel warning if
> srp_terminate_io() is ever invoked concurrently with
> srp_queuecommand(). Additionally, I think the code in
> srp_reset_device() can trigger a race with the I/O completion path. How
> about addressing all this with the patch below ?
>
> Thanks,
>
> Bart.
>
> [PATCH] IB/srp: Fix a race condition between failing I/O and I/O completion
>
> Avoid that srp_terminate_io() can access req->scmnd after it has been
> cleared by the I/O completion code. Do this by protecting req->scmnd
> accesses from srp_terminate_io() via locking
>
> Signed-off-by: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 33 ++++++++++++++++++++++-----------
>   1 file changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index a64e469..66a908b 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -783,6 +783,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>    * srp_claim_req - Take ownership of the scmnd associated with a request.
>    * @target: SRP target port.
>    * @req: SRP request.
> + * @sdev: If not NULL, only take ownership for this SCSI device.
>    * @scmnd: If NULL, take ownership of @req->scmnd. If not NULL, only take
>    *         ownership of @req->scmnd if it equals @scmnd.
>    *
> @@ -791,16 +792,17 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
>    */
>   static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
>   				       struct srp_request *req,
> +				       struct scsi_device *sdev,
>   				       struct scsi_cmnd *scmnd)
>   {
>   	unsigned long flags;
>   
>   	spin_lock_irqsave(&target->lock, flags);
> -	if (!scmnd) {
> +	if (req->scmnd &&
> +	    (!sdev || req->scmnd->device == sdev) &&
> +	    (!scmnd || req->scmnd == scmnd)) {
>   		scmnd = req->scmnd;
>   		req->scmnd = NULL;
> -	} else if (req->scmnd == scmnd) {
> -		req->scmnd = NULL;
>   	} else {
>   		scmnd = NULL;
>   	}
> @@ -827,9 +829,10 @@ static void srp_free_req(struct srp_target_port *target,
>   }
>   
>   static void srp_finish_req(struct srp_target_port *target,
> -			   struct srp_request *req, int result)
> +			   struct srp_request *req, struct scsi_device *sdev,
> +			   int result)
>   {
> -	struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
> +	struct scsi_cmnd *scmnd = srp_claim_req(target, req, sdev, NULL);
>   
>   	if (scmnd) {
>   		srp_free_req(target, req, scmnd, 0);
> @@ -841,11 +844,20 @@ static void srp_finish_req(struct srp_target_port *target,
>   static void srp_terminate_io(struct srp_rport *rport)
>   {
>   	struct srp_target_port *target = rport->lld_data;
> +	struct Scsi_Host *shost = target->scsi_host;
> +	struct scsi_device *sdev;
>   	int i;
>   
> +	/*
> +	 * Invoking srp_terminate_io() while srp_queuecommand() is running
> +	 * is not safe. Hence the warning statement below.
> +	 */
> +	shost_for_each_device(sdev, shost)
> +		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> +
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		srp_finish_req(target, req, DID_TRANSPORT_FAILFAST << 16);
> +		srp_finish_req(target, req, NULL, DID_TRANSPORT_FAILFAST << 16);
>   	}
>   }
>   
> @@ -882,7 +894,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		srp_finish_req(target, req, DID_RESET << 16);
> +		srp_finish_req(target, req, NULL, DID_RESET << 16);
>   	}
>   
>   	INIT_LIST_HEAD(&target->free_tx);
> @@ -1290,7 +1302,7 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
>   		complete(&target->tsk_mgmt_done);
>   	} else {
>   		req = &target->req_ring[rsp->tag];
> -		scmnd = srp_claim_req(target, req, NULL);
> +		scmnd = srp_claim_req(target, req, NULL, NULL);
>   		if (!scmnd) {
>   			shost_printk(KERN_ERR, target->scsi_host,
>   				     "Null scmnd for RSP w/tag %016llx\n",
> @@ -2008,7 +2020,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
>   
>   	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
>   
> -	if (!req || !srp_claim_req(target, req, scmnd))
> +	if (!req || !srp_claim_req(target, req, NULL, scmnd))
>   		return SUCCESS;
>   	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
>   			      SRP_TSK_ABORT_TASK) == 0)
> @@ -2039,8 +2051,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
>   
>   	for (i = 0; i < target->req_ring_size; ++i) {
>   		struct srp_request *req = &target->req_ring[i];
> -		if (req->scmnd && req->scmnd->device == scmnd->device)
> -			srp_finish_req(target, req, DID_RESET << 16);
> +		srp_finish_req(target, req, scmnd->device, DID_RESET << 16);
>   	}
>   
>   	return SUCCESS;

Acked-by: Sagi Grimberg  <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2014-03-13  7:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-24 14:30 [PATCH v1 0/3] SRP initiator fixes for kernel 3.15 Sagi Grimberg
     [not found] ` <1393252218-30638-1-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-02-24 14:30   ` [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop Sagi Grimberg
     [not found]     ` <1393252218-30638-2-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-02-24 15:11       ` Sebastian Riemer
2014-02-24 15:24       ` Bart Van Assche
     [not found]         ` <530B6444.1000805-HInyCGIudOg@public.gmane.org>
2014-02-27 11:32           ` Sagi Grimberg
     [not found]             ` <530F225E.5070500-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-06  9:10               ` Bart Van Assche
     [not found]                 ` <53183B71.4090609-HInyCGIudOg@public.gmane.org>
2014-03-06 15:32                   ` sagi grimberg
     [not found]                     ` <53189526.2010704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-03-06 16:10                       ` Sagi Grimberg
     [not found]                         ` <53189DDE.7090003-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-07  8:13                           ` Bart Van Assche
     [not found]                             ` <53197FC5.4090102-HInyCGIudOg@public.gmane.org>
2014-03-11 13:38                               ` Sagi Grimberg
     [not found]                                 ` <531F11B8.5030202-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-11 13:51                                   ` Bart Van Assche
     [not found]                                     ` <531F14D0.7010608-HInyCGIudOg@public.gmane.org>
2014-03-11 14:07                                       ` Sagi Grimberg
     [not found]                                         ` <531F18A7.1000907-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-11 14:29                                           ` Bart Van Assche
     [not found]                                             ` <531F1DBD.9070505-HInyCGIudOg@public.gmane.org>
2014-03-11 14:48                                               ` Sagi Grimberg
     [not found]                                                 ` <531F2221.2090403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-11 15:00                                                   ` Bart Van Assche
     [not found]                                                     ` <531F2501.9090005-HInyCGIudOg@public.gmane.org>
2014-03-11 15:30                                                       ` Sagi Grimberg
     [not found]                                                         ` <531F2C2C.7080306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-03-12 13:16                                                           ` Bart Van Assche
     [not found]                                                             ` <53205E1E.3070304-HInyCGIudOg@public.gmane.org>
2014-03-13  7:42                                                               ` Sagi Grimberg
2014-02-24 14:30   ` [PATCH v1 2/3] IB/srp: Check ib_query_gid return value Sagi Grimberg
     [not found]     ` <1393252218-30638-3-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-02-24 15:34       ` Bart Van Assche
2014-02-24 14:30   ` [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows Sagi Grimberg
     [not found]     ` <1393252218-30638-4-git-send-email-sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-02-24 15:38       ` Bart Van Assche
     [not found]         ` <530B677A.1040508-HInyCGIudOg@public.gmane.org>
2014-02-27 11:51           ` Sagi Grimberg
     [not found]             ` <530F26CE.5070404-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-02-27 14:22               ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.