All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nvmet/fcloop fixes
@ 2020-03-06 13:04 Hannes Reinecke
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-06 13:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chaitanya Kulkarni, Keith Busch, James Smart, linux-nvme,
	Hannes Reinecke, Sagi Grimberg

Hi James,

here are some fixes to be applied on top of your recent patchset
for nvme-fc (or even to be included with it). With these fixes my fcloop
blktest runs without issues and I can even unload the fcloop
module afterwards.

As usual, comments and reviews are welcome.

Hannes Reinecke (3):
  nvmet/fc: Sanitize tgtport reference counting for LS requests
  nvme/fcloop: short-circuit LS requests if no association is present
  nvme/fcloop: flush workqueue before calling
    nvme_fc_unregister_remoteport()

 drivers/nvme/target/fc.c     | 23 ++++++++++++-----------
 drivers/nvme/target/fcloop.c | 38 ++++++++++----------------------------
 2 files changed, 22 insertions(+), 39 deletions(-)

-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests
  2020-03-06 13:04 [PATCH 0/3] nvmet/fcloop fixes Hannes Reinecke
@ 2020-03-06 13:04 ` Hannes Reinecke
  2020-03-06 22:11   ` Sagi Grimberg
                     ` (2 more replies)
  2020-03-06 13:04 ` [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present Hannes Reinecke
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-06 13:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chaitanya Kulkarni, Keith Busch, James Smart, linux-nvme,
	Hannes Reinecke, Sagi Grimberg

We need to validate the targetport upon first access, otherwise
there's a risk of accessing an invalid targetport.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/fc.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 0a0f03b2faf3..aff959300ef3 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -358,6 +358,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
 
 	if (!lsop->req_queued) {
 		spin_unlock_irqrestore(&tgtport->lock, flags);
+		nvmet_fc_tgtport_put(tgtport);
 		return;
 	}
 
@@ -386,9 +387,6 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
 	if (!tgtport->ops->ls_req)
 		return -EOPNOTSUPP;
 
-	if (!nvmet_fc_tgtport_get(tgtport))
-		return -ESHUTDOWN;
-
 	lsreq->done = done;
 	lsop->req_queued = false;
 	INIT_LIST_HEAD(&lsop->lsreq_list);
@@ -396,10 +394,9 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
 	lsreq->rqstdma = fc_dma_map_single(tgtport->dev, lsreq->rqstaddr,
 				  lsreq->rqstlen + lsreq->rsplen,
 				  DMA_BIDIRECTIONAL);
-	if (fc_dma_mapping_error(tgtport->dev, lsreq->rqstdma)) {
-		ret = -EFAULT;
-		goto out_puttgtport;
-	}
+	if (fc_dma_mapping_error(tgtport->dev, lsreq->rqstdma))
+		return -EFAULT;
+
 	lsreq->rspdma = lsreq->rqstdma + lsreq->rqstlen;
 
 	spin_lock_irqsave(&tgtport->lock, flags);
@@ -426,9 +423,6 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
 	fc_dma_unmap_single(tgtport->dev, lsreq->rqstdma,
 				  (lsreq->rqstlen + lsreq->rsplen),
 				  DMA_BIDIRECTIONAL);
-out_puttgtport:
-	nvmet_fc_tgtport_put(tgtport);
-
 	return ret;
 }
 
@@ -482,14 +476,19 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
 	struct nvmefc_ls_req *lsreq;
 	int ret;
 
+	if (!nvmet_fc_tgtport_get(tgtport))
+		return;
+
 	/*
 	 * If ls_req is NULL or no hosthandle, it's an older lldd and no
 	 * message is normal. Otherwise, send unless the hostport has
 	 * already been invalidated by the lldd.
 	 */
 	if (!tgtport->ops->ls_req || !assoc->hostport ||
-	    assoc->hostport->invalid)
+	    assoc->hostport->invalid) {
+		nvmet_fc_tgtport_put(tgtport);
 		return;
+	}
 
 	lsop = kzalloc((sizeof(*lsop) +
 			sizeof(*discon_rqst) + sizeof(*discon_acc) +
@@ -498,6 +497,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
 		dev_info(tgtport->dev,
 			"{%d:%d} send Disconnect Association failed: ENOMEM\n",
 			tgtport->fc_target_port.port_num, assoc->a_id);
+		nvmet_fc_tgtport_put(tgtport);
 		return;
 	}
 
@@ -522,6 +522,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
 			"{%d:%d} XMT Disconnect Association failed: %d\n",
 			tgtport->fc_target_port.port_num, assoc->a_id, ret);
 		kfree(lsop);
+		nvmet_fc_tgtport_put(tgtport);
 	}
 }
 
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present
  2020-03-06 13:04 [PATCH 0/3] nvmet/fcloop fixes Hannes Reinecke
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
@ 2020-03-06 13:04 ` Hannes Reinecke
  2020-03-06 22:12   ` Sagi Grimberg
  2020-03-07  1:11   ` James Smart
  2020-03-06 13:04 ` [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport() Hannes Reinecke
  2020-03-31 14:24 ` [PATCH 0/3] nvmet/fcloop fixes Christoph Hellwig
  3 siblings, 2 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-06 13:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chaitanya Kulkarni, Keith Busch, James Smart, linux-nvme,
	Hannes Reinecke, Sagi Grimberg

If no association or targetport is present shifting LS processing
to a workqueue is a bad idea as the targetport itself is about
to be removed, and we would deadlock with targetport deletion.
So instead return an error directly and treat it as a send error,
avoiding the deadlock.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/fcloop.c | 36 +++++++++---------------------------
 1 file changed, 9 insertions(+), 27 deletions(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index 3610b0bd12da..4f9435d9fa61 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -332,26 +332,17 @@ fcloop_h2t_ls_req(struct nvme_fc_local_port *localport,
 {
 	struct fcloop_lsreq *tls_req = lsreq->private;
 	struct fcloop_rport *rport = remoteport->private;
-	int ret = 0;
+
+	if (!rport->targetport)
+		return  -ECONNREFUSED;
 
 	tls_req->lsreq = lsreq;
 	INIT_LIST_HEAD(&tls_req->ls_list);
 
-	if (!rport->targetport) {
-		tls_req->status = -ECONNREFUSED;
-		spin_lock(&rport->lock);
-		list_add_tail(&rport->ls_list, &tls_req->ls_list);
-		spin_unlock(&rport->lock);
-		schedule_work(&rport->ls_work);
-		return ret;
-	}
-
 	tls_req->status = 0;
-	ret = nvmet_fc_rcv_ls_req(rport->targetport, rport,
+	return nvmet_fc_rcv_ls_req(rport->targetport, rport,
 				  &tls_req->ls_rsp,
 				  lsreq->rqstaddr, lsreq->rqstlen);
-
-	return ret;
 }
 
 static int
@@ -415,7 +406,9 @@ fcloop_t2h_ls_req(struct nvmet_fc_target_port *targetport, void *hosthandle,
 {
 	struct fcloop_lsreq *tls_req = lsreq->private;
 	struct fcloop_tport *tport = targetport->private;
-	int ret = 0;
+
+	if (!tport->remoteport)
+		return -ECONNREFUSED;
 
 	/*
 	 * hosthandle should be the dst.rport value.
@@ -425,20 +418,9 @@ fcloop_t2h_ls_req(struct nvmet_fc_target_port *targetport, void *hosthandle,
 	tls_req->lsreq = lsreq;
 	INIT_LIST_HEAD(&tls_req->ls_list);
 
-	if (!tport->remoteport) {
-		tls_req->status = -ECONNREFUSED;
-		spin_lock(&tport->lock);
-		list_add_tail(&tport->ls_list, &tls_req->ls_list);
-		spin_unlock(&tport->lock);
-		schedule_work(&tport->ls_work);
-		return ret;
-	}
-
 	tls_req->status = 0;
-	ret = nvme_fc_rcv_ls_req(tport->remoteport, &tls_req->ls_rsp,
-				 lsreq->rqstaddr, lsreq->rqstlen);
-
-	return ret;
+	return nvme_fc_rcv_ls_req(tport->remoteport, &tls_req->ls_rsp,
+				  lsreq->rqstaddr, lsreq->rqstlen);
 }
 
 static int
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport()
  2020-03-06 13:04 [PATCH 0/3] nvmet/fcloop fixes Hannes Reinecke
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
  2020-03-06 13:04 ` [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present Hannes Reinecke
@ 2020-03-06 13:04 ` Hannes Reinecke
  2020-03-06 22:12   ` Sagi Grimberg
  2020-03-07  1:12   ` James Smart
  2020-03-31 14:24 ` [PATCH 0/3] nvmet/fcloop fixes Christoph Hellwig
  3 siblings, 2 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-06 13:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chaitanya Kulkarni, Keith Busch, James Smart, linux-nvme,
	Hannes Reinecke, Sagi Grimberg

nvme_fc_unregister_remoteport() will be sending LS requests, which then
would end up on a workqueue for processing. This will deadlock with
fcloop_remoteport_delete() which would try to flush the very same queue.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/fcloop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index 4f9435d9fa61..0209da9bcb67 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -943,7 +943,6 @@ fcloop_remoteport_delete(struct nvme_fc_remote_port *remoteport)
 {
 	struct fcloop_rport *rport = remoteport->private;
 
-	flush_work(&rport->ls_work);
 	fcloop_nport_put(rport->nport);
 }
 
@@ -1278,6 +1277,7 @@ __remoteport_unreg(struct fcloop_nport *nport, struct fcloop_rport *rport)
 	if (!rport)
 		return -EALREADY;
 
+	flush_work(&rport->ls_work);
 	return nvme_fc_unregister_remoteport(rport->remoteport);
 }
 
-- 
2.16.4


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
@ 2020-03-06 22:11   ` Sagi Grimberg
  2020-03-07  1:00   ` James Smart
  2020-03-10 16:52   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2020-03-06 22:11 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, James Smart, Chaitanya Kulkarni

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present
  2020-03-06 13:04 ` [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present Hannes Reinecke
@ 2020-03-06 22:12   ` Sagi Grimberg
  2020-03-07  1:11   ` James Smart
  1 sibling, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2020-03-06 22:12 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, James Smart, Chaitanya Kulkarni

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport()
  2020-03-06 13:04 ` [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport() Hannes Reinecke
@ 2020-03-06 22:12   ` Sagi Grimberg
  2020-03-07  1:12   ` James Smart
  1 sibling, 0 replies; 15+ messages in thread
From: Sagi Grimberg @ 2020-03-06 22:12 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, James Smart, Chaitanya Kulkarni

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
  2020-03-06 22:11   ` Sagi Grimberg
@ 2020-03-07  1:00   ` James Smart
  2020-03-10 16:52   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: James Smart @ 2020-03-07  1:00 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Sagi Grimberg, Chaitanya Kulkarni



On 3/6/2020 5:04 AM, Hannes Reinecke wrote:
> We need to validate the targetport upon first access, otherwise
> there's a risk of accessing an invalid targetport.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/target/fc.c | 23 ++++++++++++-----------
>   1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
> index 0a0f03b2faf3..aff959300ef3 100644
> --- a/drivers/nvme/target/fc.c
> +++ b/drivers/nvme/target/fc.c
> @@ -358,6 +358,7 @@ __nvmet_fc_finish_ls_req(struct nvmet_fc_ls_req_op *lsop)
>   
>   	if (!lsop->req_queued) {
>   		spin_unlock_irqrestore(&tgtport->lock, flags);
> +		nvmet_fc_tgtport_put(tgtport);
>   		return;
>   	}
>   
> @@ -386,9 +387,6 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
>   	if (!tgtport->ops->ls_req)
>   		return -EOPNOTSUPP;
>   
> -	if (!nvmet_fc_tgtport_get(tgtport))
> -		return -ESHUTDOWN;
> -
>   	lsreq->done = done;
>   	lsop->req_queued = false;
>   	INIT_LIST_HEAD(&lsop->lsreq_list);
> @@ -396,10 +394,9 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
>   	lsreq->rqstdma = fc_dma_map_single(tgtport->dev, lsreq->rqstaddr,
>   				  lsreq->rqstlen + lsreq->rsplen,
>   				  DMA_BIDIRECTIONAL);
> -	if (fc_dma_mapping_error(tgtport->dev, lsreq->rqstdma)) {
> -		ret = -EFAULT;
> -		goto out_puttgtport;
> -	}
> +	if (fc_dma_mapping_error(tgtport->dev, lsreq->rqstdma))
> +		return -EFAULT;
> +
>   	lsreq->rspdma = lsreq->rqstdma + lsreq->rqstlen;
>   
>   	spin_lock_irqsave(&tgtport->lock, flags);
> @@ -426,9 +423,6 @@ __nvmet_fc_send_ls_req(struct nvmet_fc_tgtport *tgtport,
>   	fc_dma_unmap_single(tgtport->dev, lsreq->rqstdma,
>   				  (lsreq->rqstlen + lsreq->rsplen),
>   				  DMA_BIDIRECTIONAL);
> -out_puttgtport:
> -	nvmet_fc_tgtport_put(tgtport);
> -
>   	return ret;
>   }
>   
> @@ -482,14 +476,19 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>   	struct nvmefc_ls_req *lsreq;
>   	int ret;
>   
> +	if (!nvmet_fc_tgtport_get(tgtport))
> +		return;
> +
>   	/*
>   	 * If ls_req is NULL or no hosthandle, it's an older lldd and no
>   	 * message is normal. Otherwise, send unless the hostport has
>   	 * already been invalidated by the lldd.
>   	 */
>   	if (!tgtport->ops->ls_req || !assoc->hostport ||
> -	    assoc->hostport->invalid)
> +	    assoc->hostport->invalid) {
> +		nvmet_fc_tgtport_put(tgtport);
>   		return;
> +	}
>   
>   	lsop = kzalloc((sizeof(*lsop) +
>   			sizeof(*discon_rqst) + sizeof(*discon_acc) +
> @@ -498,6 +497,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>   		dev_info(tgtport->dev,
>   			"{%d:%d} send Disconnect Association failed: ENOMEM\n",
>   			tgtport->fc_target_port.port_num, assoc->a_id);
> +		nvmet_fc_tgtport_put(tgtport);
>   		return;
>   	}
>   
> @@ -522,6 +522,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>   			"{%d:%d} XMT Disconnect Association failed: %d\n",
>   			tgtport->fc_target_port.port_num, assoc->a_id, ret);
>   		kfree(lsop);
> +		nvmet_fc_tgtport_put(tgtport);
>   	}
>   }
>   

I guess I don't see anything wrong with this, but it seems messier to 
handle the ref counting.  I prefer it to be on the send/completion 
localized to the common op processing.  Also creates a different 
implementation than what was done on the host side, even though the rest 
is very similar.

Reviewed-by: James Smart <james.smart@broadcom.com>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present
  2020-03-06 13:04 ` [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present Hannes Reinecke
  2020-03-06 22:12   ` Sagi Grimberg
@ 2020-03-07  1:11   ` James Smart
  2020-03-07  8:39     ` Hannes Reinecke
  1 sibling, 1 reply; 15+ messages in thread
From: James Smart @ 2020-03-07  1:11 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Sagi Grimberg, Chaitanya Kulkarni

On 3/6/2020 5:04 AM, Hannes Reinecke wrote:
> If no association or targetport is present shifting LS processing
> to a workqueue is a bad idea as the targetport itself is about
> to be removed, and we would deadlock with targetport deletion.
> So instead return an error directly and treat it as a send error,
> avoiding the deadlock.
So I don't quite agree on the targetport statement, especially relative 
to both the h2t and t2h paths.

Returning error certainly works, but it avoids the asynchronous 
completion path which is far more common. It only tests the 
immediate-return-error code case.   When throwing it to a work queue, 
it'll be a work queue specific to the port initiating the request - so 
it shouldn't be touching the other side which is non-existent.

so what you have fine, but i'm not sure it's what should be here.

-- james



_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport()
  2020-03-06 13:04 ` [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport() Hannes Reinecke
  2020-03-06 22:12   ` Sagi Grimberg
@ 2020-03-07  1:12   ` James Smart
  2020-03-07  8:41     ` Hannes Reinecke
  1 sibling, 1 reply; 15+ messages in thread
From: James Smart @ 2020-03-07  1:12 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Sagi Grimberg, Chaitanya Kulkarni



On 3/6/2020 5:04 AM, Hannes Reinecke wrote:
> nvme_fc_unregister_remoteport() will be sending LS requests, which then
> would end up on a workqueue for processing. This will deadlock with
> fcloop_remoteport_delete() which would try to flush the very same queue.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/target/fcloop.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
> index 4f9435d9fa61..0209da9bcb67 100644
> --- a/drivers/nvme/target/fcloop.c
> +++ b/drivers/nvme/target/fcloop.c
> @@ -943,7 +943,6 @@ fcloop_remoteport_delete(struct nvme_fc_remote_port *remoteport)
>   {
>   	struct fcloop_rport *rport = remoteport->private;
>   
> -	flush_work(&rport->ls_work);
>   	fcloop_nport_put(rport->nport);
>   }
>   
> @@ -1278,6 +1277,7 @@ __remoteport_unreg(struct fcloop_nport *nport, struct fcloop_rport *rport)
>   	if (!rport)
>   		return -EALREADY;
>   
> +	flush_work(&rport->ls_work);
>   	return nvme_fc_unregister_remoteport(rport->remoteport);
>   }
>   

And this is what we really needed, although there's probably a like 
thing that should be done on the targetport as well.

-- james


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present
  2020-03-07  1:11   ` James Smart
@ 2020-03-07  8:39     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-07  8:39 UTC (permalink / raw)
  To: James Smart, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Sagi Grimberg, Chaitanya Kulkarni

On 3/7/20 2:11 AM, James Smart wrote:
> On 3/6/2020 5:04 AM, Hannes Reinecke wrote:
>> If no association or targetport is present shifting LS processing
>> to a workqueue is a bad idea as the targetport itself is about
>> to be removed, and we would deadlock with targetport deletion.
>> So instead return an error directly and treat it as a send error,
>> avoiding the deadlock.
> So I don't quite agree on the targetport statement, especially relative 
> to both the h2t and t2h paths.
> 
> Returning error certainly works, but it avoids the asynchronous 
> completion path which is far more common. It only tests the 
> immediate-return-error code case.   When throwing it to a work queue, 
> it'll be a work queue specific to the port initiating the request - so 
> it shouldn't be touching the other side which is non-existent.
> 
> so what you have fine, but i'm not sure it's what should be here.
> 
Ah. you spotted it :-)

You are correct, it doesn't fix the underlying problem. But it does 
allow targetport deletion to finish, so there is _some_ value in it.
But indeed, the real problem is the flush_work() statement in
fcloop_targetport_delete(). Thing is, fcloop_targetport_delete() is 
called from the very same workqueue function when processing ls 
requests, and those need to flushed when tearing down the targetport:

[ 1488.211329] Workqueue: events fcloop_tport_lsrqst_work [
nvme_fcloop]
[ 1488.211331] Call Trace:
[ 1488.369706]  schedule+0x40/0xb0
[ 1488.369708]  schedule_timeout+0x1dd/0x300
[ 1488.469700]  wait_for_completion+0xba/0x140
[ 1488.469705]  __flush_work+0x177/0x1b0
[ 1488.469708]  ? worker_detach_from_pool+0xa0/0xa0
[ 1488.610280]  fcloop_targetport_delete+0x13/0x20 [nvme_fcloop]
[ 1488.610284]  nvmet_fc_tgtport_put+0x150/0x190 [nvmet_fc]
[ 1488.706952]  nvmet_fc_disconnect_assoc_done+0x9a/0xe0 [nvmet_fc]
[ 1488.706954]  fcloop_tport_lsrqst_work+0x7a/0xa0 [nvme_fcloop]
[ 1488.706957]  process_one_work+0x208/0x400
[ 1488.706959]  worker_thread+0x2d/0x3e0

So that needs to be move to somewhere else, but I haven't followed
the flow of control that closely that I've been able to figure out
_where_ it should go to.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport()
  2020-03-07  1:12   ` James Smart
@ 2020-03-07  8:41     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-03-07  8:41 UTC (permalink / raw)
  To: James Smart, Christoph Hellwig
  Cc: Keith Busch, linux-nvme, Sagi Grimberg, Chaitanya Kulkarni

On 3/7/20 2:12 AM, James Smart wrote:
> 
> 
> On 3/6/2020 5:04 AM, Hannes Reinecke wrote:
>> nvme_fc_unregister_remoteport() will be sending LS requests, which then
>> would end up on a workqueue for processing. This will deadlock with
>> fcloop_remoteport_delete() which would try to flush the very same queue.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/target/fcloop.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
>> index 4f9435d9fa61..0209da9bcb67 100644
>> --- a/drivers/nvme/target/fcloop.c
>> +++ b/drivers/nvme/target/fcloop.c
>> @@ -943,7 +943,6 @@ fcloop_remoteport_delete(struct 
>> nvme_fc_remote_port *remoteport)
>>   {
>>       struct fcloop_rport *rport = remoteport->private;
>> -    flush_work(&rport->ls_work);
>>       fcloop_nport_put(rport->nport);
>>   }
>> @@ -1278,6 +1277,7 @@ __remoteport_unreg(struct fcloop_nport *nport, 
>> struct fcloop_rport *rport)
>>       if (!rport)
>>           return -EALREADY;
>> +    flush_work(&rport->ls_work);
>>       return nvme_fc_unregister_remoteport(rport->remoteport);
>>   }
> 
> And this is what we really needed, although there's probably a like 
> thing that should be done on the targetport as well.
> 
As mentioned in the previous mail: I tried, but that crashed even more 
horribly.
Might've been to stupid to do it correctly, though.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests
  2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
  2020-03-06 22:11   ` Sagi Grimberg
  2020-03-07  1:00   ` James Smart
@ 2020-03-10 16:52   ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-10 16:52 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Sagi Grimberg, Keith Busch, Chaitanya Kulkarni, James Smart,
	linux-nvme, Christoph Hellwig

> @@ -482,14 +476,19 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>  	struct nvmefc_ls_req *lsreq;
>  	int ret;
>  
> +	if (!nvmet_fc_tgtport_get(tgtport))
> +		return;
> +
>  	/*
>  	 * If ls_req is NULL or no hosthandle, it's an older lldd and no
>  	 * message is normal. Otherwise, send unless the hostport has
>  	 * already been invalidated by the lldd.
>  	 */
>  	if (!tgtport->ops->ls_req || !assoc->hostport ||
> -	    assoc->hostport->invalid)
> +	    assoc->hostport->invalid) {
> +		nvmet_fc_tgtport_put(tgtport);
>  		return;
> +	}
>  
>  	lsop = kzalloc((sizeof(*lsop) +
>  			sizeof(*discon_rqst) + sizeof(*discon_acc) +
> @@ -498,6 +497,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>  		dev_info(tgtport->dev,
>  			"{%d:%d} send Disconnect Association failed: ENOMEM\n",
>  			tgtport->fc_target_port.port_num, assoc->a_id);
> +		nvmet_fc_tgtport_put(tgtport);
>  		return;
>  	}
>  
> @@ -522,6 +522,7 @@ nvmet_fc_xmt_disconnect_assoc(struct nvmet_fc_tgt_assoc *assoc)
>  			"{%d:%d} XMT Disconnect Association failed: %d\n",
>  			tgtport->fc_target_port.port_num, assoc->a_id, ret);
>  		kfree(lsop);
> +		nvmet_fc_tgtport_put(tgtport);

Can you consolidate all the calls in this function using goto
unwinding?

Otherwise looks good.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] nvmet/fcloop fixes
  2020-03-06 13:04 [PATCH 0/3] nvmet/fcloop fixes Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-03-06 13:04 ` [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport() Hannes Reinecke
@ 2020-03-31 14:24 ` Christoph Hellwig
  2020-04-16 10:37   ` Hannes Reinecke
  3 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-31 14:24 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Sagi Grimberg, Keith Busch, Chaitanya Kulkarni, James Smart,
	linux-nvme, Christoph Hellwig

Hannes,

are you going to respin this series?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] nvmet/fcloop fixes
  2020-03-31 14:24 ` [PATCH 0/3] nvmet/fcloop fixes Christoph Hellwig
@ 2020-04-16 10:37   ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2020-04-16 10:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chaitanya Kulkarni, linux-nvme, Sagi Grimberg, Keith Busch, James Smart

On 3/31/20 4:24 PM, Christoph Hellwig wrote:
> Hannes,
> 
> are you going to respin this series?
> 
Yes, will be doing so shortly.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-16 10:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-06 13:04 [PATCH 0/3] nvmet/fcloop fixes Hannes Reinecke
2020-03-06 13:04 ` [PATCH 1/3] nvmet/fc: Sanitize tgtport reference counting for LS requests Hannes Reinecke
2020-03-06 22:11   ` Sagi Grimberg
2020-03-07  1:00   ` James Smart
2020-03-10 16:52   ` Christoph Hellwig
2020-03-06 13:04 ` [PATCH 2/3] nvme/fcloop: short-circuit LS requests if no association is present Hannes Reinecke
2020-03-06 22:12   ` Sagi Grimberg
2020-03-07  1:11   ` James Smart
2020-03-07  8:39     ` Hannes Reinecke
2020-03-06 13:04 ` [PATCH 3/3] nvme/fcloop: flush workqueue before calling nvme_fc_unregister_remoteport() Hannes Reinecke
2020-03-06 22:12   ` Sagi Grimberg
2020-03-07  1:12   ` James Smart
2020-03-07  8:41     ` Hannes Reinecke
2020-03-31 14:24 ` [PATCH 0/3] nvmet/fcloop fixes Christoph Hellwig
2020-04-16 10:37   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.