[PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery
@ 2018-04-01 15:19 Israel Rukshin
  2018-04-01 15:19 ` [PATCH 2/2] nvme-rdma: Fix race at " Israel Rukshin
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Israel Rukshin @ 2018-04-01 15:19 UTC (permalink / raw)


When returning BLK_EH_HANDLED from nvme_rdma_timeout() the block layer
complete the request.
Error recovery may also complete the request when aborting the requests.

Signed-off-by: Israel Rukshin <israelr at mellanox.com>
Reviewed-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/rdma.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 758537e..c1abfc8 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1595,10 +1595,7 @@ static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
 	/* queue error recovery */
 	nvme_rdma_error_recovery(req->queue->ctrl);
 
-	/* fail with DNR on cmd timeout */
-	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
-
-	return BLK_EH_HANDLED;
+	return BLK_EH_RESET_TIMER;
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] nvme-rdma: Fix race at error recovery
  2018-04-01 15:19 [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery Israel Rukshin
@ 2018-04-01 15:19 ` Israel Rukshin
  2018-04-04 12:32 ` [PATCH 1/2] nvme-rdma: Fix race between queue timeout and " Sagi Grimberg
  2018-04-05  8:55 ` Christoph Hellwig
  2 siblings, 0 replies; 6+ messages in thread
From: Israel Rukshin @ 2018-04-01 15:19 UTC (permalink / raw)


We need to stop rdma queues before canceling the requests.
With this approach we avoid the race between error recovery work
and rdma completions.
The commit fix a NULL deref of a request MR at nvme_rdma_process_nvme_rsp().

Signed-off-by: Israel Rukshin <israelr at mellanox.com>
Reviewed-by: Max Gurtovoy <maxg at mellanox.com>
---
 drivers/nvme/host/rdma.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c1abfc8..56fccac 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -734,7 +734,6 @@ static struct blk_mq_tag_set *nvme_rdma_alloc_tagset(struct nvme_ctrl *nctrl,
 static void nvme_rdma_destroy_admin_queue(struct nvme_rdma_ctrl *ctrl,
 		bool remove)
 {
-	nvme_rdma_stop_queue(&ctrl->queues[0]);
 	if (remove) {
 		blk_cleanup_queue(ctrl->ctrl.admin_q);
 		nvme_rdma_free_tagset(&ctrl->ctrl, ctrl->ctrl.admin_tagset);
@@ -817,7 +816,6 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
 		bool remove)
 {
-	nvme_rdma_stop_io_queues(ctrl);
 	if (remove) {
 		blk_cleanup_queue(ctrl->ctrl.connect_q);
 		nvme_rdma_free_tagset(&ctrl->ctrl, ctrl->ctrl.tagset);
@@ -947,6 +945,7 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
 	return;
 
 destroy_admin:
+	nvme_rdma_stop_queue(&ctrl->queues[0]);
 	nvme_rdma_destroy_admin_queue(ctrl, false);
 requeue:
 	dev_info(ctrl->ctrl.device, "Failed reconnect attempt %d\n",
@@ -963,12 +962,14 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
+		nvme_rdma_stop_io_queues(ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
 					nvme_cancel_request, &ctrl->ctrl);
 		nvme_rdma_destroy_io_queues(ctrl, false);
 	}
 
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	nvme_rdma_stop_queue(&ctrl->queues[0]);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_cancel_request, &ctrl->ctrl);
 	nvme_rdma_destroy_admin_queue(ctrl, false);
@@ -1725,6 +1726,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
 {
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
+		nvme_rdma_stop_io_queues(ctrl);
 		blk_mq_tagset_busy_iter(&ctrl->tag_set,
 					nvme_cancel_request, &ctrl->ctrl);
 		nvme_rdma_destroy_io_queues(ctrl, shutdown);
@@ -1736,6 +1738,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
 		nvme_disable_ctrl(&ctrl->ctrl, ctrl->ctrl.cap);
 
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+	nvme_rdma_stop_queue(&ctrl->queues[0]);
 	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
 				nvme_cancel_request, &ctrl->ctrl);
 	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
@@ -2001,6 +2004,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 	return &ctrl->ctrl;
 
 out_remove_admin_queue:
+	nvme_rdma_stop_queue(&ctrl->queues[0]);
 	nvme_rdma_destroy_admin_queue(ctrl, true);
 out_kfree_queues:
 	kfree(ctrl->queues);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery
  2018-04-01 15:19 [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery Israel Rukshin
  2018-04-01 15:19 ` [PATCH 2/2] nvme-rdma: Fix race at " Israel Rukshin
@ 2018-04-04 12:32 ` Sagi Grimberg
  2018-04-05  8:55 ` Christoph Hellwig
  2 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2018-04-04 12:32 UTC (permalink / raw)



> When returning BLK_EH_HANDLED from nvme_rdma_timeout() the block layer
> complete the request.
> Error recovery may also complete the request when aborting the requests.
> 
> Signed-off-by: Israel Rukshin <israelr at mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg at mellanox.com>
> ---
>   drivers/nvme/host/rdma.c | 5 +----
>   1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 758537e..c1abfc8 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1595,10 +1595,7 @@ static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
>   	/* queue error recovery */
>   	nvme_rdma_error_recovery(req->queue->ctrl);
>   
> -	/* fail with DNR on cmd timeout */
> -	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
> -
> -	return BLK_EH_HANDLED;
> +	return BLK_EH_RESET_TIMER;
>   }

These look awfully familiar...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery
  2018-04-01 15:19 [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery Israel Rukshin
  2018-04-01 15:19 ` [PATCH 2/2] nvme-rdma: Fix race at " Israel Rukshin
  2018-04-04 12:32 ` [PATCH 1/2] nvme-rdma: Fix race between queue timeout and " Sagi Grimberg
@ 2018-04-05  8:55 ` Christoph Hellwig
  2018-04-08 11:04   ` Sagi Grimberg
  2 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2018-04-05  8:55 UTC (permalink / raw)


Please send an introduction cover letter explaining what issue you've
triggered and your overall design.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery
  2018-04-05  8:55 ` Christoph Hellwig
@ 2018-04-08 11:04   ` Sagi Grimberg
  2018-04-08 14:07     ` Israel Rukshin
  0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2018-04-08 11:04 UTC (permalink / raw)



> Please send an introduction cover letter explaining what issue you've
> triggered and your overall design.

The commit log is actually wrong... We don't complete the request in two
places, the issue is that we need to make sure to unmap user buffer
before completing the request in case of a timeout. I sent this patch
to a bug report on the list and this is what it is designed to do.

Given that we already simply schedule error recovery, we will fail it
there, after we drain the queue pair, so the choice is to reset the
timer for it in the timeout callout.

We could alternatively invalidate the rkey in the timeout callout, but
that won't work with the unsafe rkey mode.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery
  2018-04-08 11:04   ` Sagi Grimberg
@ 2018-04-08 14:07     ` Israel Rukshin
  0 siblings, 0 replies; 6+ messages in thread
From: Israel Rukshin @ 2018-04-08 14:07 UTC (permalink / raw)


On 4/8/2018 2:04 PM, Sagi Grimberg wrote:
>
>> Please send an introduction cover letter explaining what issue you've
>> triggered and your overall design.
>
> The commit log is actually wrong... We don't complete the request in two
> places, the issue is that we need to make sure to unmap user buffer

We have two bugs here that those paths fix.
What you said is one of them.
The second one is what I said here.
I will show the call traces I have got in my V2.

> before completing the request in case of a timeout. I sent this patch
> to a bug report on the list and this is what it is designed to do.
>
> Given that we already simply schedule error recovery, we will fail it
> there, after we drain the queue pair, so the choice is to reset the
> timer for it in the timeout callout.
>
> We could alternatively invalidate the rkey in the timeout callout, but
> that won't work with the unsafe rkey mode.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-04-08 14:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-01 15:19 [PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery Israel Rukshin
2018-04-01 15:19 ` [PATCH 2/2] nvme-rdma: Fix race at " Israel Rukshin
2018-04-04 12:32 ` [PATCH 1/2] nvme-rdma: Fix race between queue timeout and " Sagi Grimberg
2018-04-05  8:55 ` Christoph Hellwig
2018-04-08 11:04   ` Sagi Grimberg
2018-04-08 14:07     ` Israel Rukshin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.