linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] nvmet-rdma: Fix double free of rdma queue
@ 2020-04-07 11:02 Israel Rukshin
  2020-04-07 15:45 ` Sagi Grimberg
  2020-04-07 16:34 ` Christoph Hellwig
  0 siblings, 2 replies; 3+ messages in thread
From: Israel Rukshin @ 2020-04-07 11:02 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Shlomi Nimrodi, Israel Rukshin, Max Gurtovoy

In case rdma accept fails at nvmet_rdma_queue_connect(), release work is
scheduled. Later on, a new RDMA CM event may arrive since we didn't
destroy the cm-id and call nvmet_rdma_queue_connect_fail(), which schedule
another release work. This will cause calling nvmet_rdma_free_queue twice.
To fix this we implicitly destroy the cm_id with non-zero ret code, which
guarantees that new rdma_cm events will not arrive afterwards.
Also add a qp pointer to nvmet_rdma_queue structure, so we can use it when
the cm_id pointer is NULL or was destroyed.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
---
Changes from v1:
 - Implicitly destroy the cm_id with non-zero ret code
 - Add a qp pointer to nvmet_rdma_queue structure

 drivers/nvme/target/rdma.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 37d262a..54b840d 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -75,6 +75,7 @@ enum nvmet_rdma_queue_state {
 
 struct nvmet_rdma_queue {
 	struct rdma_cm_id	*cm_id;
+	struct ib_qp		*qp;
 	struct nvmet_port	*port;
 	struct ib_cq		*cq;
 	atomic_t		sq_wr_avail;
@@ -464,7 +465,7 @@ static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev,
 	if (ndev->srq)
 		ret = ib_post_srq_recv(ndev->srq, &cmd->wr, NULL);
 	else
-		ret = ib_post_recv(cmd->queue->cm_id->qp, &cmd->wr, NULL);
+		ret = ib_post_recv(cmd->queue->qp, &cmd->wr, NULL);
 
 	if (unlikely(ret))
 		pr_err("post_recv cmd failed\n");
@@ -503,7 +504,7 @@ static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
 	atomic_add(1 + rsp->n_rdma, &queue->sq_wr_avail);
 
 	if (rsp->n_rdma) {
-		rdma_rw_ctx_destroy(&rsp->rw, queue->cm_id->qp,
+		rdma_rw_ctx_destroy(&rsp->rw, queue->qp,
 				queue->cm_id->port_num, rsp->req.sg,
 				rsp->req.sg_cnt, nvmet_data_dir(&rsp->req));
 	}
@@ -587,7 +588,7 @@ static void nvmet_rdma_read_data_done(struct ib_cq *cq, struct ib_wc *wc)
 
 	WARN_ON(rsp->n_rdma <= 0);
 	atomic_add(rsp->n_rdma, &queue->sq_wr_avail);
-	rdma_rw_ctx_destroy(&rsp->rw, queue->cm_id->qp,
+	rdma_rw_ctx_destroy(&rsp->rw, queue->qp,
 			queue->cm_id->port_num, rsp->req.sg,
 			rsp->req.sg_cnt, nvmet_data_dir(&rsp->req));
 	rsp->n_rdma = 0;
@@ -742,7 +743,7 @@ static bool nvmet_rdma_execute_command(struct nvmet_rdma_rsp *rsp)
 	}
 
 	if (nvmet_rdma_need_data_in(rsp)) {
-		if (rdma_rw_ctx_post(&rsp->rw, queue->cm_id->qp,
+		if (rdma_rw_ctx_post(&rsp->rw, queue->qp,
 				queue->cm_id->port_num, &rsp->read_cqe, NULL))
 			nvmet_req_complete(&rsp->req, NVME_SC_DATA_XFER_ERROR);
 	} else {
@@ -1025,6 +1026,7 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
 		pr_err("failed to create_qp ret= %d\n", ret);
 		goto err_destroy_cq;
 	}
+	queue->qp = queue->cm_id->qp;
 
 	atomic_set(&queue->sq_wr_avail, qp_attr.cap.max_send_wr);
 
@@ -1053,11 +1055,10 @@ static int nvmet_rdma_create_queue_ib(struct nvmet_rdma_queue *queue)
 
 static void nvmet_rdma_destroy_queue_ib(struct nvmet_rdma_queue *queue)
 {
-	struct ib_qp *qp = queue->cm_id->qp;
-
-	ib_drain_qp(qp);
-	rdma_destroy_id(queue->cm_id);
-	ib_destroy_qp(qp);
+	ib_drain_qp(queue->qp);
+	if (queue->cm_id)
+		rdma_destroy_id(queue->cm_id);
+	ib_destroy_qp(queue->qp);
 	ib_free_cq(queue->cq);
 }
 
@@ -1291,9 +1292,12 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
 
 	ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
 	if (ret) {
-		schedule_work(&queue->release_work);
-		/* Destroying rdma_cm id is not needed here */
-		return 0;
+		/*
+		 * Don't destroy the cm_id in free path, as we implicitly
+		 * destroy the cm_id here with non-zero ret code.
+		 */
+		queue->cm_id = NULL;
+		goto free_queue;
 	}
 
 	mutex_lock(&nvmet_rdma_queue_mutex);
@@ -1302,6 +1306,8 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
 
 	return 0;
 
+free_queue:
+	nvmet_rdma_free_queue(queue);
 put_device:
 	kref_put(&ndev->ref, nvmet_rdma_free_dev);
 
-- 
1.8.3.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH V2] nvmet-rdma: Fix double free of rdma queue
  2020-04-07 11:02 [PATCH V2] nvmet-rdma: Fix double free of rdma queue Israel Rukshin
@ 2020-04-07 15:45 ` Sagi Grimberg
  2020-04-07 16:34 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: Sagi Grimberg @ 2020-04-07 15:45 UTC (permalink / raw)
  To: Israel Rukshin, Linux-nvme, Christoph Hellwig
  Cc: Shlomi Nimrodi, Max Gurtovoy



On 4/7/20 4:02 AM, Israel Rukshin wrote:
> In case rdma accept fails at nvmet_rdma_queue_connect(), release work is
> scheduled. Later on, a new RDMA CM event may arrive since we didn't
> destroy the cm-id and call nvmet_rdma_queue_connect_fail(), which schedule
> another release work. This will cause calling nvmet_rdma_free_queue twice.
> To fix this we implicitly destroy the cm_id with non-zero ret code, which
> guarantees that new rdma_cm events will not arrive afterwards.
> Also add a qp pointer to nvmet_rdma_queue structure, so we can use it when
> the cm_id pointer is NULL or was destroyed.
> 
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Suggested-by: Sagi Grimberg <sagi@grimberg.me>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH V2] nvmet-rdma: Fix double free of rdma queue
  2020-04-07 11:02 [PATCH V2] nvmet-rdma: Fix double free of rdma queue Israel Rukshin
  2020-04-07 15:45 ` Sagi Grimberg
@ 2020-04-07 16:34 ` Christoph Hellwig
  1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2020-04-07 16:34 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Shlomi Nimrodi, Max Gurtovoy, Sagi Grimberg, Linux-nvme,
	Christoph Hellwig

Thanks,

applied to nvme-5.7.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-07 16:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07 11:02 [PATCH V2] nvmet-rdma: Fix double free of rdma queue Israel Rukshin
2020-04-07 15:45 ` Sagi Grimberg
2020-04-07 16:34 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).