All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 00/18] Misc update for rtrs
@ 2020-12-09 16:45 Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 01/18] RDMA/rtrs: Extend ibtrs_cq_qp_create Jack Wang
                   ` (18 more replies)
  0 siblings, 19 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma; +Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang

Hi Jason, hi Doug,

Please consider to include following changes to the next merge window.

It contains a few bugfix and cleanup.

The patches are created based on rdma/for-next.

Thanks!

Guoqing Jiang (8):
  RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails
  RDMA/rtrs: Call kobject_put in the failure path
  RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files}
  RDMA/rtrs-clt: Kill wait_for_inflight_permits
  RDMA/rtrs-clt: Remove unnecessary 'goto out'
  RDMA/rtrs-clt: Kill rtrs_clt_change_state
  RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state
  RDMA/rtrs-clt: Refactor the failure cases in alloc_clt

Jack Wang (10):
  RDMA/rtrs: Extend ibtrs_cq_qp_create
  RMDA/rtrs-srv: Occasionally flush ongoing session closing
  RDMA/rtrs-srv: Release lock before call into close_sess
  RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect
  RDMA/rtrs-clt: Set mininum limit when create QP
  RDMA/rtrs-srv: Fix missing wr_cqe
  RDMA/rtrs: Do not signal for heatbeat
  RDMA/rtrs-clt: Use bitmask to check sess->flags
  RDMA/rtrs-srv: Do not signal REG_MR
  RDMA/rtrs-srv: Init wr_cnt as 1

 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c |  11 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 120 +++++++++----------
 drivers/infiniband/ulp/rtrs/rtrs-clt.h       |   3 +-
 drivers/infiniband/ulp/rtrs/rtrs-pri.h       |   5 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c |   5 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       |  24 ++--
 drivers/infiniband/ulp/rtrs/rtrs.c           |  18 +--
 7 files changed, 93 insertions(+), 93 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH for-next 01/18] RDMA/rtrs: Extend ibtrs_cq_qp_create
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing Jack Wang
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Md Haris Iqbal

rtrs does not have same limit for both max_send_wr and max_recv_wr,
To allow client and server set different values, export in a separate
parameter for rtrs_cq_qp_create.

Also fix the type accordingly, u32 should be used instead of u16.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c |  4 ++--
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |  5 +++--
 drivers/infiniband/ulp/rtrs/rtrs-srv.c |  5 +++--
 drivers/infiniband/ulp/rtrs/rtrs.c     | 14 ++++++++------
 4 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 67f86c405a26..719254fc83a1 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1511,7 +1511,7 @@ static void destroy_con(struct rtrs_clt_con *con)
 static int create_con_cq_qp(struct rtrs_clt_con *con)
 {
 	struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
-	u16 wr_queue_size;
+	u32 wr_queue_size;
 	int err, cq_vector;
 	struct rtrs_msg_rkey_rsp *rsp;
 
@@ -1573,7 +1573,7 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 	cq_vector = con->cpu % sess->s.dev->ib_dev->num_comp_vectors;
 	err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
 				 cq_vector, wr_queue_size, wr_queue_size,
-				 IB_POLL_SOFTIRQ);
+				 wr_queue_size, IB_POLL_SOFTIRQ);
 	/*
 	 * In case of error we do not bother to clean previous allocations,
 	 * since destroy_con_cq_qp() must be called.
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index 3f2918671dbe..d5621e6fad1b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -303,8 +303,9 @@ int rtrs_post_rdma_write_imm_empty(struct rtrs_con *con, struct ib_cqe *cqe,
 				   struct ib_send_wr *head);
 
 int rtrs_cq_qp_create(struct rtrs_sess *rtrs_sess, struct rtrs_con *con,
-		      u32 max_send_sge, int cq_vector, u16 cq_size,
-		      u16 wr_queue_size, enum ib_poll_context poll_ctx);
+		      u32 max_send_sge, int cq_vector, int cq_size,
+		      u32 max_send_wr, u32 max_recv_wr,
+		      enum ib_poll_context poll_ctx);
 void rtrs_cq_qp_destroy(struct rtrs_con *con);
 
 void rtrs_init_hb(struct rtrs_sess *sess, struct ib_cqe *cqe,
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index c42fd470c4eb..ed4628f032bb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1586,7 +1586,7 @@ static int create_con(struct rtrs_srv_sess *sess,
 	struct rtrs_sess *s = &sess->s;
 	struct rtrs_srv_con *con;
 
-	u16 cq_size, wr_queue_size;
+	u32 cq_size, wr_queue_size;
 	int err, cq_vector;
 
 	con = kzalloc(sizeof(*con), GFP_KERNEL);
@@ -1630,7 +1630,8 @@ static int create_con(struct rtrs_srv_sess *sess,
 
 	/* TODO: SOFTIRQ can be faster, but be careful with softirq context */
 	err = rtrs_cq_qp_create(&sess->s, &con->c, 1, cq_vector, cq_size,
-				 wr_queue_size, IB_POLL_WORKQUEUE);
+				 wr_queue_size, wr_queue_size,
+				 IB_POLL_WORKQUEUE);
 	if (err) {
 		rtrs_err(s, "rtrs_cq_qp_create(), err: %d\n", err);
 		goto free_con;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index 2e3a849e0a77..df52427f1710 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -231,14 +231,14 @@ static int create_cq(struct rtrs_con *con, int cq_vector, u16 cq_size,
 }
 
 static int create_qp(struct rtrs_con *con, struct ib_pd *pd,
-		     u16 wr_queue_size, u32 max_sge)
+		     u32 max_send_wr, u32 max_recv_wr, u32 max_sge)
 {
 	struct ib_qp_init_attr init_attr = {NULL};
 	struct rdma_cm_id *cm_id = con->cm_id;
 	int ret;
 
-	init_attr.cap.max_send_wr = wr_queue_size;
-	init_attr.cap.max_recv_wr = wr_queue_size;
+	init_attr.cap.max_send_wr = max_send_wr;
+	init_attr.cap.max_recv_wr = max_recv_wr;
 	init_attr.cap.max_recv_sge = 1;
 	init_attr.event_handler = qp_event_handler;
 	init_attr.qp_context = con;
@@ -260,8 +260,9 @@ static int create_qp(struct rtrs_con *con, struct ib_pd *pd,
 }
 
 int rtrs_cq_qp_create(struct rtrs_sess *sess, struct rtrs_con *con,
-		       u32 max_send_sge, int cq_vector, u16 cq_size,
-		       u16 wr_queue_size, enum ib_poll_context poll_ctx)
+		       u32 max_send_sge, int cq_vector, int cq_size,
+		       u32 max_send_wr, u32 max_recv_wr,
+		       enum ib_poll_context poll_ctx)
 {
 	int err;
 
@@ -269,7 +270,8 @@ int rtrs_cq_qp_create(struct rtrs_sess *sess, struct rtrs_con *con,
 	if (err)
 		return err;
 
-	err = create_qp(con, sess->dev->ib_pd, wr_queue_size, max_send_sge);
+	err = create_qp(con, sess->dev->ib_pd, max_send_wr, max_recv_wr,
+			max_send_sge);
 	if (err) {
 		ib_free_cq(con->cq);
 		con->cq = NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 01/18] RDMA/rtrs: Extend ibtrs_cq_qp_create Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-10 14:56   ` Jinpu Wang
  2020-12-09 16:45 ` [PATCH for-next 03/18] RDMA/rtrs-srv: Release lock before call into close_sess Jack Wang
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Guoqing Jiang

If there are many establishments/teardowns, we need to make sure
we do not consume too much system memory. Thus let on going
session closing to finish before accepting new connection.

Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index ed4628f032bb..0a2202c28b54 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1791,6 +1791,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 		err = -ENOMEM;
 		goto reject_w_err;
 	}
+	if (!cid) {
+		/* Let inflight session teardown complete */
+		flush_workqueue(rtrs_wq);
+	}
 	mutex_lock(&srv->paths_mutex);
 	sess = __find_sess(srv, &msg->sess_uuid);
 	if (sess) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 03/18] RDMA/rtrs-srv: Release lock before call into close_sess
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 01/18] RDMA/rtrs: Extend ibtrs_cq_qp_create Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 04/18] RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect Jack Wang
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Lutz Pogrell

In this error case, we don't need hold mutex to call close_sess.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 0a2202c28b54..ef58fe021580 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1867,8 +1867,8 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 	return rtrs_rdma_do_reject(cm_id, -ECONNRESET);
 
 close_and_return_err:
-	close_sess(sess);
 	mutex_unlock(&srv->paths_mutex);
+	close_sess(sess);
 
 	return err;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 04/18] RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (2 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 03/18] RDMA/rtrs-srv: Release lock before call into close_sess Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 05/18] RDMA/rtrs-clt: Set mininum limit when create QP Jack Wang
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Lutz Pogrell

Remove self first to avoid deadlock, we don't want to
use close_work to remove sess sysfs.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index d2edff3b8f0d..cca3a0acbabc 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -51,6 +51,8 @@ static ssize_t rtrs_srv_disconnect_store(struct kobject *kobj,
 	sockaddr_to_str((struct sockaddr *)&sess->s.dst_addr, str, sizeof(str));
 
 	rtrs_info(s, "disconnect for path %s requested\n", str);
+	/* first remove sysfs itself to avoid deadlock */
+	sysfs_remove_file_self(&sess->kobj, &attr->attr);
 	close_sess(sess);
 
 	return count;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 05/18] RDMA/rtrs-clt: Set mininum limit when create QP
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (3 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 04/18] RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 06/18] RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails Jack Wang
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Md Haris Iqbal

Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory
with no benefits.

- SERVICE con,
For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2

- IO con
For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 719254fc83a1..b3fb5fb93815 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1511,7 +1511,7 @@ static void destroy_con(struct rtrs_clt_con *con)
 static int create_con_cq_qp(struct rtrs_clt_con *con)
 {
 	struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
-	u32 wr_queue_size;
+	u32 max_send_wr, max_recv_wr, cq_size;
 	int err, cq_vector;
 	struct rtrs_msg_rkey_rsp *rsp;
 
@@ -1523,7 +1523,8 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 		 * + 2 for drain and heartbeat
 		 * in case qp gets into error state
 		 */
-		wr_queue_size = SERVICE_CON_QUEUE_DEPTH * 3 + 2;
+		max_send_wr = SERVICE_CON_QUEUE_DEPTH * 2 + 2;
+		max_recv_wr = SERVICE_CON_QUEUE_DEPTH * 2 + 2;
 		/* We must be the first here */
 		if (WARN_ON(sess->s.dev))
 			return -EINVAL;
@@ -1555,25 +1556,29 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 
 		/* Shared between connections */
 		sess->s.dev_ref++;
-		wr_queue_size =
+		max_send_wr =
 			min_t(int, sess->s.dev->ib_dev->attrs.max_qp_wr,
 			      /* QD * (REQ + RSP + FR REGS or INVS) + drain */
 			      sess->queue_depth * 3 + 1);
+		max_recv_wr =
+			min_t(int, sess->s.dev->ib_dev->attrs.max_qp_wr,
+			      sess->queue_depth * 3 + 1);
 	}
 	/* alloc iu to recv new rkey reply when server reports flags set */
 	if (sess->flags == RTRS_MSG_NEW_RKEY_F || con->c.cid == 0) {
-		con->rsp_ius = rtrs_iu_alloc(wr_queue_size, sizeof(*rsp),
+		con->rsp_ius = rtrs_iu_alloc(max_recv_wr, sizeof(*rsp),
 					      GFP_KERNEL, sess->s.dev->ib_dev,
 					      DMA_FROM_DEVICE,
 					      rtrs_clt_rdma_done);
 		if (!con->rsp_ius)
 			return -ENOMEM;
-		con->queue_size = wr_queue_size;
+		con->queue_size = max_recv_wr;
 	}
+	cq_size = max_send_wr + max_recv_wr;
 	cq_vector = con->cpu % sess->s.dev->ib_dev->num_comp_vectors;
 	err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
-				 cq_vector, wr_queue_size, wr_queue_size,
-				 wr_queue_size, IB_POLL_SOFTIRQ);
+				 cq_vector, cq_size, max_send_wr,
+				 max_recv_wr, IB_POLL_SOFTIRQ);
 	/*
 	 * In case of error we do not bother to clean previous allocations,
 	 * since destroy_con_cq_qp() must be called.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 06/18] RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (4 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 05/18] RDMA/rtrs-clt: Set mininum limit when create QP Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 07/18] RDMA/rtrs: Call kobject_put in the failure path Jack Wang
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Guoqing Jiang, Gioh Kim

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

The rtrs_iu_free is called in rtrs_iu_alloc if memory is limited, so we
don't need to free the same iu again.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index ef58fe021580..27ac5a03259a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -651,7 +651,7 @@ static int map_cont_bufs(struct rtrs_srv_sess *sess)
 			if (!srv_mr->iu) {
 				err = -ENOMEM;
 				rtrs_err(ss, "rtrs_iu_alloc(), err: %d\n", err);
-				goto free_iu;
+				goto dereg_mr;
 			}
 		}
 		/* Eventually dma addr for each chunk can be cached */
@@ -667,7 +667,6 @@ static int map_cont_bufs(struct rtrs_srv_sess *sess)
 			srv_mr = &sess->mrs[mri];
 			sgt = &srv_mr->sgt;
 			mr = srv_mr->mr;
-free_iu:
 			rtrs_iu_free(srv_mr->iu, sess->s.dev->ib_dev, 1);
 dereg_mr:
 			ib_dereg_mr(mr);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 07/18] RDMA/rtrs: Call kobject_put in the failure path
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (5 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 06/18] RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 08/18] RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files} Jack Wang
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Guoqing Jiang, Md Haris Iqbal, Gioh Kim

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Per the comment of kobject_init_and_add, we need to free the memory
by call kobject_put.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 2 ++
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
index ba00f0de14ca..ad77659800cd 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
@@ -408,6 +408,7 @@ int rtrs_clt_create_sess_files(struct rtrs_clt_sess *sess)
 				   "%s", str);
 	if (err) {
 		pr_err("kobject_init_and_add: %d\n", err);
+		kobject_put(&sess->kobj);
 		return err;
 	}
 	err = sysfs_create_group(&sess->kobj, &rtrs_clt_sess_attr_group);
@@ -419,6 +420,7 @@ int rtrs_clt_create_sess_files(struct rtrs_clt_sess *sess)
 				   &sess->kobj, "stats");
 	if (err) {
 		pr_err("kobject_init_and_add: %d\n", err);
+		kobject_put(&sess->stats->kobj_stats);
 		goto remove_group;
 	}
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index cca3a0acbabc..0a3886629cae 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -236,6 +236,7 @@ static int rtrs_srv_create_stats_files(struct rtrs_srv_sess *sess)
 				   &sess->kobj, "stats");
 	if (err) {
 		rtrs_err(s, "kobject_init_and_add(): %d\n", err);
+		kobject_put(&sess->stats->kobj_stats);
 		return err;
 	}
 	err = sysfs_create_group(&sess->stats->kobj_stats,
@@ -292,8 +293,8 @@ int rtrs_srv_create_sess_files(struct rtrs_srv_sess *sess)
 	sysfs_remove_group(&sess->kobj, &rtrs_srv_sess_attr_group);
 put_kobj:
 	kobject_del(&sess->kobj);
-	kobject_put(&sess->kobj);
 destroy_root:
+	kobject_put(&sess->kobj);
 	rtrs_srv_destroy_once_sysfs_root_folders(sess);
 
 	return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 08/18] RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files}
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (6 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 07/18] RDMA/rtrs: Call kobject_put in the failure path Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 09/18] RDMA/rtrs-clt: Kill wait_for_inflight_permits Jack Wang
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Guoqing Jiang

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Since the two functions are called together, let's consolidate them in
a new function rtrs_clt_destroy_sysfs_root.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 9 +++------
 drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 6 ++----
 drivers/infiniband/ulp/rtrs/rtrs-clt.h       | 3 +--
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
index ad77659800cd..b6a0abf40589 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
@@ -471,15 +471,12 @@ int rtrs_clt_create_sysfs_root_files(struct rtrs_clt *clt)
 	return sysfs_create_group(&clt->dev.kobj, &rtrs_clt_attr_group);
 }
 
-void rtrs_clt_destroy_sysfs_root_folders(struct rtrs_clt *clt)
+void rtrs_clt_destroy_sysfs_root(struct rtrs_clt *clt)
 {
+	sysfs_remove_group(&clt->dev.kobj, &rtrs_clt_attr_group);
+
 	if (clt->kobj_paths) {
 		kobject_del(clt->kobj_paths);
 		kobject_put(clt->kobj_paths);
 	}
 }
-
-void rtrs_clt_destroy_sysfs_root_files(struct rtrs_clt *clt)
-{
-	sysfs_remove_group(&clt->dev.kobj, &rtrs_clt_attr_group);
-}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index b3fb5fb93815..99fc34950032 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -2707,8 +2707,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 		rtrs_clt_close_conns(sess, true);
 		kobject_put(&sess->kobj);
 	}
-	rtrs_clt_destroy_sysfs_root_files(clt);
-	rtrs_clt_destroy_sysfs_root_folders(clt);
+	rtrs_clt_destroy_sysfs_root(clt);
 	free_clt(clt);
 
 out:
@@ -2725,8 +2724,7 @@ void rtrs_clt_close(struct rtrs_clt *clt)
 	struct rtrs_clt_sess *sess, *tmp;
 
 	/* Firstly forbid sysfs access */
-	rtrs_clt_destroy_sysfs_root_files(clt);
-	rtrs_clt_destroy_sysfs_root_folders(clt);
+	rtrs_clt_destroy_sysfs_root(clt);
 
 	/* Now it is safe to iterate over all paths without locks */
 	list_for_each_entry_safe(sess, tmp, &clt->paths_list, s.entry) {
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index b8dbd701b3cb..a97a068c4c28 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -243,8 +243,7 @@ ssize_t rtrs_clt_reset_all_help(struct rtrs_clt_stats *stats,
 /* rtrs-clt-sysfs.c */
 
 int rtrs_clt_create_sysfs_root_files(struct rtrs_clt *clt);
-void rtrs_clt_destroy_sysfs_root_folders(struct rtrs_clt *clt);
-void rtrs_clt_destroy_sysfs_root_files(struct rtrs_clt *clt);
+void rtrs_clt_destroy_sysfs_root(struct rtrs_clt *clt);
 
 int rtrs_clt_create_sess_files(struct rtrs_clt_sess *sess);
 void rtrs_clt_destroy_sess_files(struct rtrs_clt_sess *sess,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 09/18] RDMA/rtrs-clt: Kill wait_for_inflight_permits
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (7 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 08/18] RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files} Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 10/18] RDMA/rtrs-clt: Remove unnecessary 'goto out' Jack Wang
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Guoqing Jiang, Md Haris Iqbal

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Let's wait the inflight permits before free it.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 99fc34950032..6a5b72ad5ba1 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1318,6 +1318,12 @@ static int alloc_permits(struct rtrs_clt *clt)
 
 static void free_permits(struct rtrs_clt *clt)
 {
+	if (clt->permits_map) {
+		size_t sz = clt->queue_depth;
+
+		wait_event(clt->permits_wait,
+			   find_first_bit(clt->permits_map, sz) >= sz);
+	}
 	kfree(clt->permits_map);
 	clt->permits_map = NULL;
 	kfree(clt->permits);
@@ -2607,19 +2613,8 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 	return clt;
 }
 
-static void wait_for_inflight_permits(struct rtrs_clt *clt)
-{
-	if (clt->permits_map) {
-		size_t sz = clt->queue_depth;
-
-		wait_event(clt->permits_wait,
-			   find_first_bit(clt->permits_map, sz) >= sz);
-	}
-}
-
 static void free_clt(struct rtrs_clt *clt)
 {
-	wait_for_inflight_permits(clt);
 	free_permits(clt);
 	free_percpu(clt->pcpu_path);
 	mutex_destroy(&clt->paths_ev_mutex);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 10/18] RDMA/rtrs-clt: Remove unnecessary 'goto out'
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (8 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 09/18] RDMA/rtrs-clt: Kill wait_for_inflight_permits Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 11/18] RDMA/rtrs-clt: Kill rtrs_clt_change_state Jack Wang
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Guoqing Jiang

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

This is not needed since the label is just after the place.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 6a5b72ad5ba1..d99fb1a1c194 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -2434,7 +2434,6 @@ static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
 			err = -ECONNRESET;
 		else
 			err = -ETIMEDOUT;
-		goto out;
 	}
 
 out:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 11/18] RDMA/rtrs-clt: Kill rtrs_clt_change_state
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (9 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 10/18] RDMA/rtrs-clt: Remove unnecessary 'goto out' Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 12/18] RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state Jack Wang
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Guoqing Jiang

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse
rtrs_clt_change_state_get_old with add the checking of 'old_state' is
valid or not.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 27 ++++++++++----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index d99fb1a1c194..39dc8423d7df 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1359,21 +1359,14 @@ static bool rtrs_clt_change_state_get_old(struct rtrs_clt_sess *sess,
 	bool changed;
 
 	spin_lock_irq(&sess->state_wq.lock);
-	*old_state = sess->state;
+	if (old_state)
+		*old_state = sess->state;
 	changed = __rtrs_clt_change_state(sess, new_state);
 	spin_unlock_irq(&sess->state_wq.lock);
 
 	return changed;
 }
 
-static bool rtrs_clt_change_state(struct rtrs_clt_sess *sess,
-				   enum rtrs_clt_state new_state)
-{
-	enum rtrs_clt_state old_state;
-
-	return rtrs_clt_change_state_get_old(sess, new_state, &old_state);
-}
-
 static void rtrs_clt_hb_err_handler(struct rtrs_con *c)
 {
 	struct rtrs_clt_con *con = container_of(c, typeof(*con), c);
@@ -1799,7 +1792,7 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
 
 static void rtrs_clt_close_conns(struct rtrs_clt_sess *sess, bool wait)
 {
-	if (rtrs_clt_change_state(sess, RTRS_CLT_CLOSING))
+	if (rtrs_clt_change_state_get_old(sess, RTRS_CLT_CLOSING, NULL))
 		queue_work(rtrs_wq, &sess->close_work);
 	if (wait)
 		flush_work(&sess->close_work);
@@ -2185,7 +2178,7 @@ static void rtrs_clt_close_work(struct work_struct *work)
 
 	cancel_delayed_work_sync(&sess->reconnect_dwork);
 	rtrs_clt_stop_and_destroy_conns(sess);
-	rtrs_clt_change_state(sess, RTRS_CLT_CLOSED);
+	rtrs_clt_change_state_get_old(sess, RTRS_CLT_CLOSED, NULL);
 }
 
 static int init_conns(struct rtrs_clt_sess *sess)
@@ -2237,7 +2230,7 @@ static int init_conns(struct rtrs_clt_sess *sess)
 	 * doing rdma_resolve_addr(), switch to CONNECTION_ERR state
 	 * manually to keep reconnecting.
 	 */
-	rtrs_clt_change_state(sess, RTRS_CLT_CONNECTING_ERR);
+	rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING_ERR, NULL);
 
 	return err;
 }
@@ -2254,7 +2247,7 @@ static void rtrs_clt_info_req_done(struct ib_cq *cq, struct ib_wc *wc)
 	if (unlikely(wc->status != IB_WC_SUCCESS)) {
 		rtrs_err(sess->clt, "Sess info request send failed: %s\n",
 			  ib_wc_status_msg(wc->status));
-		rtrs_clt_change_state(sess, RTRS_CLT_CONNECTING_ERR);
+		rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING_ERR, NULL);
 		return;
 	}
 
@@ -2378,7 +2371,7 @@ static void rtrs_clt_info_rsp_done(struct ib_cq *cq, struct ib_wc *wc)
 out:
 	rtrs_clt_update_wc_stats(con);
 	rtrs_iu_free(iu, sess->s.dev->ib_dev, 1);
-	rtrs_clt_change_state(sess, state);
+	rtrs_clt_change_state_get_old(sess, state, NULL);
 }
 
 static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
@@ -2443,7 +2436,7 @@ static int rtrs_send_sess_info(struct rtrs_clt_sess *sess)
 		rtrs_iu_free(rx_iu, sess->s.dev->ib_dev, 1);
 	if (unlikely(err))
 		/* If we've never taken async path because of malloc problems */
-		rtrs_clt_change_state(sess, RTRS_CLT_CONNECTING_ERR);
+		rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING_ERR, NULL);
 
 	return err;
 }
@@ -2500,7 +2493,7 @@ static void rtrs_clt_reconnect_work(struct work_struct *work)
 	/* Stop everything */
 	rtrs_clt_stop_and_destroy_conns(sess);
 	msleep(RTRS_RECONNECT_BACKOFF);
-	if (rtrs_clt_change_state(sess, RTRS_CLT_CONNECTING)) {
+	if (rtrs_clt_change_state_get_old(sess, RTRS_CLT_CONNECTING, NULL)) {
 		err = init_sess(sess);
 		if (err)
 			goto reconnect_again;
@@ -2509,7 +2502,7 @@ static void rtrs_clt_reconnect_work(struct work_struct *work)
 	return;
 
 reconnect_again:
-	if (rtrs_clt_change_state(sess, RTRS_CLT_RECONNECTING)) {
+	if (rtrs_clt_change_state_get_old(sess, RTRS_CLT_RECONNECTING, NULL)) {
 		sess->stats->reconnects.fail_cnt++;
 		delay_ms = clt->reconnect_delay_sec * 1000;
 		queue_delayed_work(rtrs_wq, &sess->reconnect_dwork,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 12/18] RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (10 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 11/18] RDMA/rtrs-clt: Kill rtrs_clt_change_state Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 13/18] RDMA/rtrs-srv: Fix missing wr_cqe Jack Wang
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Guoqing Jiang, Md Haris Iqbal

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Let's rename it to rtrs_clt_change_state since the previous one is
killed.

Also update the comment to make it more clear.

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 39dc8423d7df..3c90718f668d 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -178,18 +178,18 @@ struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
 }
 
 /**
- * __rtrs_clt_change_state() - change the session state through session state
+ * rtrs_clt_change_state() - change the session state through session state
  * machine.
  *
  * @sess: client session to change the state of.
  * @new_state: state to change to.
  *
- * returns true if successful, false if the requested state can not be set.
+ * returns true if sess's state is changed to new state, otherwise return false.
  *
  * Locks:
  * state_wq lock must be hold.
  */
-static bool __rtrs_clt_change_state(struct rtrs_clt_sess *sess,
+static bool rtrs_clt_change_state(struct rtrs_clt_sess *sess,
 				     enum rtrs_clt_state new_state)
 {
 	enum rtrs_clt_state old_state;
@@ -286,7 +286,7 @@ static bool rtrs_clt_change_state_from_to(struct rtrs_clt_sess *sess,
 
 	spin_lock_irq(&sess->state_wq.lock);
 	if (sess->state == old_state)
-		changed = __rtrs_clt_change_state(sess, new_state);
+		changed = rtrs_clt_change_state(sess, new_state);
 	spin_unlock_irq(&sess->state_wq.lock);
 
 	return changed;
@@ -1361,7 +1361,7 @@ static bool rtrs_clt_change_state_get_old(struct rtrs_clt_sess *sess,
 	spin_lock_irq(&sess->state_wq.lock);
 	if (old_state)
 		*old_state = sess->state;
-	changed = __rtrs_clt_change_state(sess, new_state);
+	changed = rtrs_clt_change_state(sess, new_state);
 	spin_unlock_irq(&sess->state_wq.lock);
 
 	return changed;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 13/18] RDMA/rtrs-srv: Fix missing wr_cqe
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (11 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 12/18] RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 14/18] RDMA/rtrs-clt: Refactor the failure cases in alloc_clt Jack Wang
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Md Haris Iqbal, Guoqing Jiang

We had a few places wr_cqe is not set, which could lead to NULL pointer
deref or GPF in error case.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 27ac5a03259a..c742bb5d965b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -267,6 +267,7 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 		WARN_ON_ONCE(rkey != wr->rkey);
 
 	wr->wr.opcode = IB_WR_RDMA_WRITE;
+	wr->wr.wr_cqe   = &io_comp_cqe;
 	wr->wr.ex.imm_data = 0;
 	wr->wr.send_flags  = 0;
 
@@ -294,6 +295,7 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 		inv_wr.sg_list = NULL;
 		inv_wr.num_sge = 0;
 		inv_wr.opcode = IB_WR_SEND_WITH_INV;
+		inv_wr.wr_cqe   = &io_comp_cqe;
 		inv_wr.send_flags = 0;
 		inv_wr.ex.invalidate_rkey = rkey;
 	}
@@ -304,6 +306,7 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 
 		srv_mr = &sess->mrs[id->msg_id];
 		rwr.wr.opcode = IB_WR_REG_MR;
+		rwr.wr.wr_cqe = &local_reg_cqe;
 		rwr.wr.num_sge = 0;
 		rwr.mr = srv_mr->mr;
 		rwr.wr.send_flags = 0;
@@ -379,6 +382,7 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 
 		if (need_inval) {
 			if (likely(sg_cnt)) {
+				inv_wr.wr_cqe   = &io_comp_cqe;
 				inv_wr.sg_list = NULL;
 				inv_wr.num_sge = 0;
 				inv_wr.opcode = IB_WR_SEND_WITH_INV;
@@ -421,6 +425,7 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 		srv_mr = &sess->mrs[id->msg_id];
 		rwr.wr.next = &imm_wr;
 		rwr.wr.opcode = IB_WR_REG_MR;
+		rwr.wr.wr_cqe = &local_reg_cqe;
 		rwr.wr.num_sge = 0;
 		rwr.wr.send_flags = 0;
 		rwr.mr = srv_mr->mr;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 14/18] RDMA/rtrs-clt: Refactor the failure cases in alloc_clt
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (12 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 13/18] RDMA/rtrs-srv: Fix missing wr_cqe Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 15/18] RDMA/rtrs: Do not signal for heatbeat Jack Wang
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Guoqing Jiang, Md Haris Iqbal

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Make all failure cases go to the common path to avoid duplicate code.
And some issued existed before.

1. clt need to be freed to avoid memory leak.

2. return ERR_PTR(-ENOMEM) if kobject_create_and_add fails, because
   rtrs_clt_open checks the return value of by call "IS_ERR(clt)".

Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 3c90718f668d..493f45a33b5e 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -2568,11 +2568,8 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 	clt->dev.class = rtrs_clt_dev_class;
 	clt->dev.release = rtrs_clt_dev_release;
 	err = dev_set_name(&clt->dev, "%s", sessname);
-	if (err) {
-		free_percpu(clt->pcpu_path);
-		kfree(clt);
-		return ERR_PTR(err);
-	}
+	if (err)
+		goto err;
 	/*
 	 * Suppress user space notification until
 	 * sysfs files are created
@@ -2580,29 +2577,31 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 	dev_set_uevent_suppress(&clt->dev, true);
 	err = device_register(&clt->dev);
 	if (err) {
-		free_percpu(clt->pcpu_path);
 		put_device(&clt->dev);
-		return ERR_PTR(err);
+		goto err;
 	}
 
 	clt->kobj_paths = kobject_create_and_add("paths", &clt->dev.kobj);
 	if (!clt->kobj_paths) {
-		free_percpu(clt->pcpu_path);
-		device_unregister(&clt->dev);
-		return NULL;
+		err = -ENOMEM;
+		goto err_dev;
 	}
 	err = rtrs_clt_create_sysfs_root_files(clt);
 	if (err) {
-		free_percpu(clt->pcpu_path);
 		kobject_del(clt->kobj_paths);
 		kobject_put(clt->kobj_paths);
-		device_unregister(&clt->dev);
-		return ERR_PTR(err);
+		goto err_dev;
 	}
 	dev_set_uevent_suppress(&clt->dev, false);
 	kobject_uevent(&clt->dev.kobj, KOBJ_ADD);
 
 	return clt;
+err_dev:
+	device_unregister(&clt->dev);
+err:
+	free_percpu(clt->pcpu_path);
+	kfree(clt);
+	return ERR_PTR(err);
 }
 
 static void free_clt(struct rtrs_clt *clt)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 15/18] RDMA/rtrs: Do not signal for heatbeat
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (13 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 14/18] RDMA/rtrs-clt: Refactor the failure cases in alloc_clt Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 16/18] RDMA/rtrs-clt: Use bitmask to check sess->flags Jack Wang
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

For HB, there is no need to generate signal for completion.

Also remove a comment accordingly.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reported-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 1 -
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 1 -
 drivers/infiniband/ulp/rtrs/rtrs.c     | 4 ++--
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 493f45a33b5e..2053bf99418a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -664,7 +664,6 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 	case IB_WC_RDMA_WRITE:
 		/*
 		 * post_send() RDMA write completions of IO reqs (read/write)
-		 * and hb
 		 */
 		break;
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index c742bb5d965b..5065bf31729d 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1242,7 +1242,6 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 	case IB_WC_SEND:
 		/*
 		 * post_send() RDMA write completions of IO reqs (read/write)
-		 * and hb
 		 */
 		atomic_add(srv->queue_depth, &con->sq_wr_avail);
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index df52427f1710..97af8f0bb806 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -310,7 +310,7 @@ void rtrs_send_hb_ack(struct rtrs_sess *sess)
 
 	imm = rtrs_to_imm(RTRS_HB_ACK_IMM, 0);
 	err = rtrs_post_rdma_write_imm_empty(usr_con, sess->hb_cqe, imm,
-					      IB_SEND_SIGNALED, NULL);
+					     0, NULL);
 	if (err) {
 		sess->hb_err_handler(usr_con);
 		return;
@@ -339,7 +339,7 @@ static void hb_work(struct work_struct *work)
 	}
 	imm = rtrs_to_imm(RTRS_HB_MSG_IMM, 0);
 	err = rtrs_post_rdma_write_imm_empty(usr_con, sess->hb_cqe, imm,
-					      IB_SEND_SIGNALED, NULL);
+					     0, NULL);
 	if (err) {
 		sess->hb_err_handler(usr_con);
 		return;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 16/18] RDMA/rtrs-clt: Use bitmask to check sess->flags
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (14 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 15/18] RDMA/rtrs: Do not signal for heatbeat Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 17/18] RDMA/rtrs-srv: Do not signal REG_MR Jack Wang
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

We may want to add new flags, so it's better to use bitmask to check flags.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 2053bf99418a..7644c3f627ca 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -494,7 +494,7 @@ static void rtrs_clt_recv_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 	int err;
 	struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
 
-	WARN_ON(sess->flags != RTRS_MSG_NEW_RKEY_F);
+	WARN_ON((sess->flags & RTRS_MSG_NEW_RKEY_F) == 0);
 	iu = container_of(wc->wr_cqe, struct rtrs_iu,
 			  cqe);
 	err = rtrs_iu_post_recv(&con->c, iu);
@@ -514,7 +514,7 @@ static void rtrs_clt_rkey_rsp_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 	u32 buf_id;
 	int err;
 
-	WARN_ON(sess->flags != RTRS_MSG_NEW_RKEY_F);
+	WARN_ON((sess->flags & RTRS_MSG_NEW_RKEY_F) == 0);
 
 	iu = container_of(wc->wr_cqe, struct rtrs_iu, cqe);
 
@@ -621,12 +621,12 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 		} else if (imm_type == RTRS_HB_MSG_IMM) {
 			WARN_ON(con->c.cid);
 			rtrs_send_hb_ack(&sess->s);
-			if (sess->flags == RTRS_MSG_NEW_RKEY_F)
+			if (sess->flags & RTRS_MSG_NEW_RKEY_F)
 				return  rtrs_clt_recv_done(con, wc);
 		} else if (imm_type == RTRS_HB_ACK_IMM) {
 			WARN_ON(con->c.cid);
 			sess->s.hb_missed_cnt = 0;
-			if (sess->flags == RTRS_MSG_NEW_RKEY_F)
+			if (sess->flags & RTRS_MSG_NEW_RKEY_F)
 				return  rtrs_clt_recv_done(con, wc);
 		} else {
 			rtrs_wrn(con->c.sess, "Unknown IMM type %u\n",
@@ -654,7 +654,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 		WARN_ON(!(wc->wc_flags & IB_WC_WITH_INVALIDATE ||
 			  wc->wc_flags & IB_WC_WITH_IMM));
 		WARN_ON(wc->wr_cqe->done != rtrs_clt_rdma_done);
-		if (sess->flags == RTRS_MSG_NEW_RKEY_F) {
+		if (sess->flags & RTRS_MSG_NEW_RKEY_F) {
 			if (wc->wc_flags & IB_WC_WITH_INVALIDATE)
 				return  rtrs_clt_recv_done(con, wc);
 
@@ -679,7 +679,7 @@ static int post_recv_io(struct rtrs_clt_con *con, size_t q_size)
 	struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
 
 	for (i = 0; i < q_size; i++) {
-		if (sess->flags == RTRS_MSG_NEW_RKEY_F) {
+		if (sess->flags & RTRS_MSG_NEW_RKEY_F) {
 			struct rtrs_iu *iu = &con->rsp_ius[i];
 
 			err = rtrs_iu_post_recv(&con->c, iu);
@@ -1563,7 +1563,7 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 			      sess->queue_depth * 3 + 1);
 	}
 	/* alloc iu to recv new rkey reply when server reports flags set */
-	if (sess->flags == RTRS_MSG_NEW_RKEY_F || con->c.cid == 0) {
+	if (sess->flags & RTRS_MSG_NEW_RKEY_F || con->c.cid == 0) {
 		con->rsp_ius = rtrs_iu_alloc(max_recv_wr, sizeof(*rsp),
 					      GFP_KERNEL, sess->s.dev->ib_dev,
 					      DMA_FROM_DEVICE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 17/18] RDMA/rtrs-srv: Do not signal REG_MR
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (15 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 16/18] RDMA/rtrs-clt: Use bitmask to check sess->flags Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-09 16:45 ` [PATCH for-next 18/18] RDMA/rtrs-srv: Init wr_cnt as 1 Jack Wang
  2020-12-11 19:48 ` [PATCH for-next 00/18] Misc update for rtrs Jason Gunthorpe
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

We do not need to wait for REG_MR completion, so remove the
SIGNAL flag.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 5065bf31729d..8363c407ab4b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -818,7 +818,7 @@ static int process_info_req(struct rtrs_srv_con *con,
 		rwr[mri].wr.opcode = IB_WR_REG_MR;
 		rwr[mri].wr.wr_cqe = &local_reg_cqe;
 		rwr[mri].wr.num_sge = 0;
-		rwr[mri].wr.send_flags = mri ? 0 : IB_SEND_SIGNALED;
+		rwr[mri].wr.send_flags = 0;
 		rwr[mri].mr = mr;
 		rwr[mri].key = mr->rkey;
 		rwr[mri].access = (IB_ACCESS_LOCAL_WRITE |
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 18/18] RDMA/rtrs-srv: Init wr_cnt as 1
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (16 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 17/18] RDMA/rtrs-srv: Do not signal REG_MR Jack Wang
@ 2020-12-09 16:45 ` Jack Wang
  2020-12-11 19:48 ` [PATCH for-next 00/18] Misc update for rtrs Jason Gunthorpe
  18 siblings, 0 replies; 34+ messages in thread
From: Jack Wang @ 2020-12-09 16:45 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

Fix up wr_avail accounting. if wr_cnt is 0, then we do SIGNAL for first
wr, in completion we add queue_depth back, which is not right in the
sense of tracking for available wr.

So fix it by init wr_cnt to 1.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 8363c407ab4b..e1907c10c7e7 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1603,7 +1603,7 @@ static int create_con(struct rtrs_srv_sess *sess,
 	con->c.cm_id = cm_id;
 	con->c.sess = &sess->s;
 	con->c.cid = cid;
-	atomic_set(&con->wr_cnt, 0);
+	atomic_set(&con->wr_cnt, 1);
 
 	if (con->c.cid == 0) {
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-09 16:45 ` [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing Jack Wang
@ 2020-12-10 14:56   ` Jinpu Wang
  2020-12-11  2:33     ` Guoqing Jiang
  0 siblings, 1 reply; 34+ messages in thread
From: Jinpu Wang @ 2020-12-10 14:56 UTC (permalink / raw)
  To: linux-rdma
  Cc: Bart Van Assche, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis, Guoqing Jiang

On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
>
> If there are many establishments/teardowns, we need to make sure
> we do not consume too much system memory. Thus let on going
> session closing to finish before accepting new connection.
>
> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Please ignore this one, it could lead to deadlock, due to the fact
cma_ib_req_handler is holding
mutex_lock(&listen_id->handler_mutex) when calling into
rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
which
could try to hold the same handler_mutex, so deadlock.

Sorry & thanks!

Jack

> ---
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> index ed4628f032bb..0a2202c28b54 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -1791,6 +1791,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
>                 err = -ENOMEM;
>                 goto reject_w_err;
>         }
> +       if (!cid) {
> +               /* Let inflight session teardown complete */
> +               flush_workqueue(rtrs_wq);
> +       }
>         mutex_lock(&srv->paths_mutex);
>         sess = __find_sess(srv, &msg->sess_uuid);
>         if (sess) {
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-10 14:56   ` Jinpu Wang
@ 2020-12-11  2:33     ` Guoqing Jiang
  2020-12-11  6:50       ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Guoqing Jiang @ 2020-12-11  2:33 UTC (permalink / raw)
  To: Jinpu Wang, linux-rdma
  Cc: Bart Van Assche, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis



On 12/10/20 15:56, Jinpu Wang wrote:
> On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
>>
>> If there are many establishments/teardowns, we need to make sure
>> we do not consume too much system memory. Thus let on going
>> session closing to finish before accepting new connection.
>>
>> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
>> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
>> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> Please ignore this one, it could lead to deadlock, due to the fact
> cma_ib_req_handler is holding
> mutex_lock(&listen_id->handler_mutex) when calling into
> rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> which
> could try to hold the same handler_mutex, so deadlock.
> 

I am wondering if nvmet-rdma has the similar issue or not, if so, maybe 
introduce a locked version of rdma_destroy_id.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  2:33     ` Guoqing Jiang
@ 2020-12-11  6:50       ` Jinpu Wang
  2020-12-11  7:26         ` Leon Romanovsky
  0 siblings, 1 reply; 34+ messages in thread
From: Jinpu Wang @ 2020-12-11  6:50 UTC (permalink / raw)
  To: Guoqing Jiang
  Cc: linux-rdma, Bart Van Assche, Leon Romanovsky, Doug Ledford,
	Jason Gunthorpe, Danil Kipnis

On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang
<guoqing.jiang@cloud.ionos.com> wrote:
>
>
>
> On 12/10/20 15:56, Jinpu Wang wrote:
> > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
> >>
> >> If there are many establishments/teardowns, we need to make sure
> >> we do not consume too much system memory. Thus let on going
> >> session closing to finish before accepting new connection.
> >>
> >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> > Please ignore this one, it could lead to deadlock, due to the fact
> > cma_ib_req_handler is holding
> > mutex_lock(&listen_id->handler_mutex) when calling into
> > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> > which
> > could try to hold the same handler_mutex, so deadlock.
> >
>
> I am wondering if nvmet-rdma has the similar issue or not, if so, maybe
> introduce a locked version of rdma_destroy_id.
>
> Thanks,
> Guoqing

No, I was wrong. I rechecked the code, it's not a valid deadlock, in
cma_ib_req_handler, the conn_id is newly created in
https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.

Flush_workqueue will only flush close_work for any other cm_id, but
not the newly created one conn_id, it has not associated with anything
yet.

The same applies to nvme-rdma. so it's a false alarm by lockdep.

Regards!
Jack

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  6:50       ` Jinpu Wang
@ 2020-12-11  7:26         ` Leon Romanovsky
  2020-12-11  7:53           ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Leon Romanovsky @ 2020-12-11  7:26 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: Guoqing Jiang, linux-rdma, Bart Van Assche, Doug Ledford,
	Jason Gunthorpe, Danil Kipnis

On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote:
> On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang
> <guoqing.jiang@cloud.ionos.com> wrote:
> >
> >
> >
> > On 12/10/20 15:56, Jinpu Wang wrote:
> > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
> > >>
> > >> If there are many establishments/teardowns, we need to make sure
> > >> we do not consume too much system memory. Thus let on going
> > >> session closing to finish before accepting new connection.
> > >>
> > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> > > Please ignore this one, it could lead to deadlock, due to the fact
> > > cma_ib_req_handler is holding
> > > mutex_lock(&listen_id->handler_mutex) when calling into
> > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> > > which
> > > could try to hold the same handler_mutex, so deadlock.
> > >
> >
> > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe
> > introduce a locked version of rdma_destroy_id.
> >
> > Thanks,
> > Guoqing
>
> No, I was wrong. I rechecked the code, it's not a valid deadlock, in
> cma_ib_req_handler, the conn_id is newly created in
> https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.
>
> Flush_workqueue will only flush close_work for any other cm_id, but
> not the newly created one conn_id, it has not associated with anything
> yet.
>
> The same applies to nvme-rdma. so it's a false alarm by lockdep.

Leaving this without fix (proper lock annotation) is not right thing to
do, because everyone who runs rtrs code with LOCKDEP on will have same
"false alarm".

So I recommend or not to take this patch or write it without LOCKDEP warning.

Thanks

>
> Regards!
> Jack

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  7:26         ` Leon Romanovsky
@ 2020-12-11  7:53           ` Jinpu Wang
  2020-12-11  7:58             ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Jinpu Wang @ 2020-12-11  7:53 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Guoqing Jiang, linux-rdma, Bart Van Assche, Doug Ledford,
	Jason Gunthorpe, Danil Kipnis

On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote:
> > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang
> > <guoqing.jiang@cloud.ionos.com> wrote:
> > >
> > >
> > >
> > > On 12/10/20 15:56, Jinpu Wang wrote:
> > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
> > > >>
> > > >> If there are many establishments/teardowns, we need to make sure
> > > >> we do not consume too much system memory. Thus let on going
> > > >> session closing to finish before accepting new connection.
> > > >>
> > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> > > > Please ignore this one, it could lead to deadlock, due to the fact
> > > > cma_ib_req_handler is holding
> > > > mutex_lock(&listen_id->handler_mutex) when calling into
> > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> > > > which
> > > > could try to hold the same handler_mutex, so deadlock.
> > > >
> > >
> > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe
> > > introduce a locked version of rdma_destroy_id.
> > >
> > > Thanks,
> > > Guoqing
> >
> > No, I was wrong. I rechecked the code, it's not a valid deadlock, in
> > cma_ib_req_handler, the conn_id is newly created in
> > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.
> >
> > Flush_workqueue will only flush close_work for any other cm_id, but
> > not the newly created one conn_id, it has not associated with anything
> > yet.
> >
> > The same applies to nvme-rdma. so it's a false alarm by lockdep.
>
> Leaving this without fix (proper lock annotation) is not right thing to
> do, because everyone who runs rtrs code with LOCKDEP on will have same
> "false alarm".
>
> So I recommend or not to take this patch or write it without LOCKDEP warning.
Hi Leon,

I'm thinking about the same, do you have a suggestion on how to teach
LOCKDEP this is not really a deadlock,
I do not know LOCKDEP well.

Thanks
>
> Thanks
>
> >
> > Regards!
> > Jack

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  7:53           ` Jinpu Wang
@ 2020-12-11  7:58             ` Jinpu Wang
  2020-12-11 13:45               ` Jason Gunthorpe
  2020-12-11 20:49               ` Leon Romanovsky
  0 siblings, 2 replies; 34+ messages in thread
From: Jinpu Wang @ 2020-12-11  7:58 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Guoqing Jiang, linux-rdma, Bart Van Assche, Doug Ledford,
	Jason Gunthorpe, Danil Kipnis

On Fri, Dec 11, 2020 at 8:53 AM Jinpu Wang <jinpu.wang@cloud.ionos.com> wrote:
>
> On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote:
> > > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang
> > > <guoqing.jiang@cloud.ionos.com> wrote:
> > > >
> > > >
> > > >
> > > > On 12/10/20 15:56, Jinpu Wang wrote:
> > > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
> > > > >>
> > > > >> If there are many establishments/teardowns, we need to make sure
> > > > >> we do not consume too much system memory. Thus let on going
> > > > >> session closing to finish before accepting new connection.
> > > > >>
> > > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> > > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> > > > > Please ignore this one, it could lead to deadlock, due to the fact
> > > > > cma_ib_req_handler is holding
> > > > > mutex_lock(&listen_id->handler_mutex) when calling into
> > > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> > > > > which
> > > > > could try to hold the same handler_mutex, so deadlock.
> > > > >
> > > >
> > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe
> > > > introduce a locked version of rdma_destroy_id.
> > > >
> > > > Thanks,
> > > > Guoqing
> > >
> > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in
> > > cma_ib_req_handler, the conn_id is newly created in
> > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.
> > >
> > > Flush_workqueue will only flush close_work for any other cm_id, but
> > > not the newly created one conn_id, it has not associated with anything
> > > yet.
> > >
> > > The same applies to nvme-rdma. so it's a false alarm by lockdep.
> >
> > Leaving this without fix (proper lock annotation) is not right thing to
> > do, because everyone who runs rtrs code with LOCKDEP on will have same
> > "false alarm".
> >
> > So I recommend or not to take this patch or write it without LOCKDEP warning.
> Hi Leon,
>
> I'm thinking about the same, do you have a suggestion on how to teach
> LOCKDEP this is not really a deadlock,
> I do not know LOCKDEP well.
Found it myself, we can use lockdep_off

https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699

Thanks

>
> Thanks
> >
> > Thanks
> >
> > >
> > > Regards!
> > > Jack

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  7:58             ` Jinpu Wang
@ 2020-12-11 13:45               ` Jason Gunthorpe
       [not found]                 ` <CAD+HZHXso=S5c=MqgrmDMZpWi10FbPTinWPfLMTkMCCiosihCQ@mail.gmail.com>
  2020-12-11 20:49               ` Leon Romanovsky
  1 sibling, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2020-12-11 13:45 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: Leon Romanovsky, Guoqing Jiang, linux-rdma, Bart Van Assche,
	Doug Ledford, Danil Kipnis

On Fri, Dec 11, 2020 at 08:58:09AM +0100, Jinpu Wang wrote:
> > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in
> > > > cma_ib_req_handler, the conn_id is newly created in
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.
> > > >
> > > > Flush_workqueue will only flush close_work for any other cm_id, but
> > > > not the newly created one conn_id, it has not associated with anything
> > > > yet.
> > > >
> > > > The same applies to nvme-rdma. so it's a false alarm by lockdep.
> > >
> > > Leaving this without fix (proper lock annotation) is not right thing to
> > > do, because everyone who runs rtrs code with LOCKDEP on will have same
> > > "false alarm".
> > >
> > > So I recommend or not to take this patch or write it without LOCKDEP warning.
> > Hi Leon,
> >
> > I'm thinking about the same, do you have a suggestion on how to teach
> > LOCKDEP this is not really a deadlock,
> > I do not know LOCKDEP well.
> Found it myself, we can use lockdep_off
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699

Gah, that is horrible.

I'm not completely sure it is false either, at this point two
handler_mutex's are locked (listening ID and the new ID) and there may
be locking cycles that end up back on the listening ID, certainly it
is not so simple.

Jason

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
       [not found]                 ` <CAD+HZHXso=S5c=MqgrmDMZpWi10FbPTinWPfLMTkMCCiosihCQ@mail.gmail.com>
@ 2020-12-11 16:29                   ` Jason Gunthorpe
  2020-12-16 16:42                     ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2020-12-11 16:29 UTC (permalink / raw)
  To: Jack Wang
  Cc: Bart Van Assche, Danil Kipnis, Doug Ledford, Guoqing Jiang,
	Jinpu Wang, Leon Romanovsky, linux-rdma

On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
>    En, the lockdep was complaining about the new conn_id, I will
>    post the full log if needed next week.  let’s skip this patch for
>    now, will double check!

That is even more worrysome as the new conn_id already has a different
lock class.

Jason

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 00/18] Misc update for rtrs
  2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
                   ` (17 preceding siblings ...)
  2020-12-09 16:45 ` [PATCH for-next 18/18] RDMA/rtrs-srv: Init wr_cnt as 1 Jack Wang
@ 2020-12-11 19:48 ` Jason Gunthorpe
  18 siblings, 0 replies; 34+ messages in thread
From: Jason Gunthorpe @ 2020-12-11 19:48 UTC (permalink / raw)
  To: Jack Wang; +Cc: linux-rdma, bvanassche, leon, dledford, danil.kipnis

On Wed, Dec 09, 2020 at 05:45:24PM +0100, Jack Wang wrote:
> Hi Jason, hi Doug,
> 
> Please consider to include following changes to the next merge window.
> 
> It contains a few bugfix and cleanup.
> 
> The patches are created based on rdma/for-next.

Most of these are bug fixes so need Fixes: lines.

Please write a bit more in many of the commit messages, one line may
not explain things enough

Jason

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11  7:58             ` Jinpu Wang
  2020-12-11 13:45               ` Jason Gunthorpe
@ 2020-12-11 20:49               ` Leon Romanovsky
  1 sibling, 0 replies; 34+ messages in thread
From: Leon Romanovsky @ 2020-12-11 20:49 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: Guoqing Jiang, linux-rdma, Bart Van Assche, Doug Ledford,
	Jason Gunthorpe, Danil Kipnis

On Fri, Dec 11, 2020 at 08:58:09AM +0100, Jinpu Wang wrote:
> On Fri, Dec 11, 2020 at 8:53 AM Jinpu Wang <jinpu.wang@cloud.ionos.com> wrote:
> >
> > On Fri, Dec 11, 2020 at 8:26 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Fri, Dec 11, 2020 at 07:50:13AM +0100, Jinpu Wang wrote:
> > > > On Fri, Dec 11, 2020 at 3:35 AM Guoqing Jiang
> > > > <guoqing.jiang@cloud.ionos.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 12/10/20 15:56, Jinpu Wang wrote:
> > > > > > On Wed, Dec 9, 2020 at 5:45 PM Jack Wang <jinpu.wang@cloud.ionos.com> wrote:
> > > > > >>
> > > > > >> If there are many establishments/teardowns, we need to make sure
> > > > > >> we do not consume too much system memory. Thus let on going
> > > > > >> session closing to finish before accepting new connection.
> > > > > >>
> > > > > >> Inspired by commit 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> > > > > >> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > >> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> > > > > > Please ignore this one, it could lead to deadlock, due to the fact
> > > > > > cma_ib_req_handler is holding
> > > > > > mutex_lock(&listen_id->handler_mutex) when calling into
> > > > > > rtrs_rdma_connect, we call close_work which will call rdma_destroy_id,
> > > > > > which
> > > > > > could try to hold the same handler_mutex, so deadlock.
> > > > > >
> > > > >
> > > > > I am wondering if nvmet-rdma has the similar issue or not, if so, maybe
> > > > > introduce a locked version of rdma_destroy_id.
> > > > >
> > > > > Thanks,
> > > > > Guoqing
> > > >
> > > > No, I was wrong. I rechecked the code, it's not a valid deadlock, in
> > > > cma_ib_req_handler, the conn_id is newly created in
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2185.
> > > >
> > > > Flush_workqueue will only flush close_work for any other cm_id, but
> > > > not the newly created one conn_id, it has not associated with anything
> > > > yet.
> > > >
> > > > The same applies to nvme-rdma. so it's a false alarm by lockdep.
> > >
> > > Leaving this without fix (proper lock annotation) is not right thing to
> > > do, because everyone who runs rtrs code with LOCKDEP on will have same
> > > "false alarm".
> > >
> > > So I recommend or not to take this patch or write it without LOCKDEP warning.
> > Hi Leon,
> >
> > I'm thinking about the same, do you have a suggestion on how to teach
> > LOCKDEP this is not really a deadlock,
> > I do not know LOCKDEP well.
> Found it myself, we can use lockdep_off
>
> https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mem.c#L699

My personal experience from internal/external reviews shows that claims
about false alarms in LOCKDEP warnings are almost always wrong.

Thanks

>
> Thanks
>
> >
> > Thanks
> > >
> > > Thanks
> > >
> > > >
> > > > Regards!
> > > > Jack

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-11 16:29                   ` Jason Gunthorpe
@ 2020-12-16 16:42                     ` Jinpu Wang
  2020-12-27  9:01                       ` Leon Romanovsky
  0 siblings, 1 reply; 34+ messages in thread
From: Jinpu Wang @ 2020-12-16 16:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jack Wang, Bart Van Assche, Danil Kipnis, Doug Ledford,
	Guoqing Jiang, Leon Romanovsky, linux-rdma

On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
> >    En, the lockdep was complaining about the new conn_id, I will
> >    post the full log if needed next week.  let’s skip this patch for
> >    now, will double check!
>
> That is even more worrysome as the new conn_id already has a different
> lock class.
>
> Jason
This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but
the latest 5.10 behaves the same.

[  500.071552] ======================================================
[  500.071648] WARNING: possible circular locking dependency detected
[  500.071869] 5.4.77-storage+ #35 Tainted: G           O
[  500.071959] ------------------------------------------------------
[  500.072054] kworker/1:1/28 is trying to acquire lock:
[  500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at:
rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.072837]
               but task is already holding lock:
[  500.072938] ffff9d18800f7e80
((work_completion)(&sess->close_work)){+.+.}, at:
process_one_work+0x223/0x600
[  500.075642]
               which lock already depends on the new lock.

[  500.075759]
               the existing dependency chain (in reverse order) is:
[  500.075880]
               -> #3 ((work_completion)(&sess->close_work)){+.+.}:
[  500.076062]        process_one_work+0x278/0x600
[  500.076154]        worker_thread+0x2d/0x3d0
[  500.076225]        kthread+0x111/0x130
[  500.076290]        ret_from_fork+0x24/0x30
[  500.076370]
               -> #2 ((wq_completion)rtrs_server_wq){+.+.}:
[  500.076482]        flush_workqueue+0xab/0x4b0
[  500.076565]        rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server]
[  500.076674]        cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm]
[  500.076770]        cm_process_work+0x22/0x140 [ib_cm]
[  500.076857]        cm_req_handler+0x900/0xde0 [ib_cm]
[  500.076944]        cm_work_handler+0x136/0x1af2 [ib_cm]
[  500.077025]        process_one_work+0x29f/0x600
[  500.077097]        worker_thread+0x2d/0x3d0
[  500.077164]        kthread+0x111/0x130
[  500.077224]        ret_from_fork+0x24/0x30
[  500.077294]
               -> #1 (&id_priv->handler_mutex/1){+.+.}:
[  500.077409]        __mutex_lock+0x7e/0x950
[  500.077488]        cma_ib_req_handler+0x787/0x14f0 [rdma_cm]
[  500.077582]        cm_process_work+0x22/0x140 [ib_cm]
[  500.077669]        cm_req_handler+0x900/0xde0 [ib_cm]
[  500.077755]        cm_work_handler+0x136/0x1af2 [ib_cm]
[  500.077835]        process_one_work+0x29f/0x600
[  500.077907]        worker_thread+0x2d/0x3d0
[  500.077973]        kthread+0x111/0x130
[  500.078034]        ret_from_fork+0x24/0x30
[  500.078095]
               -> #0 (&id_priv->handler_mutex){+.+.}:
[  500.078196]        __lock_acquire+0x1166/0x1440
[  500.078267]        lock_acquire+0x90/0x170
[  500.078335]        __mutex_lock+0x7e/0x950
[  500.078410]        rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.078498]        rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
[  500.078586]        process_one_work+0x29f/0x600
[  500.078662]        worker_thread+0x2d/0x3d0
[  500.078732]        kthread+0x111/0x130
[  500.078793]        ret_from_fork+0x24/0x30
[  500.078859]
               other info that might help us debug this:

[  500.078984] Chain exists of:
                 &id_priv->handler_mutex -->
(wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work)

[  500.079207]  Possible unsafe locking scenario:

[  500.079293]        CPU0                    CPU1
[  500.079358]        ----                    ----
[  500.079358]   lock((work_completion)(&sess->close_work));
[  500.079358]
lock((wq_completion)rtrs_server_wq);
[  500.079358]
lock((work_completion)(&sess->close_work));
[  500.079358]   lock(&id_priv->handler_mutex);
[  500.079358]
                *** DEADLOCK ***

[  500.079358] 2 locks held by kworker/1:1/28:
[  500.079358]  #0: ffff99652d281f28
((wq_completion)rtrs_server_wq){+.+.}, at:
process_one_work+0x223/0x600
[  500.079358]  #1: ffff9d18800f7e80
((work_completion)(&sess->close_work)){+.+.}, at:
process_one_work+0x223/0x600
[  500.079358]
               stack backtrace:
[  500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G           O
     5.4.77-storage+ #35
[  500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1ubuntu1 04/01/2014
[  500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
[  500.079358] Call Trace:
[  500.079358]  dump_stack+0x71/0x9b
[  500.079358]  check_noncircular+0x17d/0x1a0
[  500.079358]  ? __lock_acquire+0x1166/0x1440
[  500.079358]  __lock_acquire+0x1166/0x1440
[  500.079358]  lock_acquire+0x90/0x170
[  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.079358]  __mutex_lock+0x7e/0x950
[  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.079358]  ? find_held_lock+0x2d/0x90
[  500.079358]  ? mark_held_locks+0x49/0x70
[  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.079358]  rdma_destroy_id+0x55/0x230 [rdma_cm]
[  500.079358]  rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
[  500.079358]  process_one_work+0x29f/0x600
[  500.079358]  worker_thread+0x2d/0x3d0
[  500.079358]  ? process_one_work+0x600/0x600
[  500.079358]  kthread+0x111/0x130
[  500.079358]  ? kthread_park+0x90/0x90
[  500.079358]  ret_from_fork+0x24/0x30

According to my understanding
in cma_ib_req_handler, the conn_id is newly created in
https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
And the rdma_cm_id associated with conn_id is passed to
rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.

In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
for any other cm_id, but
not the newly created one conn_id, it has not associated with anything yet.

The same applies to nvme-rdma. so it's a false alarm by lockdep.

Regards!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-16 16:42                     ` Jinpu Wang
@ 2020-12-27  9:01                       ` Leon Romanovsky
  2021-01-04  8:06                         ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Leon Romanovsky @ 2020-12-27  9:01 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: Jason Gunthorpe, Jack Wang, Bart Van Assche, Danil Kipnis,
	Doug Ledford, Guoqing Jiang, linux-rdma

On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote:
> On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
> > >    En, the lockdep was complaining about the new conn_id, I will
> > >    post the full log if needed next week.  let’s skip this patch for
> > >    now, will double check!
> >
> > That is even more worrysome as the new conn_id already has a different
> > lock class.
> >
> > Jason
> This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but
> the latest 5.10 behaves the same.
>
> [  500.071552] ======================================================
> [  500.071648] WARNING: possible circular locking dependency detected
> [  500.071869] 5.4.77-storage+ #35 Tainted: G           O
> [  500.071959] ------------------------------------------------------
> [  500.072054] kworker/1:1/28 is trying to acquire lock:
> [  500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at:
> rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.072837]
>                but task is already holding lock:
> [  500.072938] ffff9d18800f7e80
> ((work_completion)(&sess->close_work)){+.+.}, at:
> process_one_work+0x223/0x600
> [  500.075642]
>                which lock already depends on the new lock.
>
> [  500.075759]
>                the existing dependency chain (in reverse order) is:
> [  500.075880]
>                -> #3 ((work_completion)(&sess->close_work)){+.+.}:
> [  500.076062]        process_one_work+0x278/0x600
> [  500.076154]        worker_thread+0x2d/0x3d0
> [  500.076225]        kthread+0x111/0x130
> [  500.076290]        ret_from_fork+0x24/0x30
> [  500.076370]
>                -> #2 ((wq_completion)rtrs_server_wq){+.+.}:
> [  500.076482]        flush_workqueue+0xab/0x4b0
> [  500.076565]        rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server]
> [  500.076674]        cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm]
> [  500.076770]        cm_process_work+0x22/0x140 [ib_cm]
> [  500.076857]        cm_req_handler+0x900/0xde0 [ib_cm]
> [  500.076944]        cm_work_handler+0x136/0x1af2 [ib_cm]
> [  500.077025]        process_one_work+0x29f/0x600
> [  500.077097]        worker_thread+0x2d/0x3d0
> [  500.077164]        kthread+0x111/0x130
> [  500.077224]        ret_from_fork+0x24/0x30
> [  500.077294]
>                -> #1 (&id_priv->handler_mutex/1){+.+.}:
> [  500.077409]        __mutex_lock+0x7e/0x950
> [  500.077488]        cma_ib_req_handler+0x787/0x14f0 [rdma_cm]
> [  500.077582]        cm_process_work+0x22/0x140 [ib_cm]
> [  500.077669]        cm_req_handler+0x900/0xde0 [ib_cm]
> [  500.077755]        cm_work_handler+0x136/0x1af2 [ib_cm]
> [  500.077835]        process_one_work+0x29f/0x600
> [  500.077907]        worker_thread+0x2d/0x3d0
> [  500.077973]        kthread+0x111/0x130
> [  500.078034]        ret_from_fork+0x24/0x30
> [  500.078095]
>                -> #0 (&id_priv->handler_mutex){+.+.}:
> [  500.078196]        __lock_acquire+0x1166/0x1440
> [  500.078267]        lock_acquire+0x90/0x170
> [  500.078335]        __mutex_lock+0x7e/0x950
> [  500.078410]        rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.078498]        rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> [  500.078586]        process_one_work+0x29f/0x600
> [  500.078662]        worker_thread+0x2d/0x3d0
> [  500.078732]        kthread+0x111/0x130
> [  500.078793]        ret_from_fork+0x24/0x30
> [  500.078859]
>                other info that might help us debug this:
>
> [  500.078984] Chain exists of:
>                  &id_priv->handler_mutex -->
> (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work)
>
> [  500.079207]  Possible unsafe locking scenario:
>
> [  500.079293]        CPU0                    CPU1
> [  500.079358]        ----                    ----
> [  500.079358]   lock((work_completion)(&sess->close_work));
> [  500.079358]
> lock((wq_completion)rtrs_server_wq);
> [  500.079358]
> lock((work_completion)(&sess->close_work));
> [  500.079358]   lock(&id_priv->handler_mutex);
> [  500.079358]
>                 *** DEADLOCK ***
>
> [  500.079358] 2 locks held by kworker/1:1/28:
> [  500.079358]  #0: ffff99652d281f28
> ((wq_completion)rtrs_server_wq){+.+.}, at:
> process_one_work+0x223/0x600
> [  500.079358]  #1: ffff9d18800f7e80
> ((work_completion)(&sess->close_work)){+.+.}, at:
> process_one_work+0x223/0x600
> [  500.079358]
>                stack backtrace:
> [  500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G           O
>      5.4.77-storage+ #35
> [  500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1ubuntu1 04/01/2014
> [  500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> [  500.079358] Call Trace:
> [  500.079358]  dump_stack+0x71/0x9b
> [  500.079358]  check_noncircular+0x17d/0x1a0
> [  500.079358]  ? __lock_acquire+0x1166/0x1440
> [  500.079358]  __lock_acquire+0x1166/0x1440
> [  500.079358]  lock_acquire+0x90/0x170
> [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.079358]  __mutex_lock+0x7e/0x950
> [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.079358]  ? find_held_lock+0x2d/0x90
> [  500.079358]  ? mark_held_locks+0x49/0x70
> [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.079358]  rdma_destroy_id+0x55/0x230 [rdma_cm]
> [  500.079358]  rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> [  500.079358]  process_one_work+0x29f/0x600
> [  500.079358]  worker_thread+0x2d/0x3d0
> [  500.079358]  ? process_one_work+0x600/0x600
> [  500.079358]  kthread+0x111/0x130
> [  500.079358]  ? kthread_park+0x90/0x90
> [  500.079358]  ret_from_fork+0x24/0x30
>
> According to my understanding
> in cma_ib_req_handler, the conn_id is newly created in
> https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> And the rdma_cm_id associated with conn_id is passed to
> rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
>
> In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> for any other cm_id, but
> not the newly created one conn_id, it has not associated with anything yet.

How did you come to this conclusion that rtrs handler was called before
cma_cm_event_handler()? I'm not so sure about that and it will explain
the lockdep.

Thanks

>
> The same applies to nvme-rdma. so it's a false alarm by lockdep.
>
> Regards!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2020-12-27  9:01                       ` Leon Romanovsky
@ 2021-01-04  8:06                         ` Jinpu Wang
  2021-01-04  8:25                           ` Leon Romanovsky
  0 siblings, 1 reply; 34+ messages in thread
From: Jinpu Wang @ 2021-01-04  8:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, Jack Wang, Bart Van Assche, Danil Kipnis,
	Doug Ledford, Guoqing Jiang, linux-rdma

On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote:
> > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
> > > >    En, the lockdep was complaining about the new conn_id, I will
> > > >    post the full log if needed next week.  let’s skip this patch for
> > > >    now, will double check!
> > >
> > > That is even more worrysome as the new conn_id already has a different
> > > lock class.
> > >
> > > Jason
> > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but
> > the latest 5.10 behaves the same.
> >
> > [  500.071552] ======================================================
> > [  500.071648] WARNING: possible circular locking dependency detected
> > [  500.071869] 5.4.77-storage+ #35 Tainted: G           O
> > [  500.071959] ------------------------------------------------------
> > [  500.072054] kworker/1:1/28 is trying to acquire lock:
> > [  500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at:
> > rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.072837]
> >                but task is already holding lock:
> > [  500.072938] ffff9d18800f7e80
> > ((work_completion)(&sess->close_work)){+.+.}, at:
> > process_one_work+0x223/0x600
> > [  500.075642]
> >                which lock already depends on the new lock.
> >
> > [  500.075759]
> >                the existing dependency chain (in reverse order) is:
> > [  500.075880]
> >                -> #3 ((work_completion)(&sess->close_work)){+.+.}:
> > [  500.076062]        process_one_work+0x278/0x600
> > [  500.076154]        worker_thread+0x2d/0x3d0
> > [  500.076225]        kthread+0x111/0x130
> > [  500.076290]        ret_from_fork+0x24/0x30
> > [  500.076370]
> >                -> #2 ((wq_completion)rtrs_server_wq){+.+.}:
> > [  500.076482]        flush_workqueue+0xab/0x4b0
> > [  500.076565]        rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server]
> > [  500.076674]        cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm]
> > [  500.076770]        cm_process_work+0x22/0x140 [ib_cm]
> > [  500.076857]        cm_req_handler+0x900/0xde0 [ib_cm]
> > [  500.076944]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > [  500.077025]        process_one_work+0x29f/0x600
> > [  500.077097]        worker_thread+0x2d/0x3d0
> > [  500.077164]        kthread+0x111/0x130
> > [  500.077224]        ret_from_fork+0x24/0x30
> > [  500.077294]
> >                -> #1 (&id_priv->handler_mutex/1){+.+.}:
> > [  500.077409]        __mutex_lock+0x7e/0x950
> > [  500.077488]        cma_ib_req_handler+0x787/0x14f0 [rdma_cm]
> > [  500.077582]        cm_process_work+0x22/0x140 [ib_cm]
> > [  500.077669]        cm_req_handler+0x900/0xde0 [ib_cm]
> > [  500.077755]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > [  500.077835]        process_one_work+0x29f/0x600
> > [  500.077907]        worker_thread+0x2d/0x3d0
> > [  500.077973]        kthread+0x111/0x130
> > [  500.078034]        ret_from_fork+0x24/0x30
> > [  500.078095]
> >                -> #0 (&id_priv->handler_mutex){+.+.}:
> > [  500.078196]        __lock_acquire+0x1166/0x1440
> > [  500.078267]        lock_acquire+0x90/0x170
> > [  500.078335]        __mutex_lock+0x7e/0x950
> > [  500.078410]        rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.078498]        rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > [  500.078586]        process_one_work+0x29f/0x600
> > [  500.078662]        worker_thread+0x2d/0x3d0
> > [  500.078732]        kthread+0x111/0x130
> > [  500.078793]        ret_from_fork+0x24/0x30
> > [  500.078859]
> >                other info that might help us debug this:
> >
> > [  500.078984] Chain exists of:
> >                  &id_priv->handler_mutex -->
> > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work)
> >
> > [  500.079207]  Possible unsafe locking scenario:
> >
> > [  500.079293]        CPU0                    CPU1
> > [  500.079358]        ----                    ----
> > [  500.079358]   lock((work_completion)(&sess->close_work));
> > [  500.079358]
> > lock((wq_completion)rtrs_server_wq);
> > [  500.079358]
> > lock((work_completion)(&sess->close_work));
> > [  500.079358]   lock(&id_priv->handler_mutex);
> > [  500.079358]
> >                 *** DEADLOCK ***
> >
> > [  500.079358] 2 locks held by kworker/1:1/28:
> > [  500.079358]  #0: ffff99652d281f28
> > ((wq_completion)rtrs_server_wq){+.+.}, at:
> > process_one_work+0x223/0x600
> > [  500.079358]  #1: ffff9d18800f7e80
> > ((work_completion)(&sess->close_work)){+.+.}, at:
> > process_one_work+0x223/0x600
> > [  500.079358]
> >                stack backtrace:
> > [  500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G           O
> >      5.4.77-storage+ #35
> > [  500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 1.10.2-1ubuntu1 04/01/2014
> > [  500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> > [  500.079358] Call Trace:
> > [  500.079358]  dump_stack+0x71/0x9b
> > [  500.079358]  check_noncircular+0x17d/0x1a0
> > [  500.079358]  ? __lock_acquire+0x1166/0x1440
> > [  500.079358]  __lock_acquire+0x1166/0x1440
> > [  500.079358]  lock_acquire+0x90/0x170
> > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.079358]  __mutex_lock+0x7e/0x950
> > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.079358]  ? find_held_lock+0x2d/0x90
> > [  500.079358]  ? mark_held_locks+0x49/0x70
> > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.079358]  rdma_destroy_id+0x55/0x230 [rdma_cm]
> > [  500.079358]  rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > [  500.079358]  process_one_work+0x29f/0x600
> > [  500.079358]  worker_thread+0x2d/0x3d0
> > [  500.079358]  ? process_one_work+0x600/0x600
> > [  500.079358]  kthread+0x111/0x130
> > [  500.079358]  ? kthread_park+0x90/0x90
> > [  500.079358]  ret_from_fork+0x24/0x30
> >
> > According to my understanding
> > in cma_ib_req_handler, the conn_id is newly created in
> > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> > And the rdma_cm_id associated with conn_id is passed to
> > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
> >
> > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> > for any other cm_id, but
> > not the newly created one conn_id, it has not associated with anything yet.
>
> How did you come to this conclusion that rtrs handler was called before
> cma_cm_event_handler()? I'm not so sure about that and it will explain
> the lockdep.
>
> Thanks
Hi Leon,
I never said that, the call chain here is:
cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
Repeat myself in last email:
in cma_ib_req_handler, the conn_id is newly created in
 https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
And the rdma_cm_id associated with conn_id is passed to
rtrs_rdma_connect.

In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
for any other cm_id, but
not the newly created one conn_id, the rdma_cm_id passed in
rtrs_rdma_connect has not associated with anything yet.

Hope this is now clear.

Happy New Year!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2021-01-04  8:06                         ` Jinpu Wang
@ 2021-01-04  8:25                           ` Leon Romanovsky
  2021-01-04 11:04                             ` Jinpu Wang
  0 siblings, 1 reply; 34+ messages in thread
From: Leon Romanovsky @ 2021-01-04  8:25 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: Jason Gunthorpe, Jack Wang, Bart Van Assche, Danil Kipnis,
	Doug Ledford, Guoqing Jiang, linux-rdma

On Mon, Jan 04, 2021 at 09:06:13AM +0100, Jinpu Wang wrote:
> On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote:
> > > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > >
> > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
> > > > >    En, the lockdep was complaining about the new conn_id, I will
> > > > >    post the full log if needed next week.  let’s skip this patch for
> > > > >    now, will double check!
> > > >
> > > > That is even more worrysome as the new conn_id already has a different
> > > > lock class.
> > > >
> > > > Jason
> > > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but
> > > the latest 5.10 behaves the same.
> > >
> > > [  500.071552] ======================================================
> > > [  500.071648] WARNING: possible circular locking dependency detected
> > > [  500.071869] 5.4.77-storage+ #35 Tainted: G           O
> > > [  500.071959] ------------------------------------------------------
> > > [  500.072054] kworker/1:1/28 is trying to acquire lock:
> > > [  500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at:
> > > rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.072837]
> > >                but task is already holding lock:
> > > [  500.072938] ffff9d18800f7e80
> > > ((work_completion)(&sess->close_work)){+.+.}, at:
> > > process_one_work+0x223/0x600
> > > [  500.075642]
> > >                which lock already depends on the new lock.
> > >
> > > [  500.075759]
> > >                the existing dependency chain (in reverse order) is:
> > > [  500.075880]
> > >                -> #3 ((work_completion)(&sess->close_work)){+.+.}:
> > > [  500.076062]        process_one_work+0x278/0x600
> > > [  500.076154]        worker_thread+0x2d/0x3d0
> > > [  500.076225]        kthread+0x111/0x130
> > > [  500.076290]        ret_from_fork+0x24/0x30
> > > [  500.076370]
> > >                -> #2 ((wq_completion)rtrs_server_wq){+.+.}:
> > > [  500.076482]        flush_workqueue+0xab/0x4b0
> > > [  500.076565]        rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server]
> > > [  500.076674]        cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm]
> > > [  500.076770]        cm_process_work+0x22/0x140 [ib_cm]
> > > [  500.076857]        cm_req_handler+0x900/0xde0 [ib_cm]
> > > [  500.076944]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > > [  500.077025]        process_one_work+0x29f/0x600
> > > [  500.077097]        worker_thread+0x2d/0x3d0
> > > [  500.077164]        kthread+0x111/0x130
> > > [  500.077224]        ret_from_fork+0x24/0x30
> > > [  500.077294]
> > >                -> #1 (&id_priv->handler_mutex/1){+.+.}:
> > > [  500.077409]        __mutex_lock+0x7e/0x950
> > > [  500.077488]        cma_ib_req_handler+0x787/0x14f0 [rdma_cm]
> > > [  500.077582]        cm_process_work+0x22/0x140 [ib_cm]
> > > [  500.077669]        cm_req_handler+0x900/0xde0 [ib_cm]
> > > [  500.077755]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > > [  500.077835]        process_one_work+0x29f/0x600
> > > [  500.077907]        worker_thread+0x2d/0x3d0
> > > [  500.077973]        kthread+0x111/0x130
> > > [  500.078034]        ret_from_fork+0x24/0x30
> > > [  500.078095]
> > >                -> #0 (&id_priv->handler_mutex){+.+.}:
> > > [  500.078196]        __lock_acquire+0x1166/0x1440
> > > [  500.078267]        lock_acquire+0x90/0x170
> > > [  500.078335]        __mutex_lock+0x7e/0x950
> > > [  500.078410]        rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.078498]        rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > > [  500.078586]        process_one_work+0x29f/0x600
> > > [  500.078662]        worker_thread+0x2d/0x3d0
> > > [  500.078732]        kthread+0x111/0x130
> > > [  500.078793]        ret_from_fork+0x24/0x30
> > > [  500.078859]
> > >                other info that might help us debug this:
> > >
> > > [  500.078984] Chain exists of:
> > >                  &id_priv->handler_mutex -->
> > > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work)
> > >
> > > [  500.079207]  Possible unsafe locking scenario:
> > >
> > > [  500.079293]        CPU0                    CPU1
> > > [  500.079358]        ----                    ----
> > > [  500.079358]   lock((work_completion)(&sess->close_work));
> > > [  500.079358]
> > > lock((wq_completion)rtrs_server_wq);
> > > [  500.079358]
> > > lock((work_completion)(&sess->close_work));
> > > [  500.079358]   lock(&id_priv->handler_mutex);
> > > [  500.079358]
> > >                 *** DEADLOCK ***
> > >
> > > [  500.079358] 2 locks held by kworker/1:1/28:
> > > [  500.079358]  #0: ffff99652d281f28
> > > ((wq_completion)rtrs_server_wq){+.+.}, at:
> > > process_one_work+0x223/0x600
> > > [  500.079358]  #1: ffff9d18800f7e80
> > > ((work_completion)(&sess->close_work)){+.+.}, at:
> > > process_one_work+0x223/0x600
> > > [  500.079358]
> > >                stack backtrace:
> > > [  500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G           O
> > >      5.4.77-storage+ #35
> > > [  500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > BIOS 1.10.2-1ubuntu1 04/01/2014
> > > [  500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> > > [  500.079358] Call Trace:
> > > [  500.079358]  dump_stack+0x71/0x9b
> > > [  500.079358]  check_noncircular+0x17d/0x1a0
> > > [  500.079358]  ? __lock_acquire+0x1166/0x1440
> > > [  500.079358]  __lock_acquire+0x1166/0x1440
> > > [  500.079358]  lock_acquire+0x90/0x170
> > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.079358]  __mutex_lock+0x7e/0x950
> > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.079358]  ? find_held_lock+0x2d/0x90
> > > [  500.079358]  ? mark_held_locks+0x49/0x70
> > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.079358]  rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > [  500.079358]  rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > > [  500.079358]  process_one_work+0x29f/0x600
> > > [  500.079358]  worker_thread+0x2d/0x3d0
> > > [  500.079358]  ? process_one_work+0x600/0x600
> > > [  500.079358]  kthread+0x111/0x130
> > > [  500.079358]  ? kthread_park+0x90/0x90
> > > [  500.079358]  ret_from_fork+0x24/0x30
> > >
> > > According to my understanding
> > > in cma_ib_req_handler, the conn_id is newly created in
> > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> > > And the rdma_cm_id associated with conn_id is passed to
> > > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
> > >
> > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> > > for any other cm_id, but
> > > not the newly created one conn_id, it has not associated with anything yet.
> >
> > How did you come to this conclusion that rtrs handler was called before
> > cma_cm_event_handler()? I'm not so sure about that and it will explain
> > the lockdep.
> >
> > Thanks
> Hi Leon,
> I never said that, the call chain here is:
> cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
> Repeat myself in last email:
> in cma_ib_req_handler, the conn_id is newly created in
>  https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> And the rdma_cm_id associated with conn_id is passed to
> rtrs_rdma_connect.
>
> In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> for any other cm_id, but
> not the newly created one conn_id, the rdma_cm_id passed in
> rtrs_rdma_connect has not associated with anything yet.

This is exactly why I'm not so sure, after rdma_cm_id returns from
RDMA/core, it will be in that flush_workqueue queue.

>
> Hope this is now clear.
>
> Happy New Year!

Happy New Year too :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing
  2021-01-04  8:25                           ` Leon Romanovsky
@ 2021-01-04 11:04                             ` Jinpu Wang
  0 siblings, 0 replies; 34+ messages in thread
From: Jinpu Wang @ 2021-01-04 11:04 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, Jack Wang, Bart Van Assche, Danil Kipnis,
	Doug Ledford, Guoqing Jiang, linux-rdma

On Mon, Jan 4, 2021 at 9:25 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Jan 04, 2021 at 09:06:13AM +0100, Jinpu Wang wrote:
> > On Sun, Dec 27, 2020 at 10:01 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Wed, Dec 16, 2020 at 05:42:17PM +0100, Jinpu Wang wrote:
> > > > On Fri, Dec 11, 2020 at 5:29 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > >
> > > > > On Fri, Dec 11, 2020 at 05:17:36PM +0100, Jack Wang wrote:
> > > > > >    En, the lockdep was complaining about the new conn_id, I will
> > > > > >    post the full log if needed next week.  let’s skip this patch for
> > > > > >    now, will double check!
> > > > >
> > > > > That is even more worrysome as the new conn_id already has a different
> > > > > lock class.
> > > > >
> > > > > Jason
> > > > This is the dmesg of the LOCKDEP warning, it's on kernel 5.4.77, but
> > > > the latest 5.10 behaves the same.
> > > >
> > > > [  500.071552] ======================================================
> > > > [  500.071648] WARNING: possible circular locking dependency detected
> > > > [  500.071869] 5.4.77-storage+ #35 Tainted: G           O
> > > > [  500.071959] ------------------------------------------------------
> > > > [  500.072054] kworker/1:1/28 is trying to acquire lock:
> > > > [  500.072200] ffff99653a624390 (&id_priv->handler_mutex){+.+.}, at:
> > > > rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.072837]
> > > >                but task is already holding lock:
> > > > [  500.072938] ffff9d18800f7e80
> > > > ((work_completion)(&sess->close_work)){+.+.}, at:
> > > > process_one_work+0x223/0x600
> > > > [  500.075642]
> > > >                which lock already depends on the new lock.
> > > >
> > > > [  500.075759]
> > > >                the existing dependency chain (in reverse order) is:
> > > > [  500.075880]
> > > >                -> #3 ((work_completion)(&sess->close_work)){+.+.}:
> > > > [  500.076062]        process_one_work+0x278/0x600
> > > > [  500.076154]        worker_thread+0x2d/0x3d0
> > > > [  500.076225]        kthread+0x111/0x130
> > > > [  500.076290]        ret_from_fork+0x24/0x30
> > > > [  500.076370]
> > > >                -> #2 ((wq_completion)rtrs_server_wq){+.+.}:
> > > > [  500.076482]        flush_workqueue+0xab/0x4b0
> > > > [  500.076565]        rtrs_srv_rdma_cm_handler+0x71d/0x1500 [rtrs_server]
> > > > [  500.076674]        cma_ib_req_handler+0x8c4/0x14f0 [rdma_cm]
> > > > [  500.076770]        cm_process_work+0x22/0x140 [ib_cm]
> > > > [  500.076857]        cm_req_handler+0x900/0xde0 [ib_cm]
> > > > [  500.076944]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > > > [  500.077025]        process_one_work+0x29f/0x600
> > > > [  500.077097]        worker_thread+0x2d/0x3d0
> > > > [  500.077164]        kthread+0x111/0x130
> > > > [  500.077224]        ret_from_fork+0x24/0x30
> > > > [  500.077294]
> > > >                -> #1 (&id_priv->handler_mutex/1){+.+.}:
> > > > [  500.077409]        __mutex_lock+0x7e/0x950
> > > > [  500.077488]        cma_ib_req_handler+0x787/0x14f0 [rdma_cm]
> > > > [  500.077582]        cm_process_work+0x22/0x140 [ib_cm]
> > > > [  500.077669]        cm_req_handler+0x900/0xde0 [ib_cm]
> > > > [  500.077755]        cm_work_handler+0x136/0x1af2 [ib_cm]
> > > > [  500.077835]        process_one_work+0x29f/0x600
> > > > [  500.077907]        worker_thread+0x2d/0x3d0
> > > > [  500.077973]        kthread+0x111/0x130
> > > > [  500.078034]        ret_from_fork+0x24/0x30
> > > > [  500.078095]
> > > >                -> #0 (&id_priv->handler_mutex){+.+.}:
> > > > [  500.078196]        __lock_acquire+0x1166/0x1440
> > > > [  500.078267]        lock_acquire+0x90/0x170
> > > > [  500.078335]        __mutex_lock+0x7e/0x950
> > > > [  500.078410]        rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.078498]        rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > > > [  500.078586]        process_one_work+0x29f/0x600
> > > > [  500.078662]        worker_thread+0x2d/0x3d0
> > > > [  500.078732]        kthread+0x111/0x130
> > > > [  500.078793]        ret_from_fork+0x24/0x30
> > > > [  500.078859]
> > > >                other info that might help us debug this:
> > > >
> > > > [  500.078984] Chain exists of:
> > > >                  &id_priv->handler_mutex -->
> > > > (wq_completion)rtrs_server_wq --> (work_completion)(&sess->close_work)
> > > >
> > > > [  500.079207]  Possible unsafe locking scenario:
> > > >
> > > > [  500.079293]        CPU0                    CPU1
> > > > [  500.079358]        ----                    ----
> > > > [  500.079358]   lock((work_completion)(&sess->close_work));
> > > > [  500.079358]
> > > > lock((wq_completion)rtrs_server_wq);
> > > > [  500.079358]
> > > > lock((work_completion)(&sess->close_work));
> > > > [  500.079358]   lock(&id_priv->handler_mutex);
> > > > [  500.079358]
> > > >                 *** DEADLOCK ***
> > > >
> > > > [  500.079358] 2 locks held by kworker/1:1/28:
> > > > [  500.079358]  #0: ffff99652d281f28
> > > > ((wq_completion)rtrs_server_wq){+.+.}, at:
> > > > process_one_work+0x223/0x600
> > > > [  500.079358]  #1: ffff9d18800f7e80
> > > > ((work_completion)(&sess->close_work)){+.+.}, at:
> > > > process_one_work+0x223/0x600
> > > > [  500.079358]
> > > >                stack backtrace:
> > > > [  500.079358] CPU: 1 PID: 28 Comm: kworker/1:1 Tainted: G           O
> > > >      5.4.77-storage+ #35
> > > > [  500.079358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > > BIOS 1.10.2-1ubuntu1 04/01/2014
> > > > [  500.079358] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
> > > > [  500.079358] Call Trace:
> > > > [  500.079358]  dump_stack+0x71/0x9b
> > > > [  500.079358]  check_noncircular+0x17d/0x1a0
> > > > [  500.079358]  ? __lock_acquire+0x1166/0x1440
> > > > [  500.079358]  __lock_acquire+0x1166/0x1440
> > > > [  500.079358]  lock_acquire+0x90/0x170
> > > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.079358]  __mutex_lock+0x7e/0x950
> > > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.079358]  ? find_held_lock+0x2d/0x90
> > > > [  500.079358]  ? mark_held_locks+0x49/0x70
> > > > [  500.079358]  ? rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.079358]  rdma_destroy_id+0x55/0x230 [rdma_cm]
> > > > [  500.079358]  rtrs_srv_close_work+0xf2/0x2d0 [rtrs_server]
> > > > [  500.079358]  process_one_work+0x29f/0x600
> > > > [  500.079358]  worker_thread+0x2d/0x3d0
> > > > [  500.079358]  ? process_one_work+0x600/0x600
> > > > [  500.079358]  kthread+0x111/0x130
> > > > [  500.079358]  ? kthread_park+0x90/0x90
> > > > [  500.079358]  ret_from_fork+0x24/0x30
> > > >
> > > > According to my understanding
> > > > in cma_ib_req_handler, the conn_id is newly created in
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> > > > And the rdma_cm_id associated with conn_id is passed to
> > > > rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
> > > >
> > > > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> > > > for any other cm_id, but
> > > > not the newly created one conn_id, it has not associated with anything yet.
> > >
> > > How did you come to this conclusion that rtrs handler was called before
> > > cma_cm_event_handler()? I'm not so sure about that and it will explain
> > > the lockdep.
> > >
> > > Thanks
> > Hi Leon,
> > I never said that, the call chain here is:
> > cma_ib_req_handler->cma_cm_event_handler->rtrs_srv_rdma_cm_handler->rtrs_rdma_connect.
> > Repeat myself in last email:
> > in cma_ib_req_handler, the conn_id is newly created in
> >  https://elixir.bootlin.com/linux/latest/source/drivers/infiniband/core/cma.c#L2222.
> > And the rdma_cm_id associated with conn_id is passed to
> > rtrs_rdma_connect.
> >
> > In rtrs_rdma_connect, we do flush_workqueue will only flush close_work
> > for any other cm_id, but
> > not the newly created one conn_id, the rdma_cm_id passed in
> > rtrs_rdma_connect has not associated with anything yet.
>
> This is exactly why I'm not so sure, after rdma_cm_id returns from
> RDMA/core, it will be in that flush_workqueue queue.

In rtrs_rdma_connect,  we do  flush_workqueue(rtrs_wq) in the
beginning before we associate rdma_cm_id (conn_id) to rtrs_srv_con in
by rtrs_srv_rdma_cm_handler -> rtrs_rdma_connect -> create_con ->
con->c.cm_id = cm_id. And in rtrs_srv_close_work we do
rdma_destroy_id(con->c.cm_id);

so the rdma_cm_id is not in the flush_workqueue queue yet.

Thanks!
>
> >
> > Hope this is now clear.
> >
> > Happy New Year!
>
> Happy New Year too :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-01-04 11:05 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 16:45 [PATCH for-next 00/18] Misc update for rtrs Jack Wang
2020-12-09 16:45 ` [PATCH for-next 01/18] RDMA/rtrs: Extend ibtrs_cq_qp_create Jack Wang
2020-12-09 16:45 ` [PATCH for-next 02/18] RMDA/rtrs-srv: Occasionally flush ongoing session closing Jack Wang
2020-12-10 14:56   ` Jinpu Wang
2020-12-11  2:33     ` Guoqing Jiang
2020-12-11  6:50       ` Jinpu Wang
2020-12-11  7:26         ` Leon Romanovsky
2020-12-11  7:53           ` Jinpu Wang
2020-12-11  7:58             ` Jinpu Wang
2020-12-11 13:45               ` Jason Gunthorpe
     [not found]                 ` <CAD+HZHXso=S5c=MqgrmDMZpWi10FbPTinWPfLMTkMCCiosihCQ@mail.gmail.com>
2020-12-11 16:29                   ` Jason Gunthorpe
2020-12-16 16:42                     ` Jinpu Wang
2020-12-27  9:01                       ` Leon Romanovsky
2021-01-04  8:06                         ` Jinpu Wang
2021-01-04  8:25                           ` Leon Romanovsky
2021-01-04 11:04                             ` Jinpu Wang
2020-12-11 20:49               ` Leon Romanovsky
2020-12-09 16:45 ` [PATCH for-next 03/18] RDMA/rtrs-srv: Release lock before call into close_sess Jack Wang
2020-12-09 16:45 ` [PATCH for-next 04/18] RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect Jack Wang
2020-12-09 16:45 ` [PATCH for-next 05/18] RDMA/rtrs-clt: Set mininum limit when create QP Jack Wang
2020-12-09 16:45 ` [PATCH for-next 06/18] RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails Jack Wang
2020-12-09 16:45 ` [PATCH for-next 07/18] RDMA/rtrs: Call kobject_put in the failure path Jack Wang
2020-12-09 16:45 ` [PATCH for-next 08/18] RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files} Jack Wang
2020-12-09 16:45 ` [PATCH for-next 09/18] RDMA/rtrs-clt: Kill wait_for_inflight_permits Jack Wang
2020-12-09 16:45 ` [PATCH for-next 10/18] RDMA/rtrs-clt: Remove unnecessary 'goto out' Jack Wang
2020-12-09 16:45 ` [PATCH for-next 11/18] RDMA/rtrs-clt: Kill rtrs_clt_change_state Jack Wang
2020-12-09 16:45 ` [PATCH for-next 12/18] RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state Jack Wang
2020-12-09 16:45 ` [PATCH for-next 13/18] RDMA/rtrs-srv: Fix missing wr_cqe Jack Wang
2020-12-09 16:45 ` [PATCH for-next 14/18] RDMA/rtrs-clt: Refactor the failure cases in alloc_clt Jack Wang
2020-12-09 16:45 ` [PATCH for-next 15/18] RDMA/rtrs: Do not signal for heatbeat Jack Wang
2020-12-09 16:45 ` [PATCH for-next 16/18] RDMA/rtrs-clt: Use bitmask to check sess->flags Jack Wang
2020-12-09 16:45 ` [PATCH for-next 17/18] RDMA/rtrs-srv: Do not signal REG_MR Jack Wang
2020-12-09 16:45 ` [PATCH for-next 18/18] RDMA/rtrs-srv: Init wr_cnt as 1 Jack Wang
2020-12-11 19:48 ` [PATCH for-next 00/18] Misc update for rtrs Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.