All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 0/4] A few bugfix for RTRS
@ 2021-02-11  6:55 Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds Jack Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jack Wang @ 2021-02-11  6:55 UTC (permalink / raw)
  To: linux-rdma; +Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang

Hi Jason, hi Doug,

Please consider to include follwing bugfix to next release.

One bugfix for KASAN splat due to we use wrong structure type when send
RDMA_WRITE_WITH_IMM opcode from me.

One bugfix for allowing addition of random path to exsition session from Haris.

2 bugfix for memory leak from Gioh.

Thanks!
Jack Wang

Gioh Kim (2):
  RDMA/rtrs-srv: fix memory leak by missing kobject free
  RDMA/rtrs-srv-sysfs: fix missing put_device

Jack Wang (1):
  RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds

Md Haris Iqbal (1):
  RDMA/rtrs: Only allow addition of path to an already established
    session

 drivers/infiniband/ulp/rtrs/rtrs-clt.c       |  5 ++
 drivers/infiniband/ulp/rtrs/rtrs-clt.h       |  1 +
 drivers/infiniband/ulp/rtrs/rtrs-pri.h       |  4 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c |  4 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 91 ++++++++++++--------
 5 files changed, 65 insertions(+), 40 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds
  2021-02-11  6:55 [PATCH for-next 0/4] A few bugfix for RTRS Jack Wang
@ 2021-02-11  6:55 ` Jack Wang
  2021-02-11  8:33   ` Leon Romanovsky
  2021-02-11  6:55 ` [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session Jack Wang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Jack Wang @ 2021-02-11  6:55 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

[  125.352752] ==================================================================
[  125.353122] BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
[  125.353337] Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565

[  125.353655] CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
[  125.353924] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
[  125.354144] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[  125.354317] Call Trace:
[  125.354531]  dump_stack+0x96/0xe0
[  125.354715]  print_address_description.constprop.4+0x1f/0x300
[  125.354943]  ? irq_work_claim+0x2e/0x50
[  125.355129]  __kasan_report.cold.8+0x78/0x92
[  125.355334]  ? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
[  125.355545]  kasan_report+0x10/0x20
[  125.355730]  _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
[  125.355930]  ? check_chain_key+0x1d7/0x2e0
[  125.356203]  ? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib]
[  125.356399]  ? lockdep_hardirqs_on+0x1a8/0x290
[  125.356595]  ? stack_depot_save+0x218/0x56e
[  125.356781]  ? do_profile_hits.isra.6.cold.13+0x1d/0x1d
[  125.356973]  ? check_chain_key+0x1d7/0x2e0
[  125.357174]  ? save_stack+0x4d/0x80
[  125.357374]  ? save_stack+0x19/0x80
[  125.357547]  ? __kasan_slab_free+0x125/0x170
[  125.357728]  ? kfree+0xe7/0x3b0
[  125.357899]  rdma_write_sg+0x5b0/0x950 [rtrs_server]

The problem is when we send imm_wr, the type should be ib_rdma_wr, so
hw driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr
as type for imm_wr.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 64 +++++++++++++-------------
 1 file changed, 33 insertions(+), 31 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 918f1cf140cd..e13e91c2a44a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -222,7 +222,8 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 	dma_addr_t dma_addr = sess->dma_addr[id->msg_id];
 	struct rtrs_srv_mr *srv_mr;
 	struct rtrs_srv *srv = sess->srv;
-	struct ib_send_wr inv_wr, imm_wr;
+	struct ib_send_wr inv_wr;
+	struct ib_rdma_wr imm_wr;
 	struct ib_rdma_wr *wr = NULL;
 	enum ib_send_flags flags;
 	size_t sg_cnt;
@@ -274,15 +275,15 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 	if (need_inval && always_invalidate) {
 		wr->wr.next = &rwr.wr;
 		rwr.wr.next = &inv_wr;
-		inv_wr.next = &imm_wr;
+		inv_wr.next = &imm_wr.wr;
 	} else if (always_invalidate) {
 		wr->wr.next = &rwr.wr;
-		rwr.wr.next = &imm_wr;
+		rwr.wr.next = &imm_wr.wr;
 	} else if (need_inval) {
 		wr->wr.next = &inv_wr;
-		inv_wr.next = &imm_wr;
+		inv_wr.next = &imm_wr.wr;
 	} else {
-		wr->wr.next = &imm_wr;
+		wr->wr.next = &imm_wr.wr;
 	}
 	/*
 	 * From time to time we have to post signaled sends,
@@ -300,7 +301,7 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 		inv_wr.ex.invalidate_rkey = rkey;
 	}
 
-	imm_wr.next = NULL;
+	imm_wr.wr.next = NULL;
 	if (always_invalidate) {
 		struct rtrs_msg_rkey_rsp *msg;
 
@@ -321,22 +322,22 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 		list.addr   = srv_mr->iu->dma_addr;
 		list.length = sizeof(*msg);
 		list.lkey   = sess->s.dev->ib_pd->local_dma_lkey;
-		imm_wr.sg_list = &list;
-		imm_wr.num_sge = 1;
-		imm_wr.opcode = IB_WR_SEND_WITH_IMM;
+		imm_wr.wr.sg_list = &list;
+		imm_wr.wr.num_sge = 1;
+		imm_wr.wr.opcode = IB_WR_SEND_WITH_IMM;
 		ib_dma_sync_single_for_device(sess->s.dev->ib_dev,
 					      srv_mr->iu->dma_addr,
 					      srv_mr->iu->size, DMA_TO_DEVICE);
 	} else {
-		imm_wr.sg_list = NULL;
-		imm_wr.num_sge = 0;
-		imm_wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
+		imm_wr.wr.sg_list = NULL;
+		imm_wr.wr.num_sge = 0;
+		imm_wr.wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
 	}
-	imm_wr.send_flags = flags;
-	imm_wr.ex.imm_data = cpu_to_be32(rtrs_to_io_rsp_imm(id->msg_id,
+	imm_wr.wr.send_flags = flags;
+	imm_wr.wr.ex.imm_data = cpu_to_be32(rtrs_to_io_rsp_imm(id->msg_id,
 							     0, need_inval));
 
-	imm_wr.wr_cqe   = &io_comp_cqe;
+	imm_wr.wr.wr_cqe   = &io_comp_cqe;
 	ib_dma_sync_single_for_device(sess->s.dev->ib_dev, dma_addr,
 				      offset, DMA_BIDIRECTIONAL);
 
@@ -363,7 +364,8 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 {
 	struct rtrs_sess *s = con->c.sess;
 	struct rtrs_srv_sess *sess = to_srv_sess(s);
-	struct ib_send_wr inv_wr, imm_wr, *wr = NULL;
+	struct ib_send_wr inv_wr, *wr = NULL;
+	struct ib_rdma_wr imm_wr;
 	struct ib_reg_wr rwr;
 	struct rtrs_srv *srv = sess->srv;
 	struct rtrs_srv_mr *srv_mr;
@@ -400,15 +402,15 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 	if (need_inval && always_invalidate) {
 		wr = &inv_wr;
 		inv_wr.next = &rwr.wr;
-		rwr.wr.next = &imm_wr;
+		rwr.wr.next = &imm_wr.wr;
 	} else if (always_invalidate) {
 		wr = &rwr.wr;
-		rwr.wr.next = &imm_wr;
+		rwr.wr.next = &imm_wr.wr;
 	} else if (need_inval) {
 		wr = &inv_wr;
-		inv_wr.next = &imm_wr;
+		inv_wr.next = &imm_wr.wr;
 	} else {
-		wr = &imm_wr;
+		wr = &imm_wr.wr;
 	}
 	/*
 	 * From time to time we have to post signalled sends,
@@ -417,13 +419,13 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 	flags = (atomic_inc_return(&con->wr_cnt) % srv->queue_depth) ?
 		0 : IB_SEND_SIGNALED;
 	imm = rtrs_to_io_rsp_imm(id->msg_id, errno, need_inval);
-	imm_wr.next = NULL;
+	imm_wr.wr.next = NULL;
 	if (always_invalidate) {
 		struct ib_sge list;
 		struct rtrs_msg_rkey_rsp *msg;
 
 		srv_mr = &sess->mrs[id->msg_id];
-		rwr.wr.next = &imm_wr;
+		rwr.wr.next = &imm_wr.wr;
 		rwr.wr.opcode = IB_WR_REG_MR;
 		rwr.wr.wr_cqe = &local_reg_cqe;
 		rwr.wr.num_sge = 0;
@@ -440,21 +442,21 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 		list.addr   = srv_mr->iu->dma_addr;
 		list.length = sizeof(*msg);
 		list.lkey   = sess->s.dev->ib_pd->local_dma_lkey;
-		imm_wr.sg_list = &list;
-		imm_wr.num_sge = 1;
-		imm_wr.opcode = IB_WR_SEND_WITH_IMM;
+		imm_wr.wr.sg_list = &list;
+		imm_wr.wr.num_sge = 1;
+		imm_wr.wr.opcode = IB_WR_SEND_WITH_IMM;
 		ib_dma_sync_single_for_device(sess->s.dev->ib_dev,
 					      srv_mr->iu->dma_addr,
 					      srv_mr->iu->size, DMA_TO_DEVICE);
 	} else {
-		imm_wr.sg_list = NULL;
-		imm_wr.num_sge = 0;
-		imm_wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
+		imm_wr.wr.sg_list = NULL;
+		imm_wr.wr.num_sge = 0;
+		imm_wr.wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
 	}
-	imm_wr.send_flags = flags;
-	imm_wr.wr_cqe   = &io_comp_cqe;
+	imm_wr.wr.send_flags = flags;
+	imm_wr.wr.wr_cqe   = &io_comp_cqe;
 
-	imm_wr.ex.imm_data = cpu_to_be32(imm);
+	imm_wr.wr.ex.imm_data = cpu_to_be32(imm);
 
 	err = ib_post_send(id->con->c.qp, wr, NULL);
 	if (unlikely(err))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  6:55 [PATCH for-next 0/4] A few bugfix for RTRS Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds Jack Wang
@ 2021-02-11  6:55 ` Jack Wang
  2021-02-11  8:43   ` Leon Romanovsky
  2021-02-11  6:55 ` [PATCH for-next 3/4] RDMA/rtrs-srv: fix memory leak by missing kobject free Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device Jack Wang
  3 siblings, 1 reply; 12+ messages in thread
From: Jack Wang @ 2021-02-11  6:55 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang,
	Md Haris Iqbal, Lutz Pogrell

From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>

While adding a path from the client side to an already established
session, it was possible to provide the destination IP to a different
server. This is dangerous.

This commit adds an extra member to the rtrs_msg_conn_req structure, named
first_conn; which is supposed to notify if the connection request is the
first for that session or not.

On the server side, if a session does not exist but the first_conn
received inside the rtrs_msg_conn_req structure is 1, the connection
request is failed. This signifies that the connection request is for an
already existing session, and since the server did not find one, it is an
wrong connection request.

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
 drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
 4 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 7644c3f627ca..a110e520b0a4 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -31,6 +31,8 @@
  */
 #define RTRS_RECONNECT_SEED 8
 
+#define FIRST_CONN 0x01
+
 MODULE_DESCRIPTION("RDMA Transport Client");
 MODULE_LICENSE("GPL");
 
@@ -1660,6 +1662,7 @@ static int rtrs_rdma_route_resolved(struct rtrs_clt_con *con)
 		.cid_num = cpu_to_le16(sess->s.con_num),
 		.recon_cnt = cpu_to_le16(sess->s.recon_cnt),
 	};
+	msg.first_conn = sess->for_new_clt ? (FIRST_CONN & 0xff) : 0;
 	uuid_copy(&msg.sess_uuid, &sess->s.uuid);
 	uuid_copy(&msg.paths_uuid, &clt->paths_uuid);
 
@@ -2662,6 +2665,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 			err = PTR_ERR(sess);
 			goto close_all_sess;
 		}
+		sess->for_new_clt = true;
 		list_add_tail_rcu(&sess->s.entry, &clt->paths_list);
 
 		err = init_sess(sess);
@@ -2913,6 +2917,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
 	if (IS_ERR(sess))
 		return PTR_ERR(sess);
 
+	sess->for_new_clt = false;
 	/*
 	 * It is totally safe to add path in CONNECTING state: coming
 	 * IO will never grab it.  Also it is very important to add
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index a97a068c4c28..3f1a05373470 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -143,6 +143,7 @@ struct rtrs_clt_sess {
 	int			max_send_sge;
 	u32			flags;
 	struct kobject		kobj;
+	bool			for_new_clt;
 	struct rtrs_clt_stats	*stats;
 	/* cache hca_port and hca_name to display in sysfs */
 	u8			hca_port;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index d5621e6fad1b..8caad0a2322b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -188,7 +188,9 @@ struct rtrs_msg_conn_req {
 	__le16		recon_cnt;
 	uuid_t		sess_uuid;
 	uuid_t		paths_uuid;
-	u8		reserved[12];
+	u8		first_conn : 1;
+	u8		reserved_bits : 7;
+	u8		reserved[11];
 };
 
 /**
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index e13e91c2a44a..2538a84fe5fc 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1333,10 +1333,12 @@ static void free_srv(struct rtrs_srv *srv)
 }
 
 static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
-					   const uuid_t *paths_uuid)
+					  const uuid_t *paths_uuid,
+					  bool first_conn)
 {
 	struct rtrs_srv *srv;
 	int i;
+	int err = -ENOMEM;
 
 	mutex_lock(&ctx->srv_mutex);
 	list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
@@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
 			return srv;
 		}
 	}
+	/*
+	 * If this request is not the first connection request from the
+	 * client for this session then fail and return error.
+	 */
+	if (!first_conn) {
+		err = -ENXIO;
+		goto err;
+	}
 
 	/* need to allocate a new srv */
 	srv = kzalloc(sizeof(*srv), GFP_KERNEL);
 	if  (!srv) {
 		mutex_unlock(&ctx->srv_mutex);
-		return NULL;
+		goto err;
 	}
 
 	INIT_LIST_HEAD(&srv->paths_list);
@@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
 
 err_free_srv:
 	kfree(srv);
-	return NULL;
+err:
+	return ERR_PTR(err);
 }
 
 static void put_srv(struct rtrs_srv *srv)
@@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 		goto reject_w_econnreset;
 	}
 	recon_cnt = le16_to_cpu(msg->recon_cnt);
-	srv = get_or_create_srv(ctx, &msg->paths_uuid);
+	srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
 	/*
 	 * "refcount == 0" happens if a previous thread calls get_or_create_srv
 	 * allocate srv, but chunks of srv are not allocated yet.
 	 */
-	if (!srv || refcount_read(&srv->refcount) == 0) {
+	if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
 		err = -ENOMEM;
 		goto reject_w_err;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next 3/4] RDMA/rtrs-srv: fix memory leak by missing kobject free
  2021-02-11  6:55 [PATCH for-next 0/4] A few bugfix for RTRS Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session Jack Wang
@ 2021-02-11  6:55 ` Jack Wang
  2021-02-11  6:55 ` [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device Jack Wang
  3 siblings, 0 replies; 12+ messages in thread
From: Jack Wang @ 2021-02-11  6:55 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

kmemleak reported an error as below:

unreferenced object 0xffff8880674b7640 (size 64):
  comm "kworker/4:1H", pid 113, jiffies 4296403507 (age 507.840s)
  hex dump (first 32 bytes):
    69 70 3a 31 39 32 2e 31 36 38 2e 31 32 32 2e 31  ip:192.168.122.1
    31 30 40 69 70 3a 31 39 32 2e 31 36 38 2e 31 32  10@ip:192.168.12
  backtrace:
    [<0000000054413611>] kstrdup+0x2e/0x60
    [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
    [<00000000ca2be3ee>] kobject_init_and_add+0xb0/0x120
    [<0000000062ba5e78>] rtrs_srv_create_sess_files+0x14c/0x314 [rtrs_server]
    [<00000000b45b7217>] rtrs_srv_info_req_done+0x5b1/0x800 [rtrs_server]
    [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
    [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
    [<00000000cfc376be>] process_one_work+0x4bc/0x980
    [<0000000016e5c96a>] worker_thread+0x78/0x5c0
    [<00000000c20b8be0>] kthread+0x191/0x1e0
    [<000000006c9c0003>] ret_from_fork+0x3a/0x50

It is caused by the not-freed kobject of rtrs_srv_sess.
The kobject embedded in rtrs_srv_sess has ref-counter 2 after calling
process_info_req(). Therefore it must call kobject_put twice.
Currently it calls kobject_put only once at rtrs_srv_destroy_sess_files
because kobject_del removes the state_in_sysfs flag and then
kobject_put in free_sess() is not called.

This patch moves kobject_del() into free_sess() so that the kobject
of rtrs_srv_sess can be freed. And also this patch adds the missing
call of sysfs_remove_group() to clean-up the sysfs directory.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 2 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 6 ++++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index 0a3886629cae..94e3f3290500 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -305,7 +305,7 @@ void rtrs_srv_destroy_sess_files(struct rtrs_srv_sess *sess)
 	if (sess->kobj.state_in_sysfs) {
 		kobject_del(&sess->stats->kobj_stats);
 		kobject_put(&sess->stats->kobj_stats);
-		kobject_del(&sess->kobj);
+		sysfs_remove_group(&sess->kobj, &rtrs_srv_sess_attr_group);
 		kobject_put(&sess->kobj);
 
 		rtrs_srv_destroy_once_sysfs_root_folders(sess);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 2538a84fe5fc..3f1b4ce3c055 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1477,10 +1477,12 @@ static bool __is_path_w_addr_exists(struct rtrs_srv *srv,
 
 static void free_sess(struct rtrs_srv_sess *sess)
 {
-	if (sess->kobj.state_in_sysfs)
+	if (sess->kobj.state_in_sysfs) {
+		kobject_del(&sess->kobj);
 		kobject_put(&sess->kobj);
-	else
+	} else {
 		kfree(sess);
+	}
 }
 
 static void rtrs_srv_close_work(struct work_struct *work)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device
  2021-02-11  6:55 [PATCH for-next 0/4] A few bugfix for RTRS Jack Wang
                   ` (2 preceding siblings ...)
  2021-02-11  6:55 ` [PATCH for-next 3/4] RDMA/rtrs-srv: fix memory leak by missing kobject free Jack Wang
@ 2021-02-11  6:55 ` Jack Wang
  2021-02-11  8:48   ` Leon Romanovsky
  3 siblings, 1 reply; 12+ messages in thread
From: Jack Wang @ 2021-02-11  6:55 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, danil.kipnis, jinpu.wang, Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

put_device() decreases the ref-count and then the device will
be cleaned-up, while at is also add missing put_device in
rtrs_srv_create_once_sysfs_root_folders

This patch solves a kmemleak error as below:

unreferenced object 0xffff88809a7a0710 (size 8):
  comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s)
  hex dump (first 8 bytes):
    62 6c 61 00 6b 6b 6b a5                          bla.kkk.
  backtrace:
    [<0000000054413611>] kstrdup+0x2e/0x60
    [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
    [<00000000f1a17a6b>] dev_set_name+0xab/0xe0
    [<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server]
    [<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server]
    [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
    [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
    [<00000000cfc376be>] process_one_work+0x4bc/0x980
    [<0000000016e5c96a>] worker_thread+0x78/0x5c0
    [<00000000c20b8be0>] kthread+0x191/0x1e0
    [<000000006c9c0003>] ret_from_fork+0x3a/0x50

Fixes: baa5b28b7a47 ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index 94e3f3290500..126a96e75c62 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -183,6 +183,7 @@ static int rtrs_srv_create_once_sysfs_root_folders(struct rtrs_srv_sess *sess)
 		err = -ENOMEM;
 		pr_err("kobject_create_and_add(): %d\n", err);
 		device_del(&srv->dev);
+		put_device(&srv->dev);
 		goto unlock;
 	}
 	dev_set_uevent_suppress(&srv->dev, false);
@@ -208,6 +209,7 @@ rtrs_srv_destroy_once_sysfs_root_folders(struct rtrs_srv_sess *sess)
 		kobject_put(srv->kobj_paths);
 		mutex_unlock(&srv->paths_mutex);
 		device_del(&srv->dev);
+		put_device(&srv->dev);
 	} else {
 		mutex_unlock(&srv->paths_mutex);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds
  2021-02-11  6:55 ` [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds Jack Wang
@ 2021-02-11  8:33   ` Leon Romanovsky
  0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2021-02-11  8:33 UTC (permalink / raw)
  To: Jack Wang; +Cc: linux-rdma, bvanassche, dledford, jgg, danil.kipnis, Gioh Kim

On Thu, Feb 11, 2021 at 07:55:23AM +0100, Jack Wang wrote:
> [  125.352752] ==================================================================
> [  125.353122] BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
> [  125.353337] Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565
>
> [  125.353655] CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
> [  125.353924] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
> [  125.354144] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> [  125.354317] Call Trace:
> [  125.354531]  dump_stack+0x96/0xe0
> [  125.354715]  print_address_description.constprop.4+0x1f/0x300
> [  125.354943]  ? irq_work_claim+0x2e/0x50
> [  125.355129]  __kasan_report.cold.8+0x78/0x92
> [  125.355334]  ? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
> [  125.355545]  kasan_report+0x10/0x20
> [  125.355730]  _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
> [  125.355930]  ? check_chain_key+0x1d7/0x2e0
> [  125.356203]  ? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib]
> [  125.356399]  ? lockdep_hardirqs_on+0x1a8/0x290
> [  125.356595]  ? stack_depot_save+0x218/0x56e
> [  125.356781]  ? do_profile_hits.isra.6.cold.13+0x1d/0x1d
> [  125.356973]  ? check_chain_key+0x1d7/0x2e0
> [  125.357174]  ? save_stack+0x4d/0x80
> [  125.357374]  ? save_stack+0x19/0x80
> [  125.357547]  ? __kasan_slab_free+0x125/0x170
> [  125.357728]  ? kfree+0xe7/0x3b0
> [  125.357899]  rdma_write_sg+0x5b0/0x950 [rtrs_server]
>
> The problem is when we send imm_wr, the type should be ib_rdma_wr, so
> hw driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr
> as type for imm_wr.
>
> Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 64 +++++++++++++-------------
>  1 file changed, 33 insertions(+), 31 deletions(-)
>

You have many lines with double space before "=".

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  6:55 ` [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session Jack Wang
@ 2021-02-11  8:43   ` Leon Romanovsky
  2021-02-11  9:23     ` Jinpu Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2021-02-11  8:43 UTC (permalink / raw)
  To: Jack Wang
  Cc: linux-rdma, bvanassche, dledford, jgg, danil.kipnis,
	Md Haris Iqbal, Lutz Pogrell

On Thu, Feb 11, 2021 at 07:55:24AM +0100, Jack Wang wrote:
> From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
>
> While adding a path from the client side to an already established
> session, it was possible to provide the destination IP to a different
> server. This is dangerous.
>
> This commit adds an extra member to the rtrs_msg_conn_req structure, named
> first_conn; which is supposed to notify if the connection request is the
> first for that session or not.
>
> On the server side, if a session does not exist but the first_conn
> received inside the rtrs_msg_conn_req structure is 1, the connection
> request is failed. This signifies that the connection request is for an
> already existing session, and since the server did not find one, it is an
> wrong connection request.
>
> Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
> Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
>  drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
>  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
>  4 files changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> index 7644c3f627ca..a110e520b0a4 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> @@ -31,6 +31,8 @@
>   */
>  #define RTRS_RECONNECT_SEED 8
>
> +#define FIRST_CONN 0x01
> +
>  MODULE_DESCRIPTION("RDMA Transport Client");
>  MODULE_LICENSE("GPL");
>
> @@ -1660,6 +1662,7 @@ static int rtrs_rdma_route_resolved(struct rtrs_clt_con *con)
>  		.cid_num = cpu_to_le16(sess->s.con_num),
>  		.recon_cnt = cpu_to_le16(sess->s.recon_cnt),
>  	};
> +	msg.first_conn = sess->for_new_clt ? (FIRST_CONN & 0xff) : 0;

Why do you need this "&"? You can directly set FIRST_CONN.

>  	uuid_copy(&msg.sess_uuid, &sess->s.uuid);
>  	uuid_copy(&msg.paths_uuid, &clt->paths_uuid);
>
> @@ -2662,6 +2665,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
>  			err = PTR_ERR(sess);
>  			goto close_all_sess;
>  		}
> +		sess->for_new_clt = true;
>  		list_add_tail_rcu(&sess->s.entry, &clt->paths_list);
>
>  		err = init_sess(sess);
> @@ -2913,6 +2917,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
>  	if (IS_ERR(sess))
>  		return PTR_ERR(sess);
>
> +	sess->for_new_clt = false;
>  	/*
>  	 * It is totally safe to add path in CONNECTING state: coming
>  	 * IO will never grab it.  Also it is very important to add
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> index a97a068c4c28..3f1a05373470 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> @@ -143,6 +143,7 @@ struct rtrs_clt_sess {
>  	int			max_send_sge;
>  	u32			flags;
>  	struct kobject		kobj;
> +	bool			for_new_clt;

Let's not add bool to structs.

>  	struct rtrs_clt_stats	*stats;
>  	/* cache hca_port and hca_name to display in sysfs */
>  	u8			hca_port;
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> index d5621e6fad1b..8caad0a2322b 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> @@ -188,7 +188,9 @@ struct rtrs_msg_conn_req {
>  	__le16		recon_cnt;
>  	uuid_t		sess_uuid;
>  	uuid_t		paths_uuid;
> -	u8		reserved[12];
> +	u8		first_conn : 1;
> +	u8		reserved_bits : 7;
> +	u8		reserved[11];
>  };
>
>  /**
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> index e13e91c2a44a..2538a84fe5fc 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -1333,10 +1333,12 @@ static void free_srv(struct rtrs_srv *srv)
>  }
>
>  static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> -					   const uuid_t *paths_uuid)
> +					  const uuid_t *paths_uuid,
> +					  bool first_conn)
>  {
>  	struct rtrs_srv *srv;
>  	int i;
> +	int err = -ENOMEM;

You can avoid this "err" thing and return ERR or NULL directly and check
IS_ERR_OR_NULL() later. Or you shouldn't overwrite an error after
get_or_create_srv() call.

>
>  	mutex_lock(&ctx->srv_mutex);
>  	list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
> @@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
>  			return srv;
>  		}
>  	}
> +	/*
> +	 * If this request is not the first connection request from the
> +	 * client for this session then fail and return error.
> +	 */
> +	if (!first_conn) {
> +		err = -ENXIO;
> +		goto err;
> +	}

Are you sure that this check not racy?

Thanks

>
>  	/* need to allocate a new srv */
>  	srv = kzalloc(sizeof(*srv), GFP_KERNEL);
>  	if  (!srv) {
>  		mutex_unlock(&ctx->srv_mutex);
> -		return NULL;
> +		goto err;
>  	}
>
>  	INIT_LIST_HEAD(&srv->paths_list);
> @@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
>
>  err_free_srv:
>  	kfree(srv);
> -	return NULL;
> +err:
> +	return ERR_PTR(err);
>  }
>
>  static void put_srv(struct rtrs_srv *srv)
> @@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
>  		goto reject_w_econnreset;
>  	}
>  	recon_cnt = le16_to_cpu(msg->recon_cnt);
> -	srv = get_or_create_srv(ctx, &msg->paths_uuid);
> +	srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
>  	/*
>  	 * "refcount == 0" happens if a previous thread calls get_or_create_srv
>  	 * allocate srv, but chunks of srv are not allocated yet.
>  	 */
> -	if (!srv || refcount_read(&srv->refcount) == 0) {
> +	if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
>  		err = -ENOMEM;
>  		goto reject_w_err;
>  	}
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device
  2021-02-11  6:55 ` [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device Jack Wang
@ 2021-02-11  8:48   ` Leon Romanovsky
  0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2021-02-11  8:48 UTC (permalink / raw)
  To: Jack Wang; +Cc: linux-rdma, bvanassche, dledford, jgg, danil.kipnis, Gioh Kim

On Thu, Feb 11, 2021 at 07:55:26AM +0100, Jack Wang wrote:
> From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
>
> put_device() decreases the ref-count and then the device will
> be cleaned-up, while at is also add missing put_device in
> rtrs_srv_create_once_sysfs_root_folders
>
> This patch solves a kmemleak error as below:
>
> unreferenced object 0xffff88809a7a0710 (size 8):
>   comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s)
>   hex dump (first 8 bytes):
>     62 6c 61 00 6b 6b 6b a5                          bla.kkk.
>   backtrace:
>     [<0000000054413611>] kstrdup+0x2e/0x60
>     [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
>     [<00000000f1a17a6b>] dev_set_name+0xab/0xe0
>     [<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server]
>     [<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server]
>     [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
>     [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
>     [<00000000cfc376be>] process_one_work+0x4bc/0x980
>     [<0000000016e5c96a>] worker_thread+0x78/0x5c0
>     [<00000000c20b8be0>] kthread+0x191/0x1e0
>     [<000000006c9c0003>] ret_from_fork+0x3a/0x50
>
> Fixes: baa5b28b7a47 ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
> Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
>

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  8:43   ` Leon Romanovsky
@ 2021-02-11  9:23     ` Jinpu Wang
  2021-02-11  9:36       ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jinpu Wang @ 2021-02-11  9:23 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-rdma, Bart Van Assche, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis, Md Haris Iqbal, Lutz Pogrell

On Thu, Feb 11, 2021 at 9:43 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Feb 11, 2021 at 07:55:24AM +0100, Jack Wang wrote:
> > From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> >
> > While adding a path from the client side to an already established
> > session, it was possible to provide the destination IP to a different
> > server. This is dangerous.
> >
> > This commit adds an extra member to the rtrs_msg_conn_req structure, named
> > first_conn; which is supposed to notify if the connection request is the
> > first for that session or not.
> >
> > On the server side, if a session does not exist but the first_conn
> > received inside the rtrs_msg_conn_req structure is 1, the connection
> > request is failed. This signifies that the connection request is for an
> > already existing session, and since the server did not find one, it is an
> > wrong connection request.
> >
> > Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
> > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > ---
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
> >  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
> >  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
> >  4 files changed, 25 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > index 7644c3f627ca..a110e520b0a4 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > @@ -31,6 +31,8 @@
> >   */
> >  #define RTRS_RECONNECT_SEED 8
> >
> > +#define FIRST_CONN 0x01
> > +
> >  MODULE_DESCRIPTION("RDMA Transport Client");
> >  MODULE_LICENSE("GPL");
> >
> > @@ -1660,6 +1662,7 @@ static int rtrs_rdma_route_resolved(struct rtrs_clt_con *con)
> >               .cid_num = cpu_to_le16(sess->s.con_num),
> >               .recon_cnt = cpu_to_le16(sess->s.recon_cnt),
> >       };
> > +     msg.first_conn = sess->for_new_clt ? (FIRST_CONN & 0xff) : 0;
>
> Why do you need this "&"? You can directly set FIRST_CONN.
you're right, will change.
>
> >       uuid_copy(&msg.sess_uuid, &sess->s.uuid);
> >       uuid_copy(&msg.paths_uuid, &clt->paths_uuid);
> >
> > @@ -2662,6 +2665,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
> >                       err = PTR_ERR(sess);
> >                       goto close_all_sess;
> >               }
> > +             sess->for_new_clt = true;
> >               list_add_tail_rcu(&sess->s.entry, &clt->paths_list);
> >
> >               err = init_sess(sess);
> > @@ -2913,6 +2917,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
> >       if (IS_ERR(sess))
> >               return PTR_ERR(sess);
> >
> > +     sess->for_new_clt = false;
> >       /*
> >        * It is totally safe to add path in CONNECTING state: coming
> >        * IO will never grab it.  Also it is very important to add
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > index a97a068c4c28..3f1a05373470 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > @@ -143,6 +143,7 @@ struct rtrs_clt_sess {
> >       int                     max_send_sge;
> >       u32                     flags;
> >       struct kobject          kobj;
> > +     bool                    for_new_clt;
>
> Let's not add bool to structs.
ok, will change to u8.
>
> >       struct rtrs_clt_stats   *stats;
> >       /* cache hca_port and hca_name to display in sysfs */
> >       u8                      hca_port;
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > index d5621e6fad1b..8caad0a2322b 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > @@ -188,7 +188,9 @@ struct rtrs_msg_conn_req {
> >       __le16          recon_cnt;
> >       uuid_t          sess_uuid;
> >       uuid_t          paths_uuid;
> > -     u8              reserved[12];
> > +     u8              first_conn : 1;
> > +     u8              reserved_bits : 7;
> > +     u8              reserved[11];
> >  };
> >
> >  /**
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > index e13e91c2a44a..2538a84fe5fc 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > @@ -1333,10 +1333,12 @@ static void free_srv(struct rtrs_srv *srv)
> >  }
> >
> >  static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > -                                        const uuid_t *paths_uuid)
> > +                                       const uuid_t *paths_uuid,
> > +                                       bool first_conn)
> >  {
> >       struct rtrs_srv *srv;
> >       int i;
> > +     int err = -ENOMEM;
>
> You can avoid this "err" thing and return ERR or NULL directly and check
> IS_ERR_OR_NULL() later. Or you shouldn't overwrite an error after
> get_or_create_srv() call.
Sounds good, will change.

>
> >
> >       mutex_lock(&ctx->srv_mutex);
> >       list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
> > @@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> >                       return srv;
> >               }
> >       }
> > +     /*
> > +      * If this request is not the first connection request from the
> > +      * client for this session then fail and return error.
> > +      */
> > +     if (!first_conn) {
> > +             err = -ENXIO;
> > +             goto err;
> > +     }
>
> Are you sure that this check not racy?
I can't see how a function parameter check can be racy, can you elaborate?
>
> Thanks
Thanks for the review.!
>
> >
> >       /* need to allocate a new srv */
> >       srv = kzalloc(sizeof(*srv), GFP_KERNEL);
> >       if  (!srv) {
> >               mutex_unlock(&ctx->srv_mutex);
> > -             return NULL;
> > +             goto err;
> >       }
> >
> >       INIT_LIST_HEAD(&srv->paths_list);
> > @@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> >
> >  err_free_srv:
> >       kfree(srv);
> > -     return NULL;
> > +err:
> > +     return ERR_PTR(err);
> >  }
> >
> >  static void put_srv(struct rtrs_srv *srv)
> > @@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
> >               goto reject_w_econnreset;
> >       }
> >       recon_cnt = le16_to_cpu(msg->recon_cnt);
> > -     srv = get_or_create_srv(ctx, &msg->paths_uuid);
> > +     srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
> >       /*
> >        * "refcount == 0" happens if a previous thread calls get_or_create_srv
> >        * allocate srv, but chunks of srv are not allocated yet.
> >        */
> > -     if (!srv || refcount_read(&srv->refcount) == 0) {
> > +     if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
> >               err = -ENOMEM;
> >               goto reject_w_err;
> >       }
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  9:23     ` Jinpu Wang
@ 2021-02-11  9:36       ` Leon Romanovsky
  2021-02-11  9:48         ` Jinpu Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2021-02-11  9:36 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: linux-rdma, Bart Van Assche, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis, Md Haris Iqbal, Lutz Pogrell

On Thu, Feb 11, 2021 at 10:23:54AM +0100, Jinpu Wang wrote:
> On Thu, Feb 11, 2021 at 9:43 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Feb 11, 2021 at 07:55:24AM +0100, Jack Wang wrote:
> > > From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > >
> > > While adding a path from the client side to an already established
> > > session, it was possible to provide the destination IP to a different
> > > server. This is dangerous.
> > >
> > > This commit adds an extra member to the rtrs_msg_conn_req structure, named
> > > first_conn; which is supposed to notify if the connection request is the
> > > first for that session or not.
> > >
> > > On the server side, if a session does not exist but the first_conn
> > > received inside the rtrs_msg_conn_req structure is 1, the connection
> > > request is failed. This signifies that the connection request is for an
> > > already existing session, and since the server did not find one, it is an
> > > wrong connection request.
> > >
> > > Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
> > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > > Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > > Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
> > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > ---
> > >  drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
> > >  drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
> > >  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
> > >  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
> > >  4 files changed, 25 insertions(+), 6 deletions(-)

<...>

> > >
> > >       mutex_lock(&ctx->srv_mutex);
> > >       list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
> > > @@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > >                       return srv;
> > >               }
> > >       }
> > > +     /*
> > > +      * If this request is not the first connection request from the
> > > +      * client for this session then fail and return error.
> > > +      */
> > > +     if (!first_conn) {
> > > +             err = -ENXIO;
> > > +             goto err;
> > > +     }
> >
> > Are you sure that this check not racy?
> I can't see how a function parameter check can be racy, can you elaborate?

get_or_create_srv() itself is protected with mutex_lock, but it can be called
in parallel by rtrs_rdma_connect(), this is why I asked.

Thanks

> >
> > Thanks
> Thanks for the review.!
> >
> > >
> > >       /* need to allocate a new srv */
> > >       srv = kzalloc(sizeof(*srv), GFP_KERNEL);
> > >       if  (!srv) {
> > >               mutex_unlock(&ctx->srv_mutex);
> > > -             return NULL;
> > > +             goto err;
> > >       }
> > >
> > >       INIT_LIST_HEAD(&srv->paths_list);
> > > @@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > >
> > >  err_free_srv:
> > >       kfree(srv);
> > > -     return NULL;
> > > +err:
> > > +     return ERR_PTR(err);
> > >  }
> > >
> > >  static void put_srv(struct rtrs_srv *srv)
> > > @@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
> > >               goto reject_w_econnreset;
> > >       }
> > >       recon_cnt = le16_to_cpu(msg->recon_cnt);
> > > -     srv = get_or_create_srv(ctx, &msg->paths_uuid);
> > > +     srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
> > >       /*
> > >        * "refcount == 0" happens if a previous thread calls get_or_create_srv
> > >        * allocate srv, but chunks of srv are not allocated yet.
> > >        */
> > > -     if (!srv || refcount_read(&srv->refcount) == 0) {
> > > +     if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
> > >               err = -ENOMEM;
> > >               goto reject_w_err;
> > >       }
> > > --
> > > 2.25.1
> > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  9:36       ` Leon Romanovsky
@ 2021-02-11  9:48         ` Jinpu Wang
  2021-02-11  9:51           ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jinpu Wang @ 2021-02-11  9:48 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-rdma, Bart Van Assche, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis, Md Haris Iqbal, Lutz Pogrell

On Thu, Feb 11, 2021 at 10:37 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Feb 11, 2021 at 10:23:54AM +0100, Jinpu Wang wrote:
> > On Thu, Feb 11, 2021 at 9:43 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Thu, Feb 11, 2021 at 07:55:24AM +0100, Jack Wang wrote:
> > > > From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > > >
> > > > While adding a path from the client side to an already established
> > > > session, it was possible to provide the destination IP to a different
> > > > server. This is dangerous.
> > > >
> > > > This commit adds an extra member to the rtrs_msg_conn_req structure, named
> > > > first_conn; which is supposed to notify if the connection request is the
> > > > first for that session or not.
> > > >
> > > > On the server side, if a session does not exist but the first_conn
> > > > received inside the rtrs_msg_conn_req structure is 1, the connection
> > > > request is failed. This signifies that the connection request is for an
> > > > already existing session, and since the server did not find one, it is an
> > > > wrong connection request.
> > > >
> > > > Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
> > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > > > Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
> > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > ---
> > > >  drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
> > > >  drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
> > > >  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
> > > >  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
> > > >  4 files changed, 25 insertions(+), 6 deletions(-)
>
> <...>
>
> > > >
> > > >       mutex_lock(&ctx->srv_mutex);
> > > >       list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
> > > > @@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > > >                       return srv;
> > > >               }
> > > >       }
> > > > +     /*
> > > > +      * If this request is not the first connection request from the
> > > > +      * client for this session then fail and return error.
> > > > +      */
> > > > +     if (!first_conn) {
> > > > +             err = -ENXIO;
> > > > +             goto err;
> > > > +     }
> > >
> > > Are you sure that this check not racy?
> > I can't see how a function parameter check can be racy, can you elaborate?
>
> get_or_create_srv() itself is protected with mutex_lock, but it can be called
> in parallel by rtrs_rdma_connect(), this is why I asked.

I think again, still can't see how it could be racy.
Thanks!
>
> Thanks
>
> > >
> > > Thanks
> > Thanks for the review.!
> > >
> > > >
> > > >       /* need to allocate a new srv */
> > > >       srv = kzalloc(sizeof(*srv), GFP_KERNEL);
> > > >       if  (!srv) {
> > > >               mutex_unlock(&ctx->srv_mutex);
> > > > -             return NULL;
> > > > +             goto err;
> > > >       }
> > > >
> > > >       INIT_LIST_HEAD(&srv->paths_list);
> > > > @@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > > >
> > > >  err_free_srv:
> > > >       kfree(srv);
> > > > -     return NULL;
> > > > +err:
> > > > +     return ERR_PTR(err);
> > > >  }
> > > >
> > > >  static void put_srv(struct rtrs_srv *srv)
> > > > @@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
> > > >               goto reject_w_econnreset;
> > > >       }
> > > >       recon_cnt = le16_to_cpu(msg->recon_cnt);
> > > > -     srv = get_or_create_srv(ctx, &msg->paths_uuid);
> > > > +     srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
> > > >       /*
> > > >        * "refcount == 0" happens if a previous thread calls get_or_create_srv
> > > >        * allocate srv, but chunks of srv are not allocated yet.
> > > >        */
> > > > -     if (!srv || refcount_read(&srv->refcount) == 0) {
> > > > +     if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
> > > >               err = -ENOMEM;
> > > >               goto reject_w_err;
> > > >       }
> > > > --
> > > > 2.25.1
> > > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session
  2021-02-11  9:48         ` Jinpu Wang
@ 2021-02-11  9:51           ` Leon Romanovsky
  0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2021-02-11  9:51 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: linux-rdma, Bart Van Assche, Doug Ledford, Jason Gunthorpe,
	Danil Kipnis, Md Haris Iqbal, Lutz Pogrell

On Thu, Feb 11, 2021 at 10:48:48AM +0100, Jinpu Wang wrote:
> On Thu, Feb 11, 2021 at 10:37 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Feb 11, 2021 at 10:23:54AM +0100, Jinpu Wang wrote:
> > > On Thu, Feb 11, 2021 at 9:43 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Thu, Feb 11, 2021 at 07:55:24AM +0100, Jack Wang wrote:
> > > > > From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > > > >
> > > > > While adding a path from the client side to an already established
> > > > > session, it was possible to provide the destination IP to a different
> > > > > server. This is dangerous.
> > > > >
> > > > > This commit adds an extra member to the rtrs_msg_conn_req structure, named
> > > > > first_conn; which is supposed to notify if the connection request is the
> > > > > first for that session or not.
> > > > >
> > > > > On the server side, if a session does not exist but the first_conn
> > > > > received inside the rtrs_msg_conn_req structure is 1, the connection
> > > > > request is failed. This signifies that the connection request is for an
> > > > > already existing session, and since the server did not find one, it is an
> > > > > wrong connection request.
> > > > >
> > > > > Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
> > > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
> > > > > Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
> > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > ---
> > > > >  drivers/infiniband/ulp/rtrs/rtrs-clt.c |  5 +++++
> > > > >  drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 +
> > > > >  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  4 +++-
> > > > >  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 21 ++++++++++++++++-----
> > > > >  4 files changed, 25 insertions(+), 6 deletions(-)
> >
> > <...>
> >
> > > > >
> > > > >       mutex_lock(&ctx->srv_mutex);
> > > > >       list_for_each_entry(srv, &ctx->srv_list, ctx_list) {
> > > > > @@ -1346,12 +1348,20 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > > > >                       return srv;
> > > > >               }
> > > > >       }
> > > > > +     /*
> > > > > +      * If this request is not the first connection request from the
> > > > > +      * client for this session then fail and return error.
> > > > > +      */
> > > > > +     if (!first_conn) {
> > > > > +             err = -ENXIO;
> > > > > +             goto err;
> > > > > +     }
> > > >
> > > > Are you sure that this check not racy?
> > > I can't see how a function parameter check can be racy, can you elaborate?
> >
> > get_or_create_srv() itself is protected with mutex_lock, but it can be called
> > in parallel by rtrs_rdma_connect(), this is why I asked.
>
> I think again, still can't see how it could be racy.

No problem, thanks.

> Thanks!
> >
> > Thanks
> >
> > > >
> > > > Thanks
> > > Thanks for the review.!
> > > >
> > > > >
> > > > >       /* need to allocate a new srv */
> > > > >       srv = kzalloc(sizeof(*srv), GFP_KERNEL);
> > > > >       if  (!srv) {
> > > > >               mutex_unlock(&ctx->srv_mutex);
> > > > > -             return NULL;
> > > > > +             goto err;
> > > > >       }
> > > > >
> > > > >       INIT_LIST_HEAD(&srv->paths_list);
> > > > > @@ -1386,7 +1396,8 @@ static struct rtrs_srv *get_or_create_srv(struct rtrs_srv_ctx *ctx,
> > > > >
> > > > >  err_free_srv:
> > > > >       kfree(srv);
> > > > > -     return NULL;
> > > > > +err:
> > > > > +     return ERR_PTR(err);
> > > > >  }
> > > > >
> > > > >  static void put_srv(struct rtrs_srv *srv)
> > > > > @@ -1787,12 +1798,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
> > > > >               goto reject_w_econnreset;
> > > > >       }
> > > > >       recon_cnt = le16_to_cpu(msg->recon_cnt);
> > > > > -     srv = get_or_create_srv(ctx, &msg->paths_uuid);
> > > > > +     srv = get_or_create_srv(ctx, &msg->paths_uuid, msg->first_conn);
> > > > >       /*
> > > > >        * "refcount == 0" happens if a previous thread calls get_or_create_srv
> > > > >        * allocate srv, but chunks of srv are not allocated yet.
> > > > >        */
> > > > > -     if (!srv || refcount_read(&srv->refcount) == 0) {
> > > > > +     if (IS_ERR(srv) || refcount_read(&srv->refcount) == 0) {
> > > > >               err = -ENOMEM;
> > > > >               goto reject_w_err;
> > > > >       }
> > > > > --
> > > > > 2.25.1
> > > > >

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-11  9:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-11  6:55 [PATCH for-next 0/4] A few bugfix for RTRS Jack Wang
2021-02-11  6:55 ` [PATCH for-next 1/4] RDMA/rtrs-srv: Fix BUG: KASAN: stack-out-of-bounds Jack Wang
2021-02-11  8:33   ` Leon Romanovsky
2021-02-11  6:55 ` [PATCH for-next 2/4] RDMA/rtrs: Only allow addition of path to an already established session Jack Wang
2021-02-11  8:43   ` Leon Romanovsky
2021-02-11  9:23     ` Jinpu Wang
2021-02-11  9:36       ` Leon Romanovsky
2021-02-11  9:48         ` Jinpu Wang
2021-02-11  9:51           ` Leon Romanovsky
2021-02-11  6:55 ` [PATCH for-next 3/4] RDMA/rtrs-srv: fix memory leak by missing kobject free Jack Wang
2021-02-11  6:55 ` [PATCH for-next 4/4] RDMA/rtrs-srv-sysfs: fix missing put_device Jack Wang
2021-02-11  8:48   ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.