From: Jack Wang <jinpu.wang@cloud.ionos.com>
To: linux-rdma@vger.kernel.org
Cc: bvanassche@acm.org, leon@kernel.org, dledford@redhat.com,
jgg@ziepe.ca, danil.kipnis@cloud.ionos.com,
jinpu.wang@cloud.ionos.com, Gioh Kim <gi-oh.kim@cloud.ionos.com>
Subject: [PATCHv2 for-next 03/12] RDMA/rtrs-clt: avoid run destroy_con_cq_qp/create_con_cq_qp in parallel
Date: Fri, 23 Oct 2020 09:43:44 +0200 [thread overview]
Message-ID: <20201023074353.21946-4-jinpu.wang@cloud.ionos.com> (raw)
In-Reply-To: <20201023074353.21946-1-jinpu.wang@cloud.ionos.com>
It could happen two kworkers race with each other:
addr_resolver kworker reconnect kworker
rtrs_clt_rdma_cm_handler
rtrs_rdma_addr_resolved
create_con_cq_qp: s.dev_ref++
"s.dev_ref is 1"
wait in create_cm fails with TIMEOUT
destroy_con_cq_qp: --s.dev_ref
"s.dev_ref is 0"
destroy_con_cq_qp: sess->s.dev = NULL
rtrs_cq_qp_create -> create_qp(con, sess->dev->ib_pd...)
sess->dev is NULL, panic.
To fix the problem using mutex to serialize create_con_cq_qp
and destroy_con_cq_qp.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 15 +++++++++++++--
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index fb840b152b37..4677e8ed29ae 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1499,6 +1499,7 @@ static int create_con(struct rtrs_clt_sess *sess, unsigned int cid)
con->c.cid = cid;
con->c.sess = &sess->s;
atomic_set(&con->io_cnt, 0);
+ mutex_init(&con->con_mutex);
sess->s.con[cid] = &con->c;
@@ -1510,6 +1511,7 @@ static void destroy_con(struct rtrs_clt_con *con)
struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
sess->s.con[con->c.cid] = NULL;
+ mutex_destroy(&con->con_mutex);
kfree(con);
}
@@ -1520,6 +1522,7 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
int err, cq_vector;
struct rtrs_msg_rkey_rsp *rsp;
+ lockdep_assert_held(&con->con_mutex);
if (con->c.cid == 0) {
/*
* One completion for each receive and two for each send
@@ -1593,7 +1596,7 @@ static void destroy_con_cq_qp(struct rtrs_clt_con *con)
* Be careful here: destroy_con_cq_qp() can be called even
* create_con_cq_qp() failed, see comments there.
*/
-
+ lockdep_assert_held(&con->con_mutex);
rtrs_cq_qp_destroy(&con->c);
if (con->rsp_ius) {
rtrs_iu_free(con->rsp_ius, DMA_FROM_DEVICE,
@@ -1625,7 +1628,9 @@ static int rtrs_rdma_addr_resolved(struct rtrs_clt_con *con)
struct rtrs_sess *s = con->c.sess;
int err;
+ mutex_lock(&con->con_mutex);
err = create_con_cq_qp(con);
+ mutex_unlock(&con->con_mutex);
if (err) {
rtrs_err(s, "create_con_cq_qp(), err: %d\n", err);
return err;
@@ -1938,8 +1943,9 @@ static int create_cm(struct rtrs_clt_con *con)
errr:
stop_cm(con);
- /* Is safe to call destroy if cq_qp is not inited */
+ mutex_lock(&con->con_mutex);
destroy_con_cq_qp(con);
+ mutex_unlock(&con->con_mutex);
destroy_cm:
destroy_cm(con);
@@ -2046,7 +2052,9 @@ static void rtrs_clt_stop_and_destroy_conns(struct rtrs_clt_sess *sess)
if (!sess->s.con[cid])
break;
con = to_clt_con(sess->s.con[cid]);
+ mutex_lock(&con->con_mutex);
destroy_con_cq_qp(con);
+ mutex_unlock(&con->con_mutex);
destroy_cm(con);
destroy_con(con);
}
@@ -2213,7 +2221,10 @@ static int init_conns(struct rtrs_clt_sess *sess)
struct rtrs_clt_con *con = to_clt_con(sess->s.con[cid]);
stop_cm(con);
+
+ mutex_lock(&con->con_mutex);
destroy_con_cq_qp(con);
+ mutex_unlock(&con->con_mutex);
destroy_cm(con);
destroy_con(con);
}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 167acd3c90fc..b8dbd701b3cb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -72,6 +72,7 @@ struct rtrs_clt_con {
struct rtrs_iu *rsp_ius;
u32 queue_size;
unsigned int cpu;
+ struct mutex con_mutex;
atomic_t io_cnt;
int cm_err;
};
--
2.25.1
next prev parent reply other threads:[~2020-10-23 7:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-23 7:43 [PATCHv2 for-next 00/12] rtrs: misc fix and cleanup Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 01/12] RDMA/rtrs-clt: remove destroy_con_cq_qp in case route resolving failed Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 02/12] RDMA/rtrs-clt: remove outdated comment in create_con_cq_qp Jack Wang
2020-10-23 7:43 ` Jack Wang [this message]
2020-10-23 7:43 ` [PATCHv2 for-next 04/12] RDMA/rtrs-clt: missing error from rtrs_rdma_conn_established Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 05/12] RDMA/rtrs-srv: don't guard the whole __alloc_srv with srv_mutex Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 06/12] RDMA/rtrs-srv: fix typo Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 07/12] RDMA/rtrs: remove unnecessary argument dir of rtrs_iu_free Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 08/12] RDMA/rtrs-clt: remove duplicated switch-case handling for CM error events Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 09/12] RDMA/rtrs-clt: remove duplicated code Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 10/12] RDMA/rtrs-srv: kill rtrs_srv_change_state_get_old Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 11/12] RDMA/rtrs: introduce rtrs_post_send Jack Wang
2020-10-23 7:43 ` [PATCHv2 for-next 12/12] RDMA/rtrs-clt: remove 'addr' from rtrs_clt_add_path_to_arr Jack Wang
2020-10-28 16:28 ` [PATCHv2 for-next 00/12] rtrs: misc fix and cleanup Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201023074353.21946-4-jinpu.wang@cloud.ionos.com \
--to=jinpu.wang@cloud.ionos.com \
--cc=bvanassche@acm.org \
--cc=danil.kipnis@cloud.ionos.com \
--cc=dledford@redhat.com \
--cc=gi-oh.kim@cloud.ionos.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.