All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes
@ 2020-08-24 18:14 Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds Selvin Xavier
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Selvin Xavier

Includes few important bug fixes for rc cycle.
Please apply.

Thanks,
Selvin

Naresh Kumar PBS (3):
  RDMA/bnxt_re: Static NQ depth allocation
  RDMA/bnxt_re: Restrict the max_gids to 256
  RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address

Selvin Xavier (3):
  RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  RDMA/bnxt_re: Do not report transparent vlan from QP1
  RDMA/bnxt_re: Fix the qp table indexing

 drivers/infiniband/hw/bnxt_re/ib_verbs.c   | 44 ++++++++++++++++++++----------
 drivers/infiniband/hw/bnxt_re/main.c       |  3 +-
 drivers/infiniband/hw/bnxt_re/qplib_fp.c   | 26 +++++++++++-------
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 10 ++++---
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  5 ++++
 drivers/infiniband/hw/bnxt_re/qplib_sp.c   |  2 +-
 drivers/infiniband/hw/bnxt_re/qplib_sp.h   |  1 +
 7 files changed, 60 insertions(+), 31 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-24 19:01   ` Leon Romanovsky
  2020-08-24 18:14 ` [PATCH for-rc 2/6] RDMA/bnxt_re: Do not report transparent vlan from QP1 Selvin Xavier
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Selvin Xavier

Driver crashes when destroy_qp is re-tried because of an
error returned. This is because the qp entry was  removed
from the qp list during the first call.

Remove qp from the list only if destroy_qp returns success.

Fixes: 8dae419f9ec7 ("RDMA/bnxt_re: Refactor queue pair creation code")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 3f18efc..2f5aac0 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -752,12 +752,6 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
 	gsi_sqp = rdev->gsi_ctx.gsi_sqp;
 	gsi_sah = rdev->gsi_ctx.gsi_sah;
 
-	/* remove from active qp list */
-	mutex_lock(&rdev->qp_lock);
-	list_del(&gsi_sqp->list);
-	mutex_unlock(&rdev->qp_lock);
-	atomic_dec(&rdev->qp_count);
-
 	ibdev_dbg(&rdev->ibdev, "Destroy the shadow AH\n");
 	bnxt_qplib_destroy_ah(&rdev->qplib_res,
 			      &gsi_sah->qplib_ah,
@@ -772,6 +766,12 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
 	}
 	bnxt_qplib_free_qp_res(&rdev->qplib_res, &gsi_sqp->qplib_qp);
 
+	/* remove from active qp list */
+	mutex_lock(&rdev->qp_lock);
+	list_del(&gsi_sqp->list);
+	mutex_unlock(&rdev->qp_lock);
+	atomic_dec(&rdev->qp_count);
+
 	kfree(rdev->gsi_ctx.sqp_tbl);
 	kfree(gsi_sah);
 	kfree(gsi_sqp);
@@ -792,11 +792,6 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
 	unsigned int flags;
 	int rc;
 
-	mutex_lock(&rdev->qp_lock);
-	list_del(&qp->list);
-	mutex_unlock(&rdev->qp_lock);
-	atomic_dec(&rdev->qp_count);
-
 	bnxt_qplib_flush_cqn_wq(&qp->qplib_qp);
 
 	rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
@@ -819,6 +814,11 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
 			goto sh_fail;
 	}
 
+	mutex_lock(&rdev->qp_lock);
+	list_del(&qp->list);
+	mutex_unlock(&rdev->qp_lock);
+	atomic_dec(&rdev->qp_count);
+
 	ib_umem_release(qp->rumem);
 	ib_umem_release(qp->sumem);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-rc 2/6] RDMA/bnxt_re: Do not report transparent vlan from QP1
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 3/6] RDMA/bnxt_re: Fix the qp table indexing Selvin Xavier
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Selvin Xavier

QP1 Rx CQE reports transparent VLAN ID in the completion
and this is used while reporting the completion for received
MAD packet. Check if the vlan id is configured before reporting
it in the work completion.

Fixes: 84511455ac5b ("RDMA/bnxt_re: report vlan_id and sl in qp1 recv completion")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 2f5aac0..0c5fb79 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3264,6 +3264,20 @@ static void bnxt_re_process_res_rawqp1_wc(struct ib_wc *wc,
 	wc->wc_flags |= IB_WC_GRH;
 }
 
+static bool bnxt_re_check_if_vlan_valid(struct bnxt_re_dev *rdev,
+					u16 vlan_id)
+{
+	/*
+	 * Check if the vlan is configured in the host.
+	 * If not configured, it  can be a transparent
+	 * VLAN. So dont report the vlan id.
+	 */
+	if (!__vlan_find_dev_deep_rcu(rdev->netdev,
+				      htons(ETH_P_8021Q), vlan_id))
+		return false;
+	return true;
+}
+
 static bool bnxt_re_is_vlan_pkt(struct bnxt_qplib_cqe *orig_cqe,
 				u16 *vid, u8 *sl)
 {
@@ -3332,9 +3346,11 @@ static void bnxt_re_process_res_shadow_qp_wc(struct bnxt_re_qp *gsi_sqp,
 	wc->src_qp = orig_cqe->src_qp;
 	memcpy(wc->smac, orig_cqe->smac, ETH_ALEN);
 	if (bnxt_re_is_vlan_pkt(orig_cqe, &vlan_id, &sl)) {
-		wc->vlan_id = vlan_id;
-		wc->sl = sl;
-		wc->wc_flags |= IB_WC_WITH_VLAN;
+		if (bnxt_re_check_if_vlan_valid(rdev, vlan_id)) {
+			wc->vlan_id = vlan_id;
+			wc->sl = sl;
+			wc->wc_flags |= IB_WC_WITH_VLAN;
+		}
 	}
 	wc->port_num = 1;
 	wc->vendor_err = orig_cqe->status;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-rc 3/6] RDMA/bnxt_re: Fix the qp table indexing
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 2/6] RDMA/bnxt_re: Do not report transparent vlan from QP1 Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 4/6] RDMA/bnxt_re: Static NQ depth allocation Selvin Xavier
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Selvin Xavier

qp->id can be a value outside the max number of qp.
Indexing the qp table with the id can cause
out of bounds crash. So changing the qp table indexing
by (qp->id % max_qp -1).

Allocating one extra entry for QP1. Some adapters create
one more than the max_qp requested to accommodate QP1.
If the qp->id is 1, store the inforamtion in the last
entry of the qp table.

Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/qplib_fp.c   | 22 ++++++++++++++--------
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 10 ++++++----
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  5 +++++
 3 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 117b423..3535130 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -818,6 +818,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 	u16 cmd_flags = 0;
 	u32 qp_flags = 0;
 	u8 pg_sz_lvl;
+	u32 tbl_indx;
 	int rc;
 
 	RCFW_CMD_PREP(req, CREATE_QP1, cmd_flags);
@@ -907,8 +908,9 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 		rq->dbinfo.db = qp->dpi->dbr;
 		rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size);
 	}
-	rcfw->qp_tbl[qp->id].qp_id = qp->id;
-	rcfw->qp_tbl[qp->id].qp_handle = (void *)qp;
+	tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
+	rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
+	rcfw->qp_tbl[tbl_indx].qp_handle = (void *)qp;
 
 	return 0;
 
@@ -959,6 +961,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 	u16 cmd_flags = 0;
 	u32 qp_flags = 0;
 	u8 pg_sz_lvl;
+	u32 tbl_indx;
 	u16 nsge;
 
 	RCFW_CMD_PREP(req, CREATE_QP, cmd_flags);
@@ -1111,8 +1114,9 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 		rq->dbinfo.db = qp->dpi->dbr;
 		rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size);
 	}
-	rcfw->qp_tbl[qp->id].qp_id = qp->id;
-	rcfw->qp_tbl[qp->id].qp_handle = (void *)qp;
+	tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
+	rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
+	rcfw->qp_tbl[tbl_indx].qp_handle = (void *)qp;
 
 	return 0;
 fail:
@@ -1457,10 +1461,12 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
 	struct cmdq_destroy_qp req;
 	struct creq_destroy_qp_resp resp;
 	u16 cmd_flags = 0;
+	u32 tbl_indx;
 	int rc;
 
-	rcfw->qp_tbl[qp->id].qp_id = BNXT_QPLIB_QP_ID_INVALID;
-	rcfw->qp_tbl[qp->id].qp_handle = NULL;
+	tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
+	rcfw->qp_tbl[tbl_indx].qp_id = BNXT_QPLIB_QP_ID_INVALID;
+	rcfw->qp_tbl[tbl_indx].qp_handle = NULL;
 
 	RCFW_CMD_PREP(req, DESTROY_QP, cmd_flags);
 
@@ -1468,8 +1474,8 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
 	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
 					  (void *)&resp, NULL, 0);
 	if (rc) {
-		rcfw->qp_tbl[qp->id].qp_id = qp->id;
-		rcfw->qp_tbl[qp->id].qp_handle = qp;
+		rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
+		rcfw->qp_tbl[tbl_indx].qp_handle = qp;
 		return rc;
 	}
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 4e21116..f7736e3 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -307,14 +307,15 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
 	__le16  mcookie;
 	u16 cookie;
 	int rc = 0;
-	u32 qp_id;
+	u32 qp_id, tbl_indx;
 
 	pdev = rcfw->pdev;
 	switch (qp_event->event) {
 	case CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION:
 		err_event = (struct creq_qp_error_notification *)qp_event;
 		qp_id = le32_to_cpu(err_event->xid);
-		qp = rcfw->qp_tbl[qp_id].qp_handle;
+		tbl_indx = map_qp_id_to_tbl_indx(qp_id, rcfw);
+		qp = rcfw->qp_tbl[tbl_indx].qp_handle;
 		dev_dbg(&pdev->dev, "Received QP error notification\n");
 		dev_dbg(&pdev->dev,
 			"qpid 0x%x, req_err=0x%x, resp_err=0x%x\n",
@@ -615,8 +616,9 @@ int bnxt_qplib_alloc_rcfw_channel(struct bnxt_qplib_res *res,
 
 	cmdq->bmap_size = bmap_size;
 
-	rcfw->qp_tbl_size = qp_tbl_sz;
-	rcfw->qp_tbl = kcalloc(qp_tbl_sz, sizeof(struct bnxt_qplib_qp_node),
+	/* Allocate one extra to hold the QP1 entries */
+	rcfw->qp_tbl_size = qp_tbl_sz + 1;
+	rcfw->qp_tbl = kcalloc(rcfw->qp_tbl_size, sizeof(struct bnxt_qplib_qp_node),
 			       GFP_KERNEL);
 	if (!rcfw->qp_tbl)
 		goto fail;
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
index 1573876..5f2f0a5 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
@@ -216,4 +216,9 @@ int bnxt_qplib_deinit_rcfw(struct bnxt_qplib_rcfw *rcfw);
 int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
 			 struct bnxt_qplib_ctx *ctx, int is_virtfn);
 void bnxt_qplib_mark_qp_error(void *qp_handle);
+static inline u32 map_qp_id_to_tbl_indx(u32 qid, struct bnxt_qplib_rcfw *rcfw)
+{
+	/* Last index of the qp_tbl is for QP1 ie. qp_tbl_size - 1*/
+	return (qid == 1) ? rcfw->qp_tbl_size - 1 : qid % rcfw->qp_tbl_size - 2;
+}
 #endif /* __BNXT_QPLIB_RCFW_H__ */
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-rc 4/6] RDMA/bnxt_re: Static NQ depth allocation
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
                   ` (2 preceding siblings ...)
  2020-08-24 18:14 ` [PATCH for-rc 3/6] RDMA/bnxt_re: Fix the qp table indexing Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 5/6] RDMA/bnxt_re: Restrict the max_gids to 256 Selvin Xavier
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Naresh Kumar PBS, Selvin Xavier

From: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>

At first, driver allocates memory for NQ
based on qplib_ctx->cq_count and qplib_ctx->srqc_count.
Later when creating ring, it uses a static value of 128K -1.
Fixing this with a static value for now.

Fixes: b08fe048a69d ("RDMA/bnxt_re: Refactor net ring allocation function")
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 17ac8b7..13bbeb4 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1037,8 +1037,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 		struct bnxt_qplib_nq *nq;
 
 		nq = &rdev->nq[i];
-		nq->hwq.max_elements = (qplib_ctx->cq_count +
-					qplib_ctx->srqc_count + 2);
+		nq->hwq.max_elements = BNXT_QPLIB_NQE_MAX_CNT;
 		rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, &rdev->nq[i]);
 		if (rc) {
 			ibdev_err(&rdev->ibdev, "Alloc Failed NQ%d rc:%#x",
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-rc 5/6] RDMA/bnxt_re: Restrict the max_gids to 256
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
                   ` (3 preceding siblings ...)
  2020-08-24 18:14 ` [PATCH for-rc 4/6] RDMA/bnxt_re: Static NQ depth allocation Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-24 18:14 ` [PATCH for-rc 6/6] RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address Selvin Xavier
  2020-08-27 12:31 ` [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Jason Gunthorpe
  6 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Naresh Kumar PBS, Selvin Xavier

From: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>

Some adapters report more than 256 gid entries.
Restrict it to 256 for now.

Fixes: 1ac5a4047975("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/qplib_sp.c | 2 +-
 drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index 4cd475e..64d44f5 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -149,7 +149,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
 	attr->max_inline_data = le32_to_cpu(sb->max_inline_data);
 	attr->l2_db_size = (sb->l2_db_space_size + 1) *
 			    (0x01 << RCFW_DBR_BASE_PAGE_SHIFT);
-	attr->max_sgid = le32_to_cpu(sb->max_gid);
+	attr->max_sgid = BNXT_QPLIB_NUM_GIDS_SUPPORTED;
 
 	bnxt_qplib_query_version(rcfw, attr->fw_ver);
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
index 6404f0d..967890c 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
@@ -47,6 +47,7 @@
 struct bnxt_qplib_dev_attr {
 #define FW_VER_ARR_LEN			4
 	u8				fw_ver[FW_VER_ARR_LEN];
+#define BNXT_QPLIB_NUM_GIDS_SUPPORTED	256
 	u16				max_sgid;
 	u16				max_mrw;
 	u32				max_qp;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-rc 6/6] RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
                   ` (4 preceding siblings ...)
  2020-08-24 18:14 ` [PATCH for-rc 5/6] RDMA/bnxt_re: Restrict the max_gids to 256 Selvin Xavier
@ 2020-08-24 18:14 ` Selvin Xavier
  2020-08-27 12:31 ` [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Jason Gunthorpe
  6 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 18:14 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Naresh Kumar PBS, Selvin Xavier

From: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>

When computing the first psn entry, driver checks
for page alignment. If this address is not page aligned,it
attempts to compute the offset in that page for later
use by using ALIGN macro. ALIGN macro does not return offset
bytes but the requested aligned address and hence cannot
be used directly to store as offset.
Since driver was using the address itself instead of offset,
it resulted in invalid address when filling the psn buffer.

Fixed driver to use PAGE_MASK macro to calculate the offset.

Fixes: fddcbbb02af4 ("RDMA/bnxt_re: Simplify obtaining queue entry from hw ring")
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/qplib_fp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 3535130..9f90cfe 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -937,10 +937,10 @@ static void bnxt_qplib_init_psn_ptr(struct bnxt_qplib_qp *qp, int size)
 
 	sq = &qp->sq;
 	hwq = &sq->hwq;
+	/* First psn entry */
 	fpsne = (u64)bnxt_qplib_get_qe(hwq, hwq->depth, &psn_pg);
 	if (!IS_ALIGNED(fpsne, PAGE_SIZE))
-		indx_pad = ALIGN(fpsne, PAGE_SIZE) / size;
-
+		indx_pad = (fpsne & ~PAGE_MASK) / size;
 	hwq->pad_pgofft = indx_pad;
 	hwq->pad_pg = (u64 *)psn_pg;
 	hwq->pad_stride = size;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  2020-08-24 18:14 ` [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds Selvin Xavier
@ 2020-08-24 19:01   ` Leon Romanovsky
  2020-08-24 19:36     ` Selvin Xavier
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2020-08-24 19:01 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: jgg, dledford, linux-rdma

On Mon, Aug 24, 2020 at 11:14:31AM -0700, Selvin Xavier wrote:
> Driver crashes when destroy_qp is re-tried because of an
> error returned. This is because the qp entry was  removed
> from the qp list during the first call.

How is it possible that destroy_qp fail?

>
> Remove qp from the list only if destroy_qp returns success.
>
> Fixes: 8dae419f9ec7 ("RDMA/bnxt_re: Refactor queue pair creation code")
> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> ---
>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> index 3f18efc..2f5aac0 100644
> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> @@ -752,12 +752,6 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
>  	gsi_sqp = rdev->gsi_ctx.gsi_sqp;
>  	gsi_sah = rdev->gsi_ctx.gsi_sah;
>
> -	/* remove from active qp list */
> -	mutex_lock(&rdev->qp_lock);
> -	list_del(&gsi_sqp->list);
> -	mutex_unlock(&rdev->qp_lock);
> -	atomic_dec(&rdev->qp_count);
> -
>  	ibdev_dbg(&rdev->ibdev, "Destroy the shadow AH\n");
>  	bnxt_qplib_destroy_ah(&rdev->qplib_res,
>  			      &gsi_sah->qplib_ah,
> @@ -772,6 +766,12 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
>  	}
>  	bnxt_qplib_free_qp_res(&rdev->qplib_res, &gsi_sqp->qplib_qp);
>
> +	/* remove from active qp list */
> +	mutex_lock(&rdev->qp_lock);
> +	list_del(&gsi_sqp->list);
> +	mutex_unlock(&rdev->qp_lock);
> +	atomic_dec(&rdev->qp_count);
> +
>  	kfree(rdev->gsi_ctx.sqp_tbl);
>  	kfree(gsi_sah);
>  	kfree(gsi_sqp);
> @@ -792,11 +792,6 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
>  	unsigned int flags;
>  	int rc;
>
> -	mutex_lock(&rdev->qp_lock);
> -	list_del(&qp->list);
> -	mutex_unlock(&rdev->qp_lock);
> -	atomic_dec(&rdev->qp_count);
> -
>  	bnxt_qplib_flush_cqn_wq(&qp->qplib_qp);
>
>  	rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
> @@ -819,6 +814,11 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
>  			goto sh_fail;
>  	}
>
> +	mutex_lock(&rdev->qp_lock);
> +	list_del(&qp->list);
> +	mutex_unlock(&rdev->qp_lock);
> +	atomic_dec(&rdev->qp_count);
> +
>  	ib_umem_release(qp->rumem);
>  	ib_umem_release(qp->sumem);
>
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  2020-08-24 19:01   ` Leon Romanovsky
@ 2020-08-24 19:36     ` Selvin Xavier
  2020-08-24 22:00       ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-08-24 19:36 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jason Gunthorpe, Doug Ledford, linux-rdma

On Tue, Aug 25, 2020 at 12:31 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Aug 24, 2020 at 11:14:31AM -0700, Selvin Xavier wrote:
> > Driver crashes when destroy_qp is re-tried because of an
> > error returned. This is because the qp entry was  removed
> > from the qp list during the first call.
>
> How is it possible that destroy_qp fail?
>
One possibility is when the FW is in a crash state.   Driver commands
to FW  fails and it reports an error status for destroy_qp verb.
Even Though the chances of this failure is less,  wanted to avoid a
host crash seen in this scenario.
> >
> > Remove qp from the list only if destroy_qp returns success.
> >
> > Fixes: 8dae419f9ec7 ("RDMA/bnxt_re: Refactor queue pair creation code")
> > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> > ---
> >  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 22 +++++++++++-----------
> >  1 file changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> > index 3f18efc..2f5aac0 100644
> > --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> > +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> > @@ -752,12 +752,6 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
> >       gsi_sqp = rdev->gsi_ctx.gsi_sqp;
> >       gsi_sah = rdev->gsi_ctx.gsi_sah;
> >
> > -     /* remove from active qp list */
> > -     mutex_lock(&rdev->qp_lock);
> > -     list_del(&gsi_sqp->list);
> > -     mutex_unlock(&rdev->qp_lock);
> > -     atomic_dec(&rdev->qp_count);
> > -
> >       ibdev_dbg(&rdev->ibdev, "Destroy the shadow AH\n");
> >       bnxt_qplib_destroy_ah(&rdev->qplib_res,
> >                             &gsi_sah->qplib_ah,
> > @@ -772,6 +766,12 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp)
> >       }
> >       bnxt_qplib_free_qp_res(&rdev->qplib_res, &gsi_sqp->qplib_qp);
> >
> > +     /* remove from active qp list */
> > +     mutex_lock(&rdev->qp_lock);
> > +     list_del(&gsi_sqp->list);
> > +     mutex_unlock(&rdev->qp_lock);
> > +     atomic_dec(&rdev->qp_count);
> > +
> >       kfree(rdev->gsi_ctx.sqp_tbl);
> >       kfree(gsi_sah);
> >       kfree(gsi_sqp);
> > @@ -792,11 +792,6 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
> >       unsigned int flags;
> >       int rc;
> >
> > -     mutex_lock(&rdev->qp_lock);
> > -     list_del(&qp->list);
> > -     mutex_unlock(&rdev->qp_lock);
> > -     atomic_dec(&rdev->qp_count);
> > -
> >       bnxt_qplib_flush_cqn_wq(&qp->qplib_qp);
> >
> >       rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
> > @@ -819,6 +814,11 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
> >                       goto sh_fail;
> >       }
> >
> > +     mutex_lock(&rdev->qp_lock);
> > +     list_del(&qp->list);
> > +     mutex_unlock(&rdev->qp_lock);
> > +     atomic_dec(&rdev->qp_count);
> > +
> >       ib_umem_release(qp->rumem);
> >       ib_umem_release(qp->sumem);
> >
> > --
> > 2.5.5
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  2020-08-24 19:36     ` Selvin Xavier
@ 2020-08-24 22:00       ` Jason Gunthorpe
  2020-08-25 11:44         ` Gal Pressman
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-08-24 22:00 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: Leon Romanovsky, Doug Ledford, linux-rdma

On Tue, Aug 25, 2020 at 01:06:23AM +0530, Selvin Xavier wrote:
> On Tue, Aug 25, 2020 at 12:31 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Mon, Aug 24, 2020 at 11:14:31AM -0700, Selvin Xavier wrote:
> > > Driver crashes when destroy_qp is re-tried because of an
> > > error returned. This is because the qp entry was  removed
> > > from the qp list during the first call.
> >
> > How is it possible that destroy_qp fail?
> >
> One possibility is when the FW is in a crash state.   Driver commands
> to FW  fails and it reports an error status for destroy_qp verb.
> Even Though the chances of this failure is less,  wanted to avoid a
> host crash seen in this scenario.

Drivers are not allowed to fail destroy - the only exception is if a
future destroy would succeed for some reason.

This patch should ignore the return code from FW and clean up all the
host memory. If the FW is not responding then the device should be
killed and the DMA allowed bit turned off in the PCI config space.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  2020-08-24 22:00       ` Jason Gunthorpe
@ 2020-08-25 11:44         ` Gal Pressman
  0 siblings, 0 replies; 12+ messages in thread
From: Gal Pressman @ 2020-08-25 11:44 UTC (permalink / raw)
  To: Jason Gunthorpe, Selvin Xavier; +Cc: Leon Romanovsky, Doug Ledford, linux-rdma

On 25/08/2020 1:00, Jason Gunthorpe wrote:
> On Tue, Aug 25, 2020 at 01:06:23AM +0530, Selvin Xavier wrote:
>> On Tue, Aug 25, 2020 at 12:31 AM Leon Romanovsky <leon@kernel.org> wrote:
>>>
>>> On Mon, Aug 24, 2020 at 11:14:31AM -0700, Selvin Xavier wrote:
>>>> Driver crashes when destroy_qp is re-tried because of an
>>>> error returned. This is because the qp entry was  removed
>>>> from the qp list during the first call.
>>>
>>> How is it possible that destroy_qp fail?
>>>
>> One possibility is when the FW is in a crash state.   Driver commands
>> to FW  fails and it reports an error status for destroy_qp verb.
>> Even Though the chances of this failure is less,  wanted to avoid a
>> host crash seen in this scenario.
> 
> Drivers are not allowed to fail destroy - the only exception is if a
> future destroy would succeed for some reason.

Why?
We already have the iterative cleanup in uverbs_destroy_ufile_hw, and combined
with Leon's changes it makes sense that the actual return value is returned
instead of ignored.

If the subsystem handles it for DEVX, why shouldn't it handle it for other drivers?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes
  2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
                   ` (5 preceding siblings ...)
  2020-08-24 18:14 ` [PATCH for-rc 6/6] RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address Selvin Xavier
@ 2020-08-27 12:31 ` Jason Gunthorpe
  6 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2020-08-27 12:31 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: dledford, linux-rdma

On Mon, Aug 24, 2020 at 11:14:30AM -0700, Selvin Xavier wrote:
> Includes few important bug fixes for rc cycle.
> Please apply.
> 
> Thanks,
> Selvin
> 
> Naresh Kumar PBS (3):
>   RDMA/bnxt_re: Static NQ depth allocation
>   RDMA/bnxt_re: Restrict the max_gids to 256
>   RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
> 
> Selvin Xavier (3):
>   RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
>   RDMA/bnxt_re: Do not report transparent vlan from QP1
>   RDMA/bnxt_re: Fix the qp table indexing

I took them all to for-rc, even though the destroy patch isn't really
the right solution, it does avoid memory corruption. 

The driver will still trigger WARN_ON and still leak memory however.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-08-27 12:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-24 18:14 [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Selvin Xavier
2020-08-24 18:14 ` [PATCH for-rc 1/6] RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds Selvin Xavier
2020-08-24 19:01   ` Leon Romanovsky
2020-08-24 19:36     ` Selvin Xavier
2020-08-24 22:00       ` Jason Gunthorpe
2020-08-25 11:44         ` Gal Pressman
2020-08-24 18:14 ` [PATCH for-rc 2/6] RDMA/bnxt_re: Do not report transparent vlan from QP1 Selvin Xavier
2020-08-24 18:14 ` [PATCH for-rc 3/6] RDMA/bnxt_re: Fix the qp table indexing Selvin Xavier
2020-08-24 18:14 ` [PATCH for-rc 4/6] RDMA/bnxt_re: Static NQ depth allocation Selvin Xavier
2020-08-24 18:14 ` [PATCH for-rc 5/6] RDMA/bnxt_re: Restrict the max_gids to 256 Selvin Xavier
2020-08-24 18:14 ` [PATCH for-rc 6/6] RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address Selvin Xavier
2020-08-27 12:31 ` [PATCH for-rc 0/6] RDMA/bnxt_re: Bug fixes Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.