All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 00/14] RDMA/bnxt_re: Bug fixes
@ 2017-05-10 10:45 Selvin Xavier
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Selvin Xavier

The patch series includes some bug fixes for bnxt_re driver.
This series includes implementation of some of the workarounds required
for the HW and FW. The driver version is also updated in this series.

Doug,
 Please apply these changes to your repository.

Thanks,
Selvin Xavier

Devesh Sharma (3):
  RDMA/bnxt_re: Fixing the Control path command and response handling
  RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext
  RDMA/bnxt_re: Fix RQE posting logic

Eddie Wai (2):
  RDMA/bnxt_re: HW workarounds for handling specific conditions
  RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and
    destination QP

Kalesh AP (1):
  RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is
    configured

Selvin Xavier (4):
  RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list
  RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
  RDMA/bnxt_re: Report supported value to IB stack in query_device
  RDMA/bnxt_re: Update the driver version

Somnath Kotur (4):
  RDMA/bnxt_re: Fix race between netdev register and unregister events
  RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs
  RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing
    error
  RDMA/bnxt_re: Specify RDMA component when allocating stats context

 drivers/infiniband/hw/bnxt_re/bnxt_re.h    |  21 +-
 drivers/infiniband/hw/bnxt_re/ib_verbs.c   | 547 +++++++++++++++++++++++------
 drivers/infiniband/hw/bnxt_re/ib_verbs.h   |  18 +-
 drivers/infiniband/hw/bnxt_re/main.c       |  93 ++++-
 drivers/infiniband/hw/bnxt_re/qplib_fp.c   | 393 +++++++++++----------
 drivers/infiniband/hw/bnxt_re/qplib_fp.h   |  18 +-
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 314 +++++++++--------
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  61 ++--
 drivers/infiniband/hw/bnxt_re/qplib_res.c  |  10 +
 drivers/infiniband/hw/bnxt_re/qplib_res.h  |   6 +
 drivers/infiniband/hw/bnxt_re/qplib_sp.c   | 396 ++++++++-------------
 drivers/infiniband/hw/bnxt_re/qplib_sp.h   |   4 +
 drivers/infiniband/hw/bnxt_re/roce_hsi.h   |   4 +-
 13 files changed, 1128 insertions(+), 757 deletions(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH for-next 01/14] RDMA/bnxt_re: Fixing the Control path command and response handling
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 02/14] RDMA/bnxt_re: HW workarounds for handling specific conditions Selvin Xavier
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Devesh Sharma, Selvin Xavier

From: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Fixing a concurrency issue with creq handling. Each caller
was given a globally managed crsq element, which was
accessed outside a lock. This could result in corruption,
if lot of applications are simultaneously issuing Control Path
commands. Now, each caller will provide its own response buffer
and the responses will be copied under a lock.
Also, Fixing the queue full condition check for the CMDQ.

As a part of these changes, the control path code is refactored
to remove the code replication in the response status checking.

Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/qplib_fp.c   | 215 +++++--------------
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 314 ++++++++++++++-------------
 drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  61 ++----
 drivers/infiniband/hw/bnxt_re/qplib_res.h  |   5 +
 drivers/infiniband/hw/bnxt_re/qplib_sp.c   | 328 +++++++----------------------
 5 files changed, 318 insertions(+), 605 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 43d08b5..ea9ce4f 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -284,7 +284,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_create_qp1 req;
-	struct creq_create_qp1_resp *resp;
+	struct creq_create_qp1_resp resp;
 	struct bnxt_qplib_pbl *pbl;
 	struct bnxt_qplib_q *sq = &qp->sq;
 	struct bnxt_qplib_q *rq = &qp->rq;
@@ -394,31 +394,12 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 
 	req.pd_id = cpu_to_le32(qp->pd->id);
 
-	resp = (struct creq_create_qp1_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&res->pdev->dev, "QPLIB: FP: CREATE_QP1 send failed");
-		rc = -EINVAL;
-		goto fail;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP1 timed out");
-		rc = -ETIMEDOUT;
-		goto fail;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP1 failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		rc = -EINVAL;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
 		goto fail;
-	}
-	qp->id = le32_to_cpu(resp->xid);
+
+	qp->id = le32_to_cpu(resp.xid);
 	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
 	sq->flush_in_progress = false;
 	rq->flush_in_progress = false;
@@ -442,7 +423,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct sq_send *hw_sq_send_hdr, **hw_sq_send_ptr;
 	struct cmdq_create_qp req;
-	struct creq_create_qp_resp *resp;
+	struct creq_create_qp_resp resp;
 	struct bnxt_qplib_pbl *pbl;
 	struct sq_psn_search **psn_search_ptr;
 	unsigned long int psn_search, poff = 0;
@@ -627,31 +608,12 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 	}
 	req.pd_id = cpu_to_le32(qp->pd->id);
 
-	resp = (struct creq_create_qp_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP send failed");
-		rc = -EINVAL;
-		goto fail;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP timed out");
-		rc = -ETIMEDOUT;
-		goto fail;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_QP failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		rc = -EINVAL;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
 		goto fail;
-	}
-	qp->id = le32_to_cpu(resp->xid);
+
+	qp->id = le32_to_cpu(resp.xid);
 	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
 	sq->flush_in_progress = false;
 	rq->flush_in_progress = false;
@@ -769,10 +731,11 @@ int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_modify_qp req;
-	struct creq_modify_qp_resp *resp;
+	struct creq_modify_qp_resp resp;
 	u16 cmd_flags = 0, pkey;
 	u32 temp32[4];
 	u32 bmask;
+	int rc;
 
 	RCFW_CMD_PREP(req, MODIFY_QP, cmd_flags);
 
@@ -862,27 +825,10 @@ int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 
 	req.vlan_pcp_vlan_dei_vlan_id = cpu_to_le16(qp->vlan_id);
 
-	resp = (struct creq_modify_qp_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: MODIFY_QP failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
+		return rc;
 	qp->cur_qp_state = qp->state;
 	return 0;
 }
@@ -891,37 +837,26 @@ int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_query_qp req;
-	struct creq_query_qp_resp *resp;
+	struct creq_query_qp_resp resp;
+	struct bnxt_qplib_rcfw_sbuf *sbuf;
 	struct creq_query_qp_resp_sb *sb;
 	u16 cmd_flags = 0;
 	u32 temp32[4];
-	int i;
+	int i, rc = 0;
 
 	RCFW_CMD_PREP(req, QUERY_QP, cmd_flags);
 
+	sbuf = bnxt_qplib_rcfw_alloc_sbuf(rcfw, sizeof(*sb));
+	if (!sbuf)
+		return -ENOMEM;
+	sb = sbuf->sb;
+
 	req.qp_cid = cpu_to_le32(qp->id);
 	req.resp_size = sizeof(*sb) / BNXT_QPLIB_CMDQE_UNITS;
-	resp = (struct creq_query_qp_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     (void **)&sb, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: QUERY_QP failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  (void *)sbuf, 0);
+	if (rc)
+		goto bail;
 	/* Extract the context from the side buffer */
 	qp->state = sb->en_sqd_async_notify_state &
 			CREQ_QUERY_QP_RESP_SB_STATE_MASK;
@@ -976,7 +911,9 @@ int bnxt_qplib_query_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 	qp->dest_qpn = le32_to_cpu(sb->dest_qp_id);
 	memcpy(qp->smac, sb->src_mac, 6);
 	qp->vlan_id = le16_to_cpu(sb->vlan_pcp_vlan_dei_vlan_id);
-	return 0;
+bail:
+	bnxt_qplib_rcfw_free_sbuf(rcfw, sbuf);
+	return rc;
 }
 
 static void __clean_cq(struct bnxt_qplib_cq *cq, u64 qp)
@@ -1021,34 +958,18 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_destroy_qp req;
-	struct creq_destroy_qp_resp *resp;
+	struct creq_destroy_qp_resp resp;
 	unsigned long flags;
 	u16 cmd_flags = 0;
+	int rc;
 
 	RCFW_CMD_PREP(req, DESTROY_QP, cmd_flags);
 
 	req.qp_cid = cpu_to_le32(qp->id);
-	resp = (struct creq_destroy_qp_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_QP failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
+		return rc;
 
 	/* Must walk the associated CQs to nullified the QP ptr */
 	spin_lock_irqsave(&qp->scq->hwq.lock, flags);
@@ -1483,7 +1404,7 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_create_cq req;
-	struct creq_create_cq_resp *resp;
+	struct creq_create_cq_resp resp;
 	struct bnxt_qplib_pbl *pbl;
 	u16 cmd_flags = 0;
 	int rc;
@@ -1525,30 +1446,12 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
 			(cq->cnq_hw_ring_id & CMDQ_CREATE_CQ_CNQ_ID_MASK) <<
 			 CMDQ_CREATE_CQ_CNQ_ID_SFT);
 
-	resp = (struct creq_create_cq_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_CQ send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_CQ timed out");
-		rc = -ETIMEDOUT;
-		goto fail;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: CREATE_CQ failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		rc = -EINVAL;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
 		goto fail;
-	}
-	cq->id = le32_to_cpu(resp->xid);
+
+	cq->id = le32_to_cpu(resp.xid);
 	cq->dbr_base = res->dpi_tbl.dbr_bar_reg_iomem;
 	cq->period = BNXT_QPLIB_QUEUE_START_PERIOD;
 	init_waitqueue_head(&cq->waitq);
@@ -1566,33 +1469,17 @@ int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_destroy_cq req;
-	struct creq_destroy_cq_resp *resp;
+	struct creq_destroy_cq_resp resp;
 	u16 cmd_flags = 0;
+	int rc;
 
 	RCFW_CMD_PREP(req, DESTROY_CQ, cmd_flags);
 
 	req.cq_cid = cpu_to_le32(cq->id);
-	resp = (struct creq_destroy_cq_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_CQ send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_CQ timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: FP: DESTROY_CQ failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
+		return rc;
 	bnxt_qplib_free_hwq(res->pdev, &cq->hwq);
 	return 0;
 }
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 23fb726..16e4275 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -39,72 +39,55 @@
 #include <linux/spinlock.h>
 #include <linux/pci.h>
 #include <linux/prefetch.h>
+#include <linux/delay.h>
+
 #include "roce_hsi.h"
 #include "qplib_res.h"
 #include "qplib_rcfw.h"
 static void bnxt_qplib_service_creq(unsigned long data);
 
 /* Hardware communication channel */
-int bnxt_qplib_rcfw_wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
+static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
 {
 	u16 cbit;
 	int rc;
 
-	cookie &= RCFW_MAX_COOKIE_VALUE;
 	cbit = cookie % RCFW_MAX_OUTSTANDING_CMD;
-	if (!test_bit(cbit, rcfw->cmdq_bitmap))
-		dev_warn(&rcfw->pdev->dev,
-			 "QPLIB: CMD bit %d for cookie 0x%x is not set?",
-			 cbit, cookie);
-
 	rc = wait_event_timeout(rcfw->waitq,
 				!test_bit(cbit, rcfw->cmdq_bitmap),
 				msecs_to_jiffies(RCFW_CMD_WAIT_TIME_MS));
-	if (!rc) {
-		dev_warn(&rcfw->pdev->dev,
-			 "QPLIB: Bono Error: timeout %d msec, msg {0x%x}\n",
-			 RCFW_CMD_WAIT_TIME_MS, cookie);
-	}
-
-	return rc;
+	return rc ? 0 : -ETIMEDOUT;
 };
 
-int bnxt_qplib_rcfw_block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
+static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
 {
-	u32 count = -1;
+	u32 count = RCFW_BLOCKED_CMD_WAIT_COUNT;
 	u16 cbit;
 
-	cookie &= RCFW_MAX_COOKIE_VALUE;
 	cbit = cookie % RCFW_MAX_OUTSTANDING_CMD;
 	if (!test_bit(cbit, rcfw->cmdq_bitmap))
 		goto done;
 	do {
+		mdelay(1); /* 1m sec */
 		bnxt_qplib_service_creq((unsigned long)rcfw);
 	} while (test_bit(cbit, rcfw->cmdq_bitmap) && --count);
 done:
-	return count;
+	return count ? 0 : -ETIMEDOUT;
 };
 
-void *bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
-				   struct cmdq_base *req, void **crsbe,
-				   u8 is_block)
+static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
+			  struct creq_base *resp, void *sb, u8 is_block)
 {
-	struct bnxt_qplib_crsq *crsq = &rcfw->crsq;
 	struct bnxt_qplib_cmdqe *cmdqe, **cmdq_ptr;
 	struct bnxt_qplib_hwq *cmdq = &rcfw->cmdq;
-	struct bnxt_qplib_hwq *crsb = &rcfw->crsb;
-	struct bnxt_qplib_crsqe *crsqe = NULL;
-	struct bnxt_qplib_crsbe **crsb_ptr;
+	struct bnxt_qplib_crsq *crsqe;
 	u32 sw_prod, cmdq_prod;
-	u8 retry_cnt = 0xFF;
-	dma_addr_t dma_addr;
 	unsigned long flags;
 	u32 size, opcode;
 	u16 cookie, cbit;
 	int pg, idx;
 	u8 *preq;
 
-retry:
 	opcode = req->opcode;
 	if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags) &&
 	    (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC &&
@@ -112,63 +95,50 @@ void *bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
 		dev_err(&rcfw->pdev->dev,
 			"QPLIB: RCFW not initialized, reject opcode 0x%x",
 			opcode);
-		return NULL;
+		return -EINVAL;
 	}
 
 	if (test_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags) &&
 	    opcode == CMDQ_BASE_OPCODE_INITIALIZE_FW) {
 		dev_err(&rcfw->pdev->dev, "QPLIB: RCFW already initialized!");
-		return NULL;
+		return -EINVAL;
 	}
 
 	/* Cmdq are in 16-byte units, each request can consume 1 or more
 	 * cmdqe
 	 */
 	spin_lock_irqsave(&cmdq->lock, flags);
-	if (req->cmd_size > cmdq->max_elements -
-	    ((HWQ_CMP(cmdq->prod, cmdq) - HWQ_CMP(cmdq->cons, cmdq)) &
-	     (cmdq->max_elements - 1))) {
+	if (req->cmd_size >= HWQ_FREE_SLOTS(cmdq)) {
 		dev_err(&rcfw->pdev->dev, "QPLIB: RCFW: CMDQ is full!");
 		spin_unlock_irqrestore(&cmdq->lock, flags);
-
-		if (!retry_cnt--)
-			return NULL;
-		goto retry;
+		return -EAGAIN;
 	}
 
-	retry_cnt = 0xFF;
 
-	cookie = atomic_inc_return(&rcfw->seq_num) & RCFW_MAX_COOKIE_VALUE;
+	cookie = rcfw->seq_num & RCFW_MAX_COOKIE_VALUE;
 	cbit = cookie % RCFW_MAX_OUTSTANDING_CMD;
 	if (is_block)
 		cookie |= RCFW_CMD_IS_BLOCKING;
+
+	set_bit(cbit, rcfw->cmdq_bitmap);
 	req->cookie = cpu_to_le16(cookie);
-	if (test_and_set_bit(cbit, rcfw->cmdq_bitmap)) {
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: RCFW MAX outstanding cmd reached!");
-		atomic_dec(&rcfw->seq_num);
+	crsqe = &rcfw->crsqe_tbl[cbit];
+	if (crsqe->resp) {
 		spin_unlock_irqrestore(&cmdq->lock, flags);
-
-		if (!retry_cnt--)
-			return NULL;
-		goto retry;
+		return -EBUSY;
 	}
-	/* Reserve a resp buffer slot if requested */
-	if (req->resp_size && crsbe) {
-		spin_lock(&crsb->lock);
-		sw_prod = HWQ_CMP(crsb->prod, crsb);
-		crsb_ptr = (struct bnxt_qplib_crsbe **)crsb->pbl_ptr;
-		*crsbe = (void *)&crsb_ptr[get_crsb_pg(sw_prod)]
-					  [get_crsb_idx(sw_prod)];
-		bnxt_qplib_crsb_dma_next(crsb->pbl_dma_ptr, sw_prod, &dma_addr);
-		req->resp_addr = cpu_to_le64(dma_addr);
-		crsb->prod++;
-		spin_unlock(&crsb->lock);
-
-		req->resp_size = (sizeof(struct bnxt_qplib_crsbe) +
-				  BNXT_QPLIB_CMDQE_UNITS - 1) /
-				 BNXT_QPLIB_CMDQE_UNITS;
+	memset(resp, 0, sizeof(*resp));
+	crsqe->resp = (struct creq_qp_event *)resp;
+	crsqe->resp->cookie = req->cookie;
+	crsqe->req_size = req->cmd_size;
+	if (req->resp_size && sb) {
+		struct bnxt_qplib_rcfw_sbuf *sbuf = sb;
+
+		req->resp_addr = cpu_to_le64(sbuf->dma_addr);
+		req->resp_size = (sbuf->size + BNXT_QPLIB_CMDQE_UNITS - 1) /
+				  BNXT_QPLIB_CMDQE_UNITS;
 	}
+
 	cmdq_ptr = (struct bnxt_qplib_cmdqe **)cmdq->pbl_ptr;
 	preq = (u8 *)req;
 	size = req->cmd_size * BNXT_QPLIB_CMDQE_UNITS;
@@ -190,23 +160,24 @@ void *bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
 		preq += min_t(u32, size, sizeof(*cmdqe));
 		size -= min_t(u32, size, sizeof(*cmdqe));
 		cmdq->prod++;
+		rcfw->seq_num++;
 	} while (size > 0);
 
+	rcfw->seq_num++;
+
 	cmdq_prod = cmdq->prod;
 	if (rcfw->flags & FIRMWARE_FIRST_FLAG) {
-		/* The very first doorbell write is required to set this flag
-		 * which prompts the FW to reset its internal pointers
+		/* The very first doorbell write
+		 * is required to set this flag
+		 * which prompts the FW to reset
+		 * its internal pointers
 		 */
 		cmdq_prod |= FIRMWARE_FIRST_FLAG;
 		rcfw->flags &= ~FIRMWARE_FIRST_FLAG;
 	}
-	sw_prod = HWQ_CMP(crsq->prod, crsq);
-	crsqe = &crsq->crsq[sw_prod];
-	memset(crsqe, 0, sizeof(*crsqe));
-	crsq->prod++;
-	crsqe->req_size = req->cmd_size;
 
 	/* ring CMDQ DB */
+	wmb();
 	writel(cmdq_prod, rcfw->cmdq_bar_reg_iomem +
 	       rcfw->cmdq_bar_reg_prod_off);
 	writel(RCFW_CMDQ_TRIG_VAL, rcfw->cmdq_bar_reg_iomem +
@@ -214,9 +185,56 @@ void *bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
 done:
 	spin_unlock_irqrestore(&cmdq->lock, flags);
 	/* Return the CREQ response pointer */
-	return crsqe ? &crsqe->qp_event : NULL;
+	return 0;
 }
 
+int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
+				 struct cmdq_base *req,
+				 struct creq_base *resp,
+				 void *sb, u8 is_block)
+{
+	struct creq_qp_event *evnt = (struct creq_qp_event *)resp;
+	u16 cookie;
+	u8 opcode, retry_cnt = 0xFF;
+	int rc = 0;
+
+	do {
+		opcode = req->opcode;
+		rc = __send_message(rcfw, req, resp, sb, is_block);
+		cookie = le16_to_cpu(req->cookie) & RCFW_MAX_COOKIE_VALUE;
+		if (!rc)
+			break;
+
+		if (!retry_cnt || (rc != -EAGAIN && rc != -EBUSY)) {
+			/* send failed */
+			dev_err(&rcfw->pdev->dev, "QPLIB: cmdq[%#x]=%#x send failed",
+				cookie, opcode);
+			return rc;
+		}
+		is_block ? mdelay(1) : usleep_range(500, 1000);
+
+	} while (retry_cnt--);
+
+	if (is_block)
+		rc = __block_for_resp(rcfw, cookie);
+	else
+		rc = __wait_for_resp(rcfw, cookie);
+	if (rc) {
+		/* timed out */
+		dev_err(&rcfw->pdev->dev, "QPLIB: cmdq[%#x]=%#x timedout (%d)msec",
+			cookie, opcode, RCFW_CMD_WAIT_TIME_MS);
+		return rc;
+	}
+
+	if (evnt->status) {
+		/* failed with status */
+		dev_err(&rcfw->pdev->dev, "QPLIB: cmdq[%#x]=%#x status %#x",
+			cookie, opcode, evnt->status);
+		rc = -EFAULT;
+	}
+
+	return rc;
+}
 /* Completions */
 static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw,
 					 struct creq_func_event *func_event)
@@ -260,12 +278,12 @@ static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw,
 static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
 				       struct creq_qp_event *qp_event)
 {
-	struct bnxt_qplib_crsq *crsq = &rcfw->crsq;
 	struct bnxt_qplib_hwq *cmdq = &rcfw->cmdq;
-	struct bnxt_qplib_crsqe *crsqe;
-	u16 cbit, cookie, blocked = 0;
+	struct bnxt_qplib_crsq *crsqe;
 	unsigned long flags;
-	u32 sw_cons;
+	u16 cbit, blocked = 0;
+	u16 cookie;
+	__le16  mcookie;
 
 	switch (qp_event->event) {
 	case CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION:
@@ -275,24 +293,31 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
 	default:
 		/* Command Response */
 		spin_lock_irqsave(&cmdq->lock, flags);
-		sw_cons = HWQ_CMP(crsq->cons, crsq);
-		crsqe = &crsq->crsq[sw_cons];
-		crsq->cons++;
-		memcpy(&crsqe->qp_event, qp_event, sizeof(crsqe->qp_event));
-
-		cookie = le16_to_cpu(crsqe->qp_event.cookie);
+		cookie = le16_to_cpu(qp_event->cookie);
+		mcookie = qp_event->cookie;
 		blocked = cookie & RCFW_CMD_IS_BLOCKING;
 		cookie &= RCFW_MAX_COOKIE_VALUE;
 		cbit = cookie % RCFW_MAX_OUTSTANDING_CMD;
+		crsqe = &rcfw->crsqe_tbl[cbit];
+		if (crsqe->resp &&
+		    crsqe->resp->cookie  == mcookie) {
+			memcpy(crsqe->resp, qp_event, sizeof(*qp_event));
+			crsqe->resp = NULL;
+		} else {
+			dev_err(&rcfw->pdev->dev,
+				"QPLIB: CMD %s resp->cookie = %#x, evnt->cookie = %#x",
+				crsqe->resp ? "mismatch" : "collision",
+				crsqe->resp ? crsqe->resp->cookie : 0, mcookie);
+		}
 		if (!test_and_clear_bit(cbit, rcfw->cmdq_bitmap))
 			dev_warn(&rcfw->pdev->dev,
 				 "QPLIB: CMD bit %d was not requested", cbit);
-
 		cmdq->cons += crsqe->req_size;
-		spin_unlock_irqrestore(&cmdq->lock, flags);
+		crsqe->req_size = 0;
+
 		if (!blocked)
 			wake_up(&rcfw->waitq);
-		break;
+		spin_unlock_irqrestore(&cmdq->lock, flags);
 	}
 	return 0;
 }
@@ -305,12 +330,12 @@ static void bnxt_qplib_service_creq(unsigned long data)
 	struct creq_base *creqe, **creq_ptr;
 	u32 sw_cons, raw_cons;
 	unsigned long flags;
-	u32 type;
+	u32 type, budget = CREQ_ENTRY_POLL_BUDGET;
 
-	/* Service the CREQ until empty */
+	/* Service the CREQ until budget is over */
 	spin_lock_irqsave(&creq->lock, flags);
 	raw_cons = creq->cons;
-	while (1) {
+	while (budget > 0) {
 		sw_cons = HWQ_CMP(raw_cons, creq);
 		creq_ptr = (struct creq_base **)creq->pbl_ptr;
 		creqe = &creq_ptr[get_creq_pg(sw_cons)][get_creq_idx(sw_cons)];
@@ -320,15 +345,9 @@ static void bnxt_qplib_service_creq(unsigned long data)
 		type = creqe->type & CREQ_BASE_TYPE_MASK;
 		switch (type) {
 		case CREQ_BASE_TYPE_QP_EVENT:
-			if (!bnxt_qplib_process_qp_event
-			    (rcfw, (struct creq_qp_event *)creqe))
-				rcfw->creq_qp_event_processed++;
-			else {
-				dev_warn(&rcfw->pdev->dev, "QPLIB: crsqe with");
-				dev_warn(&rcfw->pdev->dev,
-					 "QPLIB: type = 0x%x not handled",
-					 type);
-			}
+			bnxt_qplib_process_qp_event
+				(rcfw, (struct creq_qp_event *)creqe);
+			rcfw->creq_qp_event_processed++;
 			break;
 		case CREQ_BASE_TYPE_FUNC_EVENT:
 			if (!bnxt_qplib_process_func_event
@@ -346,7 +365,9 @@ static void bnxt_qplib_service_creq(unsigned long data)
 			break;
 		}
 		raw_cons++;
+		budget--;
 	}
+
 	if (creq->cons != raw_cons) {
 		creq->cons = raw_cons;
 		CREQ_DB_REARM(rcfw->creq_bar_reg_iomem, raw_cons,
@@ -375,23 +396,16 @@ static irqreturn_t bnxt_qplib_creq_irq(int irq, void *dev_instance)
 /* RCFW */
 int bnxt_qplib_deinit_rcfw(struct bnxt_qplib_rcfw *rcfw)
 {
-	struct creq_deinitialize_fw_resp *resp;
 	struct cmdq_deinitialize_fw req;
+	struct creq_deinitialize_fw_resp resp;
 	u16 cmd_flags = 0;
+	int rc;
 
 	RCFW_CMD_PREP(req, DEINITIALIZE_FW, cmd_flags);
-	resp = (struct creq_deinitialize_fw_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp)
-		return -EINVAL;
-
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie)))
-		return -ETIMEDOUT;
-
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie))
-		return -EFAULT;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  NULL, 0);
+	if (rc)
+		return rc;
 
 	clear_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags);
 	return 0;
@@ -417,9 +431,10 @@ static int __get_pbl_pg_idx(struct bnxt_qplib_pbl *pbl)
 int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
 			 struct bnxt_qplib_ctx *ctx, int is_virtfn)
 {
-	struct creq_initialize_fw_resp *resp;
 	struct cmdq_initialize_fw req;
+	struct creq_initialize_fw_resp resp;
 	u16 cmd_flags = 0, level;
+	int rc;
 
 	RCFW_CMD_PREP(req, INITIALIZE_FW, cmd_flags);
 
@@ -482,37 +497,19 @@ int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
 
 skip_ctx_setup:
 	req.stat_ctx_id = cpu_to_le32(ctx->stats.fw_id);
-	resp = (struct creq_initialize_fw_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: RCFW: INITIALIZE_FW send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: RCFW: INITIALIZE_FW timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: RCFW: INITIALIZE_FW failed");
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  NULL, 0);
+	if (rc)
+		return rc;
 	set_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags);
 	return 0;
 }
 
 void bnxt_qplib_free_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
 {
-	bnxt_qplib_free_hwq(rcfw->pdev, &rcfw->crsb);
-	kfree(rcfw->crsq.crsq);
+	kfree(rcfw->crsqe_tbl);
 	bnxt_qplib_free_hwq(rcfw->pdev, &rcfw->cmdq);
 	bnxt_qplib_free_hwq(rcfw->pdev, &rcfw->creq);
-
 	rcfw->pdev = NULL;
 }
 
@@ -539,21 +536,11 @@ int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
 		goto fail;
 	}
 
-	rcfw->crsq.max_elements = rcfw->cmdq.max_elements;
-	rcfw->crsq.crsq = kcalloc(rcfw->crsq.max_elements,
-				  sizeof(*rcfw->crsq.crsq), GFP_KERNEL);
-	if (!rcfw->crsq.crsq)
+	rcfw->crsqe_tbl = kcalloc(rcfw->cmdq.max_elements,
+				  sizeof(*rcfw->crsqe_tbl), GFP_KERNEL);
+	if (!rcfw->crsqe_tbl)
 		goto fail;
 
-	rcfw->crsb.max_elements = BNXT_QPLIB_CRSBE_MAX_CNT;
-	if (bnxt_qplib_alloc_init_hwq(rcfw->pdev, &rcfw->crsb, NULL, 0,
-				      &rcfw->crsb.max_elements,
-				      BNXT_QPLIB_CRSBE_UNITS, 0, PAGE_SIZE,
-				      HWQ_TYPE_CTX)) {
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: HW channel CRSB allocation failed");
-		goto fail;
-	}
 	return 0;
 
 fail:
@@ -606,7 +593,7 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
 	int rc;
 
 	/* General */
-	atomic_set(&rcfw->seq_num, 0);
+	rcfw->seq_num = 0;
 	rcfw->flags = FIRMWARE_FIRST_FLAG;
 	bmap_size = BITS_TO_LONGS(RCFW_MAX_OUTSTANDING_CMD *
 				  sizeof(unsigned long));
@@ -636,10 +623,6 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
 
 	rcfw->cmdq_bar_reg_trig_off = RCFW_COMM_TRIG_OFFSET;
 
-	/* CRSQ */
-	rcfw->crsq.prod = 0;
-	rcfw->crsq.cons = 0;
-
 	/* CREQ */
 	rcfw->creq_bar_reg = RCFW_COMM_CONS_PCI_BAR_REGION;
 	res_base = pci_resource_start(pdev, rcfw->creq_bar_reg);
@@ -692,3 +675,34 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
 	__iowrite32_copy(rcfw->cmdq_bar_reg_iomem, &init, sizeof(init) / 4);
 	return 0;
 }
+
+struct bnxt_qplib_rcfw_sbuf *bnxt_qplib_rcfw_alloc_sbuf(
+		struct bnxt_qplib_rcfw *rcfw,
+		u32 size)
+{
+	struct bnxt_qplib_rcfw_sbuf *sbuf;
+
+	sbuf = kzalloc(sizeof(*sbuf), GFP_ATOMIC);
+	if (!sbuf)
+		return NULL;
+
+	sbuf->size = size;
+	sbuf->sb = dma_zalloc_coherent(&rcfw->pdev->dev, sbuf->size,
+				       &sbuf->dma_addr, GFP_ATOMIC);
+	if (!sbuf->sb)
+		goto bail;
+
+	return sbuf;
+bail:
+	kfree(sbuf);
+	return NULL;
+}
+
+void bnxt_qplib_rcfw_free_sbuf(struct bnxt_qplib_rcfw *rcfw,
+			       struct bnxt_qplib_rcfw_sbuf *sbuf)
+{
+	if (sbuf->sb)
+		dma_free_coherent(&rcfw->pdev->dev, sbuf->size,
+				  sbuf->sb, sbuf->dma_addr);
+	kfree(sbuf);
+}
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
index d3567d7..09ce121 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
@@ -73,6 +73,7 @@
 #define RCFW_MAX_OUTSTANDING_CMD	BNXT_QPLIB_CMDQE_MAX_CNT
 #define RCFW_MAX_COOKIE_VALUE		0x7FFF
 #define RCFW_CMD_IS_BLOCKING		0x8000
+#define RCFW_BLOCKED_CMD_WAIT_COUNT	0x4E20
 
 /* Cmdq contains a fix number of a 16-Byte slots */
 struct bnxt_qplib_cmdqe {
@@ -94,32 +95,6 @@ struct bnxt_qplib_crsbe {
 	u8			data[1024];
 };
 
-/* CRSQ SB */
-#define BNXT_QPLIB_CRSBE_MAX_CNT	4
-#define BNXT_QPLIB_CRSBE_UNITS		sizeof(struct bnxt_qplib_crsbe)
-#define BNXT_QPLIB_CRSBE_CNT_PER_PG	(PAGE_SIZE / BNXT_QPLIB_CRSBE_UNITS)
-
-#define MAX_CRSB_IDX			(BNXT_QPLIB_CRSBE_MAX_CNT - 1)
-#define MAX_CRSB_IDX_PER_PG		(BNXT_QPLIB_CRSBE_CNT_PER_PG - 1)
-
-static inline u32 get_crsb_pg(u32 val)
-{
-	return (val & ~MAX_CRSB_IDX_PER_PG) / BNXT_QPLIB_CRSBE_CNT_PER_PG;
-}
-
-static inline u32 get_crsb_idx(u32 val)
-{
-	return val & MAX_CRSB_IDX_PER_PG;
-}
-
-static inline void bnxt_qplib_crsb_dma_next(dma_addr_t *pg_map_arr,
-					    u32 prod, dma_addr_t *dma_addr)
-{
-		*dma_addr = pg_map_arr[(prod) / BNXT_QPLIB_CRSBE_CNT_PER_PG];
-		*dma_addr += ((prod) % BNXT_QPLIB_CRSBE_CNT_PER_PG) *
-			      BNXT_QPLIB_CRSBE_UNITS;
-}
-
 /* CREQ */
 /* Allocate 1 per QP for async error notification for now */
 #define BNXT_QPLIB_CREQE_MAX_CNT	(64 * 1024)
@@ -158,17 +133,19 @@ static inline u32 get_creq_idx(u32 val)
 #define CREQ_DB(db, raw_cons, cp_bit)				\
 	writel(CREQ_DB_CP_FLAGS | ((raw_cons) & ((cp_bit) - 1)), db)
 
+#define CREQ_ENTRY_POLL_BUDGET		0x100
+
 /* HWQ */
-struct bnxt_qplib_crsqe {
-	struct creq_qp_event	qp_event;
+
+struct bnxt_qplib_crsq {
+	struct creq_qp_event	*resp;
 	u32			req_size;
 };
 
-struct bnxt_qplib_crsq {
-	struct bnxt_qplib_crsqe	*crsq;
-	u32			prod;
-	u32			cons;
-	u32			max_elements;
+struct bnxt_qplib_rcfw_sbuf {
+	void *sb;
+	dma_addr_t dma_addr;
+	u32 size;
 };
 
 /* RCFW Communication Channels */
@@ -185,7 +162,7 @@ struct bnxt_qplib_rcfw {
 	wait_queue_head_t	waitq;
 	int			(*aeq_handler)(struct bnxt_qplib_rcfw *,
 					       struct creq_func_event *);
-	atomic_t		seq_num;
+	u32			seq_num;
 
 	/* Bar region info */
 	void __iomem		*cmdq_bar_reg_iomem;
@@ -203,8 +180,7 @@ struct bnxt_qplib_rcfw {
 
 	/* Actual Cmd and Resp Queues */
 	struct bnxt_qplib_hwq	cmdq;
-	struct bnxt_qplib_crsq	crsq;
-	struct bnxt_qplib_hwq	crsb;
+	struct bnxt_qplib_crsq	*crsqe_tbl;
 };
 
 void bnxt_qplib_free_rcfw_channel(struct bnxt_qplib_rcfw *rcfw);
@@ -219,11 +195,14 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
 					(struct bnxt_qplib_rcfw *,
 					 struct creq_func_event *));
 
-int bnxt_qplib_rcfw_block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie);
-int bnxt_qplib_rcfw_wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie);
-void *bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
-				   struct cmdq_base *req, void **crsbe,
-				   u8 is_block);
+struct bnxt_qplib_rcfw_sbuf *bnxt_qplib_rcfw_alloc_sbuf(
+				struct bnxt_qplib_rcfw *rcfw,
+				u32 size);
+void bnxt_qplib_rcfw_free_sbuf(struct bnxt_qplib_rcfw *rcfw,
+			       struct bnxt_qplib_rcfw_sbuf *sbuf);
+int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
+				 struct cmdq_base *req, struct creq_base *resp,
+				 void *sbuf, u8 is_block);
 
 int bnxt_qplib_deinit_rcfw(struct bnxt_qplib_rcfw *rcfw);
 int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index 6277d80..4103e60 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -48,6 +48,11 @@ extern const struct bnxt_qplib_gid bnxt_qplib_gid_zero;
 
 #define HWQ_CMP(idx, hwq)	((idx) & ((hwq)->max_elements - 1))
 
+#define HWQ_FREE_SLOTS(hwq)	(hwq->max_elements - \
+				((HWQ_CMP(hwq->prod, hwq)\
+				- HWQ_CMP(hwq->cons, hwq))\
+				& (hwq->max_elements - 1)))
+
 enum bnxt_qplib_hwq_type {
 	HWQ_TYPE_CTX,
 	HWQ_TYPE_QUEUE,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index 7b31ecc..cf7c9cb 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -55,37 +55,30 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
 			    struct bnxt_qplib_dev_attr *attr)
 {
 	struct cmdq_query_func req;
-	struct creq_query_func_resp *resp;
+	struct creq_query_func_resp resp;
+	struct bnxt_qplib_rcfw_sbuf *sbuf;
 	struct creq_query_func_resp_sb *sb;
 	u16 cmd_flags = 0;
 	u32 temp;
 	u8 *tqm_alloc;
-	int i;
+	int i, rc = 0;
 
 	RCFW_CMD_PREP(req, QUERY_FUNC, cmd_flags);
 
-	req.resp_size = sizeof(*sb) / BNXT_QPLIB_CMDQE_UNITS;
-	resp = (struct creq_query_func_resp *)
-		bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void **)&sb,
-					     0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: QUERY_FUNC send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: QUERY_FUNC timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: QUERY_FUNC failed ");
+	sbuf = bnxt_qplib_rcfw_alloc_sbuf(rcfw, sizeof(*sb));
+	if (!sbuf) {
 		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
+			"QPLIB: SP: QUERY_FUNC alloc side buffer failed");
+		return -ENOMEM;
 	}
+
+	sb = sbuf->sb;
+	req.resp_size = sizeof(*sb) / BNXT_QPLIB_CMDQE_UNITS;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  (void *)sbuf, 0);
+	if (rc)
+		goto bail;
+
 	/* Extract the context from the side buffer */
 	attr->max_qp = le32_to_cpu(sb->max_qp);
 	attr->max_qp_rd_atom =
@@ -130,7 +123,10 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
 		attr->tqm_alloc_reqs[i * 4 + 2] = *(++tqm_alloc);
 		attr->tqm_alloc_reqs[i * 4 + 3] = *(++tqm_alloc);
 	}
-	return 0;
+
+bail:
+	bnxt_qplib_rcfw_free_sbuf(rcfw, sbuf);
+	return rc;
 }
 
 /* SGID */
@@ -178,8 +174,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	/* Remove GID from the SGID table */
 	if (update) {
 		struct cmdq_delete_gid req;
-		struct creq_delete_gid_resp *resp;
+		struct creq_delete_gid_resp resp;
 		u16 cmd_flags = 0;
+		int rc;
 
 		RCFW_CMD_PREP(req, DELETE_GID, cmd_flags);
 		if (sgid_tbl->hw_id[index] == 0xFFFF) {
@@ -188,31 +185,10 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 			return -EINVAL;
 		}
 		req.gid_index = cpu_to_le16(sgid_tbl->hw_id[index]);
-		resp = (struct creq_delete_gid_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, NULL,
-						     0);
-		if (!resp) {
-			dev_err(&res->pdev->dev,
-				"QPLIB: SP: DELETE_GID send failed");
-			return -EINVAL;
-		}
-		if (!bnxt_qplib_rcfw_wait_for_resp(rcfw,
-						   le16_to_cpu(req.cookie))) {
-			/* Cmd timed out */
-			dev_err(&res->pdev->dev,
-				"QPLIB: SP: DELETE_GID timed out");
-			return -ETIMEDOUT;
-		}
-		if (resp->status ||
-		    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-			dev_err(&res->pdev->dev,
-				"QPLIB: SP: DELETE_GID failed ");
-			dev_err(&res->pdev->dev,
-				"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-				resp->status, le16_to_cpu(req.cookie),
-				le16_to_cpu(resp->cookie));
-			return -EINVAL;
-		}
+		rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						  (void *)&resp, NULL, 0);
+		if (rc)
+			return rc;
 	}
 	memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
 	       sizeof(bnxt_qplib_gid_zero));
@@ -234,7 +210,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 						   struct bnxt_qplib_res,
 						   sgid_tbl);
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
-	int i, free_idx, rc = 0;
+	int i, free_idx;
 
 	if (!sgid_tbl) {
 		dev_err(&res->pdev->dev, "QPLIB: SGID table not allocated");
@@ -266,10 +242,11 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	}
 	if (update) {
 		struct cmdq_add_gid req;
-		struct creq_add_gid_resp *resp;
+		struct creq_add_gid_resp resp;
 		u16 cmd_flags = 0;
 		u32 temp32[4];
 		u16 temp16[3];
+		int rc;
 
 		RCFW_CMD_PREP(req, ADD_GID, cmd_flags);
 
@@ -290,31 +267,11 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 		req.src_mac[1] = cpu_to_be16(temp16[1]);
 		req.src_mac[2] = cpu_to_be16(temp16[2]);
 
-		resp = (struct creq_add_gid_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-		if (!resp) {
-			dev_err(&res->pdev->dev,
-				"QPLIB: SP: ADD_GID send failed");
-			return -EINVAL;
-		}
-		if (!bnxt_qplib_rcfw_wait_for_resp(rcfw,
-						   le16_to_cpu(req.cookie))) {
-			/* Cmd timed out */
-			dev_err(&res->pdev->dev,
-				"QPIB: SP: ADD_GID timed out");
-			return -ETIMEDOUT;
-		}
-		if (resp->status ||
-		    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-			dev_err(&res->pdev->dev, "QPLIB: SP: ADD_GID failed ");
-			dev_err(&res->pdev->dev,
-				"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-				resp->status, le16_to_cpu(req.cookie),
-				le16_to_cpu(resp->cookie));
-			return -EINVAL;
-		}
-		sgid_tbl->hw_id[free_idx] = le32_to_cpu(resp->xid);
+		rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+						  (void *)&resp, NULL, 0);
+		if (rc)
+			return rc;
+		sgid_tbl->hw_id[free_idx] = le32_to_cpu(resp.xid);
 	}
 	/* Add GID to the sgid_tbl */
 	memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid));
@@ -325,7 +282,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 
 	*index = free_idx;
 	/* unlock */
-	return rc;
+	return 0;
 }
 
 /* pkeys */
@@ -422,10 +379,11 @@ int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_create_ah req;
-	struct creq_create_ah_resp *resp;
+	struct creq_create_ah_resp resp;
 	u16 cmd_flags = 0;
 	u32 temp32[4];
 	u16 temp16[3];
+	int rc;
 
 	RCFW_CMD_PREP(req, CREATE_AH, cmd_flags);
 
@@ -450,28 +408,12 @@ int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah)
 	req.dest_mac[1] = cpu_to_le16(temp16[1]);
 	req.dest_mac[2] = cpu_to_le16(temp16[2]);
 
-	resp = (struct creq_create_ah_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 1);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: CREATE_AH send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_block_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: CREATE_AH timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: CREATE_AH failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
-	ah->id = le32_to_cpu(resp->xid);
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  NULL, 1);
+	if (rc)
+		return rc;
+
+	ah->id = le32_to_cpu(resp.xid);
 	return 0;
 }
 
@@ -479,35 +421,19 @@ int bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_destroy_ah req;
-	struct creq_destroy_ah_resp *resp;
+	struct creq_destroy_ah_resp resp;
 	u16 cmd_flags = 0;
+	int rc;
 
 	/* Clean up the AH table in the device */
 	RCFW_CMD_PREP(req, DESTROY_AH, cmd_flags);
 
 	req.ah_cid = cpu_to_le32(ah->id);
 
-	resp = (struct creq_destroy_ah_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 1);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: DESTROY_AH send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_block_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: DESTROY_AH timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: DESTROY_AH failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  NULL, 1);
+	if (rc)
+		return rc;
 	return 0;
 }
 
@@ -516,8 +442,9 @@ int bnxt_qplib_free_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_deallocate_key req;
-	struct creq_deallocate_key_resp *resp;
+	struct creq_deallocate_key_resp resp;
 	u16 cmd_flags = 0;
+	int rc;
 
 	if (mrw->lkey == 0xFFFFFFFF) {
 		dev_info(&res->pdev->dev,
@@ -536,27 +463,11 @@ int bnxt_qplib_free_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw)
 	else
 		req.key = cpu_to_le32(mrw->lkey);
 
-	resp = (struct creq_deallocate_key_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: FREE_MR send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&res->pdev->dev, "QPLIB: SP: FREE_MR timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: FREE_MR failed ");
-		dev_err(&res->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
+					  NULL, 0);
+	if (rc)
+		return rc;
+
 	/* Free the qplib's MRW memory */
 	if (mrw->hwq.max_elements)
 		bnxt_qplib_free_hwq(res->pdev, &mrw->hwq);
@@ -568,9 +479,10 @@ int bnxt_qplib_alloc_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_allocate_mrw req;
-	struct creq_allocate_mrw_resp *resp;
+	struct creq_allocate_mrw_resp resp;
 	u16 cmd_flags = 0;
 	unsigned long tmp;
+	int rc;
 
 	RCFW_CMD_PREP(req, ALLOCATE_MRW, cmd_flags);
 
@@ -584,33 +496,17 @@ int bnxt_qplib_alloc_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw)
 	tmp = (unsigned long)mrw;
 	req.mrw_handle = cpu_to_le64(tmp);
 
-	resp = (struct creq_allocate_mrw_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, 0);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: ALLOC_MRW send failed");
-		return -EINVAL;
-	}
-	if (!bnxt_qplib_rcfw_wait_for_resp(rcfw, le16_to_cpu(req.cookie))) {
-		/* Cmd timed out */
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: ALLOC_MRW timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: ALLOC_MRW failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	if (rc)
+		return rc;
+
 	if ((mrw->type == CMDQ_ALLOCATE_MRW_MRW_FLAGS_MW_TYPE1)  ||
 	    (mrw->type == CMDQ_ALLOCATE_MRW_MRW_FLAGS_MW_TYPE2A) ||
 	    (mrw->type == CMDQ_ALLOCATE_MRW_MRW_FLAGS_MW_TYPE2B))
-		mrw->rkey = le32_to_cpu(resp->xid);
+		mrw->rkey = le32_to_cpu(resp.xid);
 	else
-		mrw->lkey = le32_to_cpu(resp->xid);
+		mrw->lkey = le32_to_cpu(resp.xid);
 	return 0;
 }
 
@@ -619,40 +515,17 @@ int bnxt_qplib_dereg_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw,
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_deregister_mr req;
-	struct creq_deregister_mr_resp *resp;
+	struct creq_deregister_mr_resp resp;
 	u16 cmd_flags = 0;
 	int rc;
 
 	RCFW_CMD_PREP(req, DEREGISTER_MR, cmd_flags);
 
 	req.lkey = cpu_to_le32(mrw->lkey);
-	resp = (struct creq_deregister_mr_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, block);
-	if (!resp) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: DEREG_MR send failed");
-		return -EINVAL;
-	}
-	if (block)
-		rc = bnxt_qplib_rcfw_block_for_resp(rcfw,
-						    le16_to_cpu(req.cookie));
-	else
-		rc = bnxt_qplib_rcfw_wait_for_resp(rcfw,
-						   le16_to_cpu(req.cookie));
-	if (!rc) {
-		/* Cmd timed out */
-		dev_err(&res->pdev->dev, "QPLIB: SP: DEREG_MR timed out");
-		return -ETIMEDOUT;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&rcfw->pdev->dev, "QPLIB: SP: DEREG_MR failed ");
-		dev_err(&rcfw->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, block);
+	if (rc)
+		return rc;
 
 	/* Free the qplib's MR memory */
 	if (mrw->hwq.max_elements) {
@@ -669,7 +542,7 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_register_mr req;
-	struct creq_register_mr_resp *resp;
+	struct creq_register_mr_resp resp;
 	u16 cmd_flags = 0, level;
 	int pg_ptrs, pages, i, rc;
 	dma_addr_t **pbl_ptr;
@@ -730,36 +603,11 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
 	req.key = cpu_to_le32(mr->lkey);
 	req.mr_size = cpu_to_le64(mr->total_size);
 
-	resp = (struct creq_register_mr_resp *)
-			bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
-						     NULL, block);
-	if (!resp) {
-		dev_err(&res->pdev->dev, "SP: REG_MR send failed");
-		rc = -EINVAL;
-		goto fail;
-	}
-	if (block)
-		rc = bnxt_qplib_rcfw_block_for_resp(rcfw,
-						    le16_to_cpu(req.cookie));
-	else
-		rc = bnxt_qplib_rcfw_wait_for_resp(rcfw,
-						   le16_to_cpu(req.cookie));
-	if (!rc) {
-		/* Cmd timed out */
-		dev_err(&res->pdev->dev, "SP: REG_MR timed out");
-		rc = -ETIMEDOUT;
-		goto fail;
-	}
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: REG_MR failed ");
-		dev_err(&res->pdev->dev,
-			"QPLIB: SP: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		rc = -EINVAL;
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, block);
+	if (rc)
 		goto fail;
-	}
+
 	return 0;
 
 fail:
@@ -804,35 +652,15 @@ int bnxt_qplib_map_tc2cos(struct bnxt_qplib_res *res, u16 *cids)
 {
 	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
 	struct cmdq_map_tc_to_cos req;
-	struct creq_map_tc_to_cos_resp *resp;
+	struct creq_map_tc_to_cos_resp resp;
 	u16 cmd_flags = 0;
-	int tleft;
+	int rc = 0;
 
 	RCFW_CMD_PREP(req, MAP_TC_TO_COS, cmd_flags);
 	req.cos0 = cpu_to_le16(cids[0]);
 	req.cos1 = cpu_to_le16(cids[1]);
 
-	resp = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, NULL, 0);
-	if (!resp) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: MAP_TC2COS send failed");
-		return -EINVAL;
-	}
-
-	tleft = bnxt_qplib_rcfw_block_for_resp(rcfw, le16_to_cpu(req.cookie));
-	if (!tleft) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: MAP_TC2COS timed out");
-		return -ETIMEDOUT;
-	}
-
-	if (resp->status ||
-	    le16_to_cpu(resp->cookie) != le16_to_cpu(req.cookie)) {
-		dev_err(&res->pdev->dev, "QPLIB: SP: MAP_TC2COS failed ");
-		dev_err(&res->pdev->dev,
-			"QPLIB: with status 0x%x cmdq 0x%x resp 0x%x",
-			resp->status, le16_to_cpu(req.cookie),
-			le16_to_cpu(resp->cookie));
-		return -EINVAL;
-	}
-
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
 	return 0;
 }
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 02/14] RDMA/bnxt_re: HW workarounds for handling specific conditions
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-05-10 10:45   ` [PATCH for-next 01/14] RDMA/bnxt_re: Fixing the Control path command and response handling Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events Selvin Xavier
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Eddie Wai, Sriharsha Basavapatna, Selvin Xavier

From: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch implements the following HW workarounds

1. The SQ depth needs to be augmented  by 128 + 1 to avoid running
   into a CQE slip issue
2. Workaround to handle the problem where the HW fast path engine continues
   to access DMA memory in retranmission mode even after the WQE has
   already been completed. If the HW reports this condition, driver detects
   it and posts a Fence WQE. The driver stops reporting the completions
   to stack until it receives completion  for Fence WQE.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c  | 381 ++++++++++++++++++++++++++----
 drivers/infiniband/hw/bnxt_re/ib_verbs.h  |  14 ++
 drivers/infiniband/hw/bnxt_re/qplib_fp.c  | 165 +++++++++++--
 drivers/infiniband/hw/bnxt_re/qplib_fp.h  |  17 +-
 drivers/infiniband/hw/bnxt_re/qplib_res.h |   1 -
 drivers/infiniband/hw/bnxt_re/qplib_sp.c  |   5 +
 drivers/infiniband/hw/bnxt_re/qplib_sp.h  |   2 +
 7 files changed, 515 insertions(+), 70 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 33af2e3..6385213 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -61,6 +61,48 @@
 #include "ib_verbs.h"
 #include <rdma/bnxt_re-abi.h>
 
+static int __from_ib_access_flags(int iflags)
+{
+	int qflags = 0;
+
+	if (iflags & IB_ACCESS_LOCAL_WRITE)
+		qflags |= BNXT_QPLIB_ACCESS_LOCAL_WRITE;
+	if (iflags & IB_ACCESS_REMOTE_READ)
+		qflags |= BNXT_QPLIB_ACCESS_REMOTE_READ;
+	if (iflags & IB_ACCESS_REMOTE_WRITE)
+		qflags |= BNXT_QPLIB_ACCESS_REMOTE_WRITE;
+	if (iflags & IB_ACCESS_REMOTE_ATOMIC)
+		qflags |= BNXT_QPLIB_ACCESS_REMOTE_ATOMIC;
+	if (iflags & IB_ACCESS_MW_BIND)
+		qflags |= BNXT_QPLIB_ACCESS_MW_BIND;
+	if (iflags & IB_ZERO_BASED)
+		qflags |= BNXT_QPLIB_ACCESS_ZERO_BASED;
+	if (iflags & IB_ACCESS_ON_DEMAND)
+		qflags |= BNXT_QPLIB_ACCESS_ON_DEMAND;
+	return qflags;
+};
+
+static enum ib_access_flags __to_ib_access_flags(int qflags)
+{
+	enum ib_access_flags iflags = 0;
+
+	if (qflags & BNXT_QPLIB_ACCESS_LOCAL_WRITE)
+		iflags |= IB_ACCESS_LOCAL_WRITE;
+	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_WRITE)
+		iflags |= IB_ACCESS_REMOTE_WRITE;
+	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_READ)
+		iflags |= IB_ACCESS_REMOTE_READ;
+	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_ATOMIC)
+		iflags |= IB_ACCESS_REMOTE_ATOMIC;
+	if (qflags & BNXT_QPLIB_ACCESS_MW_BIND)
+		iflags |= IB_ACCESS_MW_BIND;
+	if (qflags & BNXT_QPLIB_ACCESS_ZERO_BASED)
+		iflags |= IB_ZERO_BASED;
+	if (qflags & BNXT_QPLIB_ACCESS_ON_DEMAND)
+		iflags |= IB_ACCESS_ON_DEMAND;
+	return iflags;
+};
+
 static int bnxt_re_build_sgl(struct ib_sge *ib_sg_list,
 			     struct bnxt_qplib_sge *sg_list, int num)
 {
@@ -410,6 +452,165 @@ enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev,
 	return IB_LINK_LAYER_ETHERNET;
 }
 
+#define BNXT_RE_FENCE_BYTES	64
+#define	BNXT_RE_FENCE_PBL_SIZE	DIV_ROUND_UP(BNXT_RE_FENCE_BYTES, PAGE_SIZE)
+
+static void bnxt_re_create_fence_wqe(struct bnxt_re_pd *pd)
+{
+	struct bnxt_re_fence_data *fence = &pd->fence;
+	struct ib_mr *ib_mr = &fence->mr->ib_mr;
+	struct bnxt_qplib_swqe *wqe = &fence->bind_wqe;
+
+	memset(wqe, 0, sizeof(*wqe));
+	wqe->type = BNXT_QPLIB_SWQE_TYPE_BIND_MW;
+	wqe->wr_id = BNXT_QPLIB_FENCE_WRID;
+	wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_SIGNAL_COMP;
+	wqe->flags |= BNXT_QPLIB_SWQE_FLAGS_UC_FENCE;
+	wqe->bind.zero_based = false;
+	wqe->bind.parent_l_key = ib_mr->lkey;
+	wqe->bind.va = (u64)fence->va;
+	wqe->bind.length = fence->size;
+	wqe->bind.access_cntl = __from_ib_access_flags(IB_ACCESS_REMOTE_READ);
+	wqe->bind.mw_type = SQ_BIND_MW_TYPE_TYPE1;
+
+	/* Save the initial rkey in fence structure for now;
+	 * wqe->bind.r_key will be set at (re)bind time.
+	 */
+	fence->bind_rkey = ib_inc_rkey(fence->mw->rkey);
+}
+
+static int bnxt_re_bind_fence_mw(struct bnxt_qplib_qp *qplib_qp)
+{
+	struct bnxt_re_qp *qp = container_of(qplib_qp, struct bnxt_re_qp,
+					     qplib_qp);
+	struct ib_pd *ib_pd = qp->ib_qp.pd;
+	struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
+	struct bnxt_re_fence_data *fence = &pd->fence;
+	struct bnxt_qplib_swqe *fence_wqe = &fence->bind_wqe;
+	struct bnxt_qplib_swqe wqe;
+	int rc;
+
+	/* TODO: Need SQ locking here when Fence WQE
+	 * posting moves up into bnxt_re from bnxt_qplib.
+	 */
+	memcpy(&wqe, fence_wqe, sizeof(wqe));
+	wqe.bind.r_key = fence->bind_rkey;
+	fence->bind_rkey = ib_inc_rkey(fence->bind_rkey);
+
+	dev_dbg(rdev_to_dev(qp->rdev),
+		"Posting bind fence-WQE: rkey: %#x QP: %d PD: %p\n",
+		wqe.bind.r_key, qp->qplib_qp.id, pd);
+	rc = bnxt_qplib_post_send(&qp->qplib_qp, &wqe);
+	if (rc) {
+		dev_err(rdev_to_dev(qp->rdev), "Failed to bind fence-WQE\n");
+		return rc;
+	}
+	bnxt_qplib_post_send_db(&qp->qplib_qp);
+
+	return rc;
+}
+
+static void bnxt_re_destroy_fence_mr(struct bnxt_re_pd *pd)
+{
+	struct bnxt_re_fence_data *fence = &pd->fence;
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct device *dev = &rdev->en_dev->pdev->dev;
+	struct bnxt_re_mr *mr = fence->mr;
+
+	if (fence->mw) {
+		bnxt_re_dealloc_mw(fence->mw);
+		fence->mw = NULL;
+	}
+	if (mr) {
+		if (mr->ib_mr.rkey)
+			bnxt_qplib_dereg_mrw(&rdev->qplib_res, &mr->qplib_mr,
+					     true);
+		if (mr->ib_mr.lkey)
+			bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
+		kfree(mr);
+		fence->mr = NULL;
+	}
+	if (fence->dma_addr) {
+		dma_unmap_single(dev, fence->dma_addr, BNXT_RE_FENCE_BYTES,
+				 DMA_BIDIRECTIONAL);
+		fence->dma_addr = 0;
+	}
+	kfree(fence->va);
+	fence->va = NULL;
+}
+
+static int bnxt_re_create_fence_mr(struct bnxt_re_pd *pd)
+{
+	int mr_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_MW_BIND;
+	struct bnxt_re_fence_data *fence = &pd->fence;
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct device *dev = &rdev->en_dev->pdev->dev;
+	struct bnxt_re_mr *mr = NULL;
+	dma_addr_t dma_addr = 0;
+	struct ib_mw *mw;
+	void *va = NULL;
+	u64 pbl_tbl;
+	int rc;
+
+	/* Allocate a small chunk of memory and dma-map it */
+	fence->va = kzalloc(BNXT_RE_FENCE_BYTES, GFP_KERNEL);
+	dma_addr = dma_map_single(dev, va, BNXT_RE_FENCE_BYTES,
+				  DMA_BIDIRECTIONAL);
+	rc = dma_mapping_error(dev, dma_addr);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to dma-map fence-MR-mem\n");
+		rc = -EIO;
+		fence->dma_addr = 0;
+		goto fail;
+	}
+	fence->dma_addr = dma_addr;
+
+	/* Allocate a MR */
+	mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+	if (!mr)
+		return -ENOMEM;
+	fence->mr = mr;
+	mr->rdev = rdev;
+	mr->qplib_mr.pd = &pd->qplib_pd;
+	mr->qplib_mr.type = CMDQ_ALLOCATE_MRW_MRW_FLAGS_PMR;
+	mr->qplib_mr.flags = __from_ib_access_flags(mr_access_flags);
+	rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mr->qplib_mr);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to alloc fence-HW-MR\n");
+		goto fail;
+	}
+
+	/* Register MR */
+	mr->ib_mr.lkey = mr->qplib_mr.lkey;
+	mr->qplib_mr.va         = (u64)va;
+	mr->qplib_mr.total_size = BNXT_RE_FENCE_BYTES;
+	pbl_tbl = dma_addr;
+	rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, &pbl_tbl,
+			       BNXT_RE_FENCE_PBL_SIZE, false);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Failed to register fence-MR\n");
+		goto fail;
+	}
+	mr->ib_mr.rkey = mr->qplib_mr.rkey;
+
+	/* Create a fence MW only for kernel consumers */
+	mw = bnxt_re_alloc_mw(&pd->ib_pd, IB_MW_TYPE_1, NULL);
+	if (!mw) {
+		dev_err(rdev_to_dev(rdev),
+			"Failed to create fence-MW for PD: %p\n", pd);
+		rc = -EINVAL;
+		goto fail;
+	}
+	fence->mw = mw;
+
+	bnxt_re_create_fence_wqe(pd);
+	return 0;
+
+fail:
+	bnxt_re_destroy_fence_mr(pd);
+	return rc;
+}
+
 /* Protection Domains */
 int bnxt_re_dealloc_pd(struct ib_pd *ib_pd)
 {
@@ -417,6 +618,7 @@ int bnxt_re_dealloc_pd(struct ib_pd *ib_pd)
 	struct bnxt_re_dev *rdev = pd->rdev;
 	int rc;
 
+	bnxt_re_destroy_fence_mr(pd);
 	if (ib_pd->uobject && pd->dpi.dbr) {
 		struct ib_ucontext *ib_uctx = ib_pd->uobject->context;
 		struct bnxt_re_ucontext *ucntx;
@@ -498,6 +700,10 @@ struct ib_pd *bnxt_re_alloc_pd(struct ib_device *ibdev,
 		}
 	}
 
+	if (!udata)
+		if (bnxt_re_create_fence_mr(pd))
+			dev_warn(rdev_to_dev(rdev),
+				 "Failed to create Fence-MR\n");
 	return &pd->ib_pd;
 dbfail:
 	(void)bnxt_qplib_dealloc_pd(&rdev->qplib_res, &rdev->qplib_res.pd_tbl,
@@ -848,12 +1054,16 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
 	/* Shadow QP SQ depth should be same as QP1 RQ depth */
 	qp->qplib_qp.sq.max_wqe = qp1_qp->rq.max_wqe;
 	qp->qplib_qp.sq.max_sge = 2;
+	/* Q full delta can be 1 since it is internal QP */
+	qp->qplib_qp.sq.q_full_delta = 1;
 
 	qp->qplib_qp.scq = qp1_qp->scq;
 	qp->qplib_qp.rcq = qp1_qp->rcq;
 
 	qp->qplib_qp.rq.max_wqe = qp1_qp->rq.max_wqe;
 	qp->qplib_qp.rq.max_sge = qp1_qp->rq.max_sge;
+	/* Q full delta can be 1 since it is internal QP */
+	qp->qplib_qp.rq.q_full_delta = 1;
 
 	qp->qplib_qp.mtu = qp1_qp->mtu;
 
@@ -916,10 +1126,6 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 	qp->qplib_qp.sig_type = ((qp_init_attr->sq_sig_type ==
 				  IB_SIGNAL_ALL_WR) ? true : false);
 
-	entries = roundup_pow_of_two(qp_init_attr->cap.max_send_wr + 1);
-	qp->qplib_qp.sq.max_wqe = min_t(u32, entries,
-					dev_attr->max_qp_wqes + 1);
-
 	qp->qplib_qp.sq.max_sge = qp_init_attr->cap.max_send_sge;
 	if (qp->qplib_qp.sq.max_sge > dev_attr->max_qp_sges)
 		qp->qplib_qp.sq.max_sge = dev_attr->max_qp_sges;
@@ -958,6 +1164,9 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 		qp->qplib_qp.rq.max_wqe = min_t(u32, entries,
 						dev_attr->max_qp_wqes + 1);
 
+		qp->qplib_qp.rq.q_full_delta = qp->qplib_qp.rq.max_wqe -
+						qp_init_attr->cap.max_recv_wr;
+
 		qp->qplib_qp.rq.max_sge = qp_init_attr->cap.max_recv_sge;
 		if (qp->qplib_qp.rq.max_sge > dev_attr->max_qp_sges)
 			qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
@@ -966,6 +1175,12 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 	qp->qplib_qp.mtu = ib_mtu_enum_to_int(iboe_get_mtu(rdev->netdev->mtu));
 
 	if (qp_init_attr->qp_type == IB_QPT_GSI) {
+		/* Allocate 1 more than what's provided */
+		entries = roundup_pow_of_two(qp_init_attr->cap.max_send_wr + 1);
+		qp->qplib_qp.sq.max_wqe = min_t(u32, entries,
+						dev_attr->max_qp_wqes + 1);
+		qp->qplib_qp.sq.q_full_delta = qp->qplib_qp.sq.max_wqe -
+						qp_init_attr->cap.max_send_wr;
 		qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
 		if (qp->qplib_qp.rq.max_sge > dev_attr->max_qp_sges)
 			qp->qplib_qp.rq.max_sge = dev_attr->max_qp_sges;
@@ -1005,6 +1220,23 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 		}
 
 	} else {
+		/* Allocate 128 + 1 more than what's provided */
+		entries = roundup_pow_of_two(qp_init_attr->cap.max_send_wr +
+					     BNXT_QPLIB_RESERVED_QP_WRS + 1);
+		qp->qplib_qp.sq.max_wqe = min_t(u32, entries,
+						dev_attr->max_qp_wqes +
+						BNXT_QPLIB_RESERVED_QP_WRS + 1);
+		qp->qplib_qp.sq.q_full_delta = qp->qplib_qp.sq.max_wqe -
+						qp_init_attr->cap.max_send_wr;
+
+		/*
+		 * Reserving one slot for Phantom WQE. Application can
+		 * post one extra entry in this case. But allowing this to avoid
+		 * unexpected Queue full condition
+		 */
+
+		qp->qplib_qp.sq.q_full_delta -= 1;
+
 		qp->qplib_qp.max_rd_atomic = dev_attr->max_qp_rd_atom;
 		qp->qplib_qp.max_dest_rd_atomic = dev_attr->max_qp_init_rd_atom;
 		if (udata) {
@@ -1128,48 +1360,6 @@ static enum ib_mtu __to_ib_mtu(u32 mtu)
 	}
 }
 
-static int __from_ib_access_flags(int iflags)
-{
-	int qflags = 0;
-
-	if (iflags & IB_ACCESS_LOCAL_WRITE)
-		qflags |= BNXT_QPLIB_ACCESS_LOCAL_WRITE;
-	if (iflags & IB_ACCESS_REMOTE_READ)
-		qflags |= BNXT_QPLIB_ACCESS_REMOTE_READ;
-	if (iflags & IB_ACCESS_REMOTE_WRITE)
-		qflags |= BNXT_QPLIB_ACCESS_REMOTE_WRITE;
-	if (iflags & IB_ACCESS_REMOTE_ATOMIC)
-		qflags |= BNXT_QPLIB_ACCESS_REMOTE_ATOMIC;
-	if (iflags & IB_ACCESS_MW_BIND)
-		qflags |= BNXT_QPLIB_ACCESS_MW_BIND;
-	if (iflags & IB_ZERO_BASED)
-		qflags |= BNXT_QPLIB_ACCESS_ZERO_BASED;
-	if (iflags & IB_ACCESS_ON_DEMAND)
-		qflags |= BNXT_QPLIB_ACCESS_ON_DEMAND;
-	return qflags;
-};
-
-static enum ib_access_flags __to_ib_access_flags(int qflags)
-{
-	enum ib_access_flags iflags = 0;
-
-	if (qflags & BNXT_QPLIB_ACCESS_LOCAL_WRITE)
-		iflags |= IB_ACCESS_LOCAL_WRITE;
-	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_WRITE)
-		iflags |= IB_ACCESS_REMOTE_WRITE;
-	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_READ)
-		iflags |= IB_ACCESS_REMOTE_READ;
-	if (qflags & BNXT_QPLIB_ACCESS_REMOTE_ATOMIC)
-		iflags |= IB_ACCESS_REMOTE_ATOMIC;
-	if (qflags & BNXT_QPLIB_ACCESS_MW_BIND)
-		iflags |= IB_ACCESS_MW_BIND;
-	if (qflags & BNXT_QPLIB_ACCESS_ZERO_BASED)
-		iflags |= IB_ZERO_BASED;
-	if (qflags & BNXT_QPLIB_ACCESS_ON_DEMAND)
-		iflags |= IB_ACCESS_ON_DEMAND;
-	return iflags;
-};
-
 static int bnxt_re_modify_shadow_qp(struct bnxt_re_dev *rdev,
 				    struct bnxt_re_qp *qp1_qp,
 				    int qp_attr_mask)
@@ -1376,11 +1566,21 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
 		entries = roundup_pow_of_two(qp_attr->cap.max_send_wr);
 		qp->qplib_qp.sq.max_wqe = min_t(u32, entries,
 						dev_attr->max_qp_wqes + 1);
+		qp->qplib_qp.sq.q_full_delta = qp->qplib_qp.sq.max_wqe -
+						qp_attr->cap.max_send_wr;
+		/*
+		 * Reserving one slot for Phantom WQE. Some application can
+		 * post one extra entry in this case. Allowing this to avoid
+		 * unexpected Queue full condition
+		 */
+		qp->qplib_qp.sq.q_full_delta -= 1;
 		qp->qplib_qp.sq.max_sge = qp_attr->cap.max_send_sge;
 		if (qp->qplib_qp.rq.max_wqe) {
 			entries = roundup_pow_of_two(qp_attr->cap.max_recv_wr);
 			qp->qplib_qp.rq.max_wqe =
 				min_t(u32, entries, dev_attr->max_qp_wqes + 1);
+			qp->qplib_qp.rq.q_full_delta = qp->qplib_qp.rq.max_wqe -
+						       qp_attr->cap.max_recv_wr;
 			qp->qplib_qp.rq.max_sge = qp_attr->cap.max_recv_sge;
 		} else {
 			/* SRQ was used prior, just ignore the RQ caps */
@@ -2641,12 +2841,36 @@ static void bnxt_re_process_res_ud_wc(struct ib_wc *wc,
 		wc->opcode = IB_WC_RECV_RDMA_WITH_IMM;
 }
 
+static int send_phantom_wqe(struct bnxt_re_qp *qp)
+{
+	struct bnxt_qplib_qp *lib_qp = &qp->qplib_qp;
+	unsigned long flags;
+	int rc = 0;
+
+	spin_lock_irqsave(&qp->sq_lock, flags);
+
+	rc = bnxt_re_bind_fence_mw(lib_qp);
+	if (!rc) {
+		lib_qp->sq.phantom_wqe_cnt++;
+		dev_dbg(&lib_qp->sq.hwq.pdev->dev,
+			"qp %#x sq->prod %#x sw_prod %#x phantom_wqe_cnt %d\n",
+			lib_qp->id, lib_qp->sq.hwq.prod,
+			HWQ_CMP(lib_qp->sq.hwq.prod, &lib_qp->sq.hwq),
+			lib_qp->sq.phantom_wqe_cnt);
+	}
+
+	spin_unlock_irqrestore(&qp->sq_lock, flags);
+	return rc;
+}
+
 int bnxt_re_poll_cq(struct ib_cq *ib_cq, int num_entries, struct ib_wc *wc)
 {
 	struct bnxt_re_cq *cq = container_of(ib_cq, struct bnxt_re_cq, ib_cq);
 	struct bnxt_re_qp *qp;
 	struct bnxt_qplib_cqe *cqe;
 	int i, ncqe, budget;
+	struct bnxt_qplib_q *sq;
+	struct bnxt_qplib_qp *lib_qp;
 	u32 tbl_idx;
 	struct bnxt_re_sqp_entries *sqp_entry = NULL;
 	unsigned long flags;
@@ -2659,7 +2883,21 @@ int bnxt_re_poll_cq(struct ib_cq *ib_cq, int num_entries, struct ib_wc *wc)
 	}
 	cqe = &cq->cql[0];
 	while (budget) {
-		ncqe = bnxt_qplib_poll_cq(&cq->qplib_cq, cqe, budget);
+		lib_qp = NULL;
+		ncqe = bnxt_qplib_poll_cq(&cq->qplib_cq, cqe, budget, &lib_qp);
+		if (lib_qp) {
+			sq = &lib_qp->sq;
+			if (sq->send_phantom) {
+				qp = container_of(lib_qp,
+						  struct bnxt_re_qp, qplib_qp);
+				if (send_phantom_wqe(qp) == -ENOMEM)
+					dev_err(rdev_to_dev(cq->rdev),
+						"Phantom failed! Scheduled to send again\n");
+				else
+					sq->send_phantom = false;
+			}
+		}
+
 		if (!ncqe)
 			break;
 
@@ -2912,6 +3150,55 @@ struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type,
 	return ERR_PTR(rc);
 }
 
+struct ib_mw *bnxt_re_alloc_mw(struct ib_pd *ib_pd, enum ib_mw_type type,
+			       struct ib_udata *udata)
+{
+	struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
+	struct bnxt_re_dev *rdev = pd->rdev;
+	struct bnxt_re_mw *mw;
+	int rc;
+
+	mw = kzalloc(sizeof(*mw), GFP_KERNEL);
+	if (!mw)
+		return ERR_PTR(-ENOMEM);
+	mw->rdev = rdev;
+	mw->qplib_mw.pd = &pd->qplib_pd;
+
+	mw->qplib_mw.type = (type == IB_MW_TYPE_1 ?
+			       CMDQ_ALLOCATE_MRW_MRW_FLAGS_MW_TYPE1 :
+			       CMDQ_ALLOCATE_MRW_MRW_FLAGS_MW_TYPE2B);
+	rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mw->qplib_mw);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Allocate MW failed!");
+		goto fail;
+	}
+	mw->ib_mw.rkey = mw->qplib_mw.rkey;
+
+	atomic_inc(&rdev->mw_count);
+	return &mw->ib_mw;
+
+fail:
+	kfree(mw);
+	return ERR_PTR(rc);
+}
+
+int bnxt_re_dealloc_mw(struct ib_mw *ib_mw)
+{
+	struct bnxt_re_mw *mw = container_of(ib_mw, struct bnxt_re_mw, ib_mw);
+	struct bnxt_re_dev *rdev = mw->rdev;
+	int rc;
+
+	rc = bnxt_qplib_free_mrw(&rdev->qplib_res, &mw->qplib_mw);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Free MW failed: %#x\n", rc);
+		return rc;
+	}
+
+	kfree(mw);
+	atomic_dec(&rdev->mw_count);
+	return rc;
+}
+
 /* Fast Memory Regions */
 struct ib_fmr *bnxt_re_alloc_fmr(struct ib_pd *ib_pd, int mr_access_flags,
 				 struct ib_fmr_attr *fmr_attr)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index b4084c2..7135c78 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -44,11 +44,22 @@ struct bnxt_re_gid_ctx {
 	u32			refcnt;
 };
 
+struct bnxt_re_fence_data {
+	u32 size;
+	void *va;
+	dma_addr_t dma_addr;
+	struct bnxt_re_mr *mr;
+	struct ib_mw *mw;
+	struct bnxt_qplib_swqe bind_wqe;
+	u32 bind_rkey;
+};
+
 struct bnxt_re_pd {
 	struct bnxt_re_dev	*rdev;
 	struct ib_pd		ib_pd;
 	struct bnxt_qplib_pd	qplib_pd;
 	struct bnxt_qplib_dpi	dpi;
+	struct bnxt_re_fence_data fence;
 };
 
 struct bnxt_re_ah {
@@ -181,6 +192,9 @@ int bnxt_re_map_mr_sg(struct ib_mr *ib_mr, struct scatterlist *sg, int sg_nents,
 struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type mr_type,
 			       u32 max_num_sg);
 int bnxt_re_dereg_mr(struct ib_mr *mr);
+struct ib_mw *bnxt_re_alloc_mw(struct ib_pd *ib_pd, enum ib_mw_type type,
+			       struct ib_udata *udata);
+int bnxt_re_dealloc_mw(struct ib_mw *mw);
 struct ib_fmr *bnxt_re_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
 				 struct ib_fmr_attr *fmr_attr);
 int bnxt_re_map_phys_fmr(struct ib_fmr *fmr, u64 *page_list, int list_len,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index ea9ce4f..66abec0 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -1083,8 +1083,12 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 		rc = -EINVAL;
 		goto done;
 	}
-	if (HWQ_CMP((sq->hwq.prod + 1), &sq->hwq) ==
-	    HWQ_CMP(sq->hwq.cons, &sq->hwq)) {
+
+	if (bnxt_qplib_queue_full(sq)) {
+		dev_err(&sq->hwq.pdev->dev,
+			"QPLIB: prod = %#x cons = %#x qdepth = %#x delta = %#x",
+			sq->hwq.prod, sq->hwq.cons, sq->hwq.max_elements,
+			sq->q_full_delta);
 		rc = -ENOMEM;
 		goto done;
 	}
@@ -1332,8 +1336,7 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 		rc = -EINVAL;
 		goto done;
 	}
-	if (HWQ_CMP((rq->hwq.prod + 1), &rq->hwq) ==
-	    HWQ_CMP(rq->hwq.cons, &rq->hwq)) {
+	if (bnxt_qplib_queue_full(rq)) {
 		dev_err(&rq->hwq.pdev->dev,
 			"QPLIB: FP: QP (0x%x) RQ is full!", qp->id);
 		rc = -EINVAL;
@@ -1551,14 +1554,112 @@ static int __flush_rq(struct bnxt_qplib_q *rq, struct bnxt_qplib_qp *qp,
 	return rc;
 }
 
+/* Note: SQE is valid from sw_sq_cons up to cqe_sq_cons (exclusive)
+ *       CQE is track from sw_cq_cons to max_element but valid only if VALID=1
+ */
+static int do_wa9060(struct bnxt_qplib_qp *qp, struct bnxt_qplib_cq *cq,
+		     u32 cq_cons, u32 sw_sq_cons, u32 cqe_sq_cons)
+{
+	struct bnxt_qplib_q *sq = &qp->sq;
+	struct bnxt_qplib_swq *swq;
+	u32 peek_sw_cq_cons, peek_raw_cq_cons, peek_sq_cons_idx;
+	struct cq_base *peek_hwcqe, **peek_hw_cqe_ptr;
+	struct cq_req *peek_req_hwcqe;
+	struct bnxt_qplib_qp *peek_qp;
+	struct bnxt_qplib_q *peek_sq;
+	int i, rc = 0;
+
+	/* Normal mode */
+	/* Check for the psn_search marking before completing */
+	swq = &sq->swq[sw_sq_cons];
+	if (swq->psn_search &&
+	    le32_to_cpu(swq->psn_search->flags_next_psn) & 0x80000000) {
+		/* Unmark */
+		swq->psn_search->flags_next_psn = cpu_to_le32
+			(le32_to_cpu(swq->psn_search->flags_next_psn)
+				     & ~0x80000000);
+		dev_dbg(&cq->hwq.pdev->dev,
+			"FP: Process Req cq_cons=0x%x qp=0x%x sq cons sw=0x%x cqe=0x%x marked!\n",
+			cq_cons, qp->id, sw_sq_cons, cqe_sq_cons);
+		sq->condition = true;
+		sq->send_phantom = true;
+
+		/* TODO: Only ARM if the previous SQE is ARMALL */
+		bnxt_qplib_arm_cq(cq, DBR_DBR_TYPE_CQ_ARMALL);
+
+		rc = -EAGAIN;
+		goto out;
+	}
+	if (sq->condition) {
+		/* Peek at the completions */
+		peek_raw_cq_cons = cq->hwq.cons;
+		peek_sw_cq_cons = cq_cons;
+		i = cq->hwq.max_elements;
+		while (i--) {
+			peek_sw_cq_cons = HWQ_CMP((peek_sw_cq_cons), &cq->hwq);
+			peek_hw_cqe_ptr = (struct cq_base **)cq->hwq.pbl_ptr;
+			peek_hwcqe = &peek_hw_cqe_ptr[CQE_PG(peek_sw_cq_cons)]
+						     [CQE_IDX(peek_sw_cq_cons)];
+			/* If the next hwcqe is VALID */
+			if (CQE_CMP_VALID(peek_hwcqe, peek_raw_cq_cons,
+					  cq->hwq.max_elements)) {
+				/* If the next hwcqe is a REQ */
+				if ((peek_hwcqe->cqe_type_toggle &
+				    CQ_BASE_CQE_TYPE_MASK) ==
+				    CQ_BASE_CQE_TYPE_REQ) {
+					peek_req_hwcqe = (struct cq_req *)
+							 peek_hwcqe;
+					peek_qp = (struct bnxt_qplib_qp *)
+						le64_to_cpu(
+						peek_req_hwcqe->qp_handle);
+					peek_sq = &peek_qp->sq;
+					peek_sq_cons_idx = HWQ_CMP(le16_to_cpu(
+						peek_req_hwcqe->sq_cons_idx) - 1
+						, &sq->hwq);
+					/* If the hwcqe's sq's wr_id matches */
+					if (peek_sq == sq &&
+					    sq->swq[peek_sq_cons_idx].wr_id ==
+					    BNXT_QPLIB_FENCE_WRID) {
+						/*
+						 *  Unbreak only if the phantom
+						 *  comes back
+						 */
+						dev_dbg(&cq->hwq.pdev->dev,
+							"FP:Got Phantom CQE");
+						sq->condition = false;
+						sq->single = true;
+						rc = 0;
+						goto out;
+					}
+				}
+				/* Valid but not the phantom, so keep looping */
+			} else {
+				/* Not valid yet, just exit and wait */
+				rc = -EINVAL;
+				goto out;
+			}
+			peek_sw_cq_cons++;
+			peek_raw_cq_cons++;
+		}
+		dev_err(&cq->hwq.pdev->dev,
+			"Should not have come here! cq_cons=0x%x qp=0x%x sq cons sw=0x%x hw=0x%x",
+			cq_cons, qp->id, sw_sq_cons, cqe_sq_cons);
+		rc = -EINVAL;
+	}
+out:
+	return rc;
+}
+
 static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 				     struct cq_req *hwcqe,
-				     struct bnxt_qplib_cqe **pcqe, int *budget)
+				     struct bnxt_qplib_cqe **pcqe, int *budget,
+				     u32 cq_cons, struct bnxt_qplib_qp **lib_qp)
 {
 	struct bnxt_qplib_qp *qp;
 	struct bnxt_qplib_q *sq;
 	struct bnxt_qplib_cqe *cqe;
-	u32 sw_cons, cqe_cons;
+	u32 sw_sq_cons, cqe_sq_cons;
+	struct bnxt_qplib_swq *swq;
 	int rc = 0;
 
 	qp = (struct bnxt_qplib_qp *)((unsigned long)
@@ -1570,13 +1671,13 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 	}
 	sq = &qp->sq;
 
-	cqe_cons = HWQ_CMP(le16_to_cpu(hwcqe->sq_cons_idx), &sq->hwq);
-	if (cqe_cons > sq->hwq.max_elements) {
+	cqe_sq_cons = HWQ_CMP(le16_to_cpu(hwcqe->sq_cons_idx), &sq->hwq);
+	if (cqe_sq_cons > sq->hwq.max_elements) {
 		dev_err(&cq->hwq.pdev->dev,
 			"QPLIB: FP: CQ Process req reported ");
 		dev_err(&cq->hwq.pdev->dev,
 			"QPLIB: sq_cons_idx 0x%x which exceeded max 0x%x",
-			cqe_cons, sq->hwq.max_elements);
+			cqe_sq_cons, sq->hwq.max_elements);
 		return -EINVAL;
 	}
 	/* If we were in the middle of flushing the SQ, continue */
@@ -1585,53 +1686,74 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 
 	/* Require to walk the sq's swq to fabricate CQEs for all previously
 	 * signaled SWQEs due to CQE aggregation from the current sq cons
-	 * to the cqe_cons
+	 * to the cqe_sq_cons
 	 */
 	cqe = *pcqe;
 	while (*budget) {
-		sw_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
-		if (sw_cons == cqe_cons)
+		sw_sq_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
+		if (sw_sq_cons == cqe_sq_cons)
+			/* Done */
 			break;
+
+		swq = &sq->swq[sw_sq_cons];
 		memset(cqe, 0, sizeof(*cqe));
 		cqe->opcode = CQ_BASE_CQE_TYPE_REQ;
 		cqe->qp_handle = (u64)(unsigned long)qp;
 		cqe->src_qp = qp->id;
-		cqe->wr_id = sq->swq[sw_cons].wr_id;
-		cqe->type = sq->swq[sw_cons].type;
+		cqe->wr_id = swq->wr_id;
+		if (cqe->wr_id == BNXT_QPLIB_FENCE_WRID)
+			goto skip;
+		cqe->type = swq->type;
 
 		/* For the last CQE, check for status.  For errors, regardless
 		 * of the request being signaled or not, it must complete with
 		 * the hwcqe error status
 		 */
-		if (HWQ_CMP((sw_cons + 1), &sq->hwq) == cqe_cons &&
+		if (HWQ_CMP((sw_sq_cons + 1), &sq->hwq) == cqe_sq_cons &&
 		    hwcqe->status != CQ_REQ_STATUS_OK) {
 			cqe->status = hwcqe->status;
 			dev_err(&cq->hwq.pdev->dev,
 				"QPLIB: FP: CQ Processed Req ");
 			dev_err(&cq->hwq.pdev->dev,
 				"QPLIB: wr_id[%d] = 0x%llx with status 0x%x",
-				sw_cons, cqe->wr_id, cqe->status);
+				sw_sq_cons, cqe->wr_id, cqe->status);
 			cqe++;
 			(*budget)--;
 			sq->flush_in_progress = true;
 			/* Must block new posting of SQ and RQ */
 			qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
+			sq->condition = false;
+			sq->single = false;
 		} else {
-			if (sq->swq[sw_cons].flags &
-			    SQ_SEND_FLAGS_SIGNAL_COMP) {
+			if (swq->flags & SQ_SEND_FLAGS_SIGNAL_COMP) {
+				/* Before we complete, do WA 9060 */
+				if (do_wa9060(qp, cq, cq_cons, sw_sq_cons,
+					      cqe_sq_cons)) {
+					*lib_qp = qp;
+					goto out;
+				}
 				cqe->status = CQ_REQ_STATUS_OK;
 				cqe++;
 				(*budget)--;
 			}
 		}
+skip:
 		sq->hwq.cons++;
+		if (sq->single)
+			break;
 	}
+out:
 	*pcqe = cqe;
-	if (!*budget && HWQ_CMP(sq->hwq.cons, &sq->hwq) != cqe_cons) {
+	if (HWQ_CMP(sq->hwq.cons, &sq->hwq) != cqe_sq_cons) {
 		/* Out of budget */
 		rc = -EAGAIN;
 		goto done;
 	}
+	/*
+	 * Back to normal completion mode only after it has completed all of
+	 * the WC for this CQE
+	 */
+	sq->single = false;
 	if (!sq->flush_in_progress)
 		goto done;
 flush:
@@ -1961,7 +2083,7 @@ static int bnxt_qplib_cq_process_cutoff(struct bnxt_qplib_cq *cq,
 }
 
 int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
-		       int num_cqes)
+		       int num_cqes, struct bnxt_qplib_qp **lib_qp)
 {
 	struct cq_base *hw_cqe, **hw_cqe_ptr;
 	unsigned long flags;
@@ -1986,7 +2108,8 @@ int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
 		case CQ_BASE_CQE_TYPE_REQ:
 			rc = bnxt_qplib_cq_process_req(cq,
 						       (struct cq_req *)hw_cqe,
-						       &cqe, &budget);
+						       &cqe, &budget,
+						       sw_cons, lib_qp);
 			break;
 		case CQ_BASE_CQE_TYPE_RES_RC:
 			rc = bnxt_qplib_cq_process_res_rc(cq,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index f0150f8..71539ea 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -88,6 +88,7 @@ struct bnxt_qplib_swq {
 
 struct bnxt_qplib_swqe {
 	/* General */
+#define	BNXT_QPLIB_FENCE_WRID	0x46454E43	/* "FENC" */
 	u64				wr_id;
 	u8				reqs_type;
 	u8				type;
@@ -216,9 +217,16 @@ struct bnxt_qplib_q {
 	struct scatterlist		*sglist;
 	u32				nmap;
 	u32				max_wqe;
+	u16				q_full_delta;
 	u16				max_sge;
 	u32				psn;
 	bool				flush_in_progress;
+	bool				condition;
+	bool				single;
+	bool				send_phantom;
+	u32				phantom_wqe_cnt;
+	u32				phantom_cqe_cnt;
+	u32				next_cq_cons;
 };
 
 struct bnxt_qplib_qp {
@@ -301,6 +309,13 @@ struct bnxt_qplib_qp {
 	(!!((hdr)->cqe_type_toggle & CQ_BASE_TOGGLE) ==		\
 	   !((raw_cons) & (cp_bit)))
 
+static inline bool bnxt_qplib_queue_full(struct bnxt_qplib_q *qplib_q)
+{
+	return HWQ_CMP((qplib_q->hwq.prod + qplib_q->q_full_delta),
+		       &qplib_q->hwq) == HWQ_CMP(qplib_q->hwq.cons,
+						 &qplib_q->hwq);
+}
+
 struct bnxt_qplib_cqe {
 	u8				status;
 	u8				type;
@@ -432,7 +447,7 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_destroy_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq);
 int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
-		       int num);
+		       int num, struct bnxt_qplib_qp **qp);
 void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type);
 void bnxt_qplib_free_nq(struct bnxt_qplib_nq *nq);
 int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq);
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index 4103e60..2e48555 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -52,7 +52,6 @@ extern const struct bnxt_qplib_gid bnxt_qplib_gid_zero;
 				((HWQ_CMP(hwq->prod, hwq)\
 				- HWQ_CMP(hwq->cons, hwq))\
 				& (hwq->max_elements - 1)))
-
 enum bnxt_qplib_hwq_type {
 	HWQ_TYPE_CTX,
 	HWQ_TYPE_QUEUE,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index cf7c9cb..fde18cf 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -88,6 +88,11 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
 		sb->max_qp_init_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ?
 		BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_init_rd_atom;
 	attr->max_qp_wqes = le16_to_cpu(sb->max_qp_wr);
+	/*
+	 * 128 WQEs needs to be reserved for the HW (8916). Prevent
+	 * reporting the max number
+	 */
+	attr->max_qp_wqes -= BNXT_QPLIB_RESERVED_QP_WRS;
 	attr->max_qp_sges = sb->max_sge;
 	attr->max_cq = le32_to_cpu(sb->max_cq);
 	attr->max_cq_wqes = le32_to_cpu(sb->max_cqe);
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
index 1442a61..a543f95 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
@@ -40,6 +40,8 @@
 #ifndef __BNXT_QPLIB_SP_H__
 #define __BNXT_QPLIB_SP_H__
 
+#define BNXT_QPLIB_RESERVED_QP_WRS	128
+
 struct bnxt_qplib_dev_attr {
 	char				fw_ver[32];
 	u16				max_sgid;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-05-10 10:45   ` [PATCH for-next 01/14] RDMA/bnxt_re: Fixing the Control path command and response handling Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 02/14] RDMA/bnxt_re: HW workarounds for handling specific conditions Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
       [not found]     ` <1494413139-11883-4-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-05-10 10:45   ` [PATCH for-next 04/14] RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list Selvin Xavier
                     ` (10 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Somnath Kotur, Sriharsha Basavapatna, Selvin Xavier

From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Upon receipt of the NETDEV_REGISTER event from the netdev notifier
chain, the IB stack registration is spawned off to a workqueue
since that also requires an rtnl lock.
There could be 2 kinds of races between the NETDEV_REGISTER and the
NETDEV_UNREGISTER event handling.
a)The NETDEV_UNREGISTER event is received in rapid succession after
the NETDEV_REGISTER event even before the work queue got a chance
to run
b)The NETDEV_UNREGISTER event is received while the workqueue that
handles registration with the IB stack is still in progress.
Handle both the races with a bit flag that is set as part of the
NETDEV_REGISTER event just before the work item is queued and cleared
in the workqueue after the registration with the IB stack is complete.
Use a barrier to ensure the bit is cleared only after the IB stack
registration code is completed.

Removes the link speed query from the query_port as it causes
deadlock while trying to acquire rtnl_lock. Now querying the
speed from the nedev event itself.

Also, added a code to wait for resources(like CQ) to be freed by the
ULPs, during driver unload

Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  | 13 +++++++----
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 21 +++--------------
 drivers/infiniband/hw/bnxt_re/main.c     | 39 ++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index ebf7be8..fad04b2 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -80,11 +80,13 @@ struct bnxt_re_dev {
 	struct ib_device		ibdev;
 	struct list_head		list;
 	unsigned long			flags;
-#define BNXT_RE_FLAG_NETDEV_REGISTERED	0
-#define BNXT_RE_FLAG_IBDEV_REGISTERED	1
-#define BNXT_RE_FLAG_GOT_MSIX		2
-#define BNXT_RE_FLAG_RCFW_CHANNEL_EN	8
-#define BNXT_RE_FLAG_QOS_WORK_REG	16
+#define BNXT_RE_FLAG_NETDEV_REGISTERED         0
+#define BNXT_RE_FLAG_IBDEV_REGISTERED          1
+#define BNXT_RE_FLAG_GOT_MSIX                  2
+#define BNXT_RE_FLAG_RCFW_CHANNEL_EN           4
+#define BNXT_RE_FLAG_QOS_WORK_REG              5
+#define BNXT_RE_FLAG_NETDEV_REG_IN_PROG        6
+
 	struct net_device		*netdev;
 	unsigned int			version, major, minor;
 	struct bnxt_en_dev		*en_dev;
@@ -127,6 +129,7 @@ struct bnxt_re_dev {
 	struct bnxt_re_qp		*qp1_sqp;
 	struct bnxt_re_ah		*sqp_ah;
 	struct bnxt_re_sqp_entries sqp_tbl[1024];
+	u32 espeed;
 };
 
 #define to_bnxt_re_dev(ptr, member)	\
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 6385213..c717d5d 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -223,20 +223,8 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
 	return 0;
 }
 
-static void __to_ib_speed_width(struct net_device *netdev, u8 *speed, u8 *width)
+static void __to_ib_speed_width(u32 espeed, u8 *speed, u8 *width)
 {
-	struct ethtool_link_ksettings lksettings;
-	u32 espeed;
-
-	if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
-		memset(&lksettings, 0, sizeof(lksettings));
-		rtnl_lock();
-		netdev->ethtool_ops->get_link_ksettings(netdev, &lksettings);
-		rtnl_unlock();
-		espeed = lksettings.base.speed;
-	} else {
-		espeed = SPEED_UNKNOWN;
-	}
 	switch (espeed) {
 	case SPEED_1000:
 		*speed = IB_SPEED_SDR;
@@ -303,12 +291,9 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
 	port_attr->sm_sl = 0;
 	port_attr->subnet_timeout = 0;
 	port_attr->init_type_reply = 0;
-	/* call the underlying netdev's ethtool hooks to query speed settings
-	 * for which we acquire rtnl_lock _only_ if it's registered with
-	 * IB stack to avoid race in the NETDEV_UNREG path
-	 */
+
 	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
-		__to_ib_speed_width(rdev->netdev, &port_attr->active_speed,
+		__to_ib_speed_width(rdev->espeed, &port_attr->active_speed,
 				    &port_attr->active_width);
 	return 0;
 }
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 5d35540..99a54fd 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -48,6 +48,8 @@
 #include <net/ipv6.h>
 #include <net/addrconf.h>
 #include <linux/if_ether.h>
+#include <linux/atomic.h>
+#include <asm/barrier.h>
 
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_user_verbs.h>
@@ -926,6 +928,10 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
 	if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
 		cancel_delayed_work(&rdev->worker);
 
+	/* Wait for ULPs to release references */
+	while (atomic_read(&rdev->cq_count))
+		usleep_range(500, 1000);
+
 	bnxt_re_cleanup_res(rdev);
 	bnxt_re_free_res(rdev, lock_wait);
 
@@ -1152,6 +1158,22 @@ static void bnxt_re_remove_one(struct bnxt_re_dev *rdev)
 	pci_dev_put(rdev->en_dev->pdev);
 }
 
+static void bnxt_re_get_link_speed(struct bnxt_re_dev *rdev)
+{
+	struct ethtool_link_ksettings lksettings;
+	struct net_device *netdev = rdev->netdev;
+
+	if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
+		memset(&lksettings, 0, sizeof(lksettings));
+		if (rtnl_trylock()) {
+			netdev->ethtool_ops->get_link_ksettings(netdev,
+								&lksettings);
+			rdev->espeed = lksettings.base.speed;
+			rtnl_unlock();
+		}
+	}
+}
+
 /* Handle all deferred netevents tasks */
 static void bnxt_re_task(struct work_struct *work)
 {
@@ -1169,6 +1191,10 @@ static void bnxt_re_task(struct work_struct *work)
 	switch (re_work->event) {
 	case NETDEV_REGISTER:
 		rc = bnxt_re_ib_reg(rdev);
+		/* Use memory barrier to sync the rdev->flags */
+		smp_mb();
+		clear_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG,
+			  &rdev->flags);
 		if (rc)
 			dev_err(rdev_to_dev(rdev),
 				"Failed to register with IB: %#x", rc);
@@ -1186,6 +1212,7 @@ static void bnxt_re_task(struct work_struct *work)
 		else if (netif_carrier_ok(rdev->netdev))
 			bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
 					       IB_EVENT_PORT_ACTIVE);
+		bnxt_re_get_link_speed(rdev);
 		break;
 	default:
 		break;
@@ -1244,10 +1271,22 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 			break;
 		}
 		bnxt_re_init_one(rdev);
+		/*
+		 *  Query the link speed at the time of registration.
+		 *  Already in rtnl_lock context
+		 */
+		bnxt_re_get_link_speed(rdev);
+
+		set_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags);
 		sch_work = true;
 		break;
 
 	case NETDEV_UNREGISTER:
+		/* netdev notifier will call NETDEV_UNREGISTER again later since
+		 * we are still holding the reference to the netdev
+		 */
+		if (test_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags))
+			goto exit;
 		bnxt_re_ib_unreg(rdev, false);
 		bnxt_re_remove_one(rdev);
 		bnxt_re_dev_unreg(rdev);
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 04/14] RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 05/14] RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext Selvin Xavier
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Selvin Xavier

If the host buffers are freed before destroying MR in HW,
HW could try accessing these buffers. This could cause a host
crash. Fixing the code to avoid this condition.

Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index c717d5d..6db218c 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3043,6 +3043,12 @@ int bnxt_re_dereg_mr(struct ib_mr *ib_mr)
 	struct bnxt_re_dev *rdev = mr->rdev;
 	int rc = 0;
 
+	rc = bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
+	if (rc) {
+		dev_err(rdev_to_dev(rdev), "Dereg MR failed: %#x\n", rc);
+		return rc;
+	}
+
 	if (mr->npages && mr->pages) {
 		rc = bnxt_qplib_free_fast_reg_page_list(&rdev->qplib_res,
 							&mr->qplib_frpl);
@@ -3050,8 +3056,6 @@ int bnxt_re_dereg_mr(struct ib_mr *ib_mr)
 		mr->npages = 0;
 		mr->pages = NULL;
 	}
-	rc = bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
-
 	if (!IS_ERR(mr->ib_umem) && mr->ib_umem)
 		ib_umem_release(mr->ib_umem);
 
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 05/14] RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 04/14] RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 06/14] RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs Selvin Xavier
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Devesh Sharma, Selvin Xavier

From: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

The driver must free the DPI during the dealloc_ucontext
instead of freeing it during dealloc_pd. However, the DPI
allocation scheme remains unchanged.

Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 58 ++++++++++++++++----------------
 drivers/infiniband/hw/bnxt_re/ib_verbs.h |  3 +-
 2 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 6db218c..eabc97e 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -604,30 +604,13 @@ int bnxt_re_dealloc_pd(struct ib_pd *ib_pd)
 	int rc;
 
 	bnxt_re_destroy_fence_mr(pd);
-	if (ib_pd->uobject && pd->dpi.dbr) {
-		struct ib_ucontext *ib_uctx = ib_pd->uobject->context;
-		struct bnxt_re_ucontext *ucntx;
 
-		/* Free DPI only if this is the first PD allocated by the
-		 * application and mark the context dpi as NULL
-		 */
-		ucntx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
-
-		rc = bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
-					    &rdev->qplib_res.dpi_tbl,
-					    &pd->dpi);
+	if (pd->qplib_pd.id) {
+		rc = bnxt_qplib_dealloc_pd(&rdev->qplib_res,
+					   &rdev->qplib_res.pd_tbl,
+					   &pd->qplib_pd);
 		if (rc)
-			dev_err(rdev_to_dev(rdev), "Failed to deallocate HW DPI");
-			/* Don't fail, continue*/
-		ucntx->dpi = NULL;
-	}
-
-	rc = bnxt_qplib_dealloc_pd(&rdev->qplib_res,
-				   &rdev->qplib_res.pd_tbl,
-				   &pd->qplib_pd);
-	if (rc) {
-		dev_err(rdev_to_dev(rdev), "Failed to deallocate HW PD");
-		return rc;
+			dev_err(rdev_to_dev(rdev), "Failed to deallocate HW PD");
 	}
 
 	kfree(pd);
@@ -659,23 +642,22 @@ struct ib_pd *bnxt_re_alloc_pd(struct ib_device *ibdev,
 	if (udata) {
 		struct bnxt_re_pd_resp resp;
 
-		if (!ucntx->dpi) {
+		if (!ucntx->dpi.dbr) {
 			/* Allocate DPI in alloc_pd to avoid failing of
 			 * ibv_devinfo and family of application when DPIs
 			 * are depleted.
 			 */
 			if (bnxt_qplib_alloc_dpi(&rdev->qplib_res.dpi_tbl,
-						 &pd->dpi, ucntx)) {
+						 &ucntx->dpi, ucntx)) {
 				rc = -ENOMEM;
 				goto dbfail;
 			}
-			ucntx->dpi = &pd->dpi;
 		}
 
 		resp.pdid = pd->qplib_pd.id;
 		/* Still allow mapping this DBR to the new user PD. */
-		resp.dpi = ucntx->dpi->dpi;
-		resp.dbr = (u64)ucntx->dpi->umdbr;
+		resp.dpi = ucntx->dpi.dpi;
+		resp.dbr = (u64)ucntx->dpi.umdbr;
 
 		rc = ib_copy_to_udata(udata, &resp, sizeof(resp));
 		if (rc) {
@@ -951,7 +933,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
 		qplib_qp->rq.nmap = umem->nmap;
 	}
 
-	qplib_qp->dpi = cntx->dpi;
+	qplib_qp->dpi = &cntx->dpi;
 	return 0;
 rqfail:
 	ib_umem_release(qp->sumem);
@@ -2360,7 +2342,7 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 		}
 		cq->qplib_cq.sghead = cq->umem->sg_head.sgl;
 		cq->qplib_cq.nmap = cq->umem->nmap;
-		cq->qplib_cq.dpi = uctx->dpi;
+		cq->qplib_cq.dpi = &uctx->dpi;
 	} else {
 		cq->max_cql = min_t(u32, entries, MAX_CQL_PER_POLL);
 		cq->cql = kcalloc(cq->max_cql, sizeof(struct bnxt_qplib_cqe),
@@ -3439,8 +3421,26 @@ int bnxt_re_dealloc_ucontext(struct ib_ucontext *ib_uctx)
 	struct bnxt_re_ucontext *uctx = container_of(ib_uctx,
 						   struct bnxt_re_ucontext,
 						   ib_uctx);
+
+	struct bnxt_re_dev *rdev = uctx->rdev;
+	int rc = 0;
+
 	if (uctx->shpg)
 		free_page((unsigned long)uctx->shpg);
+
+	if (uctx->dpi.dbr) {
+		/* Free DPI only if this is the first PD allocated by the
+		 * application and mark the context dpi as NULL
+		 */
+		rc = bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
+					    &rdev->qplib_res.dpi_tbl,
+					    &uctx->dpi);
+		if (rc)
+			dev_err(rdev_to_dev(rdev), "Deallocte HW DPI failed!");
+			/* Don't fail, continue*/
+		uctx->dpi.dbr = NULL;
+	}
+
 	kfree(uctx);
 	return 0;
 }
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 7135c78..381e4e9 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -58,7 +58,6 @@ struct bnxt_re_pd {
 	struct bnxt_re_dev	*rdev;
 	struct ib_pd		ib_pd;
 	struct bnxt_qplib_pd	qplib_pd;
-	struct bnxt_qplib_dpi	dpi;
 	struct bnxt_re_fence_data fence;
 };
 
@@ -125,7 +124,7 @@ struct bnxt_re_mw {
 struct bnxt_re_ucontext {
 	struct bnxt_re_dev	*rdev;
 	struct ib_ucontext	ib_uctx;
-	struct bnxt_qplib_dpi	*dpi;
+	struct bnxt_qplib_dpi	dpi;
 	void			*shpg;
 	spinlock_t		sh_lock;	/* protect shpg */
 };
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 06/14] RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 05/14] RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 07/14] RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error Selvin Xavier
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Somnath Kotur, Selvin Xavier

From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

HW stalls out after 0x800000 WQEs are posted for UD QPs.
To workaround this problem, driver will send a modify_qp cmd
to the HW at around the halfway mark(0x400000) so that BONO
can accordingly modify the QP context in the HW to prevent this
stall.
This workaround needs to be done for UD, QP1 and Raw Ethertype
packets. Added a counter to keep track of WQEs posted during post_send.

Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  |  2 ++
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 18 ++++++++++++++++++
 drivers/infiniband/hw/bnxt_re/qplib_fp.c |  3 +++
 drivers/infiniband/hw/bnxt_re/qplib_fp.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index fad04b2..277c2da 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -56,6 +56,8 @@
 #define BNXT_RE_MAX_SRQC_COUNT		(64 * 1024)
 #define BNXT_RE_MAX_CQ_COUNT		(64 * 1024)
 
+#define BNXT_RE_UD_QP_HW_STALL		0x400000
+
 struct bnxt_re_work {
 	struct work_struct	work;
 	unsigned long		event;
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index eabc97e..61703f3 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -2048,6 +2048,22 @@ static int bnxt_re_copy_wr_payload(struct bnxt_re_dev *rdev,
 	return payload_sz;
 }
 
+static void bnxt_ud_qp_hw_stall_workaround(struct bnxt_re_qp *qp)
+{
+	if ((qp->ib_qp.qp_type == IB_QPT_UD ||
+	     qp->ib_qp.qp_type == IB_QPT_GSI ||
+	     qp->ib_qp.qp_type == IB_QPT_RAW_ETHERTYPE) &&
+	     qp->qplib_qp.wqe_cnt == BNXT_RE_UD_QP_HW_STALL) {
+		int qp_attr_mask;
+		struct ib_qp_attr qp_attr;
+
+		qp_attr_mask = IB_QP_STATE;
+		qp_attr.qp_state = IB_QPS_RTS;
+		bnxt_re_modify_qp(&qp->ib_qp, &qp_attr, qp_attr_mask, NULL);
+		qp->qplib_qp.wqe_cnt = 0;
+	}
+}
+
 static int bnxt_re_post_send_shadow_qp(struct bnxt_re_dev *rdev,
 				       struct bnxt_re_qp *qp,
 				struct ib_send_wr *wr)
@@ -2093,6 +2109,7 @@ static int bnxt_re_post_send_shadow_qp(struct bnxt_re_dev *rdev,
 		wr = wr->next;
 	}
 	bnxt_qplib_post_send_db(&qp->qplib_qp);
+	bnxt_ud_qp_hw_stall_workaround(qp);
 	spin_unlock_irqrestore(&qp->sq_lock, flags);
 	return rc;
 }
@@ -2189,6 +2206,7 @@ int bnxt_re_post_send(struct ib_qp *ib_qp, struct ib_send_wr *wr,
 		wr = wr->next;
 	}
 	bnxt_qplib_post_send_db(&qp->qplib_qp);
+	bnxt_ud_qp_hw_stall_workaround(qp);
 	spin_unlock_irqrestore(&qp->sq_lock, flags);
 
 	return rc;
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 66abec0..0716298 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -1298,6 +1298,9 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 	}
 
 	sq->hwq.prod++;
+
+	qp->wqe_cnt++;
+
 done:
 	return rc;
 }
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index 71539ea..36b7b7d 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -250,6 +250,7 @@ struct bnxt_qplib_qp {
 	u8				timeout;
 	u8				retry_cnt;
 	u8				rnr_retry;
+	u64				wqe_cnt;
 	u32				min_rnr_timer;
 	u32				max_rd_atomic;
 	u32				max_dest_rd_atomic;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 07/14] RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 06/14] RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured Selvin Xavier
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Somnath Kotur, Selvin Xavier

From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Posting WQE size of 2 results in a WQE_FORMAT_ERROR
thrown by the HW as it requires host to supply WQE Size with room
for atleast one SGE so that the resulting WQE size be atleast 3.

Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/qplib_fp.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 0716298..6a07e4b 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -1128,6 +1128,11 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 		}
 		/* Each SGE entry = 1 WQE size16 */
 		wqe_size16 = wqe->num_sge;
+		/* HW requires wqe size has room for atleast one SGE even if
+		 * none was supplied by ULP
+		 */
+		if (!wqe->num_sge)
+			wqe_size16++;
 	}
 
 	/* Specifics */
@@ -1364,6 +1369,11 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 	rqe->flags = wqe->flags;
 	rqe->wqe_size = wqe->num_sge +
 			((offsetof(typeof(*rqe), data) + 15) >> 4);
+	/* HW requires wqe size has room for atleast one SGE even if none
+	 * was supplied by ULP
+	 */
+	if (!wqe->num_sge)
+		rqe->wqe_size++;
 
 	/* Supply the rqe->wr_id index to the wr_id_tbl for now */
 	rqe->wr_id[0] = cpu_to_le32(sw_prod);
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 07/14] RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
       [not found]     ` <1494413139-11883-9-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-05-10 10:45   ` [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails Selvin Xavier
                     ` (5 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Kalesh AP, Selvin Xavier

From: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Current implementation does not program vlan header insertion
in RoCE packet if no vlan is configured. Firmware does not add
prority when there is no vlan tag in the packet. Modify the code
to insert vlan header when PFC is enabled on the interface.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/main.c      | 53 +++++++++++++++++++++++---
 drivers/infiniband/hw/bnxt_re/qplib_res.c | 10 +++++
 drivers/infiniband/hw/bnxt_re/qplib_res.h |  2 +
 drivers/infiniband/hw/bnxt_re/qplib_sp.c  | 63 ++++++++++++++++++++++++++++---
 drivers/infiniband/hw/bnxt_re/qplib_sp.h  |  2 +
 drivers/infiniband/hw/bnxt_re/roce_hsi.h  |  4 +-
 6 files changed, 121 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 99a54fd..fd61949 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -840,6 +840,42 @@ static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev)
 	mutex_unlock(&rdev->qp_lock);
 }
 
+static int bnxt_re_update_gid(struct bnxt_re_dev *rdev)
+{
+	struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl;
+	struct bnxt_qplib_gid gid;
+	u16 gid_idx, index;
+	int rc = 0;
+
+	if (!test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
+		return 0;
+
+	if (!sgid_tbl) {
+		dev_err(rdev_to_dev(rdev), "QPLIB: SGID table not allocated");
+		return -EINVAL;
+	}
+
+	for (index = 0; index < sgid_tbl->active; index++) {
+		gid_idx = sgid_tbl->hw_id[index];
+
+		if (!memcmp(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
+			    sizeof(bnxt_qplib_gid_zero)))
+			continue;
+		/* need to modify the VLAN enable setting of non VLAN GID only
+		 * as setting is done for VLAN GID while adding GID
+		 */
+		if (sgid_tbl->vlan[index])
+			continue;
+
+		memcpy(&gid, &sgid_tbl->tbl[index], sizeof(gid));
+
+		rc = bnxt_qplib_update_sgid(sgid_tbl, &gid, gid_idx,
+					    rdev->qplib_res.netdev->dev_addr);
+	}
+
+	return rc;
+}
+
 static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
 {
 	u32 prio_map = 0, tmp_map = 0;
@@ -859,8 +895,6 @@ static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
 	tmp_map = dcb_ieee_getapp_mask(netdev, &app);
 	prio_map |= tmp_map;
 
-	if (!prio_map)
-		prio_map = -EFAULT;
 	return prio_map;
 }
 
@@ -886,10 +920,7 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
 	int rc;
 
 	/* Get priority for roce */
-	rc = bnxt_re_get_priority_mask(rdev);
-	if (rc < 0)
-		return rc;
-	prio_map = (u8)rc;
+	prio_map = bnxt_re_get_priority_mask(rdev);
 
 	if (prio_map == rdev->cur_prio_map)
 		return 0;
@@ -911,6 +942,16 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
 		return rc;
 	}
 
+	/* Actual priorities are not programmed as they are already
+	 * done by L2 driver; just enable or disable priority vlan tagging
+	 */
+	if ((prio_map == 0 && rdev->qplib_res.prio) ||
+	    (prio_map != 0 && !rdev->qplib_res.prio)) {
+		rdev->qplib_res.prio = prio_map ? true : false;
+
+		bnxt_re_update_gid(rdev);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c
index 62447b3..4e10170 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
@@ -468,9 +468,11 @@ static void bnxt_qplib_free_sgid_tbl(struct bnxt_qplib_res *res,
 	kfree(sgid_tbl->tbl);
 	kfree(sgid_tbl->hw_id);
 	kfree(sgid_tbl->ctx);
+	kfree(sgid_tbl->vlan);
 	sgid_tbl->tbl = NULL;
 	sgid_tbl->hw_id = NULL;
 	sgid_tbl->ctx = NULL;
+	sgid_tbl->vlan = NULL;
 	sgid_tbl->max = 0;
 	sgid_tbl->active = 0;
 }
@@ -491,8 +493,15 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res,
 	if (!sgid_tbl->ctx)
 		goto out_free2;
 
+	sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL);
+	if (!sgid_tbl->vlan)
+		goto out_free3;
+
 	sgid_tbl->max = max;
 	return 0;
+out_free3:
+	kfree(sgid_tbl->ctx);
+	sgid_tbl->ctx = NULL;
 out_free2:
 	kfree(sgid_tbl->hw_id);
 	sgid_tbl->hw_id = NULL;
@@ -514,6 +523,7 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct bnxt_qplib_res *res,
 	}
 	memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max);
 	memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max);
+	memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max);
 	sgid_tbl->active = 0;
 }
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index 2e48555..e872075 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -116,6 +116,7 @@ struct bnxt_qplib_sgid_tbl {
 	u16				max;
 	u16				active;
 	void				*ctx;
+	u8				*vlan;
 };
 
 struct bnxt_qplib_pkey_tbl {
@@ -188,6 +189,7 @@ struct bnxt_qplib_res {
 	struct bnxt_qplib_sgid_tbl	sgid_tbl;
 	struct bnxt_qplib_pkey_tbl	pkey_tbl;
 	struct bnxt_qplib_dpi_tbl	dpi_tbl;
+	bool				prio;
 };
 
 #define to_bnxt_qplib(ptr, type, member)	\
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index fde18cf..3744d33 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -197,6 +197,7 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	}
 	memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
 	       sizeof(bnxt_qplib_gid_zero));
+	sgid_tbl->vlan[index] = 0;
 	sgid_tbl->active--;
 	dev_dbg(&res->pdev->dev,
 		"QPLIB: SGID deleted hw_id[0x%x] = 0x%x active = 0x%x",
@@ -260,11 +261,19 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 		req.gid[1] = cpu_to_be32(temp32[2]);
 		req.gid[2] = cpu_to_be32(temp32[1]);
 		req.gid[3] = cpu_to_be32(temp32[0]);
-		if (vlan_id != 0xFFFF)
-			req.vlan = cpu_to_le16((vlan_id &
-					CMDQ_ADD_GID_VLAN_VLAN_ID_MASK) |
-					CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
-					CMDQ_ADD_GID_VLAN_VLAN_EN);
+		/*
+		 * driver should ensure that all RoCE traffic is always VLAN
+		 * tagged if RoCE traffic is running on non-zero VLAN ID or
+		 * RoCE traffic is running on non-zero Priority.
+		 */
+		if ((vlan_id != 0xFFFF) || res->prio) {
+			if (vlan_id != 0xFFFF)
+				req.vlan = cpu_to_le16
+				(vlan_id & CMDQ_ADD_GID_VLAN_VLAN_ID_MASK);
+			req.vlan |= cpu_to_le16
+					(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
+					 CMDQ_ADD_GID_VLAN_VLAN_EN);
+		}
 
 		/* MAC in network format */
 		memcpy(temp16, smac, 6);
@@ -281,6 +290,9 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	/* Add GID to the sgid_tbl */
 	memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid));
 	sgid_tbl->active++;
+	if (vlan_id != 0xFFFF)
+		sgid_tbl->vlan[free_idx] = 1;
+
 	dev_dbg(&res->pdev->dev,
 		"QPLIB: SGID added hw_id[0x%x] = 0x%x active = 0x%x",
 		 free_idx, sgid_tbl->hw_id[free_idx], sgid_tbl->active);
@@ -290,6 +302,47 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	return 0;
 }
 
+int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
+			   struct bnxt_qplib_gid *gid, u16 gid_idx,
+			   u8 *smac)
+{
+	struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl,
+						   struct bnxt_qplib_res,
+						   sgid_tbl);
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct creq_modify_gid_resp resp;
+	struct cmdq_modify_gid req;
+	int rc;
+	u16 cmd_flags = 0;
+	u32 temp32[4];
+	u16 temp16[3];
+
+	RCFW_CMD_PREP(req, MODIFY_GID, cmd_flags);
+
+	memcpy(temp32, gid->data, sizeof(struct bnxt_qplib_gid));
+	req.gid[0] = cpu_to_be32(temp32[3]);
+	req.gid[1] = cpu_to_be32(temp32[2]);
+	req.gid[2] = cpu_to_be32(temp32[1]);
+	req.gid[3] = cpu_to_be32(temp32[0]);
+	if (res->prio) {
+		req.vlan |= cpu_to_le16
+			(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
+			 CMDQ_ADD_GID_VLAN_VLAN_EN);
+	}
+
+	/* MAC in network format */
+	memcpy(temp16, smac, 6);
+	req.src_mac[0] = cpu_to_be16(temp16[0]);
+	req.src_mac[1] = cpu_to_be16(temp16[1]);
+	req.src_mac[2] = cpu_to_be16(temp16[2]);
+
+	req.gid_index = cpu_to_le16(gid_idx);
+
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	return rc;
+}
+
 /* pkeys */
 int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
 			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
index a543f95..e3a3ed9 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
@@ -132,6 +132,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 			struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id,
 			bool update, u32 *index);
+int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
+			   struct bnxt_qplib_gid *gid, u16 gid_idx, u8 *smac);
 int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
 			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
 			u16 *pkey);
diff --git a/drivers/infiniband/hw/bnxt_re/roce_hsi.h b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
index fc23477..eeb55b2 100644
--- a/drivers/infiniband/hw/bnxt_re/roce_hsi.h
+++ b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
@@ -1473,8 +1473,8 @@ struct cmdq_modify_gid {
 	u8 resp_size;
 	u8 reserved8;
 	__le64 resp_addr;
-	__le32 gid[4];
-	__le16 src_mac[3];
+	__be32 gid[4];
+	__be16 src_mac[3];
 	__le16 vlan;
 	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_MASK		    0xfffUL
 	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_SFT		    0
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
       [not found]     ` <1494413139-11883-10-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  2017-05-10 10:45   ` [PATCH for-next 10/14] RDMA/bnxt_re: Fix RQE posting logic Selvin Xavier
                     ` (4 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Selvin Xavier, Kalesh AP

Do not free context table entry if delete GID fails. Stack may call
back the add_gid for the same context again which could result in
panic

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 61703f3..525f4b0 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -375,15 +375,17 @@ int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num,
 			return -EINVAL;
 		ctx->refcnt--;
 		if (!ctx->refcnt) {
-			rc = bnxt_qplib_del_sgid
-					(sgid_tbl,
-					 &sgid_tbl->tbl[ctx->idx], true);
-			if (rc)
+			rc = bnxt_qplib_del_sgid(sgid_tbl,
+						 &sgid_tbl->tbl[ctx->idx],
+						 true);
+			if (rc) {
 				dev_err(rdev_to_dev(rdev),
 					"Failed to remove GID: %#x", rc);
-			ctx_tbl = sgid_tbl->ctx;
-			ctx_tbl[ctx->idx] = NULL;
-			kfree(ctx);
+			} else {
+				ctx_tbl = sgid_tbl->ctx;
+				ctx_tbl[ctx->idx] = NULL;
+				kfree(ctx);
+			}
 		}
 	} else {
 		return -EINVAL;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 10/14] RDMA/bnxt_re: Fix RQE posting logic
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 11/14] RDMA/bnxt_re: Report supported value to IB stack in query_device Selvin Xavier
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Devesh Sharma, Kalesh AP, Selvin Xavier

From: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

This patch adds code to ring RQ Doorbell aggressively
so that the adapter can DMA RQ buffers sooner, instead
of DMA all WQEs in the post_recv WR list together at the
end of the post_recv verb.
Also use spinlock to serialize RQ posting

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Devesh Sharma <devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  |  2 ++
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 18 +++++++++++++++++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h |  1 +
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 277c2da..12950ec 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -58,6 +58,8 @@
 
 #define BNXT_RE_UD_QP_HW_STALL		0x400000
 
+#define BNXT_RE_RQ_WQE_THRESHOLD	32
+
 struct bnxt_re_work {
 	struct work_struct	work;
 	unsigned long		event;
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 525f4b0..5a8b17e 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -1225,6 +1225,7 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
 
 	qp->ib_qp.qp_num = qp->qplib_qp.id;
 	spin_lock_init(&qp->sq_lock);
+	spin_lock_init(&qp->rq_lock);
 
 	if (udata) {
 		struct bnxt_re_qp_resp resp;
@@ -2256,7 +2257,10 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr,
 	struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
 	struct bnxt_qplib_swqe wqe;
 	int rc = 0, payload_sz = 0;
+	unsigned long flags;
+	u32 count = 0;
 
+	spin_lock_irqsave(&qp->rq_lock, flags);
 	while (wr) {
 		/* House keeping */
 		memset(&wqe, 0, sizeof(wqe));
@@ -2285,9 +2289,21 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr,
 			*bad_wr = wr;
 			break;
 		}
+
+		/* Ring DB if the RQEs posted reaches a threshold value */
+		if (++count >= BNXT_RE_RQ_WQE_THRESHOLD) {
+			bnxt_qplib_post_recv_db(&qp->qplib_qp);
+			count = 0;
+		}
+
 		wr = wr->next;
 	}
-	bnxt_qplib_post_recv_db(&qp->qplib_qp);
+
+	if (count)
+		bnxt_qplib_post_recv_db(&qp->qplib_qp);
+
+	spin_unlock_irqrestore(&qp->rq_lock, flags);
+
 	return rc;
 }
 
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 381e4e9..93539e3 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -72,6 +72,7 @@ struct bnxt_re_qp {
 	struct bnxt_re_dev	*rdev;
 	struct ib_qp		ib_qp;
 	spinlock_t		sq_lock;	/* protect sq */
+	spinlock_t		rq_lock;	/* protect rq */
 	struct bnxt_qplib_qp	qplib_qp;
 	struct ib_umem		*sumem;
 	struct ib_umem		*rumem;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 11/14] RDMA/bnxt_re: Report supported value to IB stack in query_device
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 10/14] RDMA/bnxt_re: Fix RQE posting logic Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 12/14] RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP Selvin Xavier
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Selvin Xavier

 - Report supported value for max_mr_size to IB stack in query_device.
   Also, check and log if MR size requested by application in
   reg_user_mr() is greater than value currently supported by driver.
 - Report only 4K page size support for now
 - Fix Max_QP value returned by ibv_devinfo -vv.
   In case of PF, FW reserves 129 QPs for creating QP1s of VFs
   and PF. So the max_qp value reported by FW for PF doesn'tt include
   the QP1. Fixing this issue by adding 1 with the value reported
   by Bono.

Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  |  2 ++
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 ++++++++----
 drivers/infiniband/hw/bnxt_re/qplib_sp.c |  2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 12950ec..30bff05 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -51,6 +51,8 @@
 #define BNXT_RE_PAGE_SIZE_8M		BIT(23)
 #define BNXT_RE_PAGE_SIZE_1G		BIT(30)
 
+#define BNXT_RE_MAX_MR_SIZE		BIT(30)
+
 #define BNXT_RE_MAX_QPC_COUNT		(64 * 1024)
 #define BNXT_RE_MAX_MRW_COUNT		(64 * 1024)
 #define BNXT_RE_MAX_SRQC_COUNT		(64 * 1024)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 5a8b17e..34aa09e 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -145,10 +145,8 @@ int bnxt_re_query_device(struct ib_device *ibdev,
 	ib_attr->fw_ver = (u64)(unsigned long)(dev_attr->fw_ver);
 	bnxt_qplib_get_guid(rdev->netdev->dev_addr,
 			    (u8 *)&ib_attr->sys_image_guid);
-	ib_attr->max_mr_size = ~0ull;
-	ib_attr->page_size_cap = BNXT_RE_PAGE_SIZE_4K | BNXT_RE_PAGE_SIZE_8K |
-				 BNXT_RE_PAGE_SIZE_64K | BNXT_RE_PAGE_SIZE_2M |
-				 BNXT_RE_PAGE_SIZE_8M | BNXT_RE_PAGE_SIZE_1G;
+	ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE;
+	ib_attr->page_size_cap = BNXT_RE_PAGE_SIZE_4K;
 
 	ib_attr->vendor_id = rdev->en_dev->pdev->vendor;
 	ib_attr->vendor_part_id = rdev->en_dev->pdev->device;
@@ -3314,6 +3312,12 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
 	struct scatterlist *sg;
 	int entry;
 
+	if (length > BNXT_RE_MAX_MR_SIZE) {
+		dev_err(rdev_to_dev(rdev), "MR Size: %lld > Max supported:%ld\n",
+			length, BNXT_RE_MAX_MR_SIZE);
+		return ERR_PTR(-ENOMEM);
+	}
+
 	mr = kzalloc(sizeof(*mr), GFP_KERNEL);
 	if (!mr)
 		return ERR_PTR(-ENOMEM);
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index 3744d33..168e0cc 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -81,6 +81,8 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
 
 	/* Extract the context from the side buffer */
 	attr->max_qp = le32_to_cpu(sb->max_qp);
+	/* max_qp value reported by FW for PF doesn't include the QP1 for PF */
+	attr->max_qp += 1;
 	attr->max_qp_rd_atom =
 		sb->max_qp_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ?
 		BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_rd_atom;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 12/14] RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 11/14] RDMA/bnxt_re: Report supported value to IB stack in query_device Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 13/14] RDMA/bnxt_re: Specify RDMA component when allocating stats context Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version Selvin Xavier
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Eddie Wai, Selvin Xavier

From: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

There's a couple of bugs in the support of max_rd_atomic and
max_dest_rd_atomic. In the modify_qp, if the requested max_rd_atomic,
which is the ORRQ size, is greater than what the chip can support,
then we have to cap the request to chip max as we can't have the HW
overflow the ORRQ. Capping the max_rd_atomic support internally is okay
to do as the remaining read/atomic WRs will still be sitting in the SQ.
However, for the max_dest_rd_atomic, the driver has to error out as
this dictates the IRRQ size and we can't control what the remote
side sends.

Signed-off-by: Eddie Wai <eddie.wai-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 34aa09e..f770737 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -172,7 +172,7 @@ int bnxt_re_query_device(struct ib_device *ibdev,
 	ib_attr->max_mr = dev_attr->max_mr;
 	ib_attr->max_pd = dev_attr->max_pd;
 	ib_attr->max_qp_rd_atom = dev_attr->max_qp_rd_atom;
-	ib_attr->max_qp_init_rd_atom = dev_attr->max_qp_rd_atom;
+	ib_attr->max_qp_init_rd_atom = dev_attr->max_qp_init_rd_atom;
 	ib_attr->atomic_cap = IB_ATOMIC_HCA;
 	ib_attr->masked_atomic_cap = IB_ATOMIC_HCA;
 
@@ -1503,13 +1503,24 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
 	if (qp_attr_mask & IB_QP_MAX_QP_RD_ATOMIC) {
 		qp->qplib_qp.modify_flags |=
 				CMDQ_MODIFY_QP_MODIFY_MASK_MAX_RD_ATOMIC;
-		qp->qplib_qp.max_rd_atomic = qp_attr->max_rd_atomic;
+		/* Cap the max_rd_atomic to device max */
+		qp->qplib_qp.max_rd_atomic = min_t(u32, qp_attr->max_rd_atomic,
+						   dev_attr->max_qp_rd_atom);
 	}
 	if (qp_attr_mask & IB_QP_SQ_PSN) {
 		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_SQ_PSN;
 		qp->qplib_qp.sq.psn = qp_attr->sq_psn;
 	}
 	if (qp_attr_mask & IB_QP_MAX_DEST_RD_ATOMIC) {
+		if (qp_attr->max_dest_rd_atomic >
+		    dev_attr->max_qp_init_rd_atom) {
+			dev_err(rdev_to_dev(rdev),
+				"max_dest_rd_atomic requested%d is > dev_max%d",
+				qp_attr->max_dest_rd_atomic,
+				dev_attr->max_qp_init_rd_atom);
+			return -EINVAL;
+		}
+
 		qp->qplib_qp.modify_flags |=
 				CMDQ_MODIFY_QP_MODIFY_MASK_MAX_DEST_RD_ATOMIC;
 		qp->qplib_qp.max_dest_rd_atomic = qp_attr->max_dest_rd_atomic;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 13/14] RDMA/bnxt_re: Specify RDMA component when allocating stats context
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 12/14] RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
  2017-05-10 10:45   ` [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version Selvin Xavier
  13 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Somnath Kotur, Selvin Xavier

From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Starting FW version 20.6.47, firmware is keeping separate statistics
for L2 and RDMA. However, driver needs to specify RDMA or not when
allocating stat_ctx.

Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index fd61949..00dab7c 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -335,6 +335,7 @@ static int bnxt_re_net_stats_ctx_alloc(struct bnxt_re_dev *rdev,
 	bnxt_re_init_hwrm_hdr(rdev, (void *)&req, HWRM_STAT_CTX_ALLOC, -1, -1);
 	req.update_period_ms = cpu_to_le32(1000);
 	req.stats_dma_addr = cpu_to_le64(dma_map);
+	req.stat_ctx_flags = STAT_CTX_ALLOC_REQ_STAT_CTX_FLAGS_ROCE;
 	bnxt_re_fill_fw_msg(&fw_msg, (void *)&req, sizeof(req), (void *)&resp,
 			    sizeof(resp), DFLT_HWRM_CMD_TIMEOUT);
 	rc = en_dev->en_ops->bnxt_send_fw_msg(en_dev, BNXT_ROCE_ULP, &fw_msg);
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-05-10 10:45   ` [PATCH for-next 13/14] RDMA/bnxt_re: Specify RDMA component when allocating stats context Selvin Xavier
@ 2017-05-10 10:45   ` Selvin Xavier
       [not found]     ` <1494413139-11883-15-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
  13 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-10 10:45 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Selvin Xavier

Update the driver version to the scheme used in the internal
development branches.

Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 30bff05..9041c23b 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -40,7 +40,7 @@
 #ifndef __BNXT_RE_H__
 #define __BNXT_RE_H__
 #define ROCE_DRV_MODULE_NAME		"bnxt_re"
-#define ROCE_DRV_MODULE_VERSION		"1.0.0"
+#define ROCE_DRV_MODULE_VERSION		"20.6.1.0u"
 
 #define BNXT_RE_DESC	"Broadcom NetXtreme-C/E RoCE Driver"
 
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found]     ` <1494413139-11883-15-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-05-10 15:06       ` Dennis Dalessandro
       [not found]         ` <84d9a0ce-0e5c-fc50-237f-b7ad9d541324-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Dennis Dalessandro @ 2017-05-10 15:06 UTC (permalink / raw)
  To: Selvin Xavier, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 5/10/2017 6:45 AM, Selvin Xavier wrote:
> Update the driver version to the scheme used in the internal
> development branches.

See the feedback [1] we got from Greg KH on a similar patch.

[1] https://patchwork.kernel.org/patch/7490321/

> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> index 30bff05..9041c23b 100644
> --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> @@ -40,7 +40,7 @@
>  #ifndef __BNXT_RE_H__
>  #define __BNXT_RE_H__
>  #define ROCE_DRV_MODULE_NAME		"bnxt_re"
> -#define ROCE_DRV_MODULE_VERSION		"1.0.0"
> +#define ROCE_DRV_MODULE_VERSION		"20.6.1.0u"
>
>  #define BNXT_RE_DESC	"Broadcom NetXtreme-C/E RoCE Driver"
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found]         ` <84d9a0ce-0e5c-fc50-237f-b7ad9d541324-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-05-10 15:54           ` Leon Romanovsky
       [not found]             ` <20170510155430.GC1839-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-10 15:54 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Selvin Xavier, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Saeed Mahameed

[-- Attachment #1: Type: text/plain, Size: 1381 bytes --]

On Wed, May 10, 2017 at 11:06:31AM -0400, Dennis Dalessandro wrote:
> On 5/10/2017 6:45 AM, Selvin Xavier wrote:
> > Update the driver version to the scheme used in the internal
> > development branches.
>
> See the feedback [1] we got from Greg KH on a similar patch.
>
> [1] https://patchwork.kernel.org/patch/7490321/

Yeah, the things go worse when this driver ported to old kernel. This
numbers are entirely useless.

>
> > Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> > ---
> >  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > index 30bff05..9041c23b 100644
> > --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > @@ -40,7 +40,7 @@
> >  #ifndef __BNXT_RE_H__
> >  #define __BNXT_RE_H__
> >  #define ROCE_DRV_MODULE_NAME		"bnxt_re"
> > -#define ROCE_DRV_MODULE_VERSION		"1.0.0"
> > +#define ROCE_DRV_MODULE_VERSION		"20.6.1.0u"
> >
> >  #define BNXT_RE_DESC	"Broadcom NetXtreme-C/E RoCE Driver"
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found]             ` <20170510155430.GC1839-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-11  5:22               ` Selvin Xavier
       [not found]                 ` <CA+sbYW2ObMdDOdhPL9GMe5YNk7P2MsG5q=f+Ke5e4MU6A=WV6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-11  5:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Dennis Dalessandro, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Saeed Mahameed

Thanks Leon and Dennis for your comments.

The main intention of this patch is to easily map the upstream driver
to the level of supported features in the
internal code base. I understand the point you and others are
mentioning while porting to older kernels or during
the bug fixes/feature additions from other contributors. But, can we
consider it as just a number to enable vendors
to track their base branch?

If you still feel that this patch doesn't make sense, please let me
know. I will post a v2 after dropping the change.

Thanks,
Selvin

On Wed, May 10, 2017 at 9:24 PM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, May 10, 2017 at 11:06:31AM -0400, Dennis Dalessandro wrote:
>> On 5/10/2017 6:45 AM, Selvin Xavier wrote:
>> > Update the driver version to the scheme used in the internal
>> > development branches.
>>
>> See the feedback [1] we got from Greg KH on a similar patch.
>>
>> [1] https://patchwork.kernel.org/patch/7490321/
>
> Yeah, the things go worse when this driver ported to old kernel. This
> numbers are entirely useless.
>
>>
>> > Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> > ---
>> >  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> > index 30bff05..9041c23b 100644
>> > --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> > +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> > @@ -40,7 +40,7 @@
>> >  #ifndef __BNXT_RE_H__
>> >  #define __BNXT_RE_H__
>> >  #define ROCE_DRV_MODULE_NAME               "bnxt_re"
>> > -#define ROCE_DRV_MODULE_VERSION            "1.0.0"
>> > +#define ROCE_DRV_MODULE_VERSION            "20.6.1.0u"
>> >
>> >  #define BNXT_RE_DESC       "Broadcom NetXtreme-C/E RoCE Driver"
>> >
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found]                 ` <CA+sbYW2ObMdDOdhPL9GMe5YNk7P2MsG5q=f+Ke5e4MU6A=WV6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-05-11  8:27                   ` Leon Romanovsky
       [not found]                     ` <20170511082700.GA3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-11  8:27 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: Dennis Dalessandro, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Saeed Mahameed

[-- Attachment #1: Type: text/plain, Size: 2743 bytes --]

On Thu, May 11, 2017 at 10:52:31AM +0530, Selvin Xavier wrote:
> Thanks Leon and Dennis for your comments.
>
> The main intention of this patch is to easily map the upstream driver
> to the level of supported features in the
> internal code base. I understand the point you and others are
> mentioning while porting to older kernels or during
> the bug fixes/feature additions from other contributors. But, can we
> consider it as just a number to enable vendors
> to track their base branch?

It is up to you, just wanted to warn you that in long run it won't give
you any information. We are in similar situation and decided to change
versions in really rare situations, like addition of new driver. In other
cases, it is useless.

>
> If you still feel that this patch doesn't make sense, please let me
> know. I will post a v2 after dropping the change.
>
> Thanks,
> Selvin
>
> On Wed, May 10, 2017 at 9:24 PM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > On Wed, May 10, 2017 at 11:06:31AM -0400, Dennis Dalessandro wrote:
> >> On 5/10/2017 6:45 AM, Selvin Xavier wrote:
> >> > Update the driver version to the scheme used in the internal
> >> > development branches.
> >>
> >> See the feedback [1] we got from Greg KH on a similar patch.
> >>
> >> [1] https://patchwork.kernel.org/patch/7490321/
> >
> > Yeah, the things go worse when this driver ported to old kernel. This
> > numbers are entirely useless.
> >
> >>
> >> > Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> > ---
> >> >  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> > index 30bff05..9041c23b 100644
> >> > --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> > +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> > @@ -40,7 +40,7 @@
> >> >  #ifndef __BNXT_RE_H__
> >> >  #define __BNXT_RE_H__
> >> >  #define ROCE_DRV_MODULE_NAME               "bnxt_re"
> >> > -#define ROCE_DRV_MODULE_VERSION            "1.0.0"
> >> > +#define ROCE_DRV_MODULE_VERSION            "20.6.1.0u"
> >> >
> >> >  #define BNXT_RE_DESC       "Broadcom NetXtreme-C/E RoCE Driver"
> >> >
> >> >
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version
       [not found]                     ` <20170511082700.GA3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-11 13:55                       ` Dennis Dalessandro
  0 siblings, 0 replies; 28+ messages in thread
From: Dennis Dalessandro @ 2017-05-11 13:55 UTC (permalink / raw)
  To: Leon Romanovsky, Selvin Xavier
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Saeed Mahameed

On 5/11/2017 4:27 AM, Leon Romanovsky wrote:
> On Thu, May 11, 2017 at 10:52:31AM +0530, Selvin Xavier wrote:
>> Thanks Leon and Dennis for your comments.
>>
>> The main intention of this patch is to easily map the upstream driver
>> to the level of supported features in the
>> internal code base. I understand the point you and others are
>> mentioning while porting to older kernels or during
>> the bug fixes/feature additions from other contributors. But, can we
>> consider it as just a number to enable vendors
>> to track their base branch?
>
> It is up to you, just wanted to warn you that in long run it won't give
> you any information. We are in similar situation and decided to change
> versions in really rare situations, like addition of new driver. In other
> cases, it is useless.

Personally it doesn't bother me one bit. I mean, hey we did it in our 
driver. But it was NAKed by the maintainer of the staging tree when our 
driver was there, so just wanted to point out that some in the community 
are pretty strongly opposed to it.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
       [not found]     ` <1494413139-11883-10-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-05-14  6:10       ` Leon Romanovsky
       [not found]         ` <20170514061015.GP3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-14  6:10 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Kalesh AP

[-- Attachment #1: Type: text/plain, Size: 1733 bytes --]

On Wed, May 10, 2017 at 03:45:34AM -0700, Selvin Xavier wrote:
> Do not free context table entry if delete GID fails. Stack may call
> back the add_gid for the same context again which could result in
> panic

"may call" ??? Will you leave memory leak if this "may call" won't happen?


>
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> index 61703f3..525f4b0 100644
> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> @@ -375,15 +375,17 @@ int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num,
>  			return -EINVAL;
>  		ctx->refcnt--;
>  		if (!ctx->refcnt) {
> -			rc = bnxt_qplib_del_sgid
> -					(sgid_tbl,
> -					 &sgid_tbl->tbl[ctx->idx], true);
> -			if (rc)
> +			rc = bnxt_qplib_del_sgid(sgid_tbl,
> +						 &sgid_tbl->tbl[ctx->idx],
> +						 true);
> +			if (rc) {
>  				dev_err(rdev_to_dev(rdev),
>  					"Failed to remove GID: %#x", rc);
> -			ctx_tbl = sgid_tbl->ctx;
> -			ctx_tbl[ctx->idx] = NULL;
> -			kfree(ctx);
> +			} else {
> +				ctx_tbl = sgid_tbl->ctx;
> +				ctx_tbl[ctx->idx] = NULL;
> +				kfree(ctx);
> +			}
>  		}
>  	} else {
>  		return -EINVAL;
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured
       [not found]     ` <1494413139-11883-9-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-05-14  6:23       ` Leon Romanovsky
       [not found]         ` <20170514062349.GQ3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-14  6:23 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Kalesh AP

[-- Attachment #1: Type: text/plain, Size: 10463 bytes --]

On Wed, May 10, 2017 at 03:45:33AM -0700, Selvin Xavier wrote:
> From: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>
> Current implementation does not program vlan header insertion
> in RoCE packet if no vlan is configured. Firmware does not add
> prority when there is no vlan tag in the packet. Modify the code
> to insert vlan header when PFC is enabled on the interface.
>
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/bnxt_re/main.c      | 53 +++++++++++++++++++++++---
>  drivers/infiniband/hw/bnxt_re/qplib_res.c | 10 +++++
>  drivers/infiniband/hw/bnxt_re/qplib_res.h |  2 +
>  drivers/infiniband/hw/bnxt_re/qplib_sp.c  | 63 ++++++++++++++++++++++++++++---
>  drivers/infiniband/hw/bnxt_re/qplib_sp.h  |  2 +
>  drivers/infiniband/hw/bnxt_re/roce_hsi.h  |  4 +-
>  6 files changed, 121 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
> index 99a54fd..fd61949 100644
> --- a/drivers/infiniband/hw/bnxt_re/main.c
> +++ b/drivers/infiniband/hw/bnxt_re/main.c
> @@ -840,6 +840,42 @@ static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev)
>  	mutex_unlock(&rdev->qp_lock);
>  }
>
> +static int bnxt_re_update_gid(struct bnxt_re_dev *rdev)
> +{
> +	struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl;
> +	struct bnxt_qplib_gid gid;
> +	u16 gid_idx, index;
> +	int rc = 0;
> +
> +	if (!test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
> +		return 0;
> +
> +	if (!sgid_tbl) {
> +		dev_err(rdev_to_dev(rdev), "QPLIB: SGID table not allocated");
> +		return -EINVAL;
> +	}
> +
> +	for (index = 0; index < sgid_tbl->active; index++) {
> +		gid_idx = sgid_tbl->hw_id[index];
> +
> +		if (!memcmp(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
> +			    sizeof(bnxt_qplib_gid_zero)))
> +			continue;
> +		/* need to modify the VLAN enable setting of non VLAN GID only
> +		 * as setting is done for VLAN GID while adding GID
> +		 */
> +		if (sgid_tbl->vlan[index])
> +			continue;
> +
> +		memcpy(&gid, &sgid_tbl->tbl[index], sizeof(gid));
> +
> +		rc = bnxt_qplib_update_sgid(sgid_tbl, &gid, gid_idx,
> +					    rdev->qplib_res.netdev->dev_addr);
> +	}
> +
> +	return rc;
> +}
> +
>  static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
>  {
>  	u32 prio_map = 0, tmp_map = 0;
> @@ -859,8 +895,6 @@ static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
>  	tmp_map = dcb_ieee_getapp_mask(netdev, &app);
>  	prio_map |= tmp_map;
>
> -	if (!prio_map)
> -		prio_map = -EFAULT;
>  	return prio_map;
>  }
>
> @@ -886,10 +920,7 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
>  	int rc;
>
>  	/* Get priority for roce */
> -	rc = bnxt_re_get_priority_mask(rdev);
> -	if (rc < 0)
> -		return rc;
> -	prio_map = (u8)rc;
> +	prio_map = bnxt_re_get_priority_mask(rdev);
>
>  	if (prio_map == rdev->cur_prio_map)
>  		return 0;
> @@ -911,6 +942,16 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
>  		return rc;
>  	}
>
> +	/* Actual priorities are not programmed as they are already
> +	 * done by L2 driver; just enable or disable priority vlan tagging
> +	 */
> +	if ((prio_map == 0 && rdev->qplib_res.prio) ||
> +	    (prio_map != 0 && !rdev->qplib_res.prio)) {
> +		rdev->qplib_res.prio = prio_map ? true : false;
> +
> +		bnxt_re_update_gid(rdev);
> +	}
> +
>  	return 0;
>  }
>
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> index 62447b3..4e10170 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
> @@ -468,9 +468,11 @@ static void bnxt_qplib_free_sgid_tbl(struct bnxt_qplib_res *res,
>  	kfree(sgid_tbl->tbl);
>  	kfree(sgid_tbl->hw_id);
>  	kfree(sgid_tbl->ctx);
> +	kfree(sgid_tbl->vlan);
>  	sgid_tbl->tbl = NULL;
>  	sgid_tbl->hw_id = NULL;
>  	sgid_tbl->ctx = NULL;
> +	sgid_tbl->vlan = NULL;
>  	sgid_tbl->max = 0;
>  	sgid_tbl->active = 0;
>  }
> @@ -491,8 +493,15 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res,
>  	if (!sgid_tbl->ctx)
>  		goto out_free2;
>
> +	sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL);
> +	if (!sgid_tbl->vlan)
> +		goto out_free3;
> +
>  	sgid_tbl->max = max;
>  	return 0;
> +out_free3:
> +	kfree(sgid_tbl->ctx);
> +	sgid_tbl->ctx = NULL;
>  out_free2:
>  	kfree(sgid_tbl->hw_id);
>  	sgid_tbl->hw_id = NULL;
> @@ -514,6 +523,7 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct bnxt_qplib_res *res,
>  	}
>  	memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max);
>  	memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max);
> +	memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max);
>  	sgid_tbl->active = 0;
>  }
>
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
> index 2e48555..e872075 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
> @@ -116,6 +116,7 @@ struct bnxt_qplib_sgid_tbl {
>  	u16				max;
>  	u16				active;
>  	void				*ctx;
> +	u8				*vlan;
>  };
>
>  struct bnxt_qplib_pkey_tbl {
> @@ -188,6 +189,7 @@ struct bnxt_qplib_res {
>  	struct bnxt_qplib_sgid_tbl	sgid_tbl;
>  	struct bnxt_qplib_pkey_tbl	pkey_tbl;
>  	struct bnxt_qplib_dpi_tbl	dpi_tbl;
> +	bool				prio;
>  };
>
>  #define to_bnxt_qplib(ptr, type, member)	\
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> index fde18cf..3744d33 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
> @@ -197,6 +197,7 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  	}
>  	memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
>  	       sizeof(bnxt_qplib_gid_zero));
> +	sgid_tbl->vlan[index] = 0;
>  	sgid_tbl->active--;
>  	dev_dbg(&res->pdev->dev,
>  		"QPLIB: SGID deleted hw_id[0x%x] = 0x%x active = 0x%x",
> @@ -260,11 +261,19 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  		req.gid[1] = cpu_to_be32(temp32[2]);
>  		req.gid[2] = cpu_to_be32(temp32[1]);
>  		req.gid[3] = cpu_to_be32(temp32[0]);
> -		if (vlan_id != 0xFFFF)
> -			req.vlan = cpu_to_le16((vlan_id &
> -					CMDQ_ADD_GID_VLAN_VLAN_ID_MASK) |
> -					CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
> -					CMDQ_ADD_GID_VLAN_VLAN_EN);
> +		/*
> +		 * driver should ensure that all RoCE traffic is always VLAN
> +		 * tagged if RoCE traffic is running on non-zero VLAN ID or
> +		 * RoCE traffic is running on non-zero Priority.
> +		 */
> +		if ((vlan_id != 0xFFFF) || res->prio) {
> +			if (vlan_id != 0xFFFF)
> +				req.vlan = cpu_to_le16
> +				(vlan_id & CMDQ_ADD_GID_VLAN_VLAN_ID_MASK);
> +			req.vlan |= cpu_to_le16
> +					(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
> +					 CMDQ_ADD_GID_VLAN_VLAN_EN);
> +		}
>
>  		/* MAC in network format */
>  		memcpy(temp16, smac, 6);
> @@ -281,6 +290,9 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  	/* Add GID to the sgid_tbl */
>  	memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid));
>  	sgid_tbl->active++;
> +	if (vlan_id != 0xFFFF)
> +		sgid_tbl->vlan[free_idx] = 1;
> +
>  	dev_dbg(&res->pdev->dev,
>  		"QPLIB: SGID added hw_id[0x%x] = 0x%x active = 0x%x",
>  		 free_idx, sgid_tbl->hw_id[free_idx], sgid_tbl->active);
> @@ -290,6 +302,47 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  	return 0;
>  }
>
> +int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
> +			   struct bnxt_qplib_gid *gid, u16 gid_idx,
> +			   u8 *smac)
> +{
> +	struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl,
> +						   struct bnxt_qplib_res,
> +						   sgid_tbl);
> +	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
> +	struct creq_modify_gid_resp resp;
> +	struct cmdq_modify_gid req;
> +	int rc;
> +	u16 cmd_flags = 0;
> +	u32 temp32[4];
> +	u16 temp16[3];
> +
> +	RCFW_CMD_PREP(req, MODIFY_GID, cmd_flags);
> +
> +	memcpy(temp32, gid->data, sizeof(struct bnxt_qplib_gid));
> +	req.gid[0] = cpu_to_be32(temp32[3]);
> +	req.gid[1] = cpu_to_be32(temp32[2]);
> +	req.gid[2] = cpu_to_be32(temp32[1]);
> +	req.gid[3] = cpu_to_be32(temp32[0]);
> +	if (res->prio) {
> +		req.vlan |= cpu_to_le16
> +			(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
> +			 CMDQ_ADD_GID_VLAN_VLAN_EN);
> +	}
> +
> +	/* MAC in network format */
> +	memcpy(temp16, smac, 6);
> +	req.src_mac[0] = cpu_to_be16(temp16[0]);
> +	req.src_mac[1] = cpu_to_be16(temp16[1]);
> +	req.src_mac[2] = cpu_to_be16(temp16[2]);
> +

All these memcpy to temp* are redundant, temp16[0] can be replaced by smac[0].
the same goes for temp32.

> +	req.gid_index = cpu_to_le16(gid_idx);
> +
> +	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
> +					  (void *)&resp, NULL, 0);
> +	return rc;
> +}
> +
>  /* pkeys */
>  int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
>  			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
> diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
> index a543f95..e3a3ed9 100644
> --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
> +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
> @@ -132,6 +132,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
>  			struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id,
>  			bool update, u32 *index);
> +int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
> +			   struct bnxt_qplib_gid *gid, u16 gid_idx, u8 *smac);
>  int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
>  			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
>  			u16 *pkey);
> diff --git a/drivers/infiniband/hw/bnxt_re/roce_hsi.h b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
> index fc23477..eeb55b2 100644
> --- a/drivers/infiniband/hw/bnxt_re/roce_hsi.h
> +++ b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
> @@ -1473,8 +1473,8 @@ struct cmdq_modify_gid {
>  	u8 resp_size;
>  	u8 reserved8;
>  	__le64 resp_addr;
> -	__le32 gid[4];
> -	__le16 src_mac[3];
> +	__be32 gid[4];
> +	__be16 src_mac[3];
>  	__le16 vlan;
>  	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_MASK		    0xfffUL
>  	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_SFT		    0
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events
       [not found]     ` <1494413139-11883-4-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
@ 2017-05-14  6:49       ` Leon Romanovsky
       [not found]         ` <20170514064904.GR3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-14  6:49 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Somnath Kotur,
	Sriharsha Basavapatna

[-- Attachment #1: Type: text/plain, Size: 8006 bytes --]

On Wed, May 10, 2017 at 03:45:28AM -0700, Selvin Xavier wrote:
> From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>
> Upon receipt of the NETDEV_REGISTER event from the netdev notifier
> chain, the IB stack registration is spawned off to a workqueue
> since that also requires an rtnl lock.
> There could be 2 kinds of races between the NETDEV_REGISTER and the
> NETDEV_UNREGISTER event handling.
> a)The NETDEV_UNREGISTER event is received in rapid succession after
> the NETDEV_REGISTER event even before the work queue got a chance
> to run
> b)The NETDEV_UNREGISTER event is received while the workqueue that
> handles registration with the IB stack is still in progress.
> Handle both the races with a bit flag that is set as part of the
> NETDEV_REGISTER event just before the work item is queued and cleared
> in the workqueue after the registration with the IB stack is complete.
> Use a barrier to ensure the bit is cleared only after the IB stack
> registration code is completed.

I have a very strong feeling that this patch doesn't solve race, but
makes it is hard to reproduce. Why don't you use locks to protect flows?

>
> Removes the link speed query from the query_port as it causes
> deadlock while trying to acquire rtnl_lock. Now querying the
> speed from the nedev event itself.
>
> Also, added a code to wait for resources(like CQ) to be freed by the
> ULPs, during driver unload
>
> Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/bnxt_re/bnxt_re.h  | 13 +++++++----
>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 21 +++--------------
>  drivers/infiniband/hw/bnxt_re/main.c     | 39 ++++++++++++++++++++++++++++++++
>  3 files changed, 50 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> index ebf7be8..fad04b2 100644
> --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> @@ -80,11 +80,13 @@ struct bnxt_re_dev {
>  	struct ib_device		ibdev;
>  	struct list_head		list;
>  	unsigned long			flags;
> -#define BNXT_RE_FLAG_NETDEV_REGISTERED	0
> -#define BNXT_RE_FLAG_IBDEV_REGISTERED	1
> -#define BNXT_RE_FLAG_GOT_MSIX		2
> -#define BNXT_RE_FLAG_RCFW_CHANNEL_EN	8
> -#define BNXT_RE_FLAG_QOS_WORK_REG	16
> +#define BNXT_RE_FLAG_NETDEV_REGISTERED         0
> +#define BNXT_RE_FLAG_IBDEV_REGISTERED          1
> +#define BNXT_RE_FLAG_GOT_MSIX                  2
> +#define BNXT_RE_FLAG_RCFW_CHANNEL_EN           4
> +#define BNXT_RE_FLAG_QOS_WORK_REG              5
> +#define BNXT_RE_FLAG_NETDEV_REG_IN_PROG        6
> +
>  	struct net_device		*netdev;
>  	unsigned int			version, major, minor;
>  	struct bnxt_en_dev		*en_dev;
> @@ -127,6 +129,7 @@ struct bnxt_re_dev {
>  	struct bnxt_re_qp		*qp1_sqp;
>  	struct bnxt_re_ah		*sqp_ah;
>  	struct bnxt_re_sqp_entries sqp_tbl[1024];
> +	u32 espeed;
>  };
>
>  #define to_bnxt_re_dev(ptr, member)	\
> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> index 6385213..c717d5d 100644
> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> @@ -223,20 +223,8 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
>  	return 0;
>  }
>
> -static void __to_ib_speed_width(struct net_device *netdev, u8 *speed, u8 *width)
> +static void __to_ib_speed_width(u32 espeed, u8 *speed, u8 *width)
>  {
> -	struct ethtool_link_ksettings lksettings;
> -	u32 espeed;
> -
> -	if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
> -		memset(&lksettings, 0, sizeof(lksettings));
> -		rtnl_lock();
> -		netdev->ethtool_ops->get_link_ksettings(netdev, &lksettings);
> -		rtnl_unlock();
> -		espeed = lksettings.base.speed;
> -	} else {
> -		espeed = SPEED_UNKNOWN;
> -	}
>  	switch (espeed) {
>  	case SPEED_1000:
>  		*speed = IB_SPEED_SDR;
> @@ -303,12 +291,9 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
>  	port_attr->sm_sl = 0;
>  	port_attr->subnet_timeout = 0;
>  	port_attr->init_type_reply = 0;
> -	/* call the underlying netdev's ethtool hooks to query speed settings
> -	 * for which we acquire rtnl_lock _only_ if it's registered with
> -	 * IB stack to avoid race in the NETDEV_UNREG path
> -	 */
> +
>  	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
> -		__to_ib_speed_width(rdev->netdev, &port_attr->active_speed,
> +		__to_ib_speed_width(rdev->espeed, &port_attr->active_speed,
>  				    &port_attr->active_width);
>  	return 0;
>  }
> diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
> index 5d35540..99a54fd 100644
> --- a/drivers/infiniband/hw/bnxt_re/main.c
> +++ b/drivers/infiniband/hw/bnxt_re/main.c
> @@ -48,6 +48,8 @@
>  #include <net/ipv6.h>
>  #include <net/addrconf.h>
>  #include <linux/if_ether.h>
> +#include <linux/atomic.h>
> +#include <asm/barrier.h>
>
>  #include <rdma/ib_verbs.h>
>  #include <rdma/ib_user_verbs.h>
> @@ -926,6 +928,10 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
>  	if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
>  		cancel_delayed_work(&rdev->worker);
>
> +	/* Wait for ULPs to release references */
> +	while (atomic_read(&rdev->cq_count))
> +		usleep_range(500, 1000);
> +

It can change immediately after your check.

>  	bnxt_re_cleanup_res(rdev);
>  	bnxt_re_free_res(rdev, lock_wait);
>
> @@ -1152,6 +1158,22 @@ static void bnxt_re_remove_one(struct bnxt_re_dev *rdev)
>  	pci_dev_put(rdev->en_dev->pdev);
>  }
>
> +static void bnxt_re_get_link_speed(struct bnxt_re_dev *rdev)
> +{
> +	struct ethtool_link_ksettings lksettings;
> +	struct net_device *netdev = rdev->netdev;
> +
> +	if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
> +		memset(&lksettings, 0, sizeof(lksettings));
> +		if (rtnl_trylock()) {
> +			netdev->ethtool_ops->get_link_ksettings(netdev,
> +								&lksettings);
> +			rdev->espeed = lksettings.base.speed;
> +			rtnl_unlock();
> +		}
> +	}
> +}
> +
>  /* Handle all deferred netevents tasks */
>  static void bnxt_re_task(struct work_struct *work)
>  {
> @@ -1169,6 +1191,10 @@ static void bnxt_re_task(struct work_struct *work)
>  	switch (re_work->event) {
>  	case NETDEV_REGISTER:
>  		rc = bnxt_re_ib_reg(rdev);
> +		/* Use memory barrier to sync the rdev->flags */
> +		smp_mb();
> +		clear_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG,
> +			  &rdev->flags);
>  		if (rc)
>  			dev_err(rdev_to_dev(rdev),
>  				"Failed to register with IB: %#x", rc);
> @@ -1186,6 +1212,7 @@ static void bnxt_re_task(struct work_struct *work)
>  		else if (netif_carrier_ok(rdev->netdev))
>  			bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
>  					       IB_EVENT_PORT_ACTIVE);
> +		bnxt_re_get_link_speed(rdev);
>  		break;
>  	default:
>  		break;
> @@ -1244,10 +1271,22 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
>  			break;
>  		}
>  		bnxt_re_init_one(rdev);
> +		/*
> +		 *  Query the link speed at the time of registration.
> +		 *  Already in rtnl_lock context
> +		 */
> +		bnxt_re_get_link_speed(rdev);
> +
> +		set_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags);
>  		sch_work = true;
>  		break;
>
>  	case NETDEV_UNREGISTER:
> +		/* netdev notifier will call NETDEV_UNREGISTER again later since
> +		 * we are still holding the reference to the netdev
> +		 */
> +		if (test_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags))
> +			goto exit;
>  		bnxt_re_ib_unreg(rdev, false);
>  		bnxt_re_remove_one(rdev);
>  		bnxt_re_dev_unreg(rdev);
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured
       [not found]         ` <20170514062349.GQ3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-15 10:13           ` Selvin Xavier
  0 siblings, 0 replies; 28+ messages in thread
From: Selvin Xavier @ 2017-05-15 10:13 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Kalesh AP

On Sun, May 14, 2017 at 11:53 AM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>
> All these memcpy to temp* are redundant, temp16[0] can be replaced by smac[0].
> the same goes for temp32.

Thanks, We will handle it in v2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
       [not found]         ` <20170514061015.GP3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-15 10:54           ` Selvin Xavier
       [not found]             ` <CA+sbYW2LtbCf=DLXfyTKBUZJh5t7ZEBNYGHwJp1tFX49p5QL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-15 10:54 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Kalesh AP

On Sun, May 14, 2017 at 11:40 AM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, May 10, 2017 at 03:45:34AM -0700, Selvin Xavier wrote:
>> Do not free context table entry if delete GID fails. Stack may call
>> back the add_gid for the same context again which could result in
>> panic
>
> "may call" ??? Will you leave memory leak if this "may call" won't happen?
>
This fix was added only to avoid system crash in some a specific scenario.
When bnxt_re driver is loaded and if user tries to change interface mac address,
our delete GID fails because QP1 is still associated with existing MAC
(default GID).
If the above command fails GID tables are not modified in the h/w or
driver, but the GID context
memory is freed. Now, if the user changes the mac back to the original
value, another add_gid
comes to the driver where the driver reports that the GID is already
present in its table
and tries to access the context which was already freed.

So, in this case, in order to  avoid NULL pointer de-reference, this
patch removes the context memory
free  if delete_gid fails and the same context memory is re-used in new add_gid.
Memory cleanup will be taken care during driver unload, while deleting
the GID table.


I understood that the "may call" comment is not detailing this problem.
I will update the commit log and send a v2.

Thanks for your comments.
Selvin Xavier

>
>>
>> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 16 +++++++++-------
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> index 61703f3..525f4b0 100644
>> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> @@ -375,15 +375,17 @@ int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num,
>>                       return -EINVAL;
>>               ctx->refcnt--;
>>               if (!ctx->refcnt) {
>> -                     rc = bnxt_qplib_del_sgid
>> -                                     (sgid_tbl,
>> -                                      &sgid_tbl->tbl[ctx->idx], true);
>> -                     if (rc)
>> +                     rc = bnxt_qplib_del_sgid(sgid_tbl,
>> +                                              &sgid_tbl->tbl[ctx->idx],
>> +                                              true);
>> +                     if (rc) {
>>                               dev_err(rdev_to_dev(rdev),
>>                                       "Failed to remove GID: %#x", rc);
>> -                     ctx_tbl = sgid_tbl->ctx;
>> -                     ctx_tbl[ctx->idx] = NULL;
>> -                     kfree(ctx);
>> +                     } else {
>> +                             ctx_tbl = sgid_tbl->ctx;
>> +                             ctx_tbl[ctx->idx] = NULL;
>> +                             kfree(ctx);
>> +                     }
>>               }
>>       } else {
>>               return -EINVAL;
>> --
>> 2.5.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events
       [not found]         ` <20170514064904.GR3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-05-15 10:54           ` Selvin Xavier
       [not found]             ` <CA+sbYW0LLrE7EAsa1C-Bza1e6ug2MYntk8YVdOgm=iBLqDKBHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Selvin Xavier @ 2017-05-15 10:54 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Somnath Kotur,
	Sriharsha Basavapatna

On Sun, May 14, 2017 at 12:19 PM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, May 10, 2017 at 03:45:28AM -0700, Selvin Xavier wrote:
>> From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>>
>> Upon receipt of the NETDEV_REGISTER event from the netdev notifier
>> chain, the IB stack registration is spawned off to a workqueue
>> since that also requires an rtnl lock.
>> There could be 2 kinds of races between the NETDEV_REGISTER and the
>> NETDEV_UNREGISTER event handling.
>> a)The NETDEV_UNREGISTER event is received in rapid succession after
>> the NETDEV_REGISTER event even before the work queue got a chance
>> to run
>> b)The NETDEV_UNREGISTER event is received while the workqueue that
>> handles registration with the IB stack is still in progress.
>> Handle both the races with a bit flag that is set as part of the
>> NETDEV_REGISTER event just before the work item is queued and cleared
>> in the workqueue after the registration with the IB stack is complete.
>> Use a barrier to ensure the bit is cleared only after the IB stack
>> registration code is completed.
>
> I have a very strong feeling that this patch doesn't solve race, but
> makes it is hard to reproduce. Why don't you use locks to protect flows?
>

IB registration cannot be invoked from the netdev event because the IB stack
tries to acquire the rtnl_lock and netdev event is already in rtnl
lock context. So we
are scheduling a worker in case of NETDEV_REGISTER.
>From the worker task, bnxt_re driver tries to acquire rtnl_lock to access
the L2 driver for creating the initial resources.

Having a driver lock to synchronize does not solve the possible dead
lock situation.
Say, in case the driver acquired the driver_lock in the registration task and a
NETDEV_UNREGISTER is received which comes with rtnl_lock is held.
Driver tries to acquire the driver_lock during un-reg. However,
registration task will hold
the driver_lock and wait for rtnl_lock to access the L2 driver. netdev
unreg event will not
get the driver_lock.

To avoid this dead lock, we are using BNXT_RE_FLAG_NETDEV_REG_IN_PROG and
driver doesn't decrement the netdev ref count if the registration task
is running. Once registration
is done, we will reset BNXT_RE_FLAG_NETDEV_REG_IN_PROG flag and we will proceed
with un-registration.


>>
>> Removes the link speed query from the query_port as it causes
>> deadlock while trying to acquire rtnl_lock. Now querying the
>> speed from the nedev event itself.
>>
>> Also, added a code to wait for resources(like CQ) to be freed by the
>> ULPs, during driver unload
>>
>> Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/hw/bnxt_re/bnxt_re.h  | 13 +++++++----
>>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 21 +++--------------
>>  drivers/infiniband/hw/bnxt_re/main.c     | 39 ++++++++++++++++++++++++++++++++
>>  3 files changed, 50 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> index ebf7be8..fad04b2 100644
>> --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
>> @@ -80,11 +80,13 @@ struct bnxt_re_dev {
>>       struct ib_device                ibdev;
>>       struct list_head                list;
>>       unsigned long                   flags;
>> -#define BNXT_RE_FLAG_NETDEV_REGISTERED       0
>> -#define BNXT_RE_FLAG_IBDEV_REGISTERED        1
>> -#define BNXT_RE_FLAG_GOT_MSIX                2
>> -#define BNXT_RE_FLAG_RCFW_CHANNEL_EN 8
>> -#define BNXT_RE_FLAG_QOS_WORK_REG    16
>> +#define BNXT_RE_FLAG_NETDEV_REGISTERED         0
>> +#define BNXT_RE_FLAG_IBDEV_REGISTERED          1
>> +#define BNXT_RE_FLAG_GOT_MSIX                  2
>> +#define BNXT_RE_FLAG_RCFW_CHANNEL_EN           4
>> +#define BNXT_RE_FLAG_QOS_WORK_REG              5
>> +#define BNXT_RE_FLAG_NETDEV_REG_IN_PROG        6
>> +
>>       struct net_device               *netdev;
>>       unsigned int                    version, major, minor;
>>       struct bnxt_en_dev              *en_dev;
>> @@ -127,6 +129,7 @@ struct bnxt_re_dev {
>>       struct bnxt_re_qp               *qp1_sqp;
>>       struct bnxt_re_ah               *sqp_ah;
>>       struct bnxt_re_sqp_entries sqp_tbl[1024];
>> +     u32 espeed;
>>  };
>>
>>  #define to_bnxt_re_dev(ptr, member)  \
>> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> index 6385213..c717d5d 100644
>> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
>> @@ -223,20 +223,8 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
>>       return 0;
>>  }
>>
>> -static void __to_ib_speed_width(struct net_device *netdev, u8 *speed, u8 *width)
>> +static void __to_ib_speed_width(u32 espeed, u8 *speed, u8 *width)
>>  {
>> -     struct ethtool_link_ksettings lksettings;
>> -     u32 espeed;
>> -
>> -     if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
>> -             memset(&lksettings, 0, sizeof(lksettings));
>> -             rtnl_lock();
>> -             netdev->ethtool_ops->get_link_ksettings(netdev, &lksettings);
>> -             rtnl_unlock();
>> -             espeed = lksettings.base.speed;
>> -     } else {
>> -             espeed = SPEED_UNKNOWN;
>> -     }
>>       switch (espeed) {
>>       case SPEED_1000:
>>               *speed = IB_SPEED_SDR;
>> @@ -303,12 +291,9 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
>>       port_attr->sm_sl = 0;
>>       port_attr->subnet_timeout = 0;
>>       port_attr->init_type_reply = 0;
>> -     /* call the underlying netdev's ethtool hooks to query speed settings
>> -      * for which we acquire rtnl_lock _only_ if it's registered with
>> -      * IB stack to avoid race in the NETDEV_UNREG path
>> -      */
>> +
>>       if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
>> -             __to_ib_speed_width(rdev->netdev, &port_attr->active_speed,
>> +             __to_ib_speed_width(rdev->espeed, &port_attr->active_speed,
>>                                   &port_attr->active_width);
>>       return 0;
>>  }
>> diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
>> index 5d35540..99a54fd 100644
>> --- a/drivers/infiniband/hw/bnxt_re/main.c
>> +++ b/drivers/infiniband/hw/bnxt_re/main.c
>> @@ -48,6 +48,8 @@
>>  #include <net/ipv6.h>
>>  #include <net/addrconf.h>
>>  #include <linux/if_ether.h>
>> +#include <linux/atomic.h>
>> +#include <asm/barrier.h>
>>
>>  #include <rdma/ib_verbs.h>
>>  #include <rdma/ib_user_verbs.h>
>> @@ -926,6 +928,10 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
>>       if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
>>               cancel_delayed_work(&rdev->worker);
>>
>> +     /* Wait for ULPs to release references */
>> +     while (atomic_read(&rdev->cq_count))
>> +             usleep_range(500, 1000);
>> +
>
> It can change immediately after your check.
[Selvin] Yeah. We might end up waiting for 1ms in the unload path. We
have kept this change, since it
is a control path operation.
>
>>       bnxt_re_cleanup_res(rdev);
>>       bnxt_re_free_res(rdev, lock_wait);
>>
>> @@ -1152,6 +1158,22 @@ static void bnxt_re_remove_one(struct bnxt_re_dev *rdev)
>>       pci_dev_put(rdev->en_dev->pdev);
>>  }
>>
>> +static void bnxt_re_get_link_speed(struct bnxt_re_dev *rdev)
>> +{
>> +     struct ethtool_link_ksettings lksettings;
>> +     struct net_device *netdev = rdev->netdev;
>> +
>> +     if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
>> +             memset(&lksettings, 0, sizeof(lksettings));
>> +             if (rtnl_trylock()) {
>> +                     netdev->ethtool_ops->get_link_ksettings(netdev,
>> +                                                             &lksettings);
>> +                     rdev->espeed = lksettings.base.speed;
>> +                     rtnl_unlock();
>> +             }
>> +     }
>> +}
>> +
>>  /* Handle all deferred netevents tasks */
>>  static void bnxt_re_task(struct work_struct *work)
>>  {
>> @@ -1169,6 +1191,10 @@ static void bnxt_re_task(struct work_struct *work)
>>       switch (re_work->event) {
>>       case NETDEV_REGISTER:
>>               rc = bnxt_re_ib_reg(rdev);
>> +             /* Use memory barrier to sync the rdev->flags */
>> +             smp_mb();
>> +             clear_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG,
>> +                       &rdev->flags);
>>               if (rc)
>>                       dev_err(rdev_to_dev(rdev),
>>                               "Failed to register with IB: %#x", rc);
>> @@ -1186,6 +1212,7 @@ static void bnxt_re_task(struct work_struct *work)
>>               else if (netif_carrier_ok(rdev->netdev))
>>                       bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
>>                                              IB_EVENT_PORT_ACTIVE);
>> +             bnxt_re_get_link_speed(rdev);
>>               break;
>>       default:
>>               break;
>> @@ -1244,10 +1271,22 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
>>                       break;
>>               }
>>               bnxt_re_init_one(rdev);
>> +             /*
>> +              *  Query the link speed at the time of registration.
>> +              *  Already in rtnl_lock context
>> +              */
>> +             bnxt_re_get_link_speed(rdev);
>> +
>> +             set_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags);
>>               sch_work = true;
>>               break;
>>
>>       case NETDEV_UNREGISTER:
>> +             /* netdev notifier will call NETDEV_UNREGISTER again later since
>> +              * we are still holding the reference to the netdev
>> +              */
>> +             if (test_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags))
>> +                     goto exit;
>>               bnxt_re_ib_unreg(rdev, false);
>>               bnxt_re_remove_one(rdev);
>>               bnxt_re_dev_unreg(rdev);
>> --
>> 2.5.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events
       [not found]             ` <CA+sbYW0LLrE7EAsa1C-Bza1e6ug2MYntk8YVdOgm=iBLqDKBHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-05-18  4:31               ` Leon Romanovsky
  0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-18  4:31 UTC (permalink / raw)
  To: Selvin Xavier
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Somnath Kotur,
	Sriharsha Basavapatna

[-- Attachment #1: Type: text/plain, Size: 11153 bytes --]

On Mon, May 15, 2017 at 04:24:54PM +0530, Selvin Xavier wrote:
> On Sun, May 14, 2017 at 12:19 PM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > On Wed, May 10, 2017 at 03:45:28AM -0700, Selvin Xavier wrote:
> >> From: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >>
> >> Upon receipt of the NETDEV_REGISTER event from the netdev notifier
> >> chain, the IB stack registration is spawned off to a workqueue
> >> since that also requires an rtnl lock.
> >> There could be 2 kinds of races between the NETDEV_REGISTER and the
> >> NETDEV_UNREGISTER event handling.
> >> a)The NETDEV_UNREGISTER event is received in rapid succession after
> >> the NETDEV_REGISTER event even before the work queue got a chance
> >> to run
> >> b)The NETDEV_UNREGISTER event is received while the workqueue that
> >> handles registration with the IB stack is still in progress.
> >> Handle both the races with a bit flag that is set as part of the
> >> NETDEV_REGISTER event just before the work item is queued and cleared
> >> in the workqueue after the registration with the IB stack is complete.
> >> Use a barrier to ensure the bit is cleared only after the IB stack
> >> registration code is completed.
> >
> > I have a very strong feeling that this patch doesn't solve race, but
> > makes it is hard to reproduce. Why don't you use locks to protect flows?
> >
>
> IB registration cannot be invoked from the netdev event because the IB stack
> tries to acquire the rtnl_lock and netdev event is already in rtnl
> lock context. So we
> are scheduling a worker in case of NETDEV_REGISTER.
> From the worker task, bnxt_re driver tries to acquire rtnl_lock to access
> the L2 driver for creating the initial resources.
>
> Having a driver lock to synchronize does not solve the possible dead
> lock situation.
> Say, in case the driver acquired the driver_lock in the registration task and a
> NETDEV_UNREGISTER is received which comes with rtnl_lock is held.
> Driver tries to acquire the driver_lock during un-reg. However,
> registration task will hold
> the driver_lock and wait for rtnl_lock to access the L2 driver. netdev
> unreg event will not
> get the driver_lock.
>
> To avoid this dead lock, we are using BNXT_RE_FLAG_NETDEV_REG_IN_PROG and
> driver doesn't decrement the netdev ref count if the registration task
> is running. Once registration
> is done, we will reset BNXT_RE_FLAG_NETDEV_REG_IN_PROG flag and we will proceed
> with un-registration.

It doesn't answer to my question, but whatever, it is not critical.

Thanks

>
>
> >>
> >> Removes the link speed query from the query_port as it causes
> >> deadlock while trying to acquire rtnl_lock. Now querying the
> >> speed from the nedev event itself.
> >>
> >> Also, added a code to wait for resources(like CQ) to be freed by the
> >> ULPs, during driver unload
> >>
> >> Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> ---
> >>  drivers/infiniband/hw/bnxt_re/bnxt_re.h  | 13 +++++++----
> >>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 21 +++--------------
> >>  drivers/infiniband/hw/bnxt_re/main.c     | 39 ++++++++++++++++++++++++++++++++
> >>  3 files changed, 50 insertions(+), 23 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> index ebf7be8..fad04b2 100644
> >> --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> >> @@ -80,11 +80,13 @@ struct bnxt_re_dev {
> >>       struct ib_device                ibdev;
> >>       struct list_head                list;
> >>       unsigned long                   flags;
> >> -#define BNXT_RE_FLAG_NETDEV_REGISTERED       0
> >> -#define BNXT_RE_FLAG_IBDEV_REGISTERED        1
> >> -#define BNXT_RE_FLAG_GOT_MSIX                2
> >> -#define BNXT_RE_FLAG_RCFW_CHANNEL_EN 8
> >> -#define BNXT_RE_FLAG_QOS_WORK_REG    16
> >> +#define BNXT_RE_FLAG_NETDEV_REGISTERED         0
> >> +#define BNXT_RE_FLAG_IBDEV_REGISTERED          1
> >> +#define BNXT_RE_FLAG_GOT_MSIX                  2
> >> +#define BNXT_RE_FLAG_RCFW_CHANNEL_EN           4
> >> +#define BNXT_RE_FLAG_QOS_WORK_REG              5
> >> +#define BNXT_RE_FLAG_NETDEV_REG_IN_PROG        6
> >> +
> >>       struct net_device               *netdev;
> >>       unsigned int                    version, major, minor;
> >>       struct bnxt_en_dev              *en_dev;
> >> @@ -127,6 +129,7 @@ struct bnxt_re_dev {
> >>       struct bnxt_re_qp               *qp1_sqp;
> >>       struct bnxt_re_ah               *sqp_ah;
> >>       struct bnxt_re_sqp_entries sqp_tbl[1024];
> >> +     u32 espeed;
> >>  };
> >>
> >>  #define to_bnxt_re_dev(ptr, member)  \
> >> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> index 6385213..c717d5d 100644
> >> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> @@ -223,20 +223,8 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
> >>       return 0;
> >>  }
> >>
> >> -static void __to_ib_speed_width(struct net_device *netdev, u8 *speed, u8 *width)
> >> +static void __to_ib_speed_width(u32 espeed, u8 *speed, u8 *width)
> >>  {
> >> -     struct ethtool_link_ksettings lksettings;
> >> -     u32 espeed;
> >> -
> >> -     if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
> >> -             memset(&lksettings, 0, sizeof(lksettings));
> >> -             rtnl_lock();
> >> -             netdev->ethtool_ops->get_link_ksettings(netdev, &lksettings);
> >> -             rtnl_unlock();
> >> -             espeed = lksettings.base.speed;
> >> -     } else {
> >> -             espeed = SPEED_UNKNOWN;
> >> -     }
> >>       switch (espeed) {
> >>       case SPEED_1000:
> >>               *speed = IB_SPEED_SDR;
> >> @@ -303,12 +291,9 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
> >>       port_attr->sm_sl = 0;
> >>       port_attr->subnet_timeout = 0;
> >>       port_attr->init_type_reply = 0;
> >> -     /* call the underlying netdev's ethtool hooks to query speed settings
> >> -      * for which we acquire rtnl_lock _only_ if it's registered with
> >> -      * IB stack to avoid race in the NETDEV_UNREG path
> >> -      */
> >> +
> >>       if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
> >> -             __to_ib_speed_width(rdev->netdev, &port_attr->active_speed,
> >> +             __to_ib_speed_width(rdev->espeed, &port_attr->active_speed,
> >>                                   &port_attr->active_width);
> >>       return 0;
> >>  }
> >> diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
> >> index 5d35540..99a54fd 100644
> >> --- a/drivers/infiniband/hw/bnxt_re/main.c
> >> +++ b/drivers/infiniband/hw/bnxt_re/main.c
> >> @@ -48,6 +48,8 @@
> >>  #include <net/ipv6.h>
> >>  #include <net/addrconf.h>
> >>  #include <linux/if_ether.h>
> >> +#include <linux/atomic.h>
> >> +#include <asm/barrier.h>
> >>
> >>  #include <rdma/ib_verbs.h>
> >>  #include <rdma/ib_user_verbs.h>
> >> @@ -926,6 +928,10 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
> >>       if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
> >>               cancel_delayed_work(&rdev->worker);
> >>
> >> +     /* Wait for ULPs to release references */
> >> +     while (atomic_read(&rdev->cq_count))
> >> +             usleep_range(500, 1000);
> >> +
> >
> > It can change immediately after your check.
> [Selvin] Yeah. We might end up waiting for 1ms in the unload path. We
> have kept this change, since it
> is a control path operation.
> >
> >>       bnxt_re_cleanup_res(rdev);
> >>       bnxt_re_free_res(rdev, lock_wait);
> >>
> >> @@ -1152,6 +1158,22 @@ static void bnxt_re_remove_one(struct bnxt_re_dev *rdev)
> >>       pci_dev_put(rdev->en_dev->pdev);
> >>  }
> >>
> >> +static void bnxt_re_get_link_speed(struct bnxt_re_dev *rdev)
> >> +{
> >> +     struct ethtool_link_ksettings lksettings;
> >> +     struct net_device *netdev = rdev->netdev;
> >> +
> >> +     if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
> >> +             memset(&lksettings, 0, sizeof(lksettings));
> >> +             if (rtnl_trylock()) {
> >> +                     netdev->ethtool_ops->get_link_ksettings(netdev,
> >> +                                                             &lksettings);
> >> +                     rdev->espeed = lksettings.base.speed;
> >> +                     rtnl_unlock();
> >> +             }
> >> +     }
> >> +}
> >> +
> >>  /* Handle all deferred netevents tasks */
> >>  static void bnxt_re_task(struct work_struct *work)
> >>  {
> >> @@ -1169,6 +1191,10 @@ static void bnxt_re_task(struct work_struct *work)
> >>       switch (re_work->event) {
> >>       case NETDEV_REGISTER:
> >>               rc = bnxt_re_ib_reg(rdev);
> >> +             /* Use memory barrier to sync the rdev->flags */
> >> +             smp_mb();
> >> +             clear_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG,
> >> +                       &rdev->flags);
> >>               if (rc)
> >>                       dev_err(rdev_to_dev(rdev),
> >>                               "Failed to register with IB: %#x", rc);
> >> @@ -1186,6 +1212,7 @@ static void bnxt_re_task(struct work_struct *work)
> >>               else if (netif_carrier_ok(rdev->netdev))
> >>                       bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1,
> >>                                              IB_EVENT_PORT_ACTIVE);
> >> +             bnxt_re_get_link_speed(rdev);
> >>               break;
> >>       default:
> >>               break;
> >> @@ -1244,10 +1271,22 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
> >>                       break;
> >>               }
> >>               bnxt_re_init_one(rdev);
> >> +             /*
> >> +              *  Query the link speed at the time of registration.
> >> +              *  Already in rtnl_lock context
> >> +              */
> >> +             bnxt_re_get_link_speed(rdev);
> >> +
> >> +             set_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags);
> >>               sch_work = true;
> >>               break;
> >>
> >>       case NETDEV_UNREGISTER:
> >> +             /* netdev notifier will call NETDEV_UNREGISTER again later since
> >> +              * we are still holding the reference to the netdev
> >> +              */
> >> +             if (test_bit(BNXT_RE_FLAG_NETDEV_REG_IN_PROG, &rdev->flags))
> >> +                     goto exit;
> >>               bnxt_re_ib_unreg(rdev, false);
> >>               bnxt_re_remove_one(rdev);
> >>               bnxt_re_dev_unreg(rdev);
> >> --
> >> 2.5.5
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
       [not found]             ` <CA+sbYW2LtbCf=DLXfyTKBUZJh5t7ZEBNYGHwJp1tFX49p5QL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-05-18  4:32               ` Leon Romanovsky
  0 siblings, 0 replies; 28+ messages in thread
From: Leon Romanovsky @ 2017-05-18  4:32 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Kalesh AP

[-- Attachment #1: Type: text/plain, Size: 3623 bytes --]

On Mon, May 15, 2017 at 04:24:04PM +0530, Selvin Xavier wrote:
> On Sun, May 14, 2017 at 11:40 AM, Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > On Wed, May 10, 2017 at 03:45:34AM -0700, Selvin Xavier wrote:
> >> Do not free context table entry if delete GID fails. Stack may call
> >> back the add_gid for the same context again which could result in
> >> panic
> >
> > "may call" ??? Will you leave memory leak if this "may call" won't happen?
> >
> This fix was added only to avoid system crash in some a specific scenario.
> When bnxt_re driver is loaded and if user tries to change interface mac address,
> our delete GID fails because QP1 is still associated with existing MAC
> (default GID).
> If the above command fails GID tables are not modified in the h/w or
> driver, but the GID context
> memory is freed. Now, if the user changes the mac back to the original
> value, another add_gid
> comes to the driver where the driver reports that the GID is already
> present in its table
> and tries to access the context which was already freed.
>
> So, in this case, in order to  avoid NULL pointer de-reference, this
> patch removes the context memory
> free  if delete_gid fails and the same context memory is re-used in new add_gid.
> Memory cleanup will be taken care during driver unload, while deleting
> the GID table.
>
>
> I understood that the "may call" comment is not detailing this problem.
> I will update the commit log and send a v2.

Thanks

>
> Thanks for your comments.
> Selvin Xavier
>
> >
> >>
> >> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> Signed-off-by: Selvin Xavier <selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
> >> ---
> >>  drivers/infiniband/hw/bnxt_re/ib_verbs.c | 16 +++++++++-------
> >>  1 file changed, 9 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> index 61703f3..525f4b0 100644
> >> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
> >> @@ -375,15 +375,17 @@ int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num,
> >>                       return -EINVAL;
> >>               ctx->refcnt--;
> >>               if (!ctx->refcnt) {
> >> -                     rc = bnxt_qplib_del_sgid
> >> -                                     (sgid_tbl,
> >> -                                      &sgid_tbl->tbl[ctx->idx], true);
> >> -                     if (rc)
> >> +                     rc = bnxt_qplib_del_sgid(sgid_tbl,
> >> +                                              &sgid_tbl->tbl[ctx->idx],
> >> +                                              true);
> >> +                     if (rc) {
> >>                               dev_err(rdev_to_dev(rdev),
> >>                                       "Failed to remove GID: %#x", rc);
> >> -                     ctx_tbl = sgid_tbl->ctx;
> >> -                     ctx_tbl[ctx->idx] = NULL;
> >> -                     kfree(ctx);
> >> +                     } else {
> >> +                             ctx_tbl = sgid_tbl->ctx;
> >> +                             ctx_tbl[ctx->idx] = NULL;
> >> +                             kfree(ctx);
> >> +                     }
> >>               }
> >>       } else {
> >>               return -EINVAL;
> >> --
> >> 2.5.5
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-05-18  4:32 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10 10:45 [PATCH for-next 00/14] RDMA/bnxt_re: Bug fixes Selvin Xavier
     [not found] ` <1494413139-11883-1-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-05-10 10:45   ` [PATCH for-next 01/14] RDMA/bnxt_re: Fixing the Control path command and response handling Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 02/14] RDMA/bnxt_re: HW workarounds for handling specific conditions Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 03/14] RDMA/bnxt_re: Fix race between netdev register and unregister events Selvin Xavier
     [not found]     ` <1494413139-11883-4-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-05-14  6:49       ` Leon Romanovsky
     [not found]         ` <20170514064904.GR3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-15 10:54           ` Selvin Xavier
     [not found]             ` <CA+sbYW0LLrE7EAsa1C-Bza1e6ug2MYntk8YVdOgm=iBLqDKBHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-05-18  4:31               ` Leon Romanovsky
2017-05-10 10:45   ` [PATCH for-next 04/14] RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 05/14] RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 06/14] RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 07/14] RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 08/14] RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured Selvin Xavier
     [not found]     ` <1494413139-11883-9-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-05-14  6:23       ` Leon Romanovsky
     [not found]         ` <20170514062349.GQ3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-15 10:13           ` Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 09/14] RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails Selvin Xavier
     [not found]     ` <1494413139-11883-10-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-05-14  6:10       ` Leon Romanovsky
     [not found]         ` <20170514061015.GP3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-15 10:54           ` Selvin Xavier
     [not found]             ` <CA+sbYW2LtbCf=DLXfyTKBUZJh5t7ZEBNYGHwJp1tFX49p5QL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-05-18  4:32               ` Leon Romanovsky
2017-05-10 10:45   ` [PATCH for-next 10/14] RDMA/bnxt_re: Fix RQE posting logic Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 11/14] RDMA/bnxt_re: Report supported value to IB stack in query_device Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 12/14] RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 13/14] RDMA/bnxt_re: Specify RDMA component when allocating stats context Selvin Xavier
2017-05-10 10:45   ` [PATCH for-next 14/14] RDMA/bnxt_re: Update the driver version Selvin Xavier
     [not found]     ` <1494413139-11883-15-git-send-email-selvin.xavier-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-05-10 15:06       ` Dennis Dalessandro
     [not found]         ` <84d9a0ce-0e5c-fc50-237f-b7ad9d541324-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-05-10 15:54           ` Leon Romanovsky
     [not found]             ` <20170510155430.GC1839-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-11  5:22               ` Selvin Xavier
     [not found]                 ` <CA+sbYW2ObMdDOdhPL9GMe5YNk7P2MsG5q=f+Ke5e4MU6A=WV6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-05-11  8:27                   ` Leon Romanovsky
     [not found]                     ` <20170511082700.GA3616-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-05-11 13:55                       ` Dennis Dalessandro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.