linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-rc 0/4] irdma fixes
@ 2021-09-16 19:12 Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 1/4] RDMA/irdma: Skip CQP ring during a reset Shiraz Saleem
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Shiraz Saleem @ 2021-09-16 19:12 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Shiraz

From: Shiraz <shiraz.saleem@intel.com>

This series contains a small set of irdma bug fixes for 5.15 cycle.

Sindhu Devale (4):
  RDMA/irdma: Skip CQP ring during a reset
  RDMA/irdma: Validate number of CQ entries on create CQ
  RDMA/irdma: Report correct WC error when transport retry counter is
    exceeded
  RDMA/irdma: Report correct WC error when there are MW bind errors

 drivers/infiniband/hw/irdma/cm.c       |  4 ++--
 drivers/infiniband/hw/irdma/hw.c       | 14 +++++++++++---
 drivers/infiniband/hw/irdma/i40iw_if.c |  2 +-
 drivers/infiniband/hw/irdma/main.h     |  1 -
 drivers/infiniband/hw/irdma/user.h     |  2 ++
 drivers/infiniband/hw/irdma/utils.c    |  2 +-
 drivers/infiniband/hw/irdma/verbs.c    |  9 ++++++---
 7 files changed, 23 insertions(+), 11 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH rdma-rc 1/4] RDMA/irdma: Skip CQP ring during a reset
  2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
@ 2021-09-16 19:12 ` Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 2/4] RDMA/irdma: Validate number of CQ entries on create CQ Shiraz Saleem
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Shiraz Saleem @ 2021-09-16 19:12 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Sindhu Devale, LiLiang, Shiraz Saleem

From: Sindhu Devale <sindhu.devale@intel.com>

Due to duplicate reset flags, CQP commands are processed during reset.

This leads CQP failures such as below:
irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0

Remove the redundant flag and set the correct reset flag so CPQ is
paused during reset

Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Reported-by: LiLiang <liali@redhat.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/cm.c       | 4 ++--
 drivers/infiniband/hw/irdma/hw.c       | 6 +++---
 drivers/infiniband/hw/irdma/i40iw_if.c | 2 +-
 drivers/infiniband/hw/irdma/main.h     | 1 -
 drivers/infiniband/hw/irdma/utils.c    | 2 +-
 drivers/infiniband/hw/irdma/verbs.c    | 3 +--
 6 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
index 6b62299abfbb..6dea0a49d171 100644
--- a/drivers/infiniband/hw/irdma/cm.c
+++ b/drivers/infiniband/hw/irdma/cm.c
@@ -3496,7 +3496,7 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp)
 	     original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT ||
 	     last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE ||
 	     last_ae == IRDMA_AE_BAD_CLOSE ||
-	     last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->reset)) {
+	     last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset)) {
 		issue_close = 1;
 		iwqp->cm_id = NULL;
 		qp->term_flags = 0;
@@ -4250,7 +4250,7 @@ void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr,
 				       teardown_entry);
 		attr.qp_state = IB_QPS_ERR;
 		irdma_modify_qp(&cm_node->iwqp->ibqp, &attr, IB_QP_STATE, NULL);
-		if (iwdev->reset)
+		if (iwdev->rf->reset)
 			irdma_cm_disconn(cm_node->iwqp);
 		irdma_rem_ref_cm_node(cm_node);
 	}
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
index 00de5ee9a260..33c06a3a4f63 100644
--- a/drivers/infiniband/hw/irdma/hw.c
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -1489,7 +1489,7 @@ void irdma_reinitialize_ieq(struct irdma_sc_vsi *vsi)
 
 	irdma_puda_dele_rsrc(vsi, IRDMA_PUDA_RSRC_TYPE_IEQ, false);
 	if (irdma_initialize_ieq(iwdev)) {
-		iwdev->reset = true;
+		iwdev->rf->reset = true;
 		rf->gen_ops.request_reset(rf);
 	}
 }
@@ -1632,13 +1632,13 @@ void irdma_rt_deinit_hw(struct irdma_device *iwdev)
 	case IEQ_CREATED:
 		if (!iwdev->roce_mode)
 			irdma_puda_dele_rsrc(&iwdev->vsi, IRDMA_PUDA_RSRC_TYPE_IEQ,
-					     iwdev->reset);
+					     iwdev->rf->reset);
 		fallthrough;
 	case ILQ_CREATED:
 		if (!iwdev->roce_mode)
 			irdma_puda_dele_rsrc(&iwdev->vsi,
 					     IRDMA_PUDA_RSRC_TYPE_ILQ,
-					     iwdev->reset);
+					     iwdev->rf->reset);
 		break;
 	default:
 		ibdev_warn(&iwdev->ibdev, "bad init_state = %d\n", iwdev->init_state);
diff --git a/drivers/infiniband/hw/irdma/i40iw_if.c b/drivers/infiniband/hw/irdma/i40iw_if.c
index bddf88194d09..d219f64b2c3d 100644
--- a/drivers/infiniband/hw/irdma/i40iw_if.c
+++ b/drivers/infiniband/hw/irdma/i40iw_if.c
@@ -55,7 +55,7 @@ static void i40iw_close(struct i40e_info *cdev_info, struct i40e_client *client,
 
 	iwdev = to_iwdev(ibdev);
 	if (reset)
-		iwdev->reset = true;
+		iwdev->rf->reset = true;
 
 	iwdev->iw_status = 0;
 	irdma_port_ibevent(iwdev);
diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
index 743d9e143a99..b678fe712447 100644
--- a/drivers/infiniband/hw/irdma/main.h
+++ b/drivers/infiniband/hw/irdma/main.h
@@ -346,7 +346,6 @@ struct irdma_device {
 	bool roce_mode:1;
 	bool roce_dcqcn_en:1;
 	bool dcb:1;
-	bool reset:1;
 	bool iw_ooo:1;
 	enum init_completion_state init_state;
 
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c
index e94470991fe0..ac91ea5296db 100644
--- a/drivers/infiniband/hw/irdma/utils.c
+++ b/drivers/infiniband/hw/irdma/utils.c
@@ -2507,7 +2507,7 @@ void irdma_modify_qp_to_err(struct irdma_sc_qp *sc_qp)
 	struct irdma_qp *qp = sc_qp->qp_uk.back_qp;
 	struct ib_qp_attr attr;
 
-	if (qp->iwdev->reset)
+	if (qp->iwdev->rf->reset)
 		return;
 	attr.qp_state = IB_QPS_ERR;
 
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 4fc323402073..829ddfa7e144 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -535,8 +535,7 @@ static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 	irdma_qp_rem_ref(&iwqp->ibqp);
 	wait_for_completion(&iwqp->free_qp);
 	irdma_free_lsmm_rsrc(iwqp);
-	if (!iwdev->reset)
-		irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp);
+	irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp);
 
 	if (!iwqp->user_mode) {
 		if (iwqp->iwscq) {
-- 
2.27.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH rdma-rc 2/4] RDMA/irdma: Validate number of CQ entries on create CQ
  2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 1/4] RDMA/irdma: Skip CQP ring during a reset Shiraz Saleem
@ 2021-09-16 19:12 ` Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 3/4] RDMA/irdma: Report correct WC error when transport retry counter is exceeded Shiraz Saleem
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Shiraz Saleem @ 2021-09-16 19:12 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Sindhu Devale, Shiraz Saleem

From: Sindhu Devale <sindhu.devale@intel.com>

Add lower bound check for CQ entries at creation time.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/verbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 829ddfa7e144..23c47482c749 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -2034,7 +2034,7 @@ static int irdma_create_cq(struct ib_cq *ibcq,
 		/* Kmode allocations */
 		int rsize;
 
-		if (entries > rf->max_cqe) {
+		if (entries < 1 || entries > rf->max_cqe) {
 			err_code = -EINVAL;
 			goto cq_free_rsrc;
 		}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH rdma-rc 3/4] RDMA/irdma: Report correct WC error when transport retry counter is exceeded
  2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 1/4] RDMA/irdma: Skip CQP ring during a reset Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 2/4] RDMA/irdma: Validate number of CQ entries on create CQ Shiraz Saleem
@ 2021-09-16 19:12 ` Shiraz Saleem
  2021-09-16 19:12 ` [PATCH rdma-rc 4/4] RDMA/irdma: Report correct WC error when there are MW bind errors Shiraz Saleem
  2021-09-20 17:24 ` [PATCH rdma-rc 0/4] irdma fixes Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Shiraz Saleem @ 2021-09-16 19:12 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Sindhu Devale, Shiraz Saleem

From: Sindhu Devale <sindhu.devale@intel.com>

When the retry counter exceeds, as the remote QP didn't send any Ack or
Nack an asynchronous event (AE) for too many retries is generated. Add
code to handle the AE and set the correct IB WC error code
IB_WC_RETRY_EXC_ERR.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/hw.c    | 3 +++
 drivers/infiniband/hw/irdma/user.h  | 1 +
 drivers/infiniband/hw/irdma/verbs.c | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
index 33c06a3a4f63..cb9a8e24e3b7 100644
--- a/drivers/infiniband/hw/irdma/hw.c
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -176,6 +176,9 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp,
 	case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR:
 		qp->flush_code = FLUSH_GENERAL_ERR;
 		break;
+	case IRDMA_AE_LLP_TOO_MANY_RETRIES:
+		qp->flush_code = FLUSH_RETRY_EXC_ERR;
+		break;
 	default:
 		qp->flush_code = FLUSH_FATAL_ERR;
 		break;
diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h
index ff705f323233..267102d1049d 100644
--- a/drivers/infiniband/hw/irdma/user.h
+++ b/drivers/infiniband/hw/irdma/user.h
@@ -102,6 +102,7 @@ enum irdma_flush_opcode {
 	FLUSH_REM_OP_ERR,
 	FLUSH_LOC_LEN_ERR,
 	FLUSH_FATAL_ERR,
+	FLUSH_RETRY_EXC_ERR,
 };
 
 enum irdma_cmpl_status {
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 23c47482c749..c7e129ee74d0 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -3352,6 +3352,8 @@ static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode
 		return IB_WC_LOC_LEN_ERR;
 	case FLUSH_GENERAL_ERR:
 		return IB_WC_WR_FLUSH_ERR;
+	case FLUSH_RETRY_EXC_ERR:
+		return IB_WC_RETRY_EXC_ERR;
 	case FLUSH_FATAL_ERR:
 	default:
 		return IB_WC_FATAL_ERR;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH rdma-rc 4/4] RDMA/irdma: Report correct WC error when there are MW bind errors
  2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
                   ` (2 preceding siblings ...)
  2021-09-16 19:12 ` [PATCH rdma-rc 3/4] RDMA/irdma: Report correct WC error when transport retry counter is exceeded Shiraz Saleem
@ 2021-09-16 19:12 ` Shiraz Saleem
  2021-09-20 17:24 ` [PATCH rdma-rc 0/4] irdma fixes Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Shiraz Saleem @ 2021-09-16 19:12 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Sindhu Devale, Shiraz Saleem

From: Sindhu Devale <sindhu.devale@intel.com>

Report the correct WC error when MW bind error related asynchronous
events are generated by HW.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/hw.c    | 5 +++++
 drivers/infiniband/hw/irdma/user.h  | 1 +
 drivers/infiniband/hw/irdma/verbs.c | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
index cb9a8e24e3b7..7de525a5ccf8 100644
--- a/drivers/infiniband/hw/irdma/hw.c
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -179,6 +179,11 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp,
 	case IRDMA_AE_LLP_TOO_MANY_RETRIES:
 		qp->flush_code = FLUSH_RETRY_EXC_ERR;
 		break;
+	case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS:
+	case IRDMA_AE_AMP_MWBIND_BIND_DISABLED:
+	case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS:
+		qp->flush_code = FLUSH_MW_BIND_ERR;
+		break;
 	default:
 		qp->flush_code = FLUSH_FATAL_ERR;
 		break;
diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h
index 267102d1049d..3dcbb1fbf2c6 100644
--- a/drivers/infiniband/hw/irdma/user.h
+++ b/drivers/infiniband/hw/irdma/user.h
@@ -103,6 +103,7 @@ enum irdma_flush_opcode {
 	FLUSH_LOC_LEN_ERR,
 	FLUSH_FATAL_ERR,
 	FLUSH_RETRY_EXC_ERR,
+	FLUSH_MW_BIND_ERR,
 };
 
 enum irdma_cmpl_status {
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index c7e129ee74d0..7110ebf834f9 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -3354,6 +3354,8 @@ static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode
 		return IB_WC_WR_FLUSH_ERR;
 	case FLUSH_RETRY_EXC_ERR:
 		return IB_WC_RETRY_EXC_ERR;
+	case FLUSH_MW_BIND_ERR:
+		return IB_WC_MW_BIND_ERR;
 	case FLUSH_FATAL_ERR:
 	default:
 		return IB_WC_FATAL_ERR;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH rdma-rc 0/4] irdma fixes
  2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
                   ` (3 preceding siblings ...)
  2021-09-16 19:12 ` [PATCH rdma-rc 4/4] RDMA/irdma: Report correct WC error when there are MW bind errors Shiraz Saleem
@ 2021-09-20 17:24 ` Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2021-09-20 17:24 UTC (permalink / raw)
  To: Shiraz Saleem; +Cc: dledford, linux-rdma

On Thu, Sep 16, 2021 at 02:12:18PM -0500, Shiraz Saleem wrote:
> From: Shiraz <shiraz.saleem@intel.com>
> 
> This series contains a small set of irdma bug fixes for 5.15 cycle.
> 
> Sindhu Devale (4):
>   RDMA/irdma: Skip CQP ring during a reset
>   RDMA/irdma: Validate number of CQ entries on create CQ
>   RDMA/irdma: Report correct WC error when transport retry counter is
>     exceeded
>   RDMA/irdma: Report correct WC error when there are MW bind errors

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-20 17:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16 19:12 [PATCH rdma-rc 0/4] irdma fixes Shiraz Saleem
2021-09-16 19:12 ` [PATCH rdma-rc 1/4] RDMA/irdma: Skip CQP ring during a reset Shiraz Saleem
2021-09-16 19:12 ` [PATCH rdma-rc 2/4] RDMA/irdma: Validate number of CQ entries on create CQ Shiraz Saleem
2021-09-16 19:12 ` [PATCH rdma-rc 3/4] RDMA/irdma: Report correct WC error when transport retry counter is exceeded Shiraz Saleem
2021-09-16 19:12 ` [PATCH rdma-rc 4/4] RDMA/irdma: Report correct WC error when there are MW bind errors Shiraz Saleem
2021-09-20 17:24 ` [PATCH rdma-rc 0/4] irdma fixes Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).