linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/4] Provide more error details when a QP moves to
@ 2022-09-07 11:37 Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-07 11:37 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Patrisious Haddad, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Israel Rukshin, Maor Gottlieb, Max Gurtovoy

The following series adds debug prints for fatal QP events that are
helpful for finding the root cause of the errors.
The ib_get_qp_err_syndrome is called at a work queue since the QP event callback is
running on an interrupt context that can't sleep.

The functions is especially useful for debugging purposes for few
reasons:
First of all it provides the information in a human readable way, that
would make it easier to identify the bug root cause.
Secondly it also allows providing vendor specfic error codes or information
that could be very useful to users who know them.
Lastly and most importantly the function provides information about the
reason the QP moved to error state, in cases where CQE isn't generated
and without this feature it would have been way harder to know the root cause
of the error.

An example of such case would be a remote write with RKEY violation,
whereas on the remote side no CQE would be generated but this print
allows to know the reason behind the failure.

Thanks.

Israel Rukshin (1):
  nvme-rdma: add more error details when a QP moves to an error state

Patrisious Haddad (3):
  net/mlx5: Introduce CQE error syndrome
  RDMA/core: Introduce ib_get_qp_err_syndrome function
  RDMA/mlx5: Implement ib_get_qp_err_syndrome

 drivers/infiniband/core/device.c     |  1 +
 drivers/infiniband/core/verbs.c      |  8 +++++
 drivers/infiniband/hw/mlx5/main.c    |  1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  1 +
 drivers/infiniband/hw/mlx5/qp.c      | 42 ++++++++++++++++++++++++-
 drivers/infiniband/hw/mlx5/qp.h      |  2 +-
 drivers/infiniband/hw/mlx5/qpc.c     |  4 ++-
 drivers/nvme/host/rdma.c             | 24 ++++++++++++++
 include/linux/mlx5/mlx5_ifc.h        | 47 +++++++++++++++++++++++++---
 include/rdma/ib_verbs.h              | 13 ++++++++
 10 files changed, 135 insertions(+), 8 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome
  2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
@ 2022-09-07 11:37 ` Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Patrisious Haddad
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-07 11:37 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Patrisious Haddad, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Israel Rukshin, Maor Gottlieb, Max Gurtovoy

Introduces CQE error syndrome bits which are inside qp_context_extension
and are used to report the reason the QP was moved to error state.
Useful for cases in which a CQE is generated, such as remote write
rkey violaton.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 47 +++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 4acd5610e96b..1c3b258baaa7 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1441,7 +1441,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         null_mkey[0x1];
 	u8         log_max_klm_list_size[0x6];
 
-	u8         reserved_at_120[0xa];
+	u8         reserved_at_120[0x2];
+	u8	   qpc_extension[0x1];
+	u8	   reserved_at_123[0x7];
 	u8         log_max_ra_req_dc[0x6];
 	u8         reserved_at_130[0x9];
 	u8         vnic_env_cq_overrun[0x1];
@@ -1605,7 +1607,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 
 	u8         log_bf_reg_size[0x5];
 
-	u8         reserved_at_270[0x6];
+	u8         reserved_at_270[0x3];
+	u8	   qp_error_syndrome[0x1];
+	u8	   reserved_at_274[0x2];
 	u8         lag_dct[0x2];
 	u8         lag_tx_port_affinity[0x1];
 	u8         lag_native_fdb_selection[0x1];
@@ -5257,6 +5261,37 @@ struct mlx5_ifc_query_rmp_in_bits {
 	u8         reserved_at_60[0x20];
 };
 
+struct mlx5_ifc_cqe_error_syndrome_bits {
+	u8         hw_error_syndrome[0x8];
+	u8         hw_syndrome_type[0x4];
+	u8         reserved_at_c[0x4];
+	u8         vendor_error_syndrome[0x8];
+	u8         syndrome[0x8];
+};
+
+struct mlx5_ifc_qp_context_extension_bits {
+	u8         reserved_at_0[0x60];
+
+	struct mlx5_ifc_cqe_error_syndrome_bits error_syndrome;
+
+	u8         reserved_at_80[0x580];
+};
+
+struct mlx5_ifc_qpc_extension_and_pas_list_in_bits {
+	struct mlx5_ifc_qp_context_extension_bits qpc_data_extension;
+
+	u8         pas[0][0x40];
+};
+
+struct mlx5_ifc_qp_pas_list_in_bits {
+	struct mlx5_ifc_cmd_pas_bits pas[0];
+};
+
+union mlx5_ifc_qp_pas_or_qpc_ext_and_pas_bits {
+	struct mlx5_ifc_qp_pas_list_in_bits qp_pas_list;
+	struct mlx5_ifc_qpc_extension_and_pas_list_in_bits qpc_ext_and_pas_list;
+};
+
 struct mlx5_ifc_query_qp_out_bits {
 	u8         status[0x8];
 	u8         reserved_at_8[0x18];
@@ -5273,7 +5308,7 @@ struct mlx5_ifc_query_qp_out_bits {
 
 	u8         reserved_at_800[0x80];
 
-	u8         pas[][0x40];
+	union mlx5_ifc_qp_pas_or_qpc_ext_and_pas_bits qp_pas_or_qpc_ext_and_pas;
 };
 
 struct mlx5_ifc_query_qp_in_bits {
@@ -5283,7 +5318,8 @@ struct mlx5_ifc_query_qp_in_bits {
 	u8         reserved_at_20[0x10];
 	u8         op_mod[0x10];
 
-	u8         reserved_at_40[0x8];
+	u8         qpc_ext[0x1];
+	u8         reserved_at_41[0x7];
 	u8         qpn[0x18];
 
 	u8         reserved_at_60[0x20];
@@ -8417,7 +8453,8 @@ struct mlx5_ifc_create_qp_in_bits {
 	u8         reserved_at_20[0x10];
 	u8         op_mod[0x10];
 
-	u8         reserved_at_40[0x8];
+	u8         qpc_ext[0x1];
+	u8         reserved_at_41[0x7];
 	u8         input_qpn[0x18];
 
 	u8         reserved_at_60[0x20];
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function
  2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
@ 2022-09-07 11:37 ` Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Patrisious Haddad
  2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
  3 siblings, 0 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-07 11:37 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Patrisious Haddad, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Israel Rukshin, Maor Gottlieb, Max Gurtovoy

Introduce ib_get_qp_err_syndrome function, which enables kernel
applications to query the reason the QP moved to error state.
Even in cases in which no CQE was generated.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/device.c |  1 +
 drivers/infiniband/core/verbs.c  |  8 ++++++++
 include/rdma/ib_verbs.h          | 13 +++++++++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index ae60c73babcc..8235b8fa1100 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2657,6 +2657,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, get_netdev);
 	SET_DEVICE_OP(dev_ops, get_numa_node);
 	SET_DEVICE_OP(dev_ops, get_port_immutable);
+	SET_DEVICE_OP(dev_ops, get_qp_err_syndrome);
 	SET_DEVICE_OP(dev_ops, get_vector_affinity);
 	SET_DEVICE_OP(dev_ops, get_vf_config);
 	SET_DEVICE_OP(dev_ops, get_vf_guid);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e54b3f1b730e..ac20af8be33a 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1952,6 +1952,14 @@ int ib_query_qp(struct ib_qp *qp,
 }
 EXPORT_SYMBOL(ib_query_qp);
 
+int ib_get_qp_err_syndrome(struct ib_qp *qp, char *str)
+{
+	return qp->device->ops.get_qp_err_syndrome ?
+		qp->device->ops.get_qp_err_syndrome(qp->real_qp,
+						    str) : -EOPNOTSUPP;
+}
+EXPORT_SYMBOL(ib_get_qp_err_syndrome);
+
 int ib_close_qp(struct ib_qp *qp)
 {
 	struct ib_qp *real_qp;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 975d6e9efbcb..9a94f2ef993c 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2465,6 +2465,7 @@ struct ib_device_ops {
 			 int qp_attr_mask, struct ib_udata *udata);
 	int (*query_qp)(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 			int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
+	int (*get_qp_err_syndrome)(struct ib_qp *qp, char *str);
 	int (*destroy_qp)(struct ib_qp *qp, struct ib_udata *udata);
 	int (*create_cq)(struct ib_cq *cq, const struct ib_cq_init_attr *attr,
 			 struct ib_udata *udata);
@@ -3777,6 +3778,18 @@ int ib_query_qp(struct ib_qp *qp,
 		int qp_attr_mask,
 		struct ib_qp_init_attr *qp_init_attr);
 
+#define IB_ERR_SYNDROME_LENGTH 256
+
+/**
+ * ib_get_qp_err_syndrome - Returns a string that describes the reason
+ * the specified QP moved to error state.
+ * @qp : The QP to query.
+ * @str: The reason the qp moved to error state.
+ *
+ * NOTE: the user must pass a str with size of at least IB_ERR_SYNDROME_LENGTH
+ */
+int ib_get_qp_err_syndrome(struct ib_qp *qp, char *str);
+
 /**
  * ib_destroy_qp - Destroys the specified QP.
  * @qp: The QP to destroy.
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome
  2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
  2022-09-07 11:37 ` [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Patrisious Haddad
@ 2022-09-07 11:37 ` Patrisious Haddad
  2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
  3 siblings, 0 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-07 11:37 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Patrisious Haddad, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Israel Rukshin, Maor Gottlieb, Max Gurtovoy

Implement ib_get_qp_err_syndrome using a query_qp FW call
and return the result in a human readable string.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/main.c    |  1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  1 +
 drivers/infiniband/hw/mlx5/qp.c      | 42 +++++++++++++++++++++++++++-
 drivers/infiniband/hw/mlx5/qp.h      |  2 +-
 drivers/infiniband/hw/mlx5/qpc.c     |  4 ++-
 5 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 7c40efae96a3..c18d3e36542b 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3716,6 +3716,7 @@ static const struct ib_device_ops mlx5_ib_dev_ops = {
 	.get_dev_fw_str = get_dev_fw_str,
 	.get_dma_mr = mlx5_ib_get_dma_mr,
 	.get_link_layer = mlx5_ib_port_link_layer,
+	.get_qp_err_syndrome = mlx5_ib_get_qp_err_syndrome,
 	.map_mr_sg = mlx5_ib_map_mr_sg,
 	.map_mr_sg_pi = mlx5_ib_map_mr_sg_pi,
 	.mmap = mlx5_ib_mmap,
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 2e2ad3918385..bbd414cbd695 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1232,6 +1232,7 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		      int attr_mask, struct ib_udata *udata);
 int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr_mask,
 		     struct ib_qp_init_attr *qp_init_attr);
+int mlx5_ib_get_qp_err_syndrome(struct ib_qp *ibqp, char *str);
 int mlx5_ib_destroy_qp(struct ib_qp *qp, struct ib_udata *udata);
 void mlx5_ib_drain_sq(struct ib_qp *qp);
 void mlx5_ib_drain_rq(struct ib_qp *qp);
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 40d9410ec303..7cf2fe549b9a 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4806,7 +4806,8 @@ static int query_qp_attr(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 	if (!outb)
 		return -ENOMEM;
 
-	err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen);
+	err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen,
+				 false);
 	if (err)
 		goto out;
 
@@ -4992,6 +4993,45 @@ int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 	return err;
 }
 
+int mlx5_ib_get_qp_err_syndrome(struct ib_qp *ibqp, char *str)
+{
+	struct mlx5_ib_dev *dev = to_mdev(ibqp->device);
+	int outlen = MLX5_ST_SZ_BYTES(query_qp_out);
+	struct mlx5_ib_qp *qp = to_mqp(ibqp);
+	void *pas_ext_union, *err_syn;
+	u32 *outb;
+	int err;
+
+	if (!MLX5_CAP_GEN(dev->mdev, qpc_extension) ||
+	    !MLX5_CAP_GEN(dev->mdev, qp_error_syndrome))
+		return -EOPNOTSUPP;
+
+	outb = kzalloc(outlen, GFP_KERNEL);
+	if (!outb)
+		return -ENOMEM;
+
+	err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen,
+				 true);
+	if (err)
+		goto out;
+
+	pas_ext_union =
+		MLX5_ADDR_OF(query_qp_out, outb, qp_pas_or_qpc_ext_and_pas);
+	err_syn = MLX5_ADDR_OF(qpc_extension_and_pas_list_in, pas_ext_union,
+			       qpc_data_extension.error_syndrome);
+
+	scnprintf(str, IB_ERR_SYNDROME_LENGTH, "%s (0x%x 0x%x 0x%x)\n",
+		  ib_wc_status_msg(
+			  MLX5_GET(cqe_error_syndrome, err_syn, syndrome)),
+		  MLX5_GET(cqe_error_syndrome, err_syn, vendor_error_syndrome),
+		  MLX5_GET(cqe_error_syndrome, err_syn, hw_syndrome_type),
+		  MLX5_GET(cqe_error_syndrome, err_syn, hw_error_syndrome));
+
+out:
+	kfree(outb);
+	return err;
+}
+
 int mlx5_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *dev = to_mdev(ibxrcd->device);
diff --git a/drivers/infiniband/hw/mlx5/qp.h b/drivers/infiniband/hw/mlx5/qp.h
index 5d4e140db99c..8d792ca00b32 100644
--- a/drivers/infiniband/hw/mlx5/qp.h
+++ b/drivers/infiniband/hw/mlx5/qp.h
@@ -20,7 +20,7 @@ int mlx5_core_qp_modify(struct mlx5_ib_dev *dev, u16 opcode, u32 opt_param_mask,
 int mlx5_core_destroy_qp(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp);
 int mlx5_core_destroy_dct(struct mlx5_ib_dev *dev, struct mlx5_core_dct *dct);
 int mlx5_core_qp_query(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp,
-		       u32 *out, int outlen);
+		       u32 *out, int outlen, bool qpc_ext);
 int mlx5_core_dct_query(struct mlx5_ib_dev *dev, struct mlx5_core_dct *dct,
 			u32 *out, int outlen);
 
diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c
index 542e4c63a8de..7a1854aab441 100644
--- a/drivers/infiniband/hw/mlx5/qpc.c
+++ b/drivers/infiniband/hw/mlx5/qpc.c
@@ -504,12 +504,14 @@ void mlx5_cleanup_qp_table(struct mlx5_ib_dev *dev)
 }
 
 int mlx5_core_qp_query(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp,
-		       u32 *out, int outlen)
+		       u32 *out, int outlen, bool qpc_ext)
 {
 	u32 in[MLX5_ST_SZ_DW(query_qp_in)] = {};
 
 	MLX5_SET(query_qp_in, in, opcode, MLX5_CMD_OP_QUERY_QP);
 	MLX5_SET(query_qp_in, in, qpn, qp->qpn);
+	MLX5_SET(query_qp_in, in, qpc_ext, qpc_ext);
+
 	return mlx5_cmd_exec(dev->mdev, in, sizeof(in), out, outlen);
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
                   ` (2 preceding siblings ...)
  2022-09-07 11:37 ` [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Patrisious Haddad
@ 2022-09-07 11:38 ` Patrisious Haddad
  2022-09-07 12:02   ` Christoph Hellwig
  2022-09-07 12:34   ` Sagi Grimberg
  3 siblings, 2 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-07 11:38 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: Israel Rukshin, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Maor Gottlieb, Max Gurtovoy

From: Israel Rukshin <israelr@nvidia.com>

Add debug prints for fatal QP events that are helpful for finding the
root cause of the errors. The ib_get_qp_err_syndrome is called at
a work queue since the QP event callback is running on an
interrupt context that can't sleep.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/nvme/host/rdma.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3100643be299..7e56c0dbe8ea 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -99,6 +99,7 @@ struct nvme_rdma_queue {
 	bool			pi_support;
 	int			cq_size;
 	struct mutex		queue_lock;
+	struct work_struct	qp_err_work;
 };
 
 struct nvme_rdma_ctrl {
@@ -237,11 +238,31 @@ static struct nvme_rdma_qe *nvme_rdma_alloc_ring(struct ib_device *ibdev,
 	return NULL;
 }
 
+static void nvme_rdma_qp_error_work(struct work_struct *work)
+{
+	struct nvme_rdma_queue *queue = container_of(work,
+			struct nvme_rdma_queue, qp_err_work);
+	int ret;
+	char err[IB_ERR_SYNDROME_LENGTH];
+
+	ret = ib_get_qp_err_syndrome(queue->qp, err);
+	if (ret)
+		return;
+
+	pr_err("Queue %d got QP error syndrome %s\n",
+	       nvme_rdma_queue_idx(queue), err);
+}
+
 static void nvme_rdma_qp_event(struct ib_event *event, void *context)
 {
+	struct nvme_rdma_queue *queue = context;
+
 	pr_debug("QP event %s (%d)\n",
 		 ib_event_msg(event->event), event->event);
 
+	if (event->event == IB_EVENT_QP_FATAL ||
+	    event->event == IB_EVENT_QP_ACCESS_ERR)
+		queue_work(nvme_wq, &queue->qp_err_work);
 }
 
 static int nvme_rdma_wait_for_cm(struct nvme_rdma_queue *queue)
@@ -261,7 +282,9 @@ static int nvme_rdma_create_qp(struct nvme_rdma_queue *queue, const int factor)
 	struct ib_qp_init_attr init_attr;
 	int ret;
 
+	INIT_WORK(&queue->qp_err_work, nvme_rdma_qp_error_work);
 	memset(&init_attr, 0, sizeof(init_attr));
+	init_attr.qp_context = queue;
 	init_attr.event_handler = nvme_rdma_qp_event;
 	/* +1 for drain */
 	init_attr.cap.max_send_wr = factor * queue->queue_size + 1;
@@ -434,6 +457,7 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
 		ib_mr_pool_destroy(queue->qp, &queue->qp->sig_mrs);
 	ib_mr_pool_destroy(queue->qp, &queue->qp->rdma_mrs);
 
+	flush_work(&queue->qp_err_work);
 	/*
 	 * The cm_id object might have been destroyed during RDMA connection
 	 * establishment error flow to avoid getting other cma events, thus
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
@ 2022-09-07 12:02   ` Christoph Hellwig
  2022-09-07 12:11     ` Leon Romanovsky
  2022-09-07 12:34   ` Sagi Grimberg
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2022-09-07 12:02 UTC (permalink / raw)
  To: Patrisious Haddad
  Cc: Sagi Grimberg, Christoph Hellwig, Israel Rukshin,
	Leon Romanovsky, Linux-nvme, linux-rdma, Michael Guralnik,
	Maor Gottlieb, Max Gurtovoy

On Wed, Sep 07, 2022 at 02:38:00PM +0300, Patrisious Haddad wrote:
> From: Israel Rukshin <israelr@nvidia.com>
> 
> Add debug prints for fatal QP events that are helpful for finding the
> root cause of the errors. The ib_get_qp_err_syndrome is called at
> a work queue since the QP event callback is running on an
> interrupt context that can't sleep.

What an awkward interface.  What prevents us from allowing 
ib_get_qp_err_syndrome to be called from arbitrary calling contexts,
or even better just delivering the error directly as part of the
event?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 12:02   ` Christoph Hellwig
@ 2022-09-07 12:11     ` Leon Romanovsky
  0 siblings, 0 replies; 16+ messages in thread
From: Leon Romanovsky @ 2022-09-07 12:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Patrisious Haddad, Sagi Grimberg, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy

On Wed, Sep 07, 2022 at 02:02:00PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 07, 2022 at 02:38:00PM +0300, Patrisious Haddad wrote:
> > From: Israel Rukshin <israelr@nvidia.com>
> > 
> > Add debug prints for fatal QP events that are helpful for finding the
> > root cause of the errors. The ib_get_qp_err_syndrome is called at
> > a work queue since the QP event callback is running on an
> > interrupt context that can't sleep.
> 
> What an awkward interface.  What prevents us from allowing 
> ib_get_qp_err_syndrome to be called from arbitrary calling contexts,
> or even better just delivering the error directly as part of the
> event?

We need to call to our FW through command interface and unfortunately it
is not possible to do in atomic context.

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
  2022-09-07 12:02   ` Christoph Hellwig
@ 2022-09-07 12:34   ` Sagi Grimberg
  2022-09-07 12:51     ` Leon Romanovsky
  1 sibling, 1 reply; 16+ messages in thread
From: Sagi Grimberg @ 2022-09-07 12:34 UTC (permalink / raw)
  To: Patrisious Haddad, Christoph Hellwig
  Cc: Israel Rukshin, Leon Romanovsky, Linux-nvme, linux-rdma,
	Michael Guralnik, Maor Gottlieb, Max Gurtovoy


> From: Israel Rukshin <israelr@nvidia.com>
> 
> Add debug prints for fatal QP events that are helpful for finding the
> root cause of the errors. The ib_get_qp_err_syndrome is called at
> a work queue since the QP event callback is running on an
> interrupt context that can't sleep.
> 
> Signed-off-by: Israel Rukshin <israelr@nvidia.com>
> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

What makes nvme-rdma special here? Why do you get this in
nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?

This entire code needs to move to the rdma core instead
of being leaked to ulps.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 12:34   ` Sagi Grimberg
@ 2022-09-07 12:51     ` Leon Romanovsky
  2022-09-07 15:16       ` Sagi Grimberg
  0 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2022-09-07 12:51 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Patrisious Haddad, Christoph Hellwig, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy

On Wed, Sep 07, 2022 at 03:34:21PM +0300, Sagi Grimberg wrote:
> 
> > From: Israel Rukshin <israelr@nvidia.com>
> > 
> > Add debug prints for fatal QP events that are helpful for finding the
> > root cause of the errors. The ib_get_qp_err_syndrome is called at
> > a work queue since the QP event callback is running on an
> > interrupt context that can't sleep.
> > 
> > Signed-off-by: Israel Rukshin <israelr@nvidia.com>
> > Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> 
> What makes nvme-rdma special here? Why do you get this in
> nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?
> 
> This entire code needs to move to the rdma core instead
> of being leaked to ulps.

We can move, but you will lose connection between queue number,
caller and error itself.

As I answered to Christoph, we will need to execute query QP command
in a workqueue outside of event handler.

So you will get a print about queue in error state and later you will
see parsed error print somewhere in the dmesg.

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 12:51     ` Leon Romanovsky
@ 2022-09-07 15:16       ` Sagi Grimberg
  2022-09-07 15:18         ` Christoph Hellwig
  2022-09-07 17:29         ` Leon Romanovsky
  0 siblings, 2 replies; 16+ messages in thread
From: Sagi Grimberg @ 2022-09-07 15:16 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Patrisious Haddad, Christoph Hellwig, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy


>>> From: Israel Rukshin <israelr@nvidia.com>
>>>
>>> Add debug prints for fatal QP events that are helpful for finding the
>>> root cause of the errors. The ib_get_qp_err_syndrome is called at
>>> a work queue since the QP event callback is running on an
>>> interrupt context that can't sleep.
>>>
>>> Signed-off-by: Israel Rukshin <israelr@nvidia.com>
>>> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
>>
>> What makes nvme-rdma special here? Why do you get this in
>> nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?
>>
>> This entire code needs to move to the rdma core instead
>> of being leaked to ulps.
> 
> We can move, but you will lose connection between queue number,
> caller and error itself.

That still doesn't explain why nvme-rdma is special.

In any event, the ulp can log the qpn so the context can be interrogated
if that is important.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 15:16       ` Sagi Grimberg
@ 2022-09-07 15:18         ` Christoph Hellwig
  2022-09-07 17:39           ` Leon Romanovsky
  2022-09-08  7:55           ` Patrisious Haddad
  2022-09-07 17:29         ` Leon Romanovsky
  1 sibling, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2022-09-07 15:18 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Leon Romanovsky, Patrisious Haddad, Christoph Hellwig,
	Israel Rukshin, Linux-nvme, linux-rdma, Michael Guralnik,
	Maor Gottlieb, Max Gurtovoy

On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
>>>
>>> This entire code needs to move to the rdma core instead
>>> of being leaked to ulps.
>>
>> We can move, but you will lose connection between queue number,
>> caller and error itself.
>
> That still doesn't explain why nvme-rdma is special.
>
> In any event, the ulp can log the qpn so the context can be interrogated
> if that is important.

I also don't see why the QP event handler can't be called
from user context to start with.  I see absolutely no reason to
add boilerplate code to drivers for reporting slighly more verbose
errors on one specific piece of hrdware.  I'd say clean up the mess
that is the QP event handler first, and then once error reporting
becomes trivial we can just do it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 15:16       ` Sagi Grimberg
  2022-09-07 15:18         ` Christoph Hellwig
@ 2022-09-07 17:29         ` Leon Romanovsky
  1 sibling, 0 replies; 16+ messages in thread
From: Leon Romanovsky @ 2022-09-07 17:29 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Patrisious Haddad, Christoph Hellwig, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy

On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
> 
> > > > From: Israel Rukshin <israelr@nvidia.com>
> > > > 
> > > > Add debug prints for fatal QP events that are helpful for finding the
> > > > root cause of the errors. The ib_get_qp_err_syndrome is called at
> > > > a work queue since the QP event callback is running on an
> > > > interrupt context that can't sleep.
> > > > 
> > > > Signed-off-by: Israel Rukshin <israelr@nvidia.com>
> > > > Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> > > 
> > > What makes nvme-rdma special here? Why do you get this in
> > > nvme-rdma and not srp/iser/nfs-rdma/rds/smc/ipoib etc?
> > > 
> > > This entire code needs to move to the rdma core instead
> > > of being leaked to ulps.
> > 
> > We can move, but you will lose connection between queue number,
> > caller and error itself.
> 
> That still doesn't explain why nvme-rdma is special.

It was important for us to get proper review from at least one ULP,
nvme-rdma is not special at all.

> 
> In any event, the ulp can log the qpn so the context can be interrogated
> if that is important.

ok

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 15:18         ` Christoph Hellwig
@ 2022-09-07 17:39           ` Leon Romanovsky
  2022-11-01  9:12             ` Mark Zhang
  2022-09-08  7:55           ` Patrisious Haddad
  1 sibling, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2022-09-07 17:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Patrisious Haddad, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy,
	Chuck Lever

On Wed, Sep 07, 2022 at 05:18:18PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
> >>>
> >>> This entire code needs to move to the rdma core instead
> >>> of being leaked to ulps.
> >>
> >> We can move, but you will lose connection between queue number,
> >> caller and error itself.
> >
> > That still doesn't explain why nvme-rdma is special.
> >
> > In any event, the ulp can log the qpn so the context can be interrogated
> > if that is important.
> 
> I also don't see why the QP event handler can't be called
> from user context to start with.  I see absolutely no reason to
> add boilerplate code to drivers for reporting slighly more verbose
> errors on one specific piece of hrdware.  I'd say clean up the mess
> that is the QP event handler first, and then once error reporting
> becomes trivial we can just do it.

I don't know, Chuck documented it in 2018:
eb93c82ed8c7 ("RDMA/core: Document QP @event_handler function")

  1164 struct ib_qp_init_attr {
  1165         /* Consumer's event_handler callback must not block */
  1166         void                  (*event_handler)(struct ib_event *, void *);

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 15:18         ` Christoph Hellwig
  2022-09-07 17:39           ` Leon Romanovsky
@ 2022-09-08  7:55           ` Patrisious Haddad
  1 sibling, 0 replies; 16+ messages in thread
From: Patrisious Haddad @ 2022-09-08  7:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Leon Romanovsky, Israel Rukshin, Linux-nvme, linux-rdma,
	Michael Guralnik, Maor Gottlieb, Max Gurtovoy, Sagi Grimberg


On 07/09/2022 18:18, Christoph Hellwig wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
>>>>
>>>> This entire code needs to move to the rdma core instead
>>>> of being leaked to ulps.
>>>
>>> We can move, but you will lose connection between queue number,
>>> caller and error itself.
>>
>> That still doesn't explain why nvme-rdma is special.
>>
>> In any event, the ulp can log the qpn so the context can be interrogated
>> if that is important.
> 
> I also don't see why the QP event handler can't be called
> from user context to start with.  I see absolutely no reason to
> add boilerplate code to drivers for reporting slighly more verbose
> errors on one specific piece of hrdware.  I'd say clean up the mess
> that is the QP event handler first, and then once error reporting
> becomes trivial we can just do it.

I would like to emphasize that it is not just about slightly more 
verbose error, but mainly it is about an error that wouldn't have been 
reported at all without this feature, as I previously mentioned error 
cases in which the remote side doesn't generate a CQE, the remote side 
wouldn't even know why the QP was moved to error state without this feature.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-09-07 17:39           ` Leon Romanovsky
@ 2022-11-01  9:12             ` Mark Zhang
  2022-11-02  1:56               ` Mark Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Mark Zhang @ 2022-11-01  9:12 UTC (permalink / raw)
  To: Leon Romanovsky, Christoph Hellwig
  Cc: Sagi Grimberg, Patrisious Haddad, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy,
	Chuck Lever

On 9/8/2022 1:39 AM, Leon Romanovsky wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Sep 07, 2022 at 05:18:18PM +0200, Christoph Hellwig wrote:
>> On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
>>>>>
>>>>> This entire code needs to move to the rdma core instead
>>>>> of being leaked to ulps.
>>>>
>>>> We can move, but you will lose connection between queue number,
>>>> caller and error itself.
>>>
>>> That still doesn't explain why nvme-rdma is special.
>>>
>>> In any event, the ulp can log the qpn so the context can be interrogated
>>> if that is important.
>>
>> I also don't see why the QP event handler can't be called
>> from user context to start with.  I see absolutely no reason to
>> add boilerplate code to drivers for reporting slighly more verbose
>> errors on one specific piece of hrdware.  I'd say clean up the mess
>> that is the QP event handler first, and then once error reporting
>> becomes trivial we can just do it.
> 
> I don't know, Chuck documented it in 2018:
> eb93c82ed8c7 ("RDMA/core: Document QP @event_handler function")
> 
>    1164 struct ib_qp_init_attr {
>    1165         /* Consumer's event_handler callback must not block */
>    1166         void                  (*event_handler)(struct ib_event *, void *);

Looks like driver calls it in an atomic way, e.g.:
   mlx5_ib_qp_event() -> ibqp->event_handler(&event, ibqp->qp_context);

Could driver also report it as an IB async event, as WQ_CATAS_ERROR is 
defined as an async event (IB spec C11-39), and QP_FATAL is also an 
event of enum ib_event_type? Is it a problem that one event is reported 
twice?

If it is acceptable then ulp could register this event handler with 
ib_register_event_handler(), which is non-atomic.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state
  2022-11-01  9:12             ` Mark Zhang
@ 2022-11-02  1:56               ` Mark Zhang
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Zhang @ 2022-11-02  1:56 UTC (permalink / raw)
  To: Leon Romanovsky, Christoph Hellwig
  Cc: Sagi Grimberg, Patrisious Haddad, Israel Rukshin, Linux-nvme,
	linux-rdma, Michael Guralnik, Maor Gottlieb, Max Gurtovoy,
	Chuck Lever

On 11/1/2022 5:12 PM, Mark Zhang wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 9/8/2022 1:39 AM, Leon Romanovsky wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Wed, Sep 07, 2022 at 05:18:18PM +0200, Christoph Hellwig wrote:
>>> On Wed, Sep 07, 2022 at 06:16:05PM +0300, Sagi Grimberg wrote:
>>>>>>
>>>>>> This entire code needs to move to the rdma core instead
>>>>>> of being leaked to ulps.
>>>>>
>>>>> We can move, but you will lose connection between queue number,
>>>>> caller and error itself.
>>>>
>>>> That still doesn't explain why nvme-rdma is special.
>>>>
>>>> In any event, the ulp can log the qpn so the context can be 
>>>> interrogated
>>>> if that is important.
>>>
>>> I also don't see why the QP event handler can't be called
>>> from user context to start with.  I see absolutely no reason to
>>> add boilerplate code to drivers for reporting slighly more verbose
>>> errors on one specific piece of hrdware.  I'd say clean up the mess
>>> that is the QP event handler first, and then once error reporting
>>> becomes trivial we can just do it.
>>
>> I don't know, Chuck documented it in 2018:
>> eb93c82ed8c7 ("RDMA/core: Document QP @event_handler function")
>>
>>    1164 struct ib_qp_init_attr {
>>    1165         /* Consumer's event_handler callback must not block */
>>    1166         void                  (*event_handler)(struct ib_event 
>> *, void *);
> 
> Looks like driver calls it in an atomic way, e.g.:
>    mlx5_ib_qp_event() -> ibqp->event_handler(&event, ibqp->qp_context);
> 
> Could driver also report it as an IB async event, as WQ_CATAS_ERROR is
> defined as an async event (IB spec C11-39), and QP_FATAL is also an
> event of enum ib_event_type? Is it a problem that one event is reported
> twice?
> 
> If it is acceptable then ulp could register this event handler with
> ib_register_event_handler(), which is non-atomic.

Or move qp event handler to non-atomic as Christoph suggested? This 
means to fix the mlx4/mlx5 driver, to call it in a work queue.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-02  1:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 11:37 [PATCH rdma-next 0/4] Provide more error details when a QP moves to Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Patrisious Haddad
2022-09-07 11:37 ` [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Patrisious Haddad
2022-09-07 11:38 ` [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Patrisious Haddad
2022-09-07 12:02   ` Christoph Hellwig
2022-09-07 12:11     ` Leon Romanovsky
2022-09-07 12:34   ` Sagi Grimberg
2022-09-07 12:51     ` Leon Romanovsky
2022-09-07 15:16       ` Sagi Grimberg
2022-09-07 15:18         ` Christoph Hellwig
2022-09-07 17:39           ` Leon Romanovsky
2022-11-01  9:12             ` Mark Zhang
2022-11-02  1:56               ` Mark Zhang
2022-09-08  7:55           ` Patrisious Haddad
2022-09-07 17:29         ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).