linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code
@ 2020-06-02 12:55 Leon Romanovsky
  2020-06-02 12:55 ` [PATCH rdma-next 1/3] RDMA/mlx5: Return an error if copy_to_user fails Leon Romanovsky
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Leon Romanovsky @ 2020-06-02 12:55 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

This are three patches which I would ask to consider for current PR
to Linus.

First two patches are fixing corner cases that are not possible
in the real life and the third patch is needed to ensure that API
toward user space is feature complete and won't need any extra
ucontext flags.

The code of third patch was already presented in v0 and v1,
but was dropped later.

Leon Romanovsky (3):
  RDMA/mlx5: Return an error if copy_to_user fails
  RDMA/mlx5: Don't rely on FW to set zeros in ECE response
  RDMA/mlx5: Return ECE DC support

 drivers/infiniband/hw/mlx5/qp.c | 59 +++++++++++++++++++++------------
 include/linux/mlx5/mlx5_ifc.h   |  5 +--
 2 files changed, 40 insertions(+), 24 deletions(-)

--
2.26.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH rdma-next 1/3] RDMA/mlx5: Return an error if copy_to_user fails
  2020-06-02 12:55 [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Leon Romanovsky
@ 2020-06-02 12:55 ` Leon Romanovsky
  2020-06-02 12:55 ` [PATCH rdma-next 2/3] RDMA/mlx5: Don't rely on FW to set zeros in ECE response Leon Romanovsky
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2020-06-02 12:55 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang

From: Leon Romanovsky <leonro@mellanox.com>

In theoretical event, the ib_copy_to_udata() can fail, so
return -EFAULT error to the user, so he will destroy the QP.

Fixes: 50aec2c3135e ("RDMA/mlx5: Return ECE data after modify QP")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/qp.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index bc776542efd3..e8427231cf15 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4305,12 +4305,8 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	/* resp.response_length is set in ECE supported flows only */
 	if (!err && resp.response_length &&
 	    udata->outlen >= resp.response_length)
-		/*
-		 * We don't check return value of the function below
-		 * on purpose, because it is unclear how to unwind the
-		 * error flow after QP was modified to the new state.
-		 */
-		ib_copy_to_udata(udata, &resp, resp.response_length);
+		/* Return -EFAULT to the user and expect him to destroy QP. */
+		err = ib_copy_to_udata(udata, &resp, resp.response_length);
 
 out:
 	mutex_unlock(&qp->mutex);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH rdma-next 2/3] RDMA/mlx5: Don't rely on FW to set zeros in ECE response
  2020-06-02 12:55 [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Leon Romanovsky
  2020-06-02 12:55 ` [PATCH rdma-next 1/3] RDMA/mlx5: Return an error if copy_to_user fails Leon Romanovsky
@ 2020-06-02 12:55 ` Leon Romanovsky
  2020-06-02 12:55 ` [PATCH mlx5-next 3/3] RDMA/mlx5: Return ECE DC support Leon Romanovsky
  2020-06-03 18:48 ` [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2020-06-02 12:55 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang

From: Leon Romanovsky <leonro@mellanox.com>

The FW returns zeros in case feature is not enabled, but it is better
to have the capability check and ensure that returned result is cleared.

Fixes: 3e09a427ae7a ("RDMA/mlx5: Get ECE options from FW during create QP")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/qp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index e8427231cf15..494e305cf761 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -1906,7 +1906,8 @@ static int create_xrc_tgt_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 
 	base->container_mibqp = qp;
 	base->mqp.event = mlx5_ib_qp_event;
-	params->resp.ece_options = MLX5_GET(create_qp_out, out, ece);
+	if (MLX5_CAP_GEN(mdev, ece_support))
+		params->resp.ece_options = MLX5_GET(create_qp_out, out, ece);
 
 	spin_lock_irqsave(&dev->reset_flow_resource_lock, flags);
 	list_add_tail(&qp->qps_list, &dev->qp_list);
@@ -2082,7 +2083,8 @@ static int create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 
 	base->container_mibqp = qp;
 	base->mqp.event = mlx5_ib_qp_event;
-	params->resp.ece_options = MLX5_GET(create_qp_out, out, ece);
+	if (MLX5_CAP_GEN(mdev, ece_support))
+		params->resp.ece_options = MLX5_GET(create_qp_out, out, ece);
 
 	get_cqs(qp->type, init_attr->send_cq, init_attr->recv_cq,
 		&send_cq, &recv_cq);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH mlx5-next 3/3] RDMA/mlx5: Return ECE DC support
  2020-06-02 12:55 [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Leon Romanovsky
  2020-06-02 12:55 ` [PATCH rdma-next 1/3] RDMA/mlx5: Return an error if copy_to_user fails Leon Romanovsky
  2020-06-02 12:55 ` [PATCH rdma-next 2/3] RDMA/mlx5: Don't rely on FW to set zeros in ECE response Leon Romanovsky
@ 2020-06-02 12:55 ` Leon Romanovsky
  2020-06-03 18:48 ` [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2020-06-02 12:55 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma

From: Leon Romanovsky <leonro@mellanox.com>

The DC QPs are many-to-one QP types that means that first connection
will establish ECE options that coming connections should follow.
Due to this property, the ECE code was removed between first [1] and
second [2] ECE submissions.

This patch returns the dropped code, because ECE is a property of a
connection and like any other connection users are needed to manage
this data. Allow them to set ECE parameter for DC too and avoid need
of having compatibility flag for the DC ECE.

[1]
https://lore.kernel.org/linux-rdma/20200523132243.817936-1-leon@kernel.org/
[2]
https://lore.kernel.org/linux-rdma/20200525174401.71152-1-leon@kernel.org/

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/qp.c | 45 +++++++++++++++++++++++----------
 include/linux/mlx5/mlx5_ifc.h   |  5 ++--
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 494e305cf761..d74c664bedc9 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -2404,7 +2404,8 @@ static void destroy_qp_common(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 	destroy_qp(dev, qp, base, udata);
 }

-static int create_dct(struct ib_pd *pd, struct mlx5_ib_qp *qp,
+static int create_dct(struct mlx5_ib_dev *dev, struct ib_pd *pd,
+		      struct mlx5_ib_qp *qp,
 		      struct mlx5_create_qp_params *params)
 {
 	struct ib_qp_init_attr *attr = params->attr;
@@ -2423,6 +2424,8 @@ static int create_dct(struct ib_pd *pd, struct mlx5_ib_qp *qp,
 	MLX5_SET(dctc, dctc, cqn, to_mcq(attr->recv_cq)->mcq.cqn);
 	MLX5_SET64(dctc, dctc, dc_access_key, ucmd->access_key);
 	MLX5_SET(dctc, dctc, user_index, uidx);
+	if (MLX5_CAP_GEN(dev->mdev, ece_support))
+		MLX5_SET(dctc, dctc, ece, ucmd->ece_options);

 	if (qp->flags_en & MLX5_QP_FLAG_SCATTER_CQE) {
 		int rcqe_sz = mlx5_ib_get_cqe_size(attr->recv_cq);
@@ -2768,7 +2771,7 @@ static int create_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 	}

 	if (qp->type == MLX5_IB_QPT_DCT) {
-		err = create_dct(pd, qp, params);
+		err = create_dct(dev, pd, qp, params);
 		goto out;
 	}

@@ -2882,9 +2885,8 @@ static int check_ucmd_data(struct mlx5_ib_dev *dev,
 		 */
 		last = sizeof(struct mlx5_ib_create_qp_rss);
 	else
-		/* IB_QPT_RAW_PACKET and IB_QPT_DRIVER don't have ECE data */
+		/* IB_QPT_RAW_PACKET doesn't have ECE data */
 		switch (attr->qp_type) {
-		case IB_QPT_DRIVER:
 		case IB_QPT_RAW_PACKET:
 			last = offsetof(struct mlx5_ib_create_qp, ece_options);
 			break;
@@ -4095,7 +4097,8 @@ static bool modify_dci_qp_is_ok(enum ib_qp_state cur_state, enum ib_qp_state new
  * Other transitions and attributes are illegal
  */
 static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr,
-			      int attr_mask, struct ib_udata *udata)
+			      int attr_mask, struct mlx5_ib_modify_qp *ucmd,
+			      struct ib_udata *udata)
 {
 	struct mlx5_ib_qp *qp = to_mqp(ibqp);
 	struct mlx5_ib_dev *dev = to_mdev(ibqp->device);
@@ -4111,6 +4114,15 @@ static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	new_state = attr->qp_state;

 	dctc = MLX5_ADDR_OF(create_dct_in, qp->dct.in, dct_context_entry);
+	if (MLX5_CAP_GEN(dev->mdev, ece_support) && ucmd->ece_options)
+		/*
+		 * DCT doesn't initialize QP till modify command is executed,
+		 * so we need to overwrite previously set ECE field if user
+		 * provided any value except zero, which means not set/not
+		 * valid.
+		 */
+		MLX5_SET(dctc, dctc, ece, ucmd->ece_options);
+
 	if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) {
 		u16 set_id;

@@ -4145,14 +4157,21 @@ static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		MLX5_SET(dctc, dctc, counter_set_id, set_id);
 	} else if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) {
 		struct mlx5_ib_modify_qp_resp resp = {};
-		u32 out[MLX5_ST_SZ_DW(create_dct_out)] = {0};
-		u32 min_resp_len = offsetof(typeof(resp), dctn) +
-				   sizeof(resp.dctn);
+		u32 out[MLX5_ST_SZ_DW(create_dct_out)] = {};
+		u32 min_resp_len = offsetofend(typeof(resp), dctn);

 		if (udata->outlen < min_resp_len)
 			return -EINVAL;
 		resp.response_length = min_resp_len;

+		/*
+		 * If we don't have enough space for the ECE options,
+		 * simply indicate it with resp.response_length.
+		 */
+		resp.response_length = (udata->outlen < sizeof(resp)) ?
+					       min_resp_len :
+					       sizeof(resp);
+
 		required |= IB_QP_MIN_RNR_TIMER | IB_QP_AV | IB_QP_PATH_MTU;
 		if (!is_valid_mask(attr_mask, required, 0))
 			return -EINVAL;
@@ -4169,6 +4188,8 @@ static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		if (err)
 			return err;
 		resp.dctn = qp->dct.mdct.mqp.qpn;
+		if (MLX5_CAP_GEN(dev->mdev, ece_support))
+			resp.ece_options = MLX5_GET(create_dct_out, out, ece);
 		err = ib_copy_to_udata(udata, &resp, resp.response_length);
 		if (err) {
 			mlx5_core_destroy_dct(dev, &qp->dct.mdct);
@@ -4226,12 +4247,8 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	qp_type = (unlikely(ibqp->qp_type == MLX5_IB_QPT_HW_GSI)) ? IB_QPT_GSI :
 								    qp->type;

-	if (qp_type == MLX5_IB_QPT_DCT) {
-		if (memchr_inv(&ucmd.ece_options, 0, sizeof(ucmd.ece_options)))
-			return -EOPNOTSUPP;
-
-		return mlx5_ib_modify_dct(ibqp, attr, attr_mask, udata);
-	}
+	if (qp_type == MLX5_IB_QPT_DCT)
+		return mlx5_ib_modify_dct(ibqp, attr, attr_mask, &ucmd, udata);

 	mutex_lock(&qp->mutex);

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 87e2aba5e504..66aeaf113995 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -3693,7 +3693,8 @@ struct mlx5_ifc_dctc_bits {
 	u8         ecn[0x2];
 	u8         dscp[0x6];

-	u8         reserved_at_1c0[0x40];
+	u8         reserved_at_1c0[0x20];
+	u8         ece[0x20];
 };

 enum {
@@ -7945,7 +7946,7 @@ struct mlx5_ifc_create_dct_out_bits {
 	u8         reserved_at_40[0x8];
 	u8         dctn[0x18];

-	u8         reserved_at_60[0x20];
+	u8         ece[0x20];
 };

 struct mlx5_ifc_create_dct_in_bits {
--
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code
  2020-06-02 12:55 [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Leon Romanovsky
                   ` (2 preceding siblings ...)
  2020-06-02 12:55 ` [PATCH mlx5-next 3/3] RDMA/mlx5: Return ECE DC support Leon Romanovsky
@ 2020-06-03 18:48 ` Jason Gunthorpe
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2020-06-03 18:48 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Leon Romanovsky, linux-rdma, Mark Zhang

On Tue, Jun 02, 2020 at 03:55:45PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> This are three patches which I would ask to consider for current PR
> to Linus.
> 
> First two patches are fixing corner cases that are not possible
> in the real life and the third patch is needed to ensure that API
> toward user space is feature complete and won't need any extra
> ucontext flags.
> 
> The code of third patch was already presented in v0 and v1,
> but was dropped later.
> 
> Leon Romanovsky (3):
>   RDMA/mlx5: Return an error if copy_to_user fails
>   RDMA/mlx5: Don't rely on FW to set zeros in ECE response
>   RDMA/mlx5: Return ECE DC support

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-03 18:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02 12:55 [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Leon Romanovsky
2020-06-02 12:55 ` [PATCH rdma-next 1/3] RDMA/mlx5: Return an error if copy_to_user fails Leon Romanovsky
2020-06-02 12:55 ` [PATCH rdma-next 2/3] RDMA/mlx5: Don't rely on FW to set zeros in ECE response Leon Romanovsky
2020-06-02 12:55 ` [PATCH mlx5-next 3/3] RDMA/mlx5: Return ECE DC support Leon Romanovsky
2020-06-03 18:48 ` [PATCH rdma-next 0/3] mlx5 ECE fixes DC ECE code Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).