All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next V1 0/4] Add packet pacing support for IB verbs
@ 2016-12-01 11:43 Leon Romanovsky
       [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-01 11:43 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Packet pacing is a feature to control packet injection rate in
arbitrary rates. One typical use case is for streaming vendors to
control the bandwidth of different customers based on service coverage.

Packet pacing is a rate limiting and shaping for a QP (SQ for RAW QP),
set and change the rate is done by modifying QP. This series of patch
made the following high level changes:

1. Report rate limit capabilities through user data. Reported
   capabilities include: The maximum and minimum rate limit in kbps
   supported by packet pacing; Bitmap showing which QP types are
   supported by packet pacing operation.
2. Extend modify QP interface for growing attributes. Add rate limit
   support to the extended interface.
3. Enable mlx5-based hardware to be able to update the rate limit for
   RAW QP packet.

Rate limit could also be achieved by using path record data(static_rate),
however, compare to packet pacing, the drawbacks are:

- The rate field is only u8, and uses IB standard rate enumerations.
  Users have very limited options. Packet pacing supports arbitrary
  rates with a resolution of 1kbps.
- Path record data doesn't apply to Raw Ethernet QPs, because these are
  not 1:1 connected as RC, nor use address handles as UD.

The rate_limit configuration came from the application like required BW
for the streaming and not from the fabric (SM). Both of these features
are limiting the rate but coming from a different entities and required
separate fields. The actual limit should be the minimum of both of them
(if both features apply).

This patch series depend on the following patches:

Add port counter support for raw packet QP
[1] http://www.spinics.net/lists/linux-rdma/msg39607.html

Available in the "topic/packet_pacing" topic branch of this git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git

Or for browsing:
https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=topic/packet_pacing

Thanks.

Changes from V0:

1. New bit masks IB_USER_LEGACY_LAST_QP_ATTR_MASK and
   IB_USER_LAST_QP_ATTR_MASK are added to detect non-supported masks for
   legacy/ex commands.
2. The ib_uverbs_ex_modify_qp should call uhw rather than ucore at the end.

Bodong Wang (4):
  IB/mlx5: Report mlx5 packet pacing capabilities when querying device
  IB/core: Support rate limit for packet pacing
  IB/uverbs: Extend modify_qp and support packet pacing
  IB/mlx5: Update the rate limit according to user setting for RAW QP

 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_cmd.c  | 192 ++++++++++++++++++++++------------
 drivers/infiniband/core/uverbs_main.c |   1 +
 drivers/infiniband/core/verbs.c       |   2 +
 drivers/infiniband/hw/mlx5/main.c     |  16 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h  |   1 +
 drivers/infiniband/hw/mlx5/qp.c       |  74 +++++++++++--
 include/rdma/ib_verbs.h               |   2 +
 include/uapi/rdma/ib_user_verbs.h     |  21 ++++
 include/uapi/rdma/mlx5-abi.h          |  13 +++
 10 files changed, 245 insertions(+), 78 deletions(-)

--
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH rdma-next V1 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device
       [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-12-01 11:43   ` Leon Romanovsky
  2016-12-01 11:43   ` [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-01 11:43 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Enable mlx5 based hardware to report packet pacing capabilities
from kernel to user space. Packet pacing allows to limit the rate to any
number between the maximum and minimum, based on user settings.

The capabilities are exposed to user space through query_device by uhw.
The following capabilities are reported:

1. The maximum and minimum rate limit in kbps supported by packet pacing.
2. Bitmap showing which QP types are supported by packet pacing operation.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 13 +++++++++++++
 include/uapi/rdma/mlx5-abi.h      | 13 +++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 1e47999..fba8cab 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -669,6 +669,19 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 			1 << MLX5_CAP_GEN(dev->mdev, log_max_rq);
 	}
 
+	if (field_avail(typeof(resp), packet_pacing_caps, uhw->outlen)) {
+		if (MLX5_CAP_QOS(mdev, packet_pacing) &&
+		    MLX5_CAP_GEN(mdev, qos)) {
+			resp.packet_pacing_caps.qp_rate_limit_max =
+				MLX5_CAP_QOS(mdev, packet_pacing_max_rate);
+			resp.packet_pacing_caps.qp_rate_limit_min =
+				MLX5_CAP_QOS(mdev, packet_pacing_min_rate);
+			resp.packet_pacing_caps.supported_qpts |=
+				1 << IB_QPT_RAW_PACKET;
+		}
+		resp.response_length += sizeof(resp.packet_pacing_caps);
+	}
+
 	if (uhw->outlen) {
 		err = ib_copy_to_udata(uhw, &resp, resp.response_length);
 
diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
index f5d0f4e..4e9338d 100644
--- a/include/uapi/rdma/mlx5-abi.h
+++ b/include/uapi/rdma/mlx5-abi.h
@@ -124,11 +124,24 @@ struct mlx5_ib_rss_caps {
 	__u8 reserved[7];
 };
 
+struct mlx5_packet_pacing_caps {
+	__u32 qp_rate_limit_min;
+	__u32 qp_rate_limit_max; /* In kpbs */
+
+	/* Corresponding bit will be set if qp type from
+	 * 'enum ib_qp_type' is supported, e.g.
+	 * supported_qpts |= 1 << IB_QPT_RAW_PACKET
+	 */
+	__u32 supported_qpts;
+	__u32 reserved;
+};
+
 struct mlx5_ib_query_device_resp {
 	__u32	comp_mask;
 	__u32	response_length;
 	struct	mlx5_ib_tso_caps tso_caps;
 	struct	mlx5_ib_rss_caps rss_caps;
+	struct	mlx5_packet_pacing_caps packet_pacing_caps;
 };
 
 struct mlx5_ib_create_cq {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-12-01 11:43   ` [PATCH rdma-next V1 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
@ 2016-12-01 11:43   ` Leon Romanovsky
       [not found]     ` <1480592596-20126-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-12-01 11:43   ` [PATCH rdma-next V1 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
  2016-12-01 11:43   ` [PATCH rdma-next V1 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky
  3 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-01 11:43 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add new member rate_limit to ib_qp_attr which holds the packet pacing rate
in kbps, 0 means unlimited.

IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW
QPs when changing QP state from RTR to RTS, RTS to RTS.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 2 ++
 include/rdma/ib_verbs.h         | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 8368764..3e688b3 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1014,6 +1014,7 @@ static const struct {
 						 IB_QP_QKEY),
 				 [IB_QPT_GSI] = (IB_QP_CUR_STATE		|
 						 IB_QP_QKEY),
+				 [IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
 			 }
 		}
 	},
@@ -1047,6 +1048,7 @@ static const struct {
 						IB_QP_QKEY),
 				[IB_QPT_GSI] = (IB_QP_CUR_STATE			|
 						IB_QP_QKEY),
+				[IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
 			}
 		},
 		[IB_QPS_SQD]   = {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..a065361 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1102,6 +1102,7 @@ enum ib_qp_attr_mask {
 	IB_QP_RESERVED2			= (1<<22),
 	IB_QP_RESERVED3			= (1<<23),
 	IB_QP_RESERVED4			= (1<<24),
+	IB_QP_RATE_LIMIT		= (1<<25),
 };
 
 enum ib_qp_state {
@@ -1151,6 +1152,7 @@ struct ib_qp_attr {
 	u8			rnr_retry;
 	u8			alt_port_num;
 	u8			alt_timeout;
+	u32			rate_limit;
 };
 
 enum ib_wr_opcode {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rdma-next V1 3/4] IB/uverbs: Extend modify_qp and support packet pacing
       [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-12-01 11:43   ` [PATCH rdma-next V1 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
  2016-12-01 11:43   ` [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
@ 2016-12-01 11:43   ` Leon Romanovsky
  2016-12-01 11:43   ` [PATCH rdma-next V1 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky
  3 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-01 11:43 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

An new uverbs command ib_uverbs_ex_modify_qp is added to support more QP
attributes. User driver should choose to call the legacy/extended API
based on input mask.

IB_USER_LAST_QP_ATTR_MASK is added to indicated the maximum bit position
which supports legacy ib_uverbs_modify_qp.
IB_USER_LEGACY_LAST_QP_ATTR_MASK indicates the maximum bit position
which supports ib_uverbs_ex_modify_qp, the value of this mask should be
updated if new mask is added later.

Along with this change, rate_limit is supported by the extended command,
user driver could use it to control packet packing.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_cmd.c  | 192 ++++++++++++++++++++++------------
 drivers/infiniband/core/uverbs_main.c |   1 +
 include/uapi/rdma/ib_user_verbs.h     |  21 ++++
 4 files changed, 146 insertions(+), 69 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index df26a74..455034a 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -289,5 +289,6 @@ IB_UVERBS_DECLARE_EX_CMD(modify_wq);
 IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
 IB_UVERBS_DECLARE_EX_CMD(create_rwq_ind_table);
 IB_UVERBS_DECLARE_EX_CMD(destroy_rwq_ind_table);
+IB_UVERBS_DECLARE_EX_CMD(modify_qp);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index cb3f515a..b8fff8b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2328,94 +2328,86 @@ static int modify_qp_mask(enum ib_qp_type qp_type, int mask)
 	}
 }
 
-ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
-			    struct ib_device *ib_dev,
-			    const char __user *buf, int in_len,
-			    int out_len)
+static int modify_qp(struct ib_uverbs_file *file,
+		     struct ib_uverbs_ex_modify_qp *cmd, struct ib_udata *udata)
 {
-	struct ib_uverbs_modify_qp cmd;
-	struct ib_udata            udata;
-	struct ib_qp              *qp;
-	struct ib_qp_attr         *attr;
-	int                        ret;
-
-	if (copy_from_user(&cmd, buf, sizeof cmd))
-		return -EFAULT;
-
-	INIT_UDATA(&udata, buf + sizeof cmd, NULL, in_len - sizeof cmd,
-		   out_len);
+	struct ib_qp_attr *attr;
+	struct ib_qp *qp;
+	int ret;
 
 	attr = kmalloc(sizeof *attr, GFP_KERNEL);
 	if (!attr)
 		return -ENOMEM;
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = idr_read_qp(cmd->base.qp_handle, file->ucontext);
 	if (!qp) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	attr->qp_state 		  = cmd.qp_state;
-	attr->cur_qp_state 	  = cmd.cur_qp_state;
-	attr->path_mtu 		  = cmd.path_mtu;
-	attr->path_mig_state 	  = cmd.path_mig_state;
-	attr->qkey 		  = cmd.qkey;
-	attr->rq_psn 		  = cmd.rq_psn;
-	attr->sq_psn 		  = cmd.sq_psn;
-	attr->dest_qp_num 	  = cmd.dest_qp_num;
-	attr->qp_access_flags 	  = cmd.qp_access_flags;
-	attr->pkey_index 	  = cmd.pkey_index;
-	attr->alt_pkey_index 	  = cmd.alt_pkey_index;
-	attr->en_sqd_async_notify = cmd.en_sqd_async_notify;
-	attr->max_rd_atomic 	  = cmd.max_rd_atomic;
-	attr->max_dest_rd_atomic  = cmd.max_dest_rd_atomic;
-	attr->min_rnr_timer 	  = cmd.min_rnr_timer;
-	attr->port_num 		  = cmd.port_num;
-	attr->timeout 		  = cmd.timeout;
-	attr->retry_cnt 	  = cmd.retry_cnt;
-	attr->rnr_retry 	  = cmd.rnr_retry;
-	attr->alt_port_num 	  = cmd.alt_port_num;
-	attr->alt_timeout 	  = cmd.alt_timeout;
-
-	memcpy(attr->ah_attr.grh.dgid.raw, cmd.dest.dgid, 16);
-	attr->ah_attr.grh.flow_label        = cmd.dest.flow_label;
-	attr->ah_attr.grh.sgid_index        = cmd.dest.sgid_index;
-	attr->ah_attr.grh.hop_limit         = cmd.dest.hop_limit;
-	attr->ah_attr.grh.traffic_class     = cmd.dest.traffic_class;
-	attr->ah_attr.dlid 	    	    = cmd.dest.dlid;
-	attr->ah_attr.sl   	    	    = cmd.dest.sl;
-	attr->ah_attr.src_path_bits 	    = cmd.dest.src_path_bits;
-	attr->ah_attr.static_rate   	    = cmd.dest.static_rate;
-	attr->ah_attr.ah_flags 	    	    = cmd.dest.is_global ? IB_AH_GRH : 0;
-	attr->ah_attr.port_num 	    	    = cmd.dest.port_num;
-
-	memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd.alt_dest.dgid, 16);
-	attr->alt_ah_attr.grh.flow_label    = cmd.alt_dest.flow_label;
-	attr->alt_ah_attr.grh.sgid_index    = cmd.alt_dest.sgid_index;
-	attr->alt_ah_attr.grh.hop_limit     = cmd.alt_dest.hop_limit;
-	attr->alt_ah_attr.grh.traffic_class = cmd.alt_dest.traffic_class;
-	attr->alt_ah_attr.dlid 	    	    = cmd.alt_dest.dlid;
-	attr->alt_ah_attr.sl   	    	    = cmd.alt_dest.sl;
-	attr->alt_ah_attr.src_path_bits     = cmd.alt_dest.src_path_bits;
-	attr->alt_ah_attr.static_rate       = cmd.alt_dest.static_rate;
-	attr->alt_ah_attr.ah_flags 	    = cmd.alt_dest.is_global ? IB_AH_GRH : 0;
-	attr->alt_ah_attr.port_num 	    = cmd.alt_dest.port_num;
+	attr->qp_state		  = cmd->base.qp_state;
+	attr->cur_qp_state	  = cmd->base.cur_qp_state;
+	attr->path_mtu		  = cmd->base.path_mtu;
+	attr->path_mig_state	  = cmd->base.path_mig_state;
+	attr->qkey		  = cmd->base.qkey;
+	attr->rq_psn		  = cmd->base.rq_psn;
+	attr->sq_psn		  = cmd->base.sq_psn;
+	attr->dest_qp_num	  = cmd->base.dest_qp_num;
+	attr->qp_access_flags	  = cmd->base.qp_access_flags;
+	attr->pkey_index	  = cmd->base.pkey_index;
+	attr->alt_pkey_index	  = cmd->base.alt_pkey_index;
+	attr->en_sqd_async_notify = cmd->base.en_sqd_async_notify;
+	attr->max_rd_atomic	  = cmd->base.max_rd_atomic;
+	attr->max_dest_rd_atomic  = cmd->base.max_dest_rd_atomic;
+	attr->min_rnr_timer	  = cmd->base.min_rnr_timer;
+	attr->port_num		  = cmd->base.port_num;
+	attr->timeout		  = cmd->base.timeout;
+	attr->retry_cnt		  = cmd->base.retry_cnt;
+	attr->rnr_retry		  = cmd->base.rnr_retry;
+	attr->alt_port_num	  = cmd->base.alt_port_num;
+	attr->alt_timeout	  = cmd->base.alt_timeout;
+	attr->rate_limit	  = cmd->rate_limit;
+
+	memcpy(attr->ah_attr.grh.dgid.raw, cmd->base.dest.dgid, 16);
+	attr->ah_attr.grh.flow_label	= cmd->base.dest.flow_label;
+	attr->ah_attr.grh.sgid_index	= cmd->base.dest.sgid_index;
+	attr->ah_attr.grh.hop_limit	= cmd->base.dest.hop_limit;
+	attr->ah_attr.grh.traffic_class	= cmd->base.dest.traffic_class;
+	attr->ah_attr.dlid		= cmd->base.dest.dlid;
+	attr->ah_attr.sl		= cmd->base.dest.sl;
+	attr->ah_attr.src_path_bits	= cmd->base.dest.src_path_bits;
+	attr->ah_attr.static_rate	= cmd->base.dest.static_rate;
+	attr->ah_attr.ah_flags		= cmd->base.dest.is_global ?
+					  IB_AH_GRH : 0;
+	attr->ah_attr.port_num		= cmd->base.dest.port_num;
+
+	memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd->base.alt_dest.dgid, 16);
+	attr->alt_ah_attr.grh.flow_label    = cmd->base.alt_dest.flow_label;
+	attr->alt_ah_attr.grh.sgid_index    = cmd->base.alt_dest.sgid_index;
+	attr->alt_ah_attr.grh.hop_limit     = cmd->base.alt_dest.hop_limit;
+	attr->alt_ah_attr.grh.traffic_class = cmd->base.alt_dest.traffic_class;
+	attr->alt_ah_attr.dlid		    = cmd->base.alt_dest.dlid;
+	attr->alt_ah_attr.sl		    = cmd->base.alt_dest.sl;
+	attr->alt_ah_attr.src_path_bits	    = cmd->base.alt_dest.src_path_bits;
+	attr->alt_ah_attr.static_rate	    = cmd->base.alt_dest.static_rate;
+	attr->alt_ah_attr.ah_flags	    = cmd->base.alt_dest.is_global ?
+					      IB_AH_GRH : 0;
+	attr->alt_ah_attr.port_num	    = cmd->base.alt_dest.port_num;
 
 	if (qp->real_qp == qp) {
-		ret = ib_resolve_eth_dmac(qp, attr, &cmd.attr_mask);
+		ret = ib_resolve_eth_dmac(qp, attr, &cmd->base.attr_mask);
 		if (ret)
 			goto release_qp;
 		ret = qp->device->modify_qp(qp, attr,
-			modify_qp_mask(qp->qp_type, cmd.attr_mask), &udata);
+					    modify_qp_mask(qp->qp_type,
+							   cmd->base.attr_mask),
+					    udata);
 	} else {
-		ret = ib_modify_qp(qp, attr, modify_qp_mask(qp->qp_type, cmd.attr_mask));
+		ret = ib_modify_qp(qp, attr,
+				   modify_qp_mask(qp->qp_type,
+						  cmd->base.attr_mask));
 	}
 
-	if (ret)
-		goto release_qp;
-
-	ret = in_len;
-
 release_qp:
 	put_qp_read(qp);
 
@@ -2425,6 +2417,68 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
 	return ret;
 }
 
+ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
+			    struct ib_device *ib_dev,
+			    const char __user *buf, int in_len,
+			    int out_len)
+{
+	struct ib_uverbs_ex_modify_qp cmd = {};
+	struct ib_udata udata;
+	int ret;
+
+	if (copy_from_user(&cmd.base, buf, sizeof(cmd.base)))
+		return -EFAULT;
+
+	if (cmd.base.attr_mask &
+	    ~((IB_USER_LEGACY_LAST_QP_ATTR_MASK << 1) - 1))
+		return -EOPNOTSUPP;
+
+	INIT_UDATA(&udata, buf + sizeof(cmd.base), NULL,
+		   in_len - sizeof(cmd.base), out_len);
+
+	ret = modify_qp(file, &cmd, &udata);
+	if (ret)
+		return ret;
+
+	return in_len;
+}
+
+int ib_uverbs_ex_modify_qp(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_modify_qp cmd = {};
+	int ret;
+
+	/*
+	 * Last bit is reserved for extending the attr_mask by
+	 * using another field.
+	 */
+	BUILD_BUG_ON(IB_USER_LAST_QP_ATTR_MASK == (1 << 31));
+
+	if (ucore->inlen < sizeof(cmd.base))
+		return -EINVAL;
+
+	ret = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (ret)
+		return ret;
+
+	if (cmd.base.attr_mask &
+	    ~((IB_USER_LAST_QP_ATTR_MASK << 1) - 1))
+		return -EOPNOTSUPP;
+
+	if (ucore->inlen > sizeof(cmd)) {
+		if (ib_is_udata_cleared(ucore, sizeof(cmd),
+					ucore->inlen - sizeof(cmd)))
+			return -EOPNOTSUPP;
+	}
+
+	ret = modify_qp(file, &cmd, uhw);
+
+	return ret;
+}
+
 ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     const char __user *buf, int in_len,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 44b1104..cde7a56 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -137,6 +137,7 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]       = ib_uverbs_ex_destroy_wq,
 	[IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL] = ib_uverbs_ex_create_rwq_ind_table,
 	[IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL] = ib_uverbs_ex_destroy_rwq_ind_table,
+	[IB_USER_VERBS_EX_CMD_MODIFY_QP]        = ib_uverbs_ex_modify_qp,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 25225eb..ea59bd5 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -37,6 +37,7 @@
 #define IB_USER_VERBS_H
 
 #include <linux/types.h>
+#include <rdma/ib_verbs.h>
 
 /*
  * Increment this value if any changes that break userspace ABI
@@ -93,6 +94,7 @@ enum {
 	IB_USER_VERBS_EX_CMD_QUERY_DEVICE = IB_USER_VERBS_CMD_QUERY_DEVICE,
 	IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ,
 	IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP,
+	IB_USER_VERBS_EX_CMD_MODIFY_QP = IB_USER_VERBS_CMD_MODIFY_QP,
 	IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
 	IB_USER_VERBS_EX_CMD_DESTROY_FLOW,
 	IB_USER_VERBS_EX_CMD_CREATE_WQ,
@@ -545,6 +547,14 @@ enum {
 	IB_UVERBS_CREATE_QP_SUP_COMP_MASK = IB_UVERBS_CREATE_QP_MASK_IND_TABLE,
 };
 
+enum {
+	IB_USER_LEGACY_LAST_QP_ATTR_MASK = IB_QP_DEST_QPN
+};
+
+enum {
+	IB_USER_LAST_QP_ATTR_MASK = IB_QP_RATE_LIMIT
+};
+
 struct ib_uverbs_ex_create_qp {
 	__u64 user_handle;
 	__u32 pd_handle;
@@ -684,9 +694,20 @@ struct ib_uverbs_modify_qp {
 	__u64 driver_data[0];
 };
 
+struct ib_uverbs_ex_modify_qp {
+	struct ib_uverbs_modify_qp base;
+	__u32	rate_limit;
+	__u32	reserved;
+};
+
 struct ib_uverbs_modify_qp_resp {
 };
 
+struct ib_uverbs_ex_modify_qp_resp {
+	__u32  comp_mask;
+	__u32  response_length;
+};
+
 struct ib_uverbs_destroy_qp {
 	__u64 response;
 	__u32 qp_handle;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rdma-next V1 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP
       [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-12-01 11:43   ` [PATCH rdma-next V1 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
@ 2016-12-01 11:43   ` Leon Romanovsky
  3 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-01 11:43 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

- Add MODIFY_QP_EX CMD to extend modify_qp.
- Rate limit will be updated in the following state transactions: RTR2RTS,
  RTS2RTS. The limit will be removed when SQ is in RST and ERR state.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |  3 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  1 +
 drivers/infiniband/hw/mlx5/qp.c      | 74 ++++++++++++++++++++++++++++++++----
 3 files changed, 69 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index fba8cab..4d22cce 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3030,7 +3030,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	dev->ib_dev.uverbs_ex_cmd_mask =
 		(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE)	|
 		(1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ)	|
-		(1ull << IB_USER_VERBS_EX_CMD_CREATE_QP);
+		(1ull << IB_USER_VERBS_EX_CMD_CREATE_QP)	|
+		(1ull << IB_USER_VERBS_EX_CMD_MODIFY_QP);
 
 	dev->ib_dev.query_device	= mlx5_ib_query_device;
 	dev->ib_dev.query_port		= mlx5_ib_query_port;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 854748b..907d749 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -387,6 +387,7 @@ struct mlx5_ib_qp {
 	struct list_head	qps_list;
 	struct list_head	cq_recv_list;
 	struct list_head	cq_send_list;
+	u32			rate_limit;
 };
 
 struct mlx5_ib_cq_buf {
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index d1e9218..5e89fb25 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -77,12 +77,14 @@ struct mlx5_wqe_eth_pad {
 
 enum raw_qp_set_mask_map {
 	MLX5_RAW_QP_MOD_SET_RQ_Q_CTR_ID		= 1UL << 0,
+	MLX5_RAW_QP_RATE_LIMIT			= 1UL << 1,
 };
 
 struct mlx5_modify_raw_qp_param {
 	u16 operation;
 
 	u32 set_mask; /* raw_qp_set_mask_map */
+	u32 rate_limit;
 	u8 rq_q_ctr_id;
 };
 
@@ -2442,8 +2444,14 @@ static int modify_raw_packet_qp_rq(struct mlx5_ib_dev *dev,
 }
 
 static int modify_raw_packet_qp_sq(struct mlx5_core_dev *dev,
-				   struct mlx5_ib_sq *sq, int new_state)
+				   struct mlx5_ib_sq *sq,
+				   int new_state,
+				   const struct mlx5_modify_raw_qp_param *raw_qp_param)
 {
+	struct mlx5_ib_qp *ibqp = sq->base.container_mibqp;
+	u32 old_rate = ibqp->rate_limit;
+	u32 new_rate = old_rate;
+	u16 rl_index = 0;
 	void *in;
 	void *sqc;
 	int inlen;
@@ -2459,10 +2467,44 @@ static int modify_raw_packet_qp_sq(struct mlx5_core_dev *dev,
 	sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx);
 	MLX5_SET(sqc, sqc, state, new_state);
 
+	if (raw_qp_param->set_mask & MLX5_RAW_QP_RATE_LIMIT) {
+		if (new_state != MLX5_SQC_STATE_RDY)
+			pr_warn("%s: Rate limit can only be changed when SQ is moving to RDY\n",
+				__func__);
+		else
+			new_rate = raw_qp_param->rate_limit;
+	}
+
+	if (old_rate != new_rate) {
+		if (new_rate) {
+			err = mlx5_rl_add_rate(dev, new_rate, &rl_index);
+			if (err) {
+				pr_err("Failed configuring rate %u: %d\n",
+				       new_rate, err);
+				goto out;
+			}
+		}
+
+		MLX5_SET64(modify_sq_in, in, modify_bitmask, 1);
+		MLX5_SET(sqc, sqc, packet_pacing_rate_limit_index, rl_index);
+	}
+
 	err = mlx5_core_modify_sq(dev, sq->base.mqp.qpn, in, inlen);
-	if (err)
+	if (err) {
+		/* Remove new rate from table if failed */
+		if (new_rate &&
+		    old_rate != new_rate)
+			mlx5_rl_remove_rate(dev, new_rate);
 		goto out;
+	}
 
+	/* Only remove the old rate after new rate was set */
+	if ((old_rate &&
+	    (old_rate != new_rate)) ||
+	    (new_state != MLX5_SQC_STATE_RDY))
+		mlx5_rl_remove_rate(dev, old_rate);
+
+	ibqp->rate_limit = new_rate;
 	sq->state = new_state;
 
 out:
@@ -2477,6 +2519,8 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 	struct mlx5_ib_raw_packet_qp *raw_packet_qp = &qp->raw_packet_qp;
 	struct mlx5_ib_rq *rq = &raw_packet_qp->rq;
 	struct mlx5_ib_sq *sq = &raw_packet_qp->sq;
+	int modify_rq = !!qp->rq.wqe_cnt;
+	int modify_sq = !!qp->sq.wqe_cnt;
 	int rq_state;
 	int sq_state;
 	int err;
@@ -2494,10 +2538,18 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 		rq_state = MLX5_RQC_STATE_RST;
 		sq_state = MLX5_SQC_STATE_RST;
 		break;
-	case MLX5_CMD_OP_INIT2INIT_QP:
-	case MLX5_CMD_OP_INIT2RTR_QP:
 	case MLX5_CMD_OP_RTR2RTS_QP:
 	case MLX5_CMD_OP_RTS2RTS_QP:
+		if (raw_qp_param->set_mask ==
+		    MLX5_RAW_QP_RATE_LIMIT) {
+			modify_rq = 0;
+			sq_state = sq->state;
+		} else {
+			return raw_qp_param->set_mask ? -EINVAL : 0;
+		}
+		break;
+	case MLX5_CMD_OP_INIT2INIT_QP:
+	case MLX5_CMD_OP_INIT2RTR_QP:
 		if (raw_qp_param->set_mask)
 			return -EINVAL;
 		else
@@ -2507,13 +2559,13 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 		return -EINVAL;
 	}
 
-	if (qp->rq.wqe_cnt) {
-		err = modify_raw_packet_qp_rq(dev, rq, rq_state, raw_qp_param);
+	if (modify_rq) {
+		err =  modify_raw_packet_qp_rq(dev, rq, rq_state, raw_qp_param);
 		if (err)
 			return err;
 	}
 
-	if (qp->sq.wqe_cnt) {
+	if (modify_sq) {
 		if (tx_affinity) {
 			err = modify_raw_packet_tx_affinity(dev->mdev, sq,
 							    tx_affinity);
@@ -2521,7 +2573,7 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 				return err;
 		}
 
-		return modify_raw_packet_qp_sq(dev->mdev, sq, sq_state);
+		return modify_raw_packet_qp_sq(dev->mdev, sq, sq_state, raw_qp_param);
 	}
 
 	return 0;
@@ -2776,6 +2828,12 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp,
 			raw_qp_param.rq_q_ctr_id = mibport->q_cnt_id;
 			raw_qp_param.set_mask |= MLX5_RAW_QP_MOD_SET_RQ_Q_CTR_ID;
 		}
+
+		if (attr_mask & IB_QP_RATE_LIMIT) {
+			raw_qp_param.rate_limit = attr->rate_limit;
+			raw_qp_param.set_mask |= MLX5_RAW_QP_RATE_LIMIT;
+		}
+
 		err = modify_raw_packet_qp(dev, qp, &raw_qp_param, tx_affinity);
 	} else {
 		err = mlx5_core_qp_modify(dev->mdev, op, optpar, context,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]     ` <1480592596-20126-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-12-01 19:40       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0BA789-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Hefty, Sean @ 2016-12-01 19:40 UTC (permalink / raw)
  To: Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

>  enum ib_qp_state {
> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>  	u8			rnr_retry;
>  	u8			alt_port_num;
>  	u8			alt_timeout;
> +	u32			rate_limit;
>  };

And I still disagree with this approach, as there is already an existing field in the API that limits rate.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0BA789-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-12-02 14:39           ` Leon Romanovsky
       [not found]             ` <20161202143957.GF4497-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2016-12-02 14:39 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang, Rony Efraim

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

On Thu, Dec 01, 2016 at 07:40:44PM +0000, Hefty, Sean wrote:
> >  enum ib_qp_state {
> > @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
> >  	u8			rnr_retry;
> >  	u8			alt_port_num;
> >  	u8			alt_timeout;
> > +	u32			rate_limit;
> >  };
>
> And I still disagree with this approach, as there is already an existing field in the API that limits rate.

Hi Sean,

We would like to elaborate more on the subject, and we need your help
here. Our cover letter [1] has description why the existing field is
not enough. Can you write a little bit more why you didn't like
the proposed approach to an existing and real problem?

Thanks

[1] https://www.spinics.net/lists/linux-rdma/msg43585.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]             ` <20161202143957.GF4497-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-12-04 11:53               ` Liran Liss
       [not found]                 ` <HE1PR0501MB28122735F9F3DFD700B43BFBB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Liran Liss @ 2016-12-04 11:53 UTC (permalink / raw)
  To: Leon Romanovsky, Hefty, Sean
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang, Rony Efraim

> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Leon Romanovsky

> 
> On Thu, Dec 01, 2016 at 07:40:44PM +0000, Hefty, Sean wrote:
> > >  enum ib_qp_state {
> > > @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
> > >  	u8			rnr_retry;
> > >  	u8			alt_port_num;
> > >  	u8			alt_timeout;
> > > +	u32			rate_limit;
> > >  };
> >
> > And I still disagree with this approach, as there is already an existing field in the
> API that limits rate.
> 
> Hi Sean,
> 
> We would like to elaborate more on the subject, and we need your help here.
> Our cover letter [1] has description why the existing field is not enough. Can you
> write a little bit more why you didn't like the proposed approach to an existing
> and real problem?
> 

Clearly, we need a new field that is greater than 8 bits that uses different encoding (actual rate rather than a predefined enumeration of values).

However, this is not only a new field - this rate is also conceptually different than AH static rate:
1) It is not related to the fabric path.
2) It is determined by the application rather than the fabric.
3) It needs to apply to QPs that don't use AH (raw Ethernet QPs)

Of course, we might require raw Ethernet QPs to use an AH just for the sake of rate-limiting.
But 1) + 2) indicate that this should be a concept orthogonal to network paths. In fact, it could apply to RC and UD QPs as well.
In addition, if we implement full connection establishment in the kernel (i.e., the static_rate field is filled in the kernel), we would still like to allow applications to further restrict their rate using a different component mask.
--Liran

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]                 ` <HE1PR0501MB28122735F9F3DFD700B43BFBB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-12-14 19:03                   ` Doug Ledford
       [not found]                     ` <67d4e948-0ada-bd75-8855-79eed2ae9a7b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Doug Ledford @ 2016-12-14 19:03 UTC (permalink / raw)
  To: Liran Liss, Leon Romanovsky, Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang, Rony Efraim


[-- Attachment #1.1: Type: text/plain, Size: 2188 bytes --]

On 12/4/2016 6:53 AM, Liran Liss wrote:
>> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Leon Romanovsky
> 
>>
>> On Thu, Dec 01, 2016 at 07:40:44PM +0000, Hefty, Sean wrote:
>>>>  enum ib_qp_state {
>>>> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>>>>  	u8			rnr_retry;
>>>>  	u8			alt_port_num;
>>>>  	u8			alt_timeout;
>>>> +	u32			rate_limit;
>>>>  };
>>>
>>> And I still disagree with this approach, as there is already an existing field in the
>> API that limits rate.
>>
>> Hi Sean,
>>
>> We would like to elaborate more on the subject, and we need your help here.
>> Our cover letter [1] has description why the existing field is not enough. Can you
>> write a little bit more why you didn't like the proposed approach to an existing
>> and real problem?
>>
> 
> Clearly, we need a new field that is greater than 8 bits that uses different encoding (actual rate rather than a predefined enumeration of values).
> 
> However, this is not only a new field - this rate is also conceptually different than AH static rate:
> 1) It is not related to the fabric path.
> 2) It is determined by the application rather than the fabric.
> 3) It needs to apply to QPs that don't use AH (raw Ethernet QPs)
> 
> Of course, we might require raw Ethernet QPs to use an AH just for the sake of rate-limiting.
> But 1) + 2) indicate that this should be a concept orthogonal to network paths. In fact, it could apply to RC and UD QPs as well.
> In addition, if we implement full connection establishment in the kernel (i.e., the static_rate field is filled in the kernel), we would still like to allow applications to further restrict their rate using a different component mask.
> --Liran
> 

I applied this series.  I agree with Liran that the need for something
other than an enum to denote rates is needed.  We will have to work out
exactly how to present this to user space (keep both, replace
static_rate and bumb the verbs API number, other options?)

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]                     ` <67d4e948-0ada-bd75-8855-79eed2ae9a7b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-12-16  2:48                       ` ira.weiny
       [not found]                         ` <20161216024834.GA27188-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: ira.weiny @ 2016-12-16  2:48 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Liran Liss, Leon Romanovsky, Hefty, Sean,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang, Rony Efraim

On Wed, Dec 14, 2016 at 02:03:14PM -0500, Doug Ledford wrote:
> On 12/4/2016 6:53 AM, Liran Liss wrote:
> >> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> >> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Leon Romanovsky
> > 
> >>
> >> On Thu, Dec 01, 2016 at 07:40:44PM +0000, Hefty, Sean wrote:
> >>>>  enum ib_qp_state {
> >>>> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
> >>>>  	u8			rnr_retry;
> >>>>  	u8			alt_port_num;
> >>>>  	u8			alt_timeout;
> >>>> +	u32			rate_limit;
> >>>>  };
> >>>
> >>> And I still disagree with this approach, as there is already an existing field in the
> >> API that limits rate.
> >>
> >> Hi Sean,
> >>
> >> We would like to elaborate more on the subject, and we need your help here.
> >> Our cover letter [1] has description why the existing field is not enough. Can you
> >> write a little bit more why you didn't like the proposed approach to an existing
> >> and real problem?
> >>
> > 
> > Clearly, we need a new field that is greater than 8 bits that uses different encoding (actual rate rather than a predefined enumeration of values).
> > 
> > However, this is not only a new field - this rate is also conceptually different than AH static rate:
> > 1) It is not related to the fabric path.
> > 2) It is determined by the application rather than the fabric.
> > 3) It needs to apply to QPs that don't use AH (raw Ethernet QPs)
> > 
> > Of course, we might require raw Ethernet QPs to use an AH just for the sake of rate-limiting.
> > But 1) + 2) indicate that this should be a concept orthogonal to network paths. In fact, it could apply to RC and UD QPs as well.
> > In addition, if we implement full connection establishment in the kernel (i.e., the static_rate field is filled in the kernel), we would still like to allow applications to further restrict their rate using a different component mask.
> > --Liran
> > 
> 
> I applied this series.  I agree with Liran that the need for something
> other than an enum to denote rates is needed.  We will have to work out
> exactly how to present this to user space (keep both, replace
> static_rate and bumb the verbs API number, other options?)

I think in the new ioctl interface values like this should be more generic.
MTU is another example.

FWIW I agree with Sean that static rate should have been changed to accommodate
this expanded meaning.

With this new structure definition which rate is in effect?  Are devices
supposed to take the min of the 2?

Ira

> 
> -- 
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>     GPG Key ID: 0E572FDD
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing
       [not found]                         ` <20161216024834.GA27188-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-12-16  3:53                           ` Doug Ledford
  0 siblings, 0 replies; 11+ messages in thread
From: Doug Ledford @ 2016-12-16  3:53 UTC (permalink / raw)
  To: ira.weiny
  Cc: Liran Liss, Leon Romanovsky, Hefty, Sean,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang, Rony Efraim


[-- Attachment #1.1: Type: text/plain, Size: 3623 bytes --]

On 12/15/2016 9:48 PM, ira.weiny wrote:
> On Wed, Dec 14, 2016 at 02:03:14PM -0500, Doug Ledford wrote:
>> On 12/4/2016 6:53 AM, Liran Liss wrote:
>>>> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>>>> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Leon Romanovsky
>>>
>>>>
>>>> On Thu, Dec 01, 2016 at 07:40:44PM +0000, Hefty, Sean wrote:
>>>>>>  enum ib_qp_state {
>>>>>> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>>>>>>  	u8			rnr_retry;
>>>>>>  	u8			alt_port_num;
>>>>>>  	u8			alt_timeout;
>>>>>> +	u32			rate_limit;
>>>>>>  };
>>>>>
>>>>> And I still disagree with this approach, as there is already an existing field in the
>>>> API that limits rate.
>>>>
>>>> Hi Sean,
>>>>
>>>> We would like to elaborate more on the subject, and we need your help here.
>>>> Our cover letter [1] has description why the existing field is not enough. Can you
>>>> write a little bit more why you didn't like the proposed approach to an existing
>>>> and real problem?
>>>>
>>>
>>> Clearly, we need a new field that is greater than 8 bits that uses different encoding (actual rate rather than a predefined enumeration of values).
>>>
>>> However, this is not only a new field - this rate is also conceptually different than AH static rate:
>>> 1) It is not related to the fabric path.
>>> 2) It is determined by the application rather than the fabric.
>>> 3) It needs to apply to QPs that don't use AH (raw Ethernet QPs)
>>>
>>> Of course, we might require raw Ethernet QPs to use an AH just for the sake of rate-limiting.
>>> But 1) + 2) indicate that this should be a concept orthogonal to network paths. In fact, it could apply to RC and UD QPs as well.
>>> In addition, if we implement full connection establishment in the kernel (i.e., the static_rate field is filled in the kernel), we would still like to allow applications to further restrict their rate using a different component mask.
>>> --Liran
>>>
>>
>> I applied this series.  I agree with Liran that the need for something
>> other than an enum to denote rates is needed.  We will have to work out
>> exactly how to present this to user space (keep both, replace
>> static_rate and bumb the verbs API number, other options?)
> 
> I think in the new ioctl interface values like this should be more generic.
> MTU is another example.

Agreed.

> FWIW I agree with Sean that static rate should have been changed to accommodate
> this expanded meaning.

Changing static rate only matters at the user application level.  At the
driver to kernel level, it's easy to say this is driver specific and the
user space driver can communicate with the kernel driver any way it
chooses.  It's a non issue at this level.  It's later that we have to
figure things out, which was the point of my comments and why I took this.

> With this new structure definition which rate is in effect?  Are devices
> supposed to take the min of the 2?

How did you write your kernel driver and how did you write your user
space library?  Because until the rdma-core to application API gets some
modification, your user space driver will only ever get static_rate with
its existing meaning from the user app and what your driver does with it
internally is moot.  What's more, it doesn't matter if mlx5 and hfi1 do
the same thing, only that what they do is consistent with themselves.
It's when we add that user bit that we have to make things right and set
them in concrete.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-16  3:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-01 11:43 [PATCH rdma-next V1 0/4] Add packet pacing support for IB verbs Leon Romanovsky
     [not found] ` <1480592596-20126-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-12-01 11:43   ` [PATCH rdma-next V1 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
2016-12-01 11:43   ` [PATCH rdma-next V1 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
     [not found]     ` <1480592596-20126-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-12-01 19:40       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0BA789-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-12-02 14:39           ` Leon Romanovsky
     [not found]             ` <20161202143957.GF4497-2ukJVAZIZ/Y@public.gmane.org>
2016-12-04 11:53               ` Liran Liss
     [not found]                 ` <HE1PR0501MB28122735F9F3DFD700B43BFBB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-12-14 19:03                   ` Doug Ledford
     [not found]                     ` <67d4e948-0ada-bd75-8855-79eed2ae9a7b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-12-16  2:48                       ` ira.weiny
     [not found]                         ` <20161216024834.GA27188-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-12-16  3:53                           ` Doug Ledford
2016-12-01 11:43   ` [PATCH rdma-next V1 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
2016-12-01 11:43   ` [PATCH rdma-next V1 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.