All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
@ 2016-10-31 10:21 Leon Romanovsky
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2016-10-31 10:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

When sending from a 10G host to a 1G host, it is easy to overrun the receiver,
leading to packet loss and traffic backing off. Similar problems occur when
a 10G host sends data to a sub-10G virtual circuit, or a 40G host sending
to a 10G host. Packet pacing could control packet injection rate and reduces
network congestion to maximize throughput & minimize network latency.

Packet pacing is a rate limiting and shaping for a QP (SQ for RAW QP), set
and change the rate is done by modifying QP. This series of patch made the
following high level changes:
 1. Report rate limit capabilities through user data. Reported capabilities
    include: The maximum and minimum rate limit in kbps supported by packet
    pacing; Bitmap showing which QP types are supported by packet pacing
    operation.
 2. Extend modify QP interface for growing attributes. Add rate limit support
    to the extended interface.
 3. Enable mlx5-based hardware to be able to update the rate limit for
    RAW QP packet.

Available in the "topic/packet_pacing" topic branch of this git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git

Or for browsing:
https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=topic/packet_pacing

Thanks,
  Bodong & Leon

Bodong Wang (4):
  IB/mlx5: Report mlx5 packet pacing capabilities when querying device
  IB/core: Support rate limit for packet pacing
  IB/uverbs: Extend modify_qp and support packet pacing
  IB/mlx5: Update the rate limit according to user setting for RAW QP

 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_cmd.c  | 178 +++++++++++++++++++++-------------
 drivers/infiniband/core/uverbs_main.c |   1 +
 drivers/infiniband/core/verbs.c       |   2 +
 drivers/infiniband/hw/mlx5/main.c     |  16 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h  |   1 +
 drivers/infiniband/hw/mlx5/qp.c       |  71 ++++++++++++--
 include/rdma/ib_verbs.h               |   2 +
 include/uapi/rdma/ib_user_verbs.h     |  12 +++
 include/uapi/rdma/mlx5-abi.h          |  13 +++
 10 files changed, 219 insertions(+), 78 deletions(-)

--
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH rdma-next 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-10-31 10:21   ` Leon Romanovsky
  2016-10-31 10:21   ` [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2016-10-31 10:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Enable mlx5 based hardware to report packet pacing capabilities
from kernel to user space. Packet pacing allows to limit the rate to any
number between the maximum and minimum, based on user settings.

The capabilities are exposed to user space through query_device by uhw.
The following capabilities are reported:

1. The maximum and minimum rate limit in kbps supported by packet pacing.
2. Bitmap showing which QP types are supported by packet pacing operation.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 13 +++++++++++++
 include/uapi/rdma/mlx5-abi.h      | 13 +++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 2217477..ed9d327 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -669,6 +669,19 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 			1 << MLX5_CAP_GEN(dev->mdev, log_max_rq);
 	}
 
+	if (field_avail(typeof(resp), packet_pacing_caps, uhw->outlen)) {
+		if (MLX5_CAP_QOS(mdev, packet_pacing) &&
+		    MLX5_CAP_GEN(mdev, qos)) {
+			resp.packet_pacing_caps.qp_rate_limit_max =
+				MLX5_CAP_QOS(mdev, packet_pacing_max_rate);
+			resp.packet_pacing_caps.qp_rate_limit_min =
+				MLX5_CAP_QOS(mdev, packet_pacing_min_rate);
+			resp.packet_pacing_caps.supported_qpts |=
+				1 << IB_QPT_RAW_PACKET;
+		}
+		resp.response_length += sizeof(resp.packet_pacing_caps);
+	}
+
 	if (uhw->outlen) {
 		err = ib_copy_to_udata(uhw, &resp, resp.response_length);
 
diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
index f5d0f4e..4e9338d 100644
--- a/include/uapi/rdma/mlx5-abi.h
+++ b/include/uapi/rdma/mlx5-abi.h
@@ -124,11 +124,24 @@ struct mlx5_ib_rss_caps {
 	__u8 reserved[7];
 };
 
+struct mlx5_packet_pacing_caps {
+	__u32 qp_rate_limit_min;
+	__u32 qp_rate_limit_max; /* In kpbs */
+
+	/* Corresponding bit will be set if qp type from
+	 * 'enum ib_qp_type' is supported, e.g.
+	 * supported_qpts |= 1 << IB_QPT_RAW_PACKET
+	 */
+	__u32 supported_qpts;
+	__u32 reserved;
+};
+
 struct mlx5_ib_query_device_resp {
 	__u32	comp_mask;
 	__u32	response_length;
 	struct	mlx5_ib_tso_caps tso_caps;
 	struct	mlx5_ib_rss_caps rss_caps;
+	struct	mlx5_packet_pacing_caps packet_pacing_caps;
 };
 
 struct mlx5_ib_create_cq {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-10-31 10:21   ` [PATCH rdma-next 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
@ 2016-10-31 10:21   ` Leon Romanovsky
       [not found]     ` <1477909297-14491-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-10-31 10:21   ` [PATCH rdma-next 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2016-10-31 10:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add new member rate_limit to ib_qp_attr, it shows the packet pacing rate
in Kbps, 0 means unlimited.

IB_QP_RATE_LIMIT is added to ib_attr_mask, and it could be used by RAW
QPs when changing QP state from RTR to RTS, RTS to RTS.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 2 ++
 include/rdma/ib_verbs.h         | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 8368764..3e688b3 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1014,6 +1014,7 @@ static const struct {
 						 IB_QP_QKEY),
 				 [IB_QPT_GSI] = (IB_QP_CUR_STATE		|
 						 IB_QP_QKEY),
+				 [IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
 			 }
 		}
 	},
@@ -1047,6 +1048,7 @@ static const struct {
 						IB_QP_QKEY),
 				[IB_QPT_GSI] = (IB_QP_CUR_STATE			|
 						IB_QP_QKEY),
+				[IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
 			}
 		},
 		[IB_QPS_SQD]   = {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..a065361 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1102,6 +1102,7 @@ enum ib_qp_attr_mask {
 	IB_QP_RESERVED2			= (1<<22),
 	IB_QP_RESERVED3			= (1<<23),
 	IB_QP_RESERVED4			= (1<<24),
+	IB_QP_RATE_LIMIT		= (1<<25),
 };
 
 enum ib_qp_state {
@@ -1151,6 +1152,7 @@ struct ib_qp_attr {
 	u8			rnr_retry;
 	u8			alt_port_num;
 	u8			alt_timeout;
+	u32			rate_limit;
 };
 
 enum ib_wr_opcode {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rdma-next 3/4] IB/uverbs: Extend modify_qp and support packet pacing
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-10-31 10:21   ` [PATCH rdma-next 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
  2016-10-31 10:21   ` [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
@ 2016-10-31 10:21   ` Leon Romanovsky
  2016-10-31 10:21   ` [PATCH rdma-next 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2016-10-31 10:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

- Modify_qp is extended to support more attributes. Existing
  applications are not affected when calling modify_qp. New
  applications could call modify_qp_ex to use the extended fields.
- New member rate_limit is added to the modify_qp extended structure,
  users can modify it through modify_qp to control packet packing.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |   1 +
 drivers/infiniband/core/uverbs_cmd.c  | 178 +++++++++++++++++++++-------------
 drivers/infiniband/core/uverbs_main.c |   1 +
 include/uapi/rdma/ib_user_verbs.h     |  12 +++
 4 files changed, 123 insertions(+), 69 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index df26a74..455034a 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -289,5 +289,6 @@ IB_UVERBS_DECLARE_EX_CMD(modify_wq);
 IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
 IB_UVERBS_DECLARE_EX_CMD(create_rwq_ind_table);
 IB_UVERBS_DECLARE_EX_CMD(destroy_rwq_ind_table);
+IB_UVERBS_DECLARE_EX_CMD(modify_qp);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index cb3f515a..14ceba3 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2328,94 +2328,86 @@ static int modify_qp_mask(enum ib_qp_type qp_type, int mask)
 	}
 }
 
-ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
-			    struct ib_device *ib_dev,
-			    const char __user *buf, int in_len,
-			    int out_len)
+static int modify_qp(struct ib_uverbs_file *file,
+		     struct ib_uverbs_ex_modify_qp *cmd, struct ib_udata *udata)
 {
-	struct ib_uverbs_modify_qp cmd;
-	struct ib_udata            udata;
-	struct ib_qp              *qp;
-	struct ib_qp_attr         *attr;
-	int                        ret;
-
-	if (copy_from_user(&cmd, buf, sizeof cmd))
-		return -EFAULT;
-
-	INIT_UDATA(&udata, buf + sizeof cmd, NULL, in_len - sizeof cmd,
-		   out_len);
+	struct ib_qp_attr *attr;
+	struct ib_qp *qp;
+	int ret;
 
 	attr = kmalloc(sizeof *attr, GFP_KERNEL);
 	if (!attr)
 		return -ENOMEM;
 
-	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	qp = idr_read_qp(cmd->base.qp_handle, file->ucontext);
 	if (!qp) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	attr->qp_state 		  = cmd.qp_state;
-	attr->cur_qp_state 	  = cmd.cur_qp_state;
-	attr->path_mtu 		  = cmd.path_mtu;
-	attr->path_mig_state 	  = cmd.path_mig_state;
-	attr->qkey 		  = cmd.qkey;
-	attr->rq_psn 		  = cmd.rq_psn;
-	attr->sq_psn 		  = cmd.sq_psn;
-	attr->dest_qp_num 	  = cmd.dest_qp_num;
-	attr->qp_access_flags 	  = cmd.qp_access_flags;
-	attr->pkey_index 	  = cmd.pkey_index;
-	attr->alt_pkey_index 	  = cmd.alt_pkey_index;
-	attr->en_sqd_async_notify = cmd.en_sqd_async_notify;
-	attr->max_rd_atomic 	  = cmd.max_rd_atomic;
-	attr->max_dest_rd_atomic  = cmd.max_dest_rd_atomic;
-	attr->min_rnr_timer 	  = cmd.min_rnr_timer;
-	attr->port_num 		  = cmd.port_num;
-	attr->timeout 		  = cmd.timeout;
-	attr->retry_cnt 	  = cmd.retry_cnt;
-	attr->rnr_retry 	  = cmd.rnr_retry;
-	attr->alt_port_num 	  = cmd.alt_port_num;
-	attr->alt_timeout 	  = cmd.alt_timeout;
-
-	memcpy(attr->ah_attr.grh.dgid.raw, cmd.dest.dgid, 16);
-	attr->ah_attr.grh.flow_label        = cmd.dest.flow_label;
-	attr->ah_attr.grh.sgid_index        = cmd.dest.sgid_index;
-	attr->ah_attr.grh.hop_limit         = cmd.dest.hop_limit;
-	attr->ah_attr.grh.traffic_class     = cmd.dest.traffic_class;
-	attr->ah_attr.dlid 	    	    = cmd.dest.dlid;
-	attr->ah_attr.sl   	    	    = cmd.dest.sl;
-	attr->ah_attr.src_path_bits 	    = cmd.dest.src_path_bits;
-	attr->ah_attr.static_rate   	    = cmd.dest.static_rate;
-	attr->ah_attr.ah_flags 	    	    = cmd.dest.is_global ? IB_AH_GRH : 0;
-	attr->ah_attr.port_num 	    	    = cmd.dest.port_num;
-
-	memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd.alt_dest.dgid, 16);
-	attr->alt_ah_attr.grh.flow_label    = cmd.alt_dest.flow_label;
-	attr->alt_ah_attr.grh.sgid_index    = cmd.alt_dest.sgid_index;
-	attr->alt_ah_attr.grh.hop_limit     = cmd.alt_dest.hop_limit;
-	attr->alt_ah_attr.grh.traffic_class = cmd.alt_dest.traffic_class;
-	attr->alt_ah_attr.dlid 	    	    = cmd.alt_dest.dlid;
-	attr->alt_ah_attr.sl   	    	    = cmd.alt_dest.sl;
-	attr->alt_ah_attr.src_path_bits     = cmd.alt_dest.src_path_bits;
-	attr->alt_ah_attr.static_rate       = cmd.alt_dest.static_rate;
-	attr->alt_ah_attr.ah_flags 	    = cmd.alt_dest.is_global ? IB_AH_GRH : 0;
-	attr->alt_ah_attr.port_num 	    = cmd.alt_dest.port_num;
+	attr->qp_state		  = cmd->base.qp_state;
+	attr->cur_qp_state	  = cmd->base.cur_qp_state;
+	attr->path_mtu		  = cmd->base.path_mtu;
+	attr->path_mig_state	  = cmd->base.path_mig_state;
+	attr->qkey		  = cmd->base.qkey;
+	attr->rq_psn		  = cmd->base.rq_psn;
+	attr->sq_psn		  = cmd->base.sq_psn;
+	attr->dest_qp_num	  = cmd->base.dest_qp_num;
+	attr->qp_access_flags	  = cmd->base.qp_access_flags;
+	attr->pkey_index	  = cmd->base.pkey_index;
+	attr->alt_pkey_index	  = cmd->base.alt_pkey_index;
+	attr->en_sqd_async_notify = cmd->base.en_sqd_async_notify;
+	attr->max_rd_atomic	  = cmd->base.max_rd_atomic;
+	attr->max_dest_rd_atomic  = cmd->base.max_dest_rd_atomic;
+	attr->min_rnr_timer	  = cmd->base.min_rnr_timer;
+	attr->port_num		  = cmd->base.port_num;
+	attr->timeout		  = cmd->base.timeout;
+	attr->retry_cnt		  = cmd->base.retry_cnt;
+	attr->rnr_retry		  = cmd->base.rnr_retry;
+	attr->alt_port_num	  = cmd->base.alt_port_num;
+	attr->alt_timeout	  = cmd->base.alt_timeout;
+	attr->rate_limit	  = cmd->rate_limit;
+
+	memcpy(attr->ah_attr.grh.dgid.raw, cmd->base.dest.dgid, 16);
+	attr->ah_attr.grh.flow_label	= cmd->base.dest.flow_label;
+	attr->ah_attr.grh.sgid_index	= cmd->base.dest.sgid_index;
+	attr->ah_attr.grh.hop_limit	= cmd->base.dest.hop_limit;
+	attr->ah_attr.grh.traffic_class	= cmd->base.dest.traffic_class;
+	attr->ah_attr.dlid		= cmd->base.dest.dlid;
+	attr->ah_attr.sl		= cmd->base.dest.sl;
+	attr->ah_attr.src_path_bits	= cmd->base.dest.src_path_bits;
+	attr->ah_attr.static_rate	= cmd->base.dest.static_rate;
+	attr->ah_attr.ah_flags		= cmd->base.dest.is_global ?
+					  IB_AH_GRH : 0;
+	attr->ah_attr.port_num		= cmd->base.dest.port_num;
+
+	memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd->base.alt_dest.dgid, 16);
+	attr->alt_ah_attr.grh.flow_label    = cmd->base.alt_dest.flow_label;
+	attr->alt_ah_attr.grh.sgid_index    = cmd->base.alt_dest.sgid_index;
+	attr->alt_ah_attr.grh.hop_limit     = cmd->base.alt_dest.hop_limit;
+	attr->alt_ah_attr.grh.traffic_class = cmd->base.alt_dest.traffic_class;
+	attr->alt_ah_attr.dlid		    = cmd->base.alt_dest.dlid;
+	attr->alt_ah_attr.sl		    = cmd->base.alt_dest.sl;
+	attr->alt_ah_attr.src_path_bits	    = cmd->base.alt_dest.src_path_bits;
+	attr->alt_ah_attr.static_rate	    = cmd->base.alt_dest.static_rate;
+	attr->alt_ah_attr.ah_flags	    = cmd->base.alt_dest.is_global ?
+					      IB_AH_GRH : 0;
+	attr->alt_ah_attr.port_num	    = cmd->base.alt_dest.port_num;
 
 	if (qp->real_qp == qp) {
-		ret = ib_resolve_eth_dmac(qp, attr, &cmd.attr_mask);
+		ret = ib_resolve_eth_dmac(qp, attr, &cmd->base.attr_mask);
 		if (ret)
 			goto release_qp;
 		ret = qp->device->modify_qp(qp, attr,
-			modify_qp_mask(qp->qp_type, cmd.attr_mask), &udata);
+					    modify_qp_mask(qp->qp_type,
+							   cmd->base.attr_mask),
+					    udata);
 	} else {
-		ret = ib_modify_qp(qp, attr, modify_qp_mask(qp->qp_type, cmd.attr_mask));
+		ret = ib_modify_qp(qp, attr,
+				   modify_qp_mask(qp->qp_type,
+						  cmd->base.attr_mask));
 	}
 
-	if (ret)
-		goto release_qp;
-
-	ret = in_len;
-
 release_qp:
 	put_qp_read(qp);
 
@@ -2425,6 +2417,54 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
 	return ret;
 }
 
+ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
+			    struct ib_device *ib_dev,
+			    const char __user *buf, int in_len,
+			    int out_len)
+{
+	struct ib_uverbs_ex_modify_qp cmd = {};
+	struct ib_udata udata;
+	int ret;
+
+	if (copy_from_user(&cmd.base, buf, sizeof(cmd.base)))
+		return -EFAULT;
+
+	INIT_UDATA(&udata, buf + sizeof(cmd.base), NULL,
+		   in_len - sizeof(cmd.base), out_len);
+
+	ret = modify_qp(file, &cmd, &udata);
+	if (ret)
+		return ret;
+
+	return in_len;
+}
+
+int ib_uverbs_ex_modify_qp(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_modify_qp cmd = {};
+	int ret;
+
+	if (ucore->inlen < sizeof(cmd.base))
+		return -EINVAL;
+
+	ret = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (ret)
+		return ret;
+
+	if (ucore->inlen > sizeof(cmd)) {
+		if (ib_is_udata_cleared(ucore, sizeof(cmd),
+					ucore->inlen - sizeof(cmd)))
+			return -EOPNOTSUPP;
+	}
+
+	ret = modify_qp(file, &cmd, ucore);
+
+	return ret;
+}
+
 ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     const char __user *buf, int in_len,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 0012fa5..80839c3 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -137,6 +137,7 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]       = ib_uverbs_ex_destroy_wq,
 	[IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL] = ib_uverbs_ex_create_rwq_ind_table,
 	[IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL] = ib_uverbs_ex_destroy_rwq_ind_table,
+	[IB_USER_VERBS_EX_CMD_MODIFY_QP]        = ib_uverbs_ex_modify_qp,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 25225eb..351f260 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -93,6 +93,7 @@ enum {
 	IB_USER_VERBS_EX_CMD_QUERY_DEVICE = IB_USER_VERBS_CMD_QUERY_DEVICE,
 	IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ,
 	IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP,
+	IB_USER_VERBS_EX_CMD_MODIFY_QP = IB_USER_VERBS_CMD_MODIFY_QP,
 	IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
 	IB_USER_VERBS_EX_CMD_DESTROY_FLOW,
 	IB_USER_VERBS_EX_CMD_CREATE_WQ,
@@ -684,9 +685,20 @@ struct ib_uverbs_modify_qp {
 	__u64 driver_data[0];
 };
 
+struct ib_uverbs_ex_modify_qp {
+	struct ib_uverbs_modify_qp base;
+	__u32	rate_limit;
+	__u32	reserved;
+};
+
 struct ib_uverbs_modify_qp_resp {
 };
 
+struct ib_uverbs_ex_modify_qp_resp {
+	__u32  comp_mask;
+	__u32  response_length;
+};
+
 struct ib_uverbs_destroy_qp {
 	__u64 response;
 	__u32 qp_handle;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH rdma-next 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-10-31 10:21   ` [PATCH rdma-next 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
@ 2016-10-31 10:21   ` Leon Romanovsky
  2016-11-08 17:49   ` [PATCH rdma-next 0/4] Add packet pacing support for IB verbs Hefty, Sean
  2016-11-17 18:15   ` Leon Romanovsky
  5 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2016-10-31 10:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

- Add MODIFY_QP_EX CMD to extend modify_qp.
- Rate limit will be updated in the following state transactions: RTR2RTS,
  RTS2RTS. The limit will be removed when SQ is in RST and ERR state.

Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |  3 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  1 +
 drivers/infiniband/hw/mlx5/qp.c      | 71 ++++++++++++++++++++++++++++++++----
 3 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index ed9d327..fd13786 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3025,7 +3025,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	dev->ib_dev.uverbs_ex_cmd_mask =
 		(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE)	|
 		(1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ)	|
-		(1ull << IB_USER_VERBS_EX_CMD_CREATE_QP);
+		(1ull << IB_USER_VERBS_EX_CMD_CREATE_QP)	|
+		(1ull << IB_USER_VERBS_EX_CMD_MODIFY_QP);
 
 	dev->ib_dev.query_device	= mlx5_ib_query_device;
 	dev->ib_dev.query_port		= mlx5_ib_query_port;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dcdcd19..0e43254 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -387,6 +387,7 @@ struct mlx5_ib_qp {
 	struct list_head	qps_list;
 	struct list_head	cq_recv_list;
 	struct list_head	cq_send_list;
+	u32			rate_limit;
 };
 
 struct mlx5_ib_cq_buf {
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 41f4c2a..148a26f 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -78,6 +78,7 @@ struct mlx5_wqe_eth_pad {
 
 enum raw_qp_set_mask_map {
 	MLX5_RAW_QP_MOD_SET_RQ_Q_CTR_ID		= 1UL << 0,
+	MLX5_RAW_QP_RATE_LIMIT			= 1UL << 1,
 };
 
 struct mlx5_modify_raw_qp_param {
@@ -85,6 +86,7 @@ struct mlx5_modify_raw_qp_param {
 
 	u32 set_mask; /* raw_qp_set_mask_map */
 	u8 rq_q_ctr_id;
+	u32 rate_limit;
 };
 
 static void get_cqs(enum ib_qp_type qp_type,
@@ -2443,8 +2445,14 @@ static int modify_raw_packet_qp_rq(struct mlx5_ib_dev *dev,
 }
 
 static int modify_raw_packet_qp_sq(struct mlx5_core_dev *dev,
-				   struct mlx5_ib_sq *sq, int new_state)
+				   struct mlx5_ib_sq *sq,
+				   int new_state,
+				   const struct mlx5_modify_raw_qp_param *raw_qp_param)
 {
+	struct mlx5_ib_qp *ibqp = sq->base.container_mibqp;
+	u32 old_rate = ibqp->rate_limit;
+	u32 new_rate = old_rate;
+	u16 rl_index = 0;
 	void *in;
 	void *sqc;
 	int inlen;
@@ -2460,10 +2468,42 @@ static int modify_raw_packet_qp_sq(struct mlx5_core_dev *dev,
 	sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx);
 	MLX5_SET(sqc, sqc, state, new_state);
 
+	if (raw_qp_param->set_mask & MLX5_RAW_QP_RATE_LIMIT)
+		new_rate = raw_qp_param->rate_limit;
+
+	if (old_rate != new_rate) {
+		if (new_rate) {
+			err = mlx5_rl_add_rate(dev, new_rate, &rl_index);
+			if (err) {
+				pr_err("Failed configuring rate %u: %d\n",
+				       new_rate, err);
+				goto out;
+			}
+		}
+
+		MLX5_SET64(modify_sq_in, in, modify_bitmask, 1);
+		MLX5_SET(sqc, sqc, packet_pacing_rate_limit_index, rl_index);
+	}
+
 	err = mlx5_core_modify_sq(dev, sq->base.mqp.qpn, in, inlen);
-	if (err)
+	if (err) {
+		/* Remove new rate from table if failed */
+		if (new_rate &&
+		    old_rate != new_rate)
+			mlx5_rl_remove_rate(dev, new_rate);
 		goto out;
+	}
+
+	if ((new_state == MLX5_SQC_STATE_ERR) ||
+	    (new_state == MLX5_SQC_STATE_RST))
+		new_rate = 0;
 
+	/* Only remove the old rate after new rate was set */
+	if (old_rate &&
+	    old_rate != new_rate)
+		mlx5_rl_remove_rate(dev, old_rate);
+
+	ibqp->rate_limit = new_rate;
 	sq->state = new_state;
 
 out:
@@ -2478,6 +2518,8 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 	struct mlx5_ib_raw_packet_qp *raw_packet_qp = &qp->raw_packet_qp;
 	struct mlx5_ib_rq *rq = &raw_packet_qp->rq;
 	struct mlx5_ib_sq *sq = &raw_packet_qp->sq;
+	int modify_rq = !!qp->rq.wqe_cnt;
+	int modify_sq = !!qp->sq.wqe_cnt;
 	int rq_state;
 	int sq_state;
 	int err;
@@ -2495,10 +2537,18 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 		rq_state = MLX5_RQC_STATE_RST;
 		sq_state = MLX5_SQC_STATE_RST;
 		break;
-	case MLX5_CMD_OP_INIT2INIT_QP:
-	case MLX5_CMD_OP_INIT2RTR_QP:
 	case MLX5_CMD_OP_RTR2RTS_QP:
 	case MLX5_CMD_OP_RTS2RTS_QP:
+		if (raw_qp_param->set_mask ==
+		    MLX5_RAW_QP_RATE_LIMIT) {
+			modify_rq = 0;
+			sq_state = sq->state;
+		} else {
+			return raw_qp_param->set_mask ? -EINVAL : 0;
+		}
+		break;
+	case MLX5_CMD_OP_INIT2INIT_QP:
+	case MLX5_CMD_OP_INIT2RTR_QP:
 		if (raw_qp_param->set_mask)
 			return -EINVAL;
 		else
@@ -2508,13 +2558,13 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 		return -EINVAL;
 	}
 
-	if (qp->rq.wqe_cnt) {
-		err = modify_raw_packet_qp_rq(dev, rq, rq_state, raw_qp_param);
+	if (modify_rq) {
+		err =  modify_raw_packet_qp_rq(dev, rq, rq_state, raw_qp_param);
 		if (err)
 			return err;
 	}
 
-	if (qp->sq.wqe_cnt) {
+	if (modify_sq) {
 		if (tx_affinity) {
 			err = modify_raw_packet_tx_affinity(dev->mdev, sq,
 							    tx_affinity);
@@ -2522,7 +2572,7 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 				return err;
 		}
 
-		return modify_raw_packet_qp_sq(dev->mdev, sq, sq_state);
+		return modify_raw_packet_qp_sq(dev->mdev, sq, sq_state, raw_qp_param);
 	}
 
 	return 0;
@@ -2777,6 +2827,11 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp,
 			raw_qp_param.rq_q_ctr_id = mibport->q_cnt_id;
 			raw_qp_param.set_mask |= MLX5_RAW_QP_MOD_SET_RQ_Q_CTR_ID;
 		}
+
+		if (attr_mask & IB_QP_RATE_LIMIT) {
+			raw_qp_param.rate_limit = attr->rate_limit;
+			raw_qp_param.set_mask |= MLX5_RAW_QP_RATE_LIMIT;
+		}
 		err = modify_raw_packet_qp(dev, qp, &raw_qp_param, tx_affinity);
 	} else {
 		err = mlx5_core_qp_modify(dev->mdev, op, optpar, context,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing
       [not found]     ` <1477909297-14491-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-11-01 10:06       ` Yuval Shaia
       [not found]         ` <20161101100607.GB3727-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
  2016-11-09 17:27       ` Hefty, Sean
  1 sibling, 1 reply; 17+ messages in thread
From: Yuval Shaia @ 2016-11-01 10:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

Two (extremely) minor suggestions inline.

Yuval

On Mon, Oct 31, 2016 at 12:21:35PM +0200, Leon Romanovsky wrote:
> From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> Add new member rate_limit to ib_qp_attr, it shows the packet pacing rate

Suggesting to replace with:
Add new member rate_limit to ib_qp_attr which holds the packet pacing rate

> in Kbps, 0 means unlimited.
> 
> IB_QP_RATE_LIMIT is added to ib_attr_mask, and it could be used by RAW

Suggesting to replace with:
IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW

> QPs when changing QP state from RTR to RTS, RTS to RTS.
> 
> Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  drivers/infiniband/core/verbs.c | 2 ++
>  include/rdma/ib_verbs.h         | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 8368764..3e688b3 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1014,6 +1014,7 @@ static const struct {
>  						 IB_QP_QKEY),
>  				 [IB_QPT_GSI] = (IB_QP_CUR_STATE		|
>  						 IB_QP_QKEY),
> +				 [IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
>  			 }
>  		}
>  	},
> @@ -1047,6 +1048,7 @@ static const struct {
>  						IB_QP_QKEY),
>  				[IB_QPT_GSI] = (IB_QP_CUR_STATE			|
>  						IB_QP_QKEY),
> +				[IB_QPT_RAW_PACKET] = IB_QP_RATE_LIMIT,
>  			}
>  		},
>  		[IB_QPS_SQD]   = {
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 5ad43a4..a065361 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1102,6 +1102,7 @@ enum ib_qp_attr_mask {
>  	IB_QP_RESERVED2			= (1<<22),
>  	IB_QP_RESERVED3			= (1<<23),
>  	IB_QP_RESERVED4			= (1<<24),
> +	IB_QP_RATE_LIMIT		= (1<<25),
>  };
>  
>  enum ib_qp_state {
> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>  	u8			rnr_retry;
>  	u8			alt_port_num;
>  	u8			alt_timeout;
> +	u32			rate_limit;
>  };
>  
>  enum ib_wr_opcode {
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing
       [not found]         ` <20161101100607.GB3727-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
@ 2016-11-02 15:35           ` Leon Romanovsky
  0 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2016-11-02 15:35 UTC (permalink / raw)
  To: Yuval Shaia
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]

On Tue, Nov 01, 2016 at 12:06:08PM +0200, Yuval Shaia wrote:
> Two (extremely) minor suggestions inline.
>
> Yuval
>
> On Mon, Oct 31, 2016 at 12:21:35PM +0200, Leon Romanovsky wrote:
> > From: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >
> > Add new member rate_limit to ib_qp_attr, it shows the packet pacing rate
>
> Suggesting to replace with:
> Add new member rate_limit to ib_qp_attr which holds the packet pacing rate
>
> > in Kbps, 0 means unlimited.
> >
> > IB_QP_RATE_LIMIT is added to ib_attr_mask, and it could be used by RAW
>
> Suggesting to replace with:
> IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW
>
> > QPs when changing QP state from RTR to RTS, RTS to RTS.
> >
> > Signed-off-by: Bodong Wang <bodong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks Yuval,

Doug,
What do you expect from us? respin of this patch?

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-10-31 10:21   ` [PATCH rdma-next 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky
@ 2016-11-08 17:49   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0A7B31-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-11-17 18:15   ` Leon Romanovsky
  5 siblings, 1 reply; 17+ messages in thread
From: Hefty, Sean @ 2016-11-08 17:49 UTC (permalink / raw)
  To: Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> When sending from a 10G host to a 1G host, it is easy to overrun the
> receiver,
> leading to packet loss and traffic backing off. Similar problems occur
> when
> a 10G host sends data to a sub-10G virtual circuit, or a 40G host
> sending
> to a 10G host. Packet pacing could control packet injection rate and
> reduces
> network congestion to maximize throughput & minimize network latency.

Why isn't the path record data and existing mechanisms sufficient to handle this?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0A7B31-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-09  6:40       ` Leon Romanovsky
       [not found]         ` <20161109064009.GE27883-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2016-11-09  6:40 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

On Tue, Nov 08, 2016 at 05:49:26PM +0000, Hefty, Sean wrote:
> > When sending from a 10G host to a 1G host, it is easy to overrun the
> > receiver,
> > leading to packet loss and traffic backing off. Similar problems occur
> > when
> > a 10G host sends data to a sub-10G virtual circuit, or a 40G host
> > sending
> > to a 10G host. Packet pacing could control packet injection rate and
> > reduces
> > network congestion to maximize throughput & minimize network latency.
>
> Why isn't the path record data and existing mechanisms sufficient to handle this?
>

Packet pacing allows different combinations of traffic shaping: per-CPU,
per-flow and their combinations with better and steady QoS requirements
without involving subnet management.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]         ` <20161109064009.GE27883-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-11-09 17:06           ` Hefty, Sean
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F0A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Hefty, Sean @ 2016-11-09 17:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> On Tue, Nov 08, 2016 at 05:49:26PM +0000, Hefty, Sean wrote:
> > > When sending from a 10G host to a 1G host, it is easy to overrun
> the
> > > receiver,
> > > leading to packet loss and traffic backing off. Similar problems
> occur
> > > when
> > > a 10G host sends data to a sub-10G virtual circuit, or a 40G host
> > > sending
> > > to a 10G host. Packet pacing could control packet injection rate
> and
> > > reduces
> > > network congestion to maximize throughput & minimize network
> latency.
> >
> > Why isn't the path record data and existing mechanisms sufficient to
> handle this?
> >
> 
> Packet pacing allows different combinations of traffic shaping: per-
> CPU,
> per-flow and their combinations with better and steady QoS requirements
> without involving subnet management.

The patch adds this as a QP attribute, and we already have a rate for that.  I still don't see why the standard mechanisms are insufficient or couldn't be adapted.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing
       [not found]     ` <1477909297-14491-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-11-01 10:06       ` Yuval Shaia
@ 2016-11-09 17:27       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F70-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Hefty, Sean @ 2016-11-09 17:27 UTC (permalink / raw)
  To: Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bodong Wang

>  enum ib_qp_state {
> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>  	u8			rnr_retry;
>  	u8			alt_port_num;
>  	u8			alt_timeout;
> +	u32			rate_limit;
>  };

We already have ib_qp_attr::ib_ah_attr::static_rate, and that accounts for both the primary and alternate paths.  We should not add a conflicting rate_limit field.  Either use static_rate as defined by the spec, or replace/update it, with corresponding changes to how it is used in conjunction with SM data.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F70-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-09 21:00           ` Bodong Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Bodong Wang @ 2016-11-09 21:00 UTC (permalink / raw)
  To: Hefty, Sean, Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/9/2016 11:27 AM, Hefty, Sean wrote:
>>   enum ib_qp_state {
>> @@ -1151,6 +1152,7 @@ struct ib_qp_attr {
>>   	u8			rnr_retry;
>>   	u8			alt_port_num;
>>   	u8			alt_timeout;
>> +	u32			rate_limit;
>>   };
> We already have ib_qp_attr::ib_ah_attr::static_rate, and that accounts for both the primary and alternate paths.  We should not add a conflicting rate_limit field.  Either use static_rate as defined by the spec, or replace/update it, with corresponding changes to how it is used in conjunction with SM data.

They are different features. Static rate has a limitation on how many 
different speeds we could get, while packet pacing(rate limit) allows 
customer to set any number between the min and max range.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F0A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-10  7:22               ` Leon Romanovsky
       [not found]                 ` <20161110072242.GC28957-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Leon Romanovsky @ 2016-11-10  7:22 UTC (permalink / raw)
  To: Hefty, Sean, Bodong Wang
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]

On Wed, Nov 09, 2016 at 05:06:52PM +0000, Hefty, Sean wrote:
> > On Tue, Nov 08, 2016 at 05:49:26PM +0000, Hefty, Sean wrote:
> > > > When sending from a 10G host to a 1G host, it is easy to overrun
> > the
> > > > receiver,
> > > > leading to packet loss and traffic backing off. Similar problems
> > occur
> > > > when
> > > > a 10G host sends data to a sub-10G virtual circuit, or a 40G host
> > > > sending
> > > > to a 10G host. Packet pacing could control packet injection rate
> > and
> > > > reduces
> > > > network congestion to maximize throughput & minimize network
> > latency.
> > >
> > > Why isn't the path record data and existing mechanisms sufficient to
> > handle this?
> > >
> >
> > Packet pacing allows different combinations of traffic shaping: per-
> > CPU,
> > per-flow and their combinations with better and steady QoS requirements
> > without involving subnet management.
>
> The patch adds this as a QP attribute, and we already have a rate for that.  I still don't see why the standard mechanisms are insufficient or couldn't be adapted.

I'll let to Bodong to elaborate on it more, but as far as I see, the AH
attribute is relevant to UD QP only, while the packet pacing is intended
for all types of QPs.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]                 ` <20161110072242.GC28957-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-11-10 16:07                   ` Bodong Wang
  2016-11-10 16:47                   ` Jason Gunthorpe
  1 sibling, 0 replies; 17+ messages in thread
From: Bodong Wang @ 2016-11-10 16:07 UTC (permalink / raw)
  To: Leon Romanovsky, Hefty, Sean
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/10/2016 1:22 AM, Leon Romanovsky wrote:
> On Wed, Nov 09, 2016 at 05:06:52PM +0000, Hefty, Sean wrote:
>>> On Tue, Nov 08, 2016 at 05:49:26PM +0000, Hefty, Sean wrote:
>>>>> When sending from a 10G host to a 1G host, it is easy to overrun
>>> the
>>>>> receiver,
>>>>> leading to packet loss and traffic backing off. Similar problems
>>> occur
>>>>> when
>>>>> a 10G host sends data to a sub-10G virtual circuit, or a 40G host
>>>>> sending
>>>>> to a 10G host. Packet pacing could control packet injection rate
>>> and
>>>>> reduces
>>>>> network congestion to maximize throughput & minimize network
>>> latency.
>>>> Why isn't the path record data and existing mechanisms sufficient to
>>> handle this?
>>> Packet pacing allows different combinations of traffic shaping: per-
>>> CPU,
>>> per-flow and their combinations with better and steady QoS requirements
>>> without involving subnet management.
>> The patch adds this as a QP attribute, and we already have a rate for that.  I still don't see why the standard mechanisms are insufficient or couldn't be adapted.
> I'll let to Bodong to elaborate on it more, but as far as I see, the AH
> attribute is relevant to UD QP only, while the packet pacing is intended
> for all types of QPs.
>
> Thanks
While the path record data can prevent the overrun but cannot control 
the rate speed easily within the hardware limitation range. One main use 
case for packet pacing is for streaming vendors to control the speed for 
different customers based on service coverage. For example, user's NIC 
supports 10G speed but only pays for 1G speed service, packet pacing can 
achieve this purpose only by modify_qp.

Another advantage, packet pacing doesn't need to involve with subnet 
admin which path record data has to. Moreover, like indicated in the 
other thread, the main flaw of the static rate is its limitations on the 
speed options.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]                 ` <20161110072242.GC28957-2ukJVAZIZ/Y@public.gmane.org>
  2016-11-10 16:07                   ` Bodong Wang
@ 2016-11-10 16:47                   ` Jason Gunthorpe
       [not found]                     ` <CAFo2czDeGUrA8yYAJ0r5-8q5T=Y=gZojjfHrLqxribZaexmbOA@mail.gmail.com>
  1 sibling, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2016-11-10 16:47 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Hefty, Sean, Bodong Wang, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Nov 10, 2016 at 09:22:42AM +0200, Leon Romanovsky wrote:
> I'll let to Bodong to elaborate on it more, but as far as I see, the AH
> attribute is relevant to UD QP only, while the packet pacing is intended
> for all types of QPs.

In IB static rate is supposed to work for all QP types.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found]                           ` <DB5PR0501MB19281A8064B283BE35EDE821B0BF0-1FH/Iesddo5/SeJcUcAJq8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-11-15 14:43                             ` Rony Efraim
  0 siblings, 0 replies; 17+ messages in thread
From: Rony Efraim @ 2016-11-15 14:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Hefty, Sean, Bodong Wang, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1237 bytes --]


On Thu, Nov 10, 2016 at 6:47 PM +0200, Jason Gunthorpe wrote:
>On Thu, Nov 10, 2016 at 09:22:42AM +0200, Leon Romanovsky wrote:
>> I'll let to Bodong to elaborate on it more, but as far as I see, the 
>> AH attribute is relevant to UD QP only, while the packet pacing is 
>> intended for all types of QPs.

>In IB static rate is supposed to work for all QP types.
Indeed we already have a static_rate field, which applies to all IB traffic (including UD).
The problem is:
- This field is only u8, and uses IB standard rate enumerations. For pacing, we need arbitrary rates.
- The field doesn't apply to Raw Ethernet QPs, AH or AV which isn't applicable to Raw Ethernet. It is incorrect to support AV/AH for Raw Ethernet QPs.
 
The rate_limit configuration came form the application like required BW for the streaming and not from the fabric (SM).
The rate_limit resolution is much higher and it is 1Kb/s.

Both of the fields are limit the rate but coming from a different entities and required separate fields.
The actual limit should be the minimum of both of them.

Rony

>Jason
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH rdma-next 0/4] Add packet pacing support for IB verbs
       [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-11-08 17:49   ` [PATCH rdma-next 0/4] Add packet pacing support for IB verbs Hefty, Sean
@ 2016-11-17 18:15   ` Leon Romanovsky
  5 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2016-11-17 18:15 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2661 bytes --]

On Mon, Oct 31, 2016 at 12:21:33PM +0200, Leon Romanovsky wrote:
> When sending from a 10G host to a 1G host, it is easy to overrun the receiver,
> leading to packet loss and traffic backing off. Similar problems occur when
> a 10G host sends data to a sub-10G virtual circuit, or a 40G host sending
> to a 10G host. Packet pacing could control packet injection rate and reduces
> network congestion to maximize throughput & minimize network latency.
>
> Packet pacing is a rate limiting and shaping for a QP (SQ for RAW QP), set
> and change the rate is done by modifying QP. This series of patch made the
> following high level changes:
>  1. Report rate limit capabilities through user data. Reported capabilities
>     include: The maximum and minimum rate limit in kbps supported by packet
>     pacing; Bitmap showing which QP types are supported by packet pacing
>     operation.
>  2. Extend modify QP interface for growing attributes. Add rate limit support
>     to the extended interface.
>  3. Enable mlx5-based hardware to be able to update the rate limit for
>     RAW QP packet.
>
> Available in the "topic/packet_pacing" topic branch of this git repo:
> git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git
>
> Or for browsing:
> https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=topic/packet_pacing

Hi Doug,

Please drop this patch series, we discovered an issue with proposed
modify_qp implementation and will respin it.

Sorry for the inconvenience.

Thanks

>
> Thanks,
>   Bodong & Leon
>
> Bodong Wang (4):
>   IB/mlx5: Report mlx5 packet pacing capabilities when querying device
>   IB/core: Support rate limit for packet pacing
>   IB/uverbs: Extend modify_qp and support packet pacing
>   IB/mlx5: Update the rate limit according to user setting for RAW QP
>
>  drivers/infiniband/core/uverbs.h      |   1 +
>  drivers/infiniband/core/uverbs_cmd.c  | 178 +++++++++++++++++++++-------------
>  drivers/infiniband/core/uverbs_main.c |   1 +
>  drivers/infiniband/core/verbs.c       |   2 +
>  drivers/infiniband/hw/mlx5/main.c     |  16 ++-
>  drivers/infiniband/hw/mlx5/mlx5_ib.h  |   1 +
>  drivers/infiniband/hw/mlx5/qp.c       |  71 ++++++++++++--
>  include/rdma/ib_verbs.h               |   2 +
>  include/uapi/rdma/ib_user_verbs.h     |  12 +++
>  include/uapi/rdma/mlx5-abi.h          |  13 +++
>  10 files changed, 219 insertions(+), 78 deletions(-)
>
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-11-17 18:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-31 10:21 [PATCH rdma-next 0/4] Add packet pacing support for IB verbs Leon Romanovsky
     [not found] ` <1477909297-14491-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-10-31 10:21   ` [PATCH rdma-next 1/4] IB/mlx5: Report mlx5 packet pacing capabilities when querying device Leon Romanovsky
2016-10-31 10:21   ` [PATCH rdma-next 2/4] IB/core: Support rate limit for packet pacing Leon Romanovsky
     [not found]     ` <1477909297-14491-3-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-11-01 10:06       ` Yuval Shaia
     [not found]         ` <20161101100607.GB3727-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
2016-11-02 15:35           ` Leon Romanovsky
2016-11-09 17:27       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F70-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-09 21:00           ` Bodong Wang
2016-10-31 10:21   ` [PATCH rdma-next 3/4] IB/uverbs: Extend modify_qp and support " Leon Romanovsky
2016-10-31 10:21   ` [PATCH rdma-next 4/4] IB/mlx5: Update the rate limit according to user setting for RAW QP Leon Romanovsky
2016-11-08 17:49   ` [PATCH rdma-next 0/4] Add packet pacing support for IB verbs Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0A7B31-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-09  6:40       ` Leon Romanovsky
     [not found]         ` <20161109064009.GE27883-2ukJVAZIZ/Y@public.gmane.org>
2016-11-09 17:06           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB0A7F0A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-10  7:22               ` Leon Romanovsky
     [not found]                 ` <20161110072242.GC28957-2ukJVAZIZ/Y@public.gmane.org>
2016-11-10 16:07                   ` Bodong Wang
2016-11-10 16:47                   ` Jason Gunthorpe
     [not found]                     ` <CAFo2czDeGUrA8yYAJ0r5-8q5T=Y=gZojjfHrLqxribZaexmbOA@mail.gmail.com>
     [not found]                       ` <HE1PR0501MB27291D7FD8782D6650C57E5ACABF0@HE1PR0501MB2729.eurprd05.prod.outlook.com>
     [not found]                         ` <DB5PR0501MB19281A8064B283BE35EDE821B0BF0@DB5PR0501MB1928.eurprd05.prod.outlook.com>
     [not found]                           ` <DB5PR0501MB19281A8064B283BE35EDE821B0BF0-1FH/Iesddo5/SeJcUcAJq8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-11-15 14:43                             ` Rony Efraim
2016-11-17 18:15   ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.