All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/2] Packet pacing DEVX API
@ 2020-02-19 19:05 Leon Romanovsky
  2020-02-19 19:05 ` [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs Leon Romanovsky
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Leon Romanovsky @ 2020-02-19 19:05 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Yishai Hadas, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

This series from Yishai extends packet pacing to work over DEVX
interface. In first patch, he refactors the mlx5_core internal
logic. In second patch, the RDMA APIs are added.

Thanks

Yishai Hadas (2):
  net/mlx5: Expose raw packet pacing APIs
  IB/mlx5: Introduce UAPIs to manage packet pacing

 drivers/infiniband/hw/mlx5/Makefile          |   1 +
 drivers/infiniband/hw/mlx5/main.c            |   1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h         |   6 +
 drivers/infiniband/hw/mlx5/qos.c             | 136 +++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/rl.c | 130 +++++++++++++-----
 include/linux/mlx5/driver.h                  |  11 +-
 include/linux/mlx5/mlx5_ifc.h                |  26 ++--
 include/uapi/rdma/mlx5_user_ioctl_cmds.h     |  17 +++
 include/uapi/rdma/mlx5_user_ioctl_verbs.h    |   4 +
 9 files changed, 287 insertions(+), 45 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/qos.c

--
2.24.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs
  2020-02-19 19:05 [PATCH rdma-next 0/2] Packet pacing DEVX API Leon Romanovsky
@ 2020-02-19 19:05 ` Leon Romanovsky
  2020-02-21 19:04   ` Saeed Mahameed
  2020-02-19 19:05 ` [PATCH rdma-next 2/2] IB/mlx5: Introduce UAPIs to manage packet pacing Leon Romanovsky
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2020-02-19 19:05 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Yishai Hadas, Saeed Mahameed,
	linux-netdev

From: Yishai Hadas <yishaih@mellanox.com>

Expose raw packet pacing APIs to be used by DEVX based applications.
The existing code was refactored to have a single flow with the new raw
APIs.

The new raw APIs considered the input of 'pp_rate_limit_context', uid,
'dedicated', upon looking for an existing entry.

This raw mode enables future device specification data in the raw
context without changing the existing logic and code.

The ability to ask for a dedicated entry gives control for application
to allocate entries according to its needs.

A dedicated entry may not be used by some other process and it also
enables the process spreading its resources to some different entries
for use different hardware resources as part of enforcing the rate.

The counter per entry mas changed to be u64 to prevent any option to
overflow.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/rl.c | 130 ++++++++++++++-----
 include/linux/mlx5/driver.h                  |  11 +-
 include/linux/mlx5/mlx5_ifc.h                |  26 ++--
 3 files changed, 122 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rl.c b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
index 01c380425f9d..f3b29d9ade1f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
@@ -101,22 +101,39 @@ int mlx5_destroy_scheduling_element_cmd(struct mlx5_core_dev *dev, u8 hierarchy,
 	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
 }
 
+static bool mlx5_rl_are_equal_raw(struct mlx5_rl_entry *entry, void *rl_in,
+				  u16 uid)
+{
+	return (!memcmp(entry->rl_raw, rl_in, sizeof(entry->rl_raw)) &&
+		entry->uid == uid);
+}
+
 /* Finds an entry where we can register the given rate
  * If the rate already exists, return the entry where it is registered,
  * otherwise return the first available entry.
  * If the table is full, return NULL
  */
 static struct mlx5_rl_entry *find_rl_entry(struct mlx5_rl_table *table,
-					   struct mlx5_rate_limit *rl)
+					   void *rl_in, u16 uid, bool dedicated)
 {
 	struct mlx5_rl_entry *ret_entry = NULL;
 	bool empty_found = false;
 	int i;
 
 	for (i = 0; i < table->max_size; i++) {
-		if (mlx5_rl_are_equal(&table->rl_entry[i].rl, rl))
-			return &table->rl_entry[i];
-		if (!empty_found && !table->rl_entry[i].rl.rate) {
+		if (dedicated) {
+			if (!table->rl_entry[i].refcount)
+				return &table->rl_entry[i];
+			continue;
+		}
+
+		if (table->rl_entry[i].refcount) {
+			if (table->rl_entry[i].dedicated)
+				continue;
+			if (mlx5_rl_are_equal_raw(&table->rl_entry[i], rl_in,
+						  uid))
+				return &table->rl_entry[i];
+		} else if (!empty_found) {
 			empty_found = true;
 			ret_entry = &table->rl_entry[i];
 		}
@@ -126,18 +143,19 @@ static struct mlx5_rl_entry *find_rl_entry(struct mlx5_rl_table *table,
 }
 
 static int mlx5_set_pp_rate_limit_cmd(struct mlx5_core_dev *dev,
-				      u16 index,
-				      struct mlx5_rate_limit *rl)
+				      struct mlx5_rl_entry *entry, bool set)
 {
-	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {0};
-	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {0};
+	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {};
+	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {};
+	void *pp_context;
 
+	pp_context = MLX5_ADDR_OF(set_pp_rate_limit_in, in, ctx);
 	MLX5_SET(set_pp_rate_limit_in, in, opcode,
 		 MLX5_CMD_OP_SET_PP_RATE_LIMIT);
-	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, index);
-	MLX5_SET(set_pp_rate_limit_in, in, rate_limit, rl->rate);
-	MLX5_SET(set_pp_rate_limit_in, in, burst_upper_bound, rl->max_burst_sz);
-	MLX5_SET(set_pp_rate_limit_in, in, typical_packet_size, rl->typical_pkt_sz);
+	MLX5_SET(set_pp_rate_limit_in, in, uid, entry->uid);
+	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, entry->index);
+	if (set)
+		memcpy(pp_context, entry->rl_raw, sizeof(entry->rl_raw));
 	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
 }
 
@@ -158,23 +176,25 @@ bool mlx5_rl_are_equal(struct mlx5_rate_limit *rl_0,
 }
 EXPORT_SYMBOL(mlx5_rl_are_equal);
 
-int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
-		     struct mlx5_rate_limit *rl)
+int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16 uid,
+			 bool dedicated_entry, u16 *index)
 {
 	struct mlx5_rl_table *table = &dev->priv.rl_table;
 	struct mlx5_rl_entry *entry;
 	int err = 0;
+	u32 rate;
 
+	rate = MLX5_GET(set_pp_rate_limit_context, rl_in, rate_limit);
 	mutex_lock(&table->rl_lock);
 
-	if (!rl->rate || !mlx5_rl_is_in_range(dev, rl->rate)) {
+	if (!rate || !mlx5_rl_is_in_range(dev, rate)) {
 		mlx5_core_err(dev, "Invalid rate: %u, should be %u to %u\n",
-			      rl->rate, table->min_rate, table->max_rate);
+			      rate, table->min_rate, table->max_rate);
 		err = -EINVAL;
 		goto out;
 	}
 
-	entry = find_rl_entry(table, rl);
+	entry = find_rl_entry(table, rl_in, uid, dedicated_entry);
 	if (!entry) {
 		mlx5_core_err(dev, "Max number of %u rates reached\n",
 			      table->max_size);
@@ -185,16 +205,24 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
 		/* rate already configured */
 		entry->refcount++;
 	} else {
+		memcpy(entry->rl_raw, rl_in, sizeof(entry->rl_raw));
+		entry->uid = uid;
 		/* new rate limit */
-		err = mlx5_set_pp_rate_limit_cmd(dev, entry->index, rl);
+		err = mlx5_set_pp_rate_limit_cmd(dev, entry, true);
 		if (err) {
-			mlx5_core_err(dev, "Failed configuring rate limit(err %d): rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
-				      err, rl->rate, rl->max_burst_sz,
-				      rl->typical_pkt_sz);
+			mlx5_core_err(
+				dev,
+				"Failed configuring rate limit(err %d): rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
+				err, rate,
+				MLX5_GET(set_pp_rate_limit_context, rl_in,
+					 burst_upper_bound),
+				MLX5_GET(set_pp_rate_limit_context, rl_in,
+					 typical_packet_size));
 			goto out;
 		}
-		entry->rl = *rl;
+
 		entry->refcount = 1;
+		entry->dedicated = dedicated_entry;
 	}
 	*index = entry->index;
 
@@ -202,20 +230,61 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
 	mutex_unlock(&table->rl_lock);
 	return err;
 }
+EXPORT_SYMBOL(mlx5_rl_add_rate_raw);
+
+void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index)
+{
+	struct mlx5_rl_table *table = &dev->priv.rl_table;
+	struct mlx5_rl_entry *entry;
+
+	mutex_lock(&table->rl_lock);
+	entry = &table->rl_entry[index - 1];
+	entry->refcount--;
+	if (!entry->refcount)
+		/* need to remove rate */
+		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
+	mutex_unlock(&table->rl_lock);
+}
+EXPORT_SYMBOL(mlx5_rl_remove_rate_raw);
+
+int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
+		     struct mlx5_rate_limit *rl)
+{
+	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
+
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl->rate);
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
+		 rl->max_burst_sz);
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, typical_packet_size,
+		 rl->typical_pkt_sz);
+
+	return mlx5_rl_add_rate_raw(dev, rl_raw,
+				    MLX5_CAP_QOS(dev, packet_pacing_uid) ?
+					MLX5_SHARED_RESOURCE_UID : 0,
+				    false, index);
+}
 EXPORT_SYMBOL(mlx5_rl_add_rate);
 
 void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct mlx5_rate_limit *rl)
 {
+	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
 	struct mlx5_rl_table *table = &dev->priv.rl_table;
 	struct mlx5_rl_entry *entry = NULL;
-	struct mlx5_rate_limit reset_rl = {0};
 
 	/* 0 is a reserved value for unlimited rate */
 	if (rl->rate == 0)
 		return;
 
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl->rate);
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
+		 rl->max_burst_sz);
+	MLX5_SET(set_pp_rate_limit_context, rl_raw, typical_packet_size,
+		 rl->typical_pkt_sz);
+
 	mutex_lock(&table->rl_lock);
-	entry = find_rl_entry(table, rl);
+	entry = find_rl_entry(table, rl_raw,
+			      MLX5_CAP_QOS(dev, packet_pacing_uid) ?
+				MLX5_SHARED_RESOURCE_UID : 0, false);
 	if (!entry || !entry->refcount) {
 		mlx5_core_warn(dev, "Rate %u, max_burst_sz %u typical_pkt_sz %u are not configured\n",
 			       rl->rate, rl->max_burst_sz, rl->typical_pkt_sz);
@@ -223,11 +292,9 @@ void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct mlx5_rate_limit *rl)
 	}
 
 	entry->refcount--;
-	if (!entry->refcount) {
+	if (!entry->refcount)
 		/* need to remove rate */
-		mlx5_set_pp_rate_limit_cmd(dev, entry->index, &reset_rl);
-		entry->rl = reset_rl;
-	}
+		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
 
 out:
 	mutex_unlock(&table->rl_lock);
@@ -273,14 +340,13 @@ int mlx5_init_rl_table(struct mlx5_core_dev *dev)
 void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev)
 {
 	struct mlx5_rl_table *table = &dev->priv.rl_table;
-	struct mlx5_rate_limit rl = {0};
 	int i;
 
 	/* Clear all configured rates */
 	for (i = 0; i < table->max_size; i++)
-		if (table->rl_entry[i].rl.rate)
-			mlx5_set_pp_rate_limit_cmd(dev, table->rl_entry[i].index,
-						   &rl);
+		if (table->rl_entry[i].refcount)
+			mlx5_set_pp_rate_limit_cmd(dev, &table->rl_entry[i],
+						   false);
 
 	kfree(dev->priv.rl_table.rl_entry);
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 277a51d3ec40..f2b4225ed650 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -518,9 +518,11 @@ struct mlx5_rate_limit {
 };
 
 struct mlx5_rl_entry {
-	struct mlx5_rate_limit	rl;
-	u16                     index;
-	u16                     refcount;
+	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)];
+	u16 index;
+	u64 refcount;
+	u16 uid;
+	u8 dedicated : 1;
 };
 
 struct mlx5_rl_table {
@@ -1007,6 +1009,9 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
 		     struct mlx5_rate_limit *rl);
 void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct mlx5_rate_limit *rl);
 bool mlx5_rl_is_in_range(struct mlx5_core_dev *dev, u32 rate);
+int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16 uid,
+			 bool dedicated_entry, u16 *index);
+void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index);
 bool mlx5_rl_are_equal(struct mlx5_rate_limit *rl_0,
 		       struct mlx5_rate_limit *rl_1);
 int mlx5_alloc_bfreg(struct mlx5_core_dev *mdev, struct mlx5_sq_bfreg *bfreg,
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index ff8c9d527bb4..7d89ab64b372 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -810,7 +810,9 @@ struct mlx5_ifc_qos_cap_bits {
 	u8         reserved_at_4[0x1];
 	u8         packet_pacing_burst_bound[0x1];
 	u8         packet_pacing_typical_size[0x1];
-	u8         reserved_at_7[0x19];
+	u8         reserved_at_7[0x4];
+	u8         packet_pacing_uid[0x1];
+	u8         reserved_at_c[0x14];
 
 	u8         reserved_at_20[0x20];
 
@@ -8262,9 +8264,20 @@ struct mlx5_ifc_set_pp_rate_limit_out_bits {
 	u8         reserved_at_40[0x40];
 };
 
+struct mlx5_ifc_set_pp_rate_limit_context_bits {
+	u8         rate_limit[0x20];
+
+	u8	   burst_upper_bound[0x20];
+
+	u8         reserved_at_40[0x10];
+	u8	   typical_packet_size[0x10];
+
+	u8         reserved_at_60[0x120];
+};
+
 struct mlx5_ifc_set_pp_rate_limit_in_bits {
 	u8         opcode[0x10];
-	u8         reserved_at_10[0x10];
+	u8         uid[0x10];
 
 	u8         reserved_at_20[0x10];
 	u8         op_mod[0x10];
@@ -8274,14 +8287,7 @@ struct mlx5_ifc_set_pp_rate_limit_in_bits {
 
 	u8         reserved_at_60[0x20];
 
-	u8         rate_limit[0x20];
-
-	u8	   burst_upper_bound[0x20];
-
-	u8         reserved_at_c0[0x10];
-	u8	   typical_packet_size[0x10];
-
-	u8         reserved_at_e0[0x120];
+	struct mlx5_ifc_set_pp_rate_limit_context_bits ctx;
 };
 
 struct mlx5_ifc_access_register_out_bits {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 2/2] IB/mlx5: Introduce UAPIs to manage packet pacing
  2020-02-19 19:05 [PATCH rdma-next 0/2] Packet pacing DEVX API Leon Romanovsky
  2020-02-19 19:05 ` [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs Leon Romanovsky
@ 2020-02-19 19:05 ` Leon Romanovsky
  2020-03-04  0:33 ` [PATCH rdma-next 0/2] Packet pacing DEVX API Jason Gunthorpe
  2020-03-10 15:44 ` Jason Gunthorpe
  3 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2020-02-19 19:05 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Yishai Hadas, Saeed Mahameed,
	linux-netdev

From: Yishai Hadas <yishaih@mellanox.com>

Introduce packet pacing uobject and its alloc and destroy
methods.

This uobject holds mlx5 packet pacing context according to the device
specification and enables managing packet pacing device entries that are
needed by DEVX applications.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/Makefile       |   1 +
 drivers/infiniband/hw/mlx5/main.c         |   1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h      |   6 +
 drivers/infiniband/hw/mlx5/qos.c          | 136 ++++++++++++++++++++++
 include/uapi/rdma/mlx5_user_ioctl_cmds.h  |  17 +++
 include/uapi/rdma/mlx5_user_ioctl_verbs.h |   4 +
 6 files changed, 165 insertions(+)
 create mode 100644 drivers/infiniband/hw/mlx5/qos.c

diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
index d0a043ccbe58..2a334800f109 100644
--- a/drivers/infiniband/hw/mlx5/Makefile
+++ b/drivers/infiniband/hw/mlx5/Makefile
@@ -8,3 +8,4 @@ mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
 mlx5_ib-$(CONFIG_MLX5_ESWITCH) += ib_rep.o
 mlx5_ib-$(CONFIG_INFINIBAND_USER_ACCESS) += devx.o
 mlx5_ib-$(CONFIG_INFINIBAND_USER_ACCESS) += flow.o
+mlx5_ib-$(CONFIG_INFINIBAND_USER_ACCESS) += qos.o
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 398e9f32746e..01bddd584405 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -6250,6 +6250,7 @@ ADD_UVERBS_ATTRIBUTES_SIMPLE(
 static const struct uapi_definition mlx5_ib_defs[] = {
 	UAPI_DEF_CHAIN(mlx5_ib_devx_defs),
 	UAPI_DEF_CHAIN(mlx5_ib_flow_defs),
+	UAPI_DEF_CHAIN(mlx5_ib_qos_defs),
 
 	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_FLOW_ACTION,
 				&mlx5_ib_flow_action),
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 2d65df34a6d0..6c85ca766678 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -201,6 +201,11 @@ struct mlx5_ib_flow_matcher {
 	u8			match_criteria_enable;
 };
 
+struct mlx5_ib_pp {
+	u16 index;
+	struct mlx5_core_dev *mdev;
+};
+
 struct mlx5_ib_flow_db {
 	struct mlx5_ib_flow_prio	prios[MLX5_IB_NUM_FLOW_FT];
 	struct mlx5_ib_flow_prio	egress_prios[MLX5_IB_NUM_FLOW_FT];
@@ -1379,6 +1384,7 @@ int mlx5_ib_fill_stat_entry(struct sk_buff *msg,
 
 extern const struct uapi_definition mlx5_ib_devx_defs[];
 extern const struct uapi_definition mlx5_ib_flow_defs[];
+extern const struct uapi_definition mlx5_ib_qos_defs[];
 
 #if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
 int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user);
diff --git a/drivers/infiniband/hw/mlx5/qos.c b/drivers/infiniband/hw/mlx5/qos.c
new file mode 100644
index 000000000000..f822b06e7c9e
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/qos.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2020, Mellanox Technologies inc.  All rights reserved.
+ */
+
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/mlx5_user_ioctl_cmds.h>
+#include <rdma/mlx5_user_ioctl_verbs.h>
+#include <linux/mlx5/driver.h>
+#include "mlx5_ib.h"
+
+#define UVERBS_MODULE_NAME mlx5_ib
+#include <rdma/uverbs_named_ioctl.h>
+
+static bool pp_is_supported(struct ib_device *device)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+
+	return (MLX5_CAP_GEN(dev->mdev, qos) &&
+		MLX5_CAP_QOS(dev->mdev, packet_pacing) &&
+		MLX5_CAP_QOS(dev->mdev, packet_pacing_uid));
+}
+
+static int UVERBS_HANDLER(MLX5_IB_METHOD_PP_OBJ_ALLOC)(
+	struct uverbs_attr_bundle *attrs)
+{
+	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
+	struct ib_uobject *uobj = uverbs_attr_get_uobject(attrs,
+		MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE);
+	struct mlx5_ib_dev *dev;
+	struct mlx5_ib_ucontext *c;
+	struct mlx5_ib_pp *pp_entry;
+	void *in_ctx;
+	u16 uid;
+	int inlen;
+	u32 flags;
+	int err;
+
+	c = to_mucontext(ib_uverbs_get_ucontext(attrs));
+	if (IS_ERR(c))
+		return PTR_ERR(c);
+
+	/* The allocated entry can be used only by a DEVX context */
+	if (!c->devx_uid)
+		return -EINVAL;
+
+	dev = to_mdev(c->ibucontext.device);
+	pp_entry = kzalloc(sizeof(*pp_entry), GFP_KERNEL);
+	if (IS_ERR(pp_entry))
+		return PTR_ERR(pp_entry);
+
+	in_ctx = uverbs_attr_get_alloced_ptr(attrs,
+					     MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX);
+	inlen = uverbs_attr_get_len(attrs,
+				    MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX);
+	memcpy(rl_raw, in_ctx, inlen);
+	err = uverbs_get_flags32(&flags, attrs,
+		MLX5_IB_ATTR_PP_OBJ_ALLOC_FLAGS,
+		MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX);
+	if (err)
+		goto err;
+
+	uid = (flags & MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX) ?
+		c->devx_uid : MLX5_SHARED_RESOURCE_UID;
+
+	err = mlx5_rl_add_rate_raw(dev->mdev, rl_raw, uid,
+			(flags & MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX),
+			&pp_entry->index);
+	if (err)
+		goto err;
+
+	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_PP_OBJ_ALLOC_INDEX,
+			     &pp_entry->index, sizeof(pp_entry->index));
+	if (err)
+		goto clean;
+
+	pp_entry->mdev = dev->mdev;
+	uobj->object = pp_entry;
+	return 0;
+
+clean:
+	mlx5_rl_remove_rate_raw(dev->mdev, pp_entry->index);
+err:
+	kfree(pp_entry);
+	return err;
+}
+
+static int pp_obj_cleanup(struct ib_uobject *uobject,
+			  enum rdma_remove_reason why,
+			  struct uverbs_attr_bundle *attrs)
+{
+	struct mlx5_ib_pp *pp_entry = uobject->object;
+
+	mlx5_rl_remove_rate_raw(pp_entry->mdev, pp_entry->index);
+	kfree(pp_entry);
+	return 0;
+}
+
+DECLARE_UVERBS_NAMED_METHOD(
+	MLX5_IB_METHOD_PP_OBJ_ALLOC,
+	UVERBS_ATTR_IDR(MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE,
+			MLX5_IB_OBJECT_PP,
+			UVERBS_ACCESS_NEW,
+			UA_MANDATORY),
+	UVERBS_ATTR_PTR_IN(
+		MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX,
+		UVERBS_ATTR_SIZE(1,
+			MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)),
+		UA_MANDATORY,
+		UA_ALLOC_AND_COPY),
+	UVERBS_ATTR_FLAGS_IN(MLX5_IB_ATTR_PP_OBJ_ALLOC_FLAGS,
+			enum mlx5_ib_uapi_pp_alloc_flags,
+			UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_PP_OBJ_ALLOC_INDEX,
+			   UVERBS_ATTR_TYPE(u16),
+			   UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_METHOD_DESTROY(
+	MLX5_IB_METHOD_PP_OBJ_DESTROY,
+	UVERBS_ATTR_IDR(MLX5_IB_ATTR_PP_OBJ_DESTROY_HANDLE,
+			MLX5_IB_OBJECT_PP,
+			UVERBS_ACCESS_DESTROY,
+			UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_OBJECT(MLX5_IB_OBJECT_PP,
+			    UVERBS_TYPE_ALLOC_IDR(pp_obj_cleanup),
+			    &UVERBS_METHOD(MLX5_IB_METHOD_PP_OBJ_ALLOC),
+			    &UVERBS_METHOD(MLX5_IB_METHOD_PP_OBJ_DESTROY));
+
+
+const struct uapi_definition mlx5_ib_qos_defs[] = {
+	UAPI_DEF_CHAIN_OBJ_TREE_NAMED(
+		MLX5_IB_OBJECT_PP,
+		UAPI_DEF_IS_OBJ_SUPPORTED(pp_is_supported)),
+	{},
+};
diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
index afe7da6f2b8e..8f4a417fc70a 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
@@ -143,6 +143,22 @@ enum mlx5_ib_devx_umem_dereg_attrs {
 	MLX5_IB_ATTR_DEVX_UMEM_DEREG_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
 };
 
+enum mlx5_ib_pp_obj_methods {
+	MLX5_IB_METHOD_PP_OBJ_ALLOC = (1U << UVERBS_ID_NS_SHIFT),
+	MLX5_IB_METHOD_PP_OBJ_DESTROY,
+};
+
+enum mlx5_ib_pp_alloc_attrs {
+	MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+	MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX,
+	MLX5_IB_ATTR_PP_OBJ_ALLOC_FLAGS,
+	MLX5_IB_ATTR_PP_OBJ_ALLOC_INDEX,
+};
+
+enum mlx5_ib_pp_obj_destroy_attrs {
+	MLX5_IB_ATTR_PP_OBJ_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+};
+
 enum mlx5_ib_devx_umem_methods {
 	MLX5_IB_METHOD_DEVX_UMEM_REG = (1U << UVERBS_ID_NS_SHIFT),
 	MLX5_IB_METHOD_DEVX_UMEM_DEREG,
@@ -173,6 +189,7 @@ enum mlx5_ib_objects {
 	MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD,
 	MLX5_IB_OBJECT_DEVX_ASYNC_EVENT_FD,
 	MLX5_IB_OBJECT_VAR,
+	MLX5_IB_OBJECT_PP,
 };
 
 enum mlx5_ib_flow_matcher_create_attrs {
diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h
index 88b6ca70c2fe..b4641a7865f7 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h
@@ -73,5 +73,9 @@ struct mlx5_ib_uapi_devx_async_event_hdr {
 	__u8		out_data[];
 };
 
+enum mlx5_ib_uapi_pp_alloc_flags {
+	MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX = 1 << 0,
+};
+
 #endif
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs
  2020-02-19 19:05 ` [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs Leon Romanovsky
@ 2020-02-21 19:04   ` Saeed Mahameed
  2020-02-23  8:53     ` Yishai Hadas
  0 siblings, 1 reply; 10+ messages in thread
From: Saeed Mahameed @ 2020-02-21 19:04 UTC (permalink / raw)
  To: Jason Gunthorpe, leon, dledford
  Cc: Yishai Hadas, netdev, Leon Romanovsky, linux-rdma

On Wed, 2020-02-19 at 21:05 +0200, Leon Romanovsky wrote:
> From: Yishai Hadas <yishaih@mellanox.com>
> 
> Expose raw packet pacing APIs to be used by DEVX based applications.
> The existing code was refactored to have a single flow with the new
> raw
> APIs.
> 
> The new raw APIs considered the input of 'pp_rate_limit_context',
> uid,
> 'dedicated', upon looking for an existing entry.
> 
> This raw mode enables future device specification data in the raw
> context without changing the existing logic and code.
> 
> The ability to ask for a dedicated entry gives control for
> application
> to allocate entries according to its needs.
> 
> A dedicated entry may not be used by some other process and it also
> enables the process spreading its resources to some different entries
> for use different hardware resources as part of enforcing the rate.
> 

It sounds like the dedicated means "no sharing" which means you don't
need to use the mlx5_core API and you can go directly to FW.. The
problem is that the entry indices are managed by driver, and i guess
this is the reason why you had to expand the mlx5_core API..

I would like to suggest some alternatives to simplify the approach and
allow using RAW PRM for DEVX properly.

1. preserve RL entries for DEVX and let DEVX access FW directly with
PRM commands.
2. keep mlx5_core API simple and instead of adding this raw/non raw api
and complicating the RL API with this dedicated bit:

just add mlx5_rl_{alloc/free}_index(), this will dedicate for you the
RL index form the end of the RL indices database and you are free to
access the FW with this index the way you like via direct PRM commands.

> The counter per entry mas changed to be u64 to prevent any option to
                   typo ^^^ was

> overflow.
> 
> Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 130 ++++++++++++++---
> --
>  include/linux/mlx5/driver.h                  |  11 +-
>  include/linux/mlx5/mlx5_ifc.h                |  26 ++--
>  3 files changed, 122 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
> b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
> index 01c380425f9d..f3b29d9ade1f 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
> @@ -101,22 +101,39 @@ int mlx5_destroy_scheduling_element_cmd(struct
> mlx5_core_dev *dev, u8 hierarchy,
>  	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
>  }
>  
> +static bool mlx5_rl_are_equal_raw(struct mlx5_rl_entry *entry, void
> *rl_in,
> +				  u16 uid)
> +{
> +	return (!memcmp(entry->rl_raw, rl_in, sizeof(entry->rl_raw)) &&
> +		entry->uid == uid);
> +}
> +
>  /* Finds an entry where we can register the given rate
>   * If the rate already exists, return the entry where it is
> registered,
>   * otherwise return the first available entry.
>   * If the table is full, return NULL
>   */
>  static struct mlx5_rl_entry *find_rl_entry(struct mlx5_rl_table
> *table,
> -					   struct mlx5_rate_limit *rl)
> +					   void *rl_in, u16 uid, bool
> dedicated)
>  {
>  	struct mlx5_rl_entry *ret_entry = NULL;
>  	bool empty_found = false;
>  	int i;
>  
>  	for (i = 0; i < table->max_size; i++) {
> -		if (mlx5_rl_are_equal(&table->rl_entry[i].rl, rl))
> -			return &table->rl_entry[i];
> -		if (!empty_found && !table->rl_entry[i].rl.rate) {
> +		if (dedicated) {
> +			if (!table->rl_entry[i].refcount)
> +				return &table->rl_entry[i];
> +			continue;
> +		}
> +
> +		if (table->rl_entry[i].refcount) {
> +			if (table->rl_entry[i].dedicated)
> +				continue;
> +			if (mlx5_rl_are_equal_raw(&table->rl_entry[i],
> rl_in,
> +						  uid))
> +				return &table->rl_entry[i];
> +		} else if (!empty_found) {
>  			empty_found = true;
>  			ret_entry = &table->rl_entry[i];
>  		}
> @@ -126,18 +143,19 @@ static struct mlx5_rl_entry
> *find_rl_entry(struct mlx5_rl_table *table,
>  }
>  
>  static int mlx5_set_pp_rate_limit_cmd(struct mlx5_core_dev *dev,
> -				      u16 index,
> -				      struct mlx5_rate_limit *rl)
> +				      struct mlx5_rl_entry *entry, bool
> set)
>  {
> -	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {0};
> -	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {0};
> +	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {};
> +	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {};
> +	void *pp_context;
>  
> +	pp_context = MLX5_ADDR_OF(set_pp_rate_limit_in, in, ctx);
>  	MLX5_SET(set_pp_rate_limit_in, in, opcode,
>  		 MLX5_CMD_OP_SET_PP_RATE_LIMIT);
> -	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, index);
> -	MLX5_SET(set_pp_rate_limit_in, in, rate_limit, rl->rate);
> -	MLX5_SET(set_pp_rate_limit_in, in, burst_upper_bound, rl-
> >max_burst_sz);
> -	MLX5_SET(set_pp_rate_limit_in, in, typical_packet_size, rl-
> >typical_pkt_sz);
> +	MLX5_SET(set_pp_rate_limit_in, in, uid, entry->uid);
> +	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, entry-
> >index);
> +	if (set)
> +		memcpy(pp_context, entry->rl_raw, sizeof(entry-
> >rl_raw));
>  	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
>  }
>  
> @@ -158,23 +176,25 @@ bool mlx5_rl_are_equal(struct mlx5_rate_limit
> *rl_0,
>  }
>  EXPORT_SYMBOL(mlx5_rl_are_equal);
>  
> -int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
> -		     struct mlx5_rate_limit *rl)
> +int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16
> uid,
> +			 bool dedicated_entry, u16 *index)
>  {
>  	struct mlx5_rl_table *table = &dev->priv.rl_table;
>  	struct mlx5_rl_entry *entry;
>  	int err = 0;
> +	u32 rate;
>  
> +	rate = MLX5_GET(set_pp_rate_limit_context, rl_in, rate_limit);
>  	mutex_lock(&table->rl_lock);
>  
> -	if (!rl->rate || !mlx5_rl_is_in_range(dev, rl->rate)) {
> +	if (!rate || !mlx5_rl_is_in_range(dev, rate)) {
>  		mlx5_core_err(dev, "Invalid rate: %u, should be %u to
> %u\n",
> -			      rl->rate, table->min_rate, table-
> >max_rate);
> +			      rate, table->min_rate, table->max_rate);
>  		err = -EINVAL;
>  		goto out;
>  	}
>  
> -	entry = find_rl_entry(table, rl);
> +	entry = find_rl_entry(table, rl_in, uid, dedicated_entry);
>  	if (!entry) {
>  		mlx5_core_err(dev, "Max number of %u rates reached\n",
>  			      table->max_size);
> @@ -185,16 +205,24 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
> u16 *index,
>  		/* rate already configured */
>  		entry->refcount++;
>  	} else {
> +		memcpy(entry->rl_raw, rl_in, sizeof(entry->rl_raw));
> +		entry->uid = uid;
>  		/* new rate limit */
> -		err = mlx5_set_pp_rate_limit_cmd(dev, entry->index,
> rl);
> +		err = mlx5_set_pp_rate_limit_cmd(dev, entry, true);
>  		if (err) {
> -			mlx5_core_err(dev, "Failed configuring rate
> limit(err %d): rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
> -				      err, rl->rate, rl->max_burst_sz,
> -				      rl->typical_pkt_sz);
> +			mlx5_core_err(
> +				dev,
> +				"Failed configuring rate limit(err %d):
> rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
> +				err, rate,
> +				MLX5_GET(set_pp_rate_limit_context,
> rl_in,
> +					 burst_upper_bound),
> +				MLX5_GET(set_pp_rate_limit_context,
> rl_in,
> +					 typical_packet_size));
>  			goto out;
>  		}
> -		entry->rl = *rl;
> +
>  		entry->refcount = 1;
> +		entry->dedicated = dedicated_entry;
>  	}
>  	*index = entry->index;
>  
> @@ -202,20 +230,61 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
> u16 *index,
>  	mutex_unlock(&table->rl_lock);
>  	return err;
>  }
> +EXPORT_SYMBOL(mlx5_rl_add_rate_raw);
> +
> +void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index)
> +{
> +	struct mlx5_rl_table *table = &dev->priv.rl_table;
> +	struct mlx5_rl_entry *entry;
> +
> +	mutex_lock(&table->rl_lock);
> +	entry = &table->rl_entry[index - 1];
> +	entry->refcount--;
> +	if (!entry->refcount)
> +		/* need to remove rate */
> +		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
> +	mutex_unlock(&table->rl_lock);
> +}
> +EXPORT_SYMBOL(mlx5_rl_remove_rate_raw);
> +
> +int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
> +		     struct mlx5_rate_limit *rl)
> +{
> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
> +
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl-
> >rate);
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
> +		 rl->max_burst_sz);
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw,
> typical_packet_size,
> +		 rl->typical_pkt_sz);
> +
> +	return mlx5_rl_add_rate_raw(dev, rl_raw,
> +				    MLX5_CAP_QOS(dev,
> packet_pacing_uid) ?
> +					MLX5_SHARED_RESOURCE_UID : 0,
> +				    false, index);
> +}
>  EXPORT_SYMBOL(mlx5_rl_add_rate);
>  
>  void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct
> mlx5_rate_limit *rl)
>  {
> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
>  	struct mlx5_rl_table *table = &dev->priv.rl_table;
>  	struct mlx5_rl_entry *entry = NULL;
> -	struct mlx5_rate_limit reset_rl = {0};
>  
>  	/* 0 is a reserved value for unlimited rate */
>  	if (rl->rate == 0)
>  		return;
>  
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl-
> >rate);
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
> +		 rl->max_burst_sz);
> +	MLX5_SET(set_pp_rate_limit_context, rl_raw,
> typical_packet_size,
> +		 rl->typical_pkt_sz);
> +
>  	mutex_lock(&table->rl_lock);
> -	entry = find_rl_entry(table, rl);
> +	entry = find_rl_entry(table, rl_raw,
> +			      MLX5_CAP_QOS(dev, packet_pacing_uid) ?
> +				MLX5_SHARED_RESOURCE_UID : 0, false);
>  	if (!entry || !entry->refcount) {
>  		mlx5_core_warn(dev, "Rate %u, max_burst_sz %u
> typical_pkt_sz %u are not configured\n",
>  			       rl->rate, rl->max_burst_sz, rl-
> >typical_pkt_sz);
> @@ -223,11 +292,9 @@ void mlx5_rl_remove_rate(struct mlx5_core_dev
> *dev, struct mlx5_rate_limit *rl)
>  	}
>  
>  	entry->refcount--;
> -	if (!entry->refcount) {
> +	if (!entry->refcount)
>  		/* need to remove rate */
> -		mlx5_set_pp_rate_limit_cmd(dev, entry->index,
> &reset_rl);
> -		entry->rl = reset_rl;
> -	}
> +		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
>  
>  out:
>  	mutex_unlock(&table->rl_lock);
> @@ -273,14 +340,13 @@ int mlx5_init_rl_table(struct mlx5_core_dev
> *dev)
>  void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev)
>  {
>  	struct mlx5_rl_table *table = &dev->priv.rl_table;
> -	struct mlx5_rate_limit rl = {0};
>  	int i;
>  
>  	/* Clear all configured rates */
>  	for (i = 0; i < table->max_size; i++)
> -		if (table->rl_entry[i].rl.rate)
> -			mlx5_set_pp_rate_limit_cmd(dev, table-
> >rl_entry[i].index,
> -						   &rl);
> +		if (table->rl_entry[i].refcount)
> +			mlx5_set_pp_rate_limit_cmd(dev, &table-
> >rl_entry[i],
> +						   false);
>  
>  	kfree(dev->priv.rl_table.rl_entry);
>  }
> diff --git a/include/linux/mlx5/driver.h
> b/include/linux/mlx5/driver.h
> index 277a51d3ec40..f2b4225ed650 100644
> --- a/include/linux/mlx5/driver.h
> +++ b/include/linux/mlx5/driver.h
> @@ -518,9 +518,11 @@ struct mlx5_rate_limit {
>  };
>  
>  struct mlx5_rl_entry {
> -	struct mlx5_rate_limit	rl;
> -	u16                     index;
> -	u16                     refcount;
> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)];
> +	u16 index;
> +	u64 refcount;
> +	u16 uid;
> +	u8 dedicated : 1;
>  };
>  
>  struct mlx5_rl_table {
> @@ -1007,6 +1009,9 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
> u16 *index,
>  		     struct mlx5_rate_limit *rl);
>  void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct
> mlx5_rate_limit *rl);
>  bool mlx5_rl_is_in_range(struct mlx5_core_dev *dev, u32 rate);
> +int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16
> uid,
> +			 bool dedicated_entry, u16 *index);
> +void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index);
>  bool mlx5_rl_are_equal(struct mlx5_rate_limit *rl_0,
>  		       struct mlx5_rate_limit *rl_1);
>  int mlx5_alloc_bfreg(struct mlx5_core_dev *mdev, struct
> mlx5_sq_bfreg *bfreg,
> diff --git a/include/linux/mlx5/mlx5_ifc.h
> b/include/linux/mlx5/mlx5_ifc.h
> index ff8c9d527bb4..7d89ab64b372 100644
> --- a/include/linux/mlx5/mlx5_ifc.h
> +++ b/include/linux/mlx5/mlx5_ifc.h
> @@ -810,7 +810,9 @@ struct mlx5_ifc_qos_cap_bits {
>  	u8         reserved_at_4[0x1];
>  	u8         packet_pacing_burst_bound[0x1];
>  	u8         packet_pacing_typical_size[0x1];
> -	u8         reserved_at_7[0x19];
> +	u8         reserved_at_7[0x4];
> +	u8         packet_pacing_uid[0x1];
> +	u8         reserved_at_c[0x14];
>  
>  	u8         reserved_at_20[0x20];
>  
> @@ -8262,9 +8264,20 @@ struct mlx5_ifc_set_pp_rate_limit_out_bits {
>  	u8         reserved_at_40[0x40];
>  };
>  
> +struct mlx5_ifc_set_pp_rate_limit_context_bits {
> +	u8         rate_limit[0x20];
> +
> +	u8	   burst_upper_bound[0x20];
> +
> +	u8         reserved_at_40[0x10];
> +	u8	   typical_packet_size[0x10];
> +
> +	u8         reserved_at_60[0x120];
> +};
> +
>  struct mlx5_ifc_set_pp_rate_limit_in_bits {
>  	u8         opcode[0x10];
> -	u8         reserved_at_10[0x10];
> +	u8         uid[0x10];
>  
>  	u8         reserved_at_20[0x10];
>  	u8         op_mod[0x10];
> @@ -8274,14 +8287,7 @@ struct mlx5_ifc_set_pp_rate_limit_in_bits {
>  
>  	u8         reserved_at_60[0x20];
>  
> -	u8         rate_limit[0x20];
> -
> -	u8	   burst_upper_bound[0x20];
> -
> -	u8         reserved_at_c0[0x10];
> -	u8	   typical_packet_size[0x10];
> -
> -	u8         reserved_at_e0[0x120];
> +	struct mlx5_ifc_set_pp_rate_limit_context_bits ctx;
>  };
>  
>  struct mlx5_ifc_access_register_out_bits {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs
  2020-02-21 19:04   ` Saeed Mahameed
@ 2020-02-23  8:53     ` Yishai Hadas
  2020-02-24 23:32       ` Saeed Mahameed
  0 siblings, 1 reply; 10+ messages in thread
From: Yishai Hadas @ 2020-02-23  8:53 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Jason Gunthorpe, leon, dledford, Yishai Hadas, netdev,
	Leon Romanovsky, linux-rdma, Yishai Hadas, Alex Rosenbaum

On 2/21/2020 9:04 PM, Saeed Mahameed wrote:
> On Wed, 2020-02-19 at 21:05 +0200, Leon Romanovsky wrote:
>> From: Yishai Hadas <yishaih@mellanox.com>
>>
>> Expose raw packet pacing APIs to be used by DEVX based applications.
>> The existing code was refactored to have a single flow with the new
>> raw
>> APIs.
>>
>> The new raw APIs considered the input of 'pp_rate_limit_context',
>> uid,
>> 'dedicated', upon looking for an existing entry.
>>
>> This raw mode enables future device specification data in the raw
>> context without changing the existing logic and code.
>>
>> The ability to ask for a dedicated entry gives control for
>> application
>> to allocate entries according to its needs.
>>
>> A dedicated entry may not be used by some other process and it also
>> enables the process spreading its resources to some different entries
>> for use different hardware resources as part of enforcing the rate.
>>
> 
> It sounds like the dedicated means "no sharing" which means you don't
> need to use the mlx5_core API and you can go directly to FW.. The
> problem is that the entry indices are managed by driver, and i guess
> this is the reason why you had to expand the mlx5_core API..
> 

The main reason for introducing the new mlx5_core APIs was the need to 
support the "shared mode" in a "raw data" format to prevent future 
touching the kernel once PRM will support extra fields.
As the RL indices are managed by the driver (mlx5_core) including the 
sharing, we couldn’t go directly to FW, the legacy API was refactored 
inside the core to have one flow with the new raw APIs.
So we may need those APIs regardless the dedicated mode.


> I would like to suggest some alternatives to simplify the approach and
> allow using RAW PRM for DEVX properly.
> 
> 1. preserve RL entries for DEVX and let DEVX access FW directly with
> PRM commands.
> 2. keep mlx5_core API simple and instead of adding this raw/non raw api
> and complicating the RL API with this dedicated bit:
> 
> just add mlx5_rl_{alloc/free}_index(), this will dedicate for you the
> RL index form the end of the RL indices database and you are free to
> access the FW with this index the way you like via direct PRM commands.
> 
As mentioned above, we may still need the new mlx5_core raw APIs for the 
shared mode which is the main usage of the API, we found it reasonable 
to have the dedicate flag in the new raw alloc API instead of exposing 
more two new APIs only for that.

Please note that even if we'll go with those 2 extra APIs for the 
dedicated mode, we may still need to maintain in the core this 
information to prevent returning this entry for other cases.

Also the idea to preserve some entries at the end might be wasteful as 
there is no guarantee that DEVX will really be used, and even so it may 
not ask for entries in a dedicated mode.

Presering them for this optional use case might prevent using them for 
all other cases.


>> The counter per entry mas changed to be u64 to prevent any option to
>                     typo ^^^ was

Sure, thanks.

> 
>> overflow.
>>
>> Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx5/core/rl.c | 130 ++++++++++++++---
>> --
>>   include/linux/mlx5/driver.h                  |  11 +-
>>   include/linux/mlx5/mlx5_ifc.h                |  26 ++--
>>   3 files changed, 122 insertions(+), 45 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
>> b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
>> index 01c380425f9d..f3b29d9ade1f 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
>> @@ -101,22 +101,39 @@ int mlx5_destroy_scheduling_element_cmd(struct
>> mlx5_core_dev *dev, u8 hierarchy,
>>   	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
>>   }
>>   
>> +static bool mlx5_rl_are_equal_raw(struct mlx5_rl_entry *entry, void
>> *rl_in,
>> +				  u16 uid)
>> +{
>> +	return (!memcmp(entry->rl_raw, rl_in, sizeof(entry->rl_raw)) &&
>> +		entry->uid == uid);
>> +}
>> +
>>   /* Finds an entry where we can register the given rate
>>    * If the rate already exists, return the entry where it is
>> registered,
>>    * otherwise return the first available entry.
>>    * If the table is full, return NULL
>>    */
>>   static struct mlx5_rl_entry *find_rl_entry(struct mlx5_rl_table
>> *table,
>> -					   struct mlx5_rate_limit *rl)
>> +					   void *rl_in, u16 uid, bool
>> dedicated)
>>   {
>>   	struct mlx5_rl_entry *ret_entry = NULL;
>>   	bool empty_found = false;
>>   	int i;
>>   
>>   	for (i = 0; i < table->max_size; i++) {
>> -		if (mlx5_rl_are_equal(&table->rl_entry[i].rl, rl))
>> -			return &table->rl_entry[i];
>> -		if (!empty_found && !table->rl_entry[i].rl.rate) {
>> +		if (dedicated) {
>> +			if (!table->rl_entry[i].refcount)
>> +				return &table->rl_entry[i];
>> +			continue;
>> +		}
>> +
>> +		if (table->rl_entry[i].refcount) {
>> +			if (table->rl_entry[i].dedicated)
>> +				continue;
>> +			if (mlx5_rl_are_equal_raw(&table->rl_entry[i],
>> rl_in,
>> +						  uid))
>> +				return &table->rl_entry[i];
>> +		} else if (!empty_found) {
>>   			empty_found = true;
>>   			ret_entry = &table->rl_entry[i];
>>   		}
>> @@ -126,18 +143,19 @@ static struct mlx5_rl_entry
>> *find_rl_entry(struct mlx5_rl_table *table,
>>   }
>>   
>>   static int mlx5_set_pp_rate_limit_cmd(struct mlx5_core_dev *dev,
>> -				      u16 index,
>> -				      struct mlx5_rate_limit *rl)
>> +				      struct mlx5_rl_entry *entry, bool
>> set)
>>   {
>> -	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {0};
>> -	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {0};
>> +	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {};
>> +	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {};
>> +	void *pp_context;
>>   
>> +	pp_context = MLX5_ADDR_OF(set_pp_rate_limit_in, in, ctx);
>>   	MLX5_SET(set_pp_rate_limit_in, in, opcode,
>>   		 MLX5_CMD_OP_SET_PP_RATE_LIMIT);
>> -	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, index);
>> -	MLX5_SET(set_pp_rate_limit_in, in, rate_limit, rl->rate);
>> -	MLX5_SET(set_pp_rate_limit_in, in, burst_upper_bound, rl-
>>> max_burst_sz);
>> -	MLX5_SET(set_pp_rate_limit_in, in, typical_packet_size, rl-
>>> typical_pkt_sz);
>> +	MLX5_SET(set_pp_rate_limit_in, in, uid, entry->uid);
>> +	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, entry-
>>> index);
>> +	if (set)
>> +		memcpy(pp_context, entry->rl_raw, sizeof(entry-
>>> rl_raw));
>>   	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
>>   }
>>   
>> @@ -158,23 +176,25 @@ bool mlx5_rl_are_equal(struct mlx5_rate_limit
>> *rl_0,
>>   }
>>   EXPORT_SYMBOL(mlx5_rl_are_equal);
>>   
>> -int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
>> -		     struct mlx5_rate_limit *rl)
>> +int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16
>> uid,
>> +			 bool dedicated_entry, u16 *index)
>>   {
>>   	struct mlx5_rl_table *table = &dev->priv.rl_table;
>>   	struct mlx5_rl_entry *entry;
>>   	int err = 0;
>> +	u32 rate;
>>   
>> +	rate = MLX5_GET(set_pp_rate_limit_context, rl_in, rate_limit);
>>   	mutex_lock(&table->rl_lock);
>>   
>> -	if (!rl->rate || !mlx5_rl_is_in_range(dev, rl->rate)) {
>> +	if (!rate || !mlx5_rl_is_in_range(dev, rate)) {
>>   		mlx5_core_err(dev, "Invalid rate: %u, should be %u to
>> %u\n",
>> -			      rl->rate, table->min_rate, table-
>>> max_rate);
>> +			      rate, table->min_rate, table->max_rate);
>>   		err = -EINVAL;
>>   		goto out;
>>   	}
>>   
>> -	entry = find_rl_entry(table, rl);
>> +	entry = find_rl_entry(table, rl_in, uid, dedicated_entry);
>>   	if (!entry) {
>>   		mlx5_core_err(dev, "Max number of %u rates reached\n",
>>   			      table->max_size);
>> @@ -185,16 +205,24 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
>> u16 *index,
>>   		/* rate already configured */
>>   		entry->refcount++;
>>   	} else {
>> +		memcpy(entry->rl_raw, rl_in, sizeof(entry->rl_raw));
>> +		entry->uid = uid;
>>   		/* new rate limit */
>> -		err = mlx5_set_pp_rate_limit_cmd(dev, entry->index,
>> rl);
>> +		err = mlx5_set_pp_rate_limit_cmd(dev, entry, true);
>>   		if (err) {
>> -			mlx5_core_err(dev, "Failed configuring rate
>> limit(err %d): rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
>> -				      err, rl->rate, rl->max_burst_sz,
>> -				      rl->typical_pkt_sz);
>> +			mlx5_core_err(
>> +				dev,
>> +				"Failed configuring rate limit(err %d):
>> rate %u, max_burst_sz %u, typical_pkt_sz %u\n",
>> +				err, rate,
>> +				MLX5_GET(set_pp_rate_limit_context,
>> rl_in,
>> +					 burst_upper_bound),
>> +				MLX5_GET(set_pp_rate_limit_context,
>> rl_in,
>> +					 typical_packet_size));
>>   			goto out;
>>   		}
>> -		entry->rl = *rl;
>> +
>>   		entry->refcount = 1;
>> +		entry->dedicated = dedicated_entry;
>>   	}
>>   	*index = entry->index;
>>   
>> @@ -202,20 +230,61 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
>> u16 *index,
>>   	mutex_unlock(&table->rl_lock);
>>   	return err;
>>   }
>> +EXPORT_SYMBOL(mlx5_rl_add_rate_raw);
>> +
>> +void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index)
>> +{
>> +	struct mlx5_rl_table *table = &dev->priv.rl_table;
>> +	struct mlx5_rl_entry *entry;
>> +
>> +	mutex_lock(&table->rl_lock);
>> +	entry = &table->rl_entry[index - 1];
>> +	entry->refcount--;
>> +	if (!entry->refcount)
>> +		/* need to remove rate */
>> +		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
>> +	mutex_unlock(&table->rl_lock);
>> +}
>> +EXPORT_SYMBOL(mlx5_rl_remove_rate_raw);
>> +
>> +int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u16 *index,
>> +		     struct mlx5_rate_limit *rl)
>> +{
>> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
>> +
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl-
>>> rate);
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
>> +		 rl->max_burst_sz);
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw,
>> typical_packet_size,
>> +		 rl->typical_pkt_sz);
>> +
>> +	return mlx5_rl_add_rate_raw(dev, rl_raw,
>> +				    MLX5_CAP_QOS(dev,
>> packet_pacing_uid) ?
>> +					MLX5_SHARED_RESOURCE_UID : 0,
>> +				    false, index);
>> +}
>>   EXPORT_SYMBOL(mlx5_rl_add_rate);
>>   
>>   void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct
>> mlx5_rate_limit *rl)
>>   {
>> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)] = {};
>>   	struct mlx5_rl_table *table = &dev->priv.rl_table;
>>   	struct mlx5_rl_entry *entry = NULL;
>> -	struct mlx5_rate_limit reset_rl = {0};
>>   
>>   	/* 0 is a reserved value for unlimited rate */
>>   	if (rl->rate == 0)
>>   		return;
>>   
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, rate_limit, rl-
>>> rate);
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw, burst_upper_bound,
>> +		 rl->max_burst_sz);
>> +	MLX5_SET(set_pp_rate_limit_context, rl_raw,
>> typical_packet_size,
>> +		 rl->typical_pkt_sz);
>> +
>>   	mutex_lock(&table->rl_lock);
>> -	entry = find_rl_entry(table, rl);
>> +	entry = find_rl_entry(table, rl_raw,
>> +			      MLX5_CAP_QOS(dev, packet_pacing_uid) ?
>> +				MLX5_SHARED_RESOURCE_UID : 0, false);
>>   	if (!entry || !entry->refcount) {
>>   		mlx5_core_warn(dev, "Rate %u, max_burst_sz %u
>> typical_pkt_sz %u are not configured\n",
>>   			       rl->rate, rl->max_burst_sz, rl-
>>> typical_pkt_sz);
>> @@ -223,11 +292,9 @@ void mlx5_rl_remove_rate(struct mlx5_core_dev
>> *dev, struct mlx5_rate_limit *rl)
>>   	}
>>   
>>   	entry->refcount--;
>> -	if (!entry->refcount) {
>> +	if (!entry->refcount)
>>   		/* need to remove rate */
>> -		mlx5_set_pp_rate_limit_cmd(dev, entry->index,
>> &reset_rl);
>> -		entry->rl = reset_rl;
>> -	}
>> +		mlx5_set_pp_rate_limit_cmd(dev, entry, false);
>>   
>>   out:
>>   	mutex_unlock(&table->rl_lock);
>> @@ -273,14 +340,13 @@ int mlx5_init_rl_table(struct mlx5_core_dev
>> *dev)
>>   void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev)
>>   {
>>   	struct mlx5_rl_table *table = &dev->priv.rl_table;
>> -	struct mlx5_rate_limit rl = {0};
>>   	int i;
>>   
>>   	/* Clear all configured rates */
>>   	for (i = 0; i < table->max_size; i++)
>> -		if (table->rl_entry[i].rl.rate)
>> -			mlx5_set_pp_rate_limit_cmd(dev, table-
>>> rl_entry[i].index,
>> -						   &rl);
>> +		if (table->rl_entry[i].refcount)
>> +			mlx5_set_pp_rate_limit_cmd(dev, &table-
>>> rl_entry[i],
>> +						   false);
>>   
>>   	kfree(dev->priv.rl_table.rl_entry);
>>   }
>> diff --git a/include/linux/mlx5/driver.h
>> b/include/linux/mlx5/driver.h
>> index 277a51d3ec40..f2b4225ed650 100644
>> --- a/include/linux/mlx5/driver.h
>> +++ b/include/linux/mlx5/driver.h
>> @@ -518,9 +518,11 @@ struct mlx5_rate_limit {
>>   };
>>   
>>   struct mlx5_rl_entry {
>> -	struct mlx5_rate_limit	rl;
>> -	u16                     index;
>> -	u16                     refcount;
>> +	u8 rl_raw[MLX5_ST_SZ_BYTES(set_pp_rate_limit_context)];
>> +	u16 index;
>> +	u64 refcount;
>> +	u16 uid;
>> +	u8 dedicated : 1;
>>   };
>>   
>>   struct mlx5_rl_table {
>> @@ -1007,6 +1009,9 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev,
>> u16 *index,
>>   		     struct mlx5_rate_limit *rl);
>>   void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, struct
>> mlx5_rate_limit *rl);
>>   bool mlx5_rl_is_in_range(struct mlx5_core_dev *dev, u32 rate);
>> +int mlx5_rl_add_rate_raw(struct mlx5_core_dev *dev, void *rl_in, u16
>> uid,
>> +			 bool dedicated_entry, u16 *index);
>> +void mlx5_rl_remove_rate_raw(struct mlx5_core_dev *dev, u16 index);
>>   bool mlx5_rl_are_equal(struct mlx5_rate_limit *rl_0,
>>   		       struct mlx5_rate_limit *rl_1);
>>   int mlx5_alloc_bfreg(struct mlx5_core_dev *mdev, struct
>> mlx5_sq_bfreg *bfreg,
>> diff --git a/include/linux/mlx5/mlx5_ifc.h
>> b/include/linux/mlx5/mlx5_ifc.h
>> index ff8c9d527bb4..7d89ab64b372 100644
>> --- a/include/linux/mlx5/mlx5_ifc.h
>> +++ b/include/linux/mlx5/mlx5_ifc.h
>> @@ -810,7 +810,9 @@ struct mlx5_ifc_qos_cap_bits {
>>   	u8         reserved_at_4[0x1];
>>   	u8         packet_pacing_burst_bound[0x1];
>>   	u8         packet_pacing_typical_size[0x1];
>> -	u8         reserved_at_7[0x19];
>> +	u8         reserved_at_7[0x4];
>> +	u8         packet_pacing_uid[0x1];
>> +	u8         reserved_at_c[0x14];
>>   
>>   	u8         reserved_at_20[0x20];
>>   
>> @@ -8262,9 +8264,20 @@ struct mlx5_ifc_set_pp_rate_limit_out_bits {
>>   	u8         reserved_at_40[0x40];
>>   };
>>   
>> +struct mlx5_ifc_set_pp_rate_limit_context_bits {
>> +	u8         rate_limit[0x20];
>> +
>> +	u8	   burst_upper_bound[0x20];
>> +
>> +	u8         reserved_at_40[0x10];
>> +	u8	   typical_packet_size[0x10];
>> +
>> +	u8         reserved_at_60[0x120];
>> +};
>> +
>>   struct mlx5_ifc_set_pp_rate_limit_in_bits {
>>   	u8         opcode[0x10];
>> -	u8         reserved_at_10[0x10];
>> +	u8         uid[0x10];
>>   
>>   	u8         reserved_at_20[0x10];
>>   	u8         op_mod[0x10];
>> @@ -8274,14 +8287,7 @@ struct mlx5_ifc_set_pp_rate_limit_in_bits {
>>   
>>   	u8         reserved_at_60[0x20];
>>   
>> -	u8         rate_limit[0x20];
>> -
>> -	u8	   burst_upper_bound[0x20];
>> -
>> -	u8         reserved_at_c0[0x10];
>> -	u8	   typical_packet_size[0x10];
>> -
>> -	u8         reserved_at_e0[0x120];
>> +	struct mlx5_ifc_set_pp_rate_limit_context_bits ctx;
>>   };
>>   
>>   struct mlx5_ifc_access_register_out_bits {


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs
  2020-02-23  8:53     ` Yishai Hadas
@ 2020-02-24 23:32       ` Saeed Mahameed
  2020-02-25  7:43         ` Leon Romanovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Saeed Mahameed @ 2020-02-24 23:32 UTC (permalink / raw)
  To: yishaih
  Cc: Jason Gunthorpe, linux-rdma, rosenbaumalex, Yishai Hadas,
	Leon Romanovsky, netdev, leon, dledford

On Sun, 2020-02-23 at 10:53 +0200, Yishai Hadas wrote:
> On 2/21/2020 9:04 PM, Saeed Mahameed wrote:
> > On Wed, 2020-02-19 at 21:05 +0200, Leon Romanovsky wrote:
> > > From: Yishai Hadas <yishaih@mellanox.com>
> > > 
> > > Expose raw packet pacing APIs to be used by DEVX based
> > > applications.
> > > The existing code was refactored to have a single flow with the
> > > new
> > > raw
> > > APIs.
> > > 
> > > The new raw APIs considered the input of 'pp_rate_limit_context',
> > > uid,
> > > 'dedicated', upon looking for an existing entry.
> > > 
> > > This raw mode enables future device specification data in the raw
> > > context without changing the existing logic and code.
> > > 
> > > The ability to ask for a dedicated entry gives control for
> > > application
> > > to allocate entries according to its needs.
> > > 
> > > A dedicated entry may not be used by some other process and it
> > > also
> > > enables the process spreading its resources to some different
> > > entries
> > > for use different hardware resources as part of enforcing the
> > > rate.
> > > 
> > 
> > It sounds like the dedicated means "no sharing" which means you
> > don't
> > need to use the mlx5_core API and you can go directly to FW.. The
> > problem is that the entry indices are managed by driver, and i
> > guess
> > this is the reason why you had to expand the mlx5_core API..
> > 
> 
> The main reason for introducing the new mlx5_core APIs was the need
> to 
> support the "shared mode" in a "raw data" format to prevent future 
> touching the kernel once PRM will support extra fields.
> As the RL indices are managed by the driver (mlx5_core) including
> the 
> sharing, we couldn’t go directly to FW, the legacy API was
> refactored 
> inside the core to have one flow with the new raw APIs.
> So we may need those APIs regardless the dedicated mode.
> 

I not a fan of legacy APIs, all of the APIs are mlx5 internals and i
would like to keep one API which is only PRM dependent as much as
possible.

Anyway thanks for the clarification, i think the patch is good as is,
we can improve and remove the legacy API in the future and keep the raw
API.

> 
> > I would like to suggest some alternatives to simplify the approach
> > and
> > allow using RAW PRM for DEVX properly.
> > 
> > 1. preserve RL entries for DEVX and let DEVX access FW directly
> > with
> > PRM commands.
> > 2. keep mlx5_core API simple and instead of adding this raw/non raw
> > api
> > and complicating the RL API with this dedicated bit:
> > 
> > just add mlx5_rl_{alloc/free}_index(), this will dedicate for you
> > the
> > RL index form the end of the RL indices database and you are free
> > to
> > access the FW with this index the way you like via direct PRM
> > commands.
> > 
> As mentioned above, we may still need the new mlx5_core raw APIs for
> the 
> shared mode which is the main usage of the API, we found it
> reasonable 
> to have the dedicate flag in the new raw alloc API instead of
> exposing 
> more two new APIs only for that.
> 
> Please note that even if we'll go with those 2 extra APIs for the 
> dedicated mode, we may still need to maintain in the core this 
> information to prevent returning this entry for other cases.
> 
> Also the idea to preserve some entries at the end might be wasteful
> as 
> there is no guarantee that DEVX will really be used, and even so it
> may 
> not ask for entries in a dedicated mode.
> 
> Presering them for this optional use case might prevent using them
> for 
> all other cases.
> 
> 
> > > The counter per entry mas changed to be u64 to prevent any option
> > > to
> >                     typo ^^^ was
> 
> Sure, thanks.
> 

Leon, Other than the typo i am good with this patch.
you can fix up the patch prior to pulling into mlx5-next, no need for
v2.

Acked-by: Saeed Mahameed <saeedm@mellanox.com> 


thanks,
Saeed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs
  2020-02-24 23:32       ` Saeed Mahameed
@ 2020-02-25  7:43         ` Leon Romanovsky
  0 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2020-02-25  7:43 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: yishaih, Jason Gunthorpe, linux-rdma, rosenbaumalex,
	Yishai Hadas, netdev, dledford

On Mon, Feb 24, 2020 at 11:32:58PM +0000, Saeed Mahameed wrote:
> On Sun, 2020-02-23 at 10:53 +0200, Yishai Hadas wrote:
> > On 2/21/2020 9:04 PM, Saeed Mahameed wrote:
> > > On Wed, 2020-02-19 at 21:05 +0200, Leon Romanovsky wrote:
> > > > From: Yishai Hadas <yishaih@mellanox.com>
> > > >
> > > > Expose raw packet pacing APIs to be used by DEVX based
> > > > applications.
> > > > The existing code was refactored to have a single flow with the
> > > > new
> > > > raw
> > > > APIs.
> > > >
> > > > The new raw APIs considered the input of 'pp_rate_limit_context',
> > > > uid,
> > > > 'dedicated', upon looking for an existing entry.
> > > >
> > > > This raw mode enables future device specification data in the raw
> > > > context without changing the existing logic and code.
> > > >
> > > > The ability to ask for a dedicated entry gives control for
> > > > application
> > > > to allocate entries according to its needs.
> > > >
> > > > A dedicated entry may not be used by some other process and it
> > > > also
> > > > enables the process spreading its resources to some different
> > > > entries
> > > > for use different hardware resources as part of enforcing the
> > > > rate.
> > > >
> > >
> > > It sounds like the dedicated means "no sharing" which means you
> > > don't
> > > need to use the mlx5_core API and you can go directly to FW.. The
> > > problem is that the entry indices are managed by driver, and i
> > > guess
> > > this is the reason why you had to expand the mlx5_core API..
> > >
> >
> > The main reason for introducing the new mlx5_core APIs was the need
> > to
> > support the "shared mode" in a "raw data" format to prevent future
> > touching the kernel once PRM will support extra fields.
> > As the RL indices are managed by the driver (mlx5_core) including
> > the
> > sharing, we couldn’t go directly to FW, the legacy API was
> > refactored
> > inside the core to have one flow with the new raw APIs.
> > So we may need those APIs regardless the dedicated mode.
> >
>
> I not a fan of legacy APIs, all of the APIs are mlx5 internals and i
> would like to keep one API which is only PRM dependent as much as
> possible.
>
> Anyway thanks for the clarification, i think the patch is good as is,
> we can improve and remove the legacy API in the future and keep the raw
> API.
>
> >
> > > I would like to suggest some alternatives to simplify the approach
> > > and
> > > allow using RAW PRM for DEVX properly.
> > >
> > > 1. preserve RL entries for DEVX and let DEVX access FW directly
> > > with
> > > PRM commands.
> > > 2. keep mlx5_core API simple and instead of adding this raw/non raw
> > > api
> > > and complicating the RL API with this dedicated bit:
> > >
> > > just add mlx5_rl_{alloc/free}_index(), this will dedicate for you
> > > the
> > > RL index form the end of the RL indices database and you are free
> > > to
> > > access the FW with this index the way you like via direct PRM
> > > commands.
> > >
> > As mentioned above, we may still need the new mlx5_core raw APIs for
> > the
> > shared mode which is the main usage of the API, we found it
> > reasonable
> > to have the dedicate flag in the new raw alloc API instead of
> > exposing
> > more two new APIs only for that.
> >
> > Please note that even if we'll go with those 2 extra APIs for the
> > dedicated mode, we may still need to maintain in the core this
> > information to prevent returning this entry for other cases.
> >
> > Also the idea to preserve some entries at the end might be wasteful
> > as
> > there is no guarantee that DEVX will really be used, and even so it
> > may
> > not ask for entries in a dedicated mode.
> >
> > Presering them for this optional use case might prevent using them
> > for
> > all other cases.
> >
> >
> > > > The counter per entry mas changed to be u64 to prevent any option
> > > > to
> > >                     typo ^^^ was
> >
> > Sure, thanks.
> >
>
> Leon, Other than the typo i am good with this patch.
> you can fix up the patch prior to pulling into mlx5-next, no need for
> v2.

Thanks Saeed, I'll apply it once Doug/Jason ack RDMA part of the series.

>
> Acked-by: Saeed Mahameed <saeedm@mellanox.com>
>
>
> thanks,
> Saeed.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH rdma-next 0/2] Packet pacing DEVX API
  2020-02-19 19:05 [PATCH rdma-next 0/2] Packet pacing DEVX API Leon Romanovsky
  2020-02-19 19:05 ` [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs Leon Romanovsky
  2020-02-19 19:05 ` [PATCH rdma-next 2/2] IB/mlx5: Introduce UAPIs to manage packet pacing Leon Romanovsky
@ 2020-03-04  0:33 ` Jason Gunthorpe
  2020-03-05 12:21   ` Leon Romanovsky
  2020-03-10 15:44 ` Jason Gunthorpe
  3 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2020-03-04  0:33 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Yishai Hadas,
	Saeed Mahameed, linux-netdev

On Wed, Feb 19, 2020 at 09:05:16PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> This series from Yishai extends packet pacing to work over DEVX
> interface. In first patch, he refactors the mlx5_core internal
> logic. In second patch, the RDMA APIs are added.
> 
> Thanks
> 
> Yishai Hadas (2):
>   net/mlx5: Expose raw packet pacing APIs
>   IB/mlx5: Introduce UAPIs to manage packet pacing

It looks Ok, can you apply this to the shared branch?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH rdma-next 0/2] Packet pacing DEVX API
  2020-03-04  0:33 ` [PATCH rdma-next 0/2] Packet pacing DEVX API Jason Gunthorpe
@ 2020-03-05 12:21   ` Leon Romanovsky
  0 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2020-03-05 12:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Yishai Hadas, Saeed Mahameed,
	linux-netdev

On Tue, Mar 03, 2020 at 08:33:03PM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 19, 2020 at 09:05:16PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Hi,
> >
> > This series from Yishai extends packet pacing to work over DEVX
> > interface. In first patch, he refactors the mlx5_core internal
> > logic. In second patch, the RDMA APIs are added.
> >
> > Thanks
> >
> > Yishai Hadas (2):
> >   net/mlx5: Expose raw packet pacing APIs
> >   IB/mlx5: Introduce UAPIs to manage packet pacing
>
> It looks Ok, can you apply this to the shared branch?

Applied to mlx5-next
1326034b3ce7 net/mlx5: Expose raw packet pacing APIs

>
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH rdma-next 0/2] Packet pacing DEVX API
  2020-02-19 19:05 [PATCH rdma-next 0/2] Packet pacing DEVX API Leon Romanovsky
                   ` (2 preceding siblings ...)
  2020-03-04  0:33 ` [PATCH rdma-next 0/2] Packet pacing DEVX API Jason Gunthorpe
@ 2020-03-10 15:44 ` Jason Gunthorpe
  3 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2020-03-10 15:44 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Yishai Hadas,
	Saeed Mahameed, linux-netdev

On Wed, Feb 19, 2020 at 09:05:16PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> This series from Yishai extends packet pacing to work over DEVX
> interface. In first patch, he refactors the mlx5_core internal
> logic. In second patch, the RDMA APIs are added.
> 
> Thanks
> 
> Yishai Hadas (2):
>   net/mlx5: Expose raw packet pacing APIs
>   IB/mlx5: Introduce UAPIs to manage packet pacing

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-03-10 15:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-19 19:05 [PATCH rdma-next 0/2] Packet pacing DEVX API Leon Romanovsky
2020-02-19 19:05 ` [PATCH mlx5-next 1/2] net/mlx5: Expose raw packet pacing APIs Leon Romanovsky
2020-02-21 19:04   ` Saeed Mahameed
2020-02-23  8:53     ` Yishai Hadas
2020-02-24 23:32       ` Saeed Mahameed
2020-02-25  7:43         ` Leon Romanovsky
2020-02-19 19:05 ` [PATCH rdma-next 2/2] IB/mlx5: Introduce UAPIs to manage packet pacing Leon Romanovsky
2020-03-04  0:33 ` [PATCH rdma-next 0/2] Packet pacing DEVX API Jason Gunthorpe
2020-03-05 12:21   ` Leon Romanovsky
2020-03-10 15:44 ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.