All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/7] Add MEMIC operations support
@ 2021-03-18 11:15 Leon Romanovsky
  2021-03-18 11:15 ` [PATCH mlx5-next 1/7] net/mlx5: Add MEMIC operations related bits Leon Romanovsky
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, netdev,
	Saeed Mahameed, Yishai Hadas

From: Leon Romanovsky <leonro@nvidia.com>

Hi,

This series from Maor extends MEMIC to support atomic operations from
the host in addition to already supported regular read/write.

Thanks

Maor Gottlieb (7):
  net/mlx5: Add MEMIC operations related bits
  RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number
  RDMA/mlx5: Avoid use after free in allocate MEMIC bad flow
  RDMA/mlx5: Move all DM logic to separate file
  RDMA/mlx5: Add support to MODIFY_MEMIC command
  RDMA/mlx5: Add support in MEMIC operations
  RDMA/mlx5: Expose UAPI to query DM

 drivers/infiniband/hw/mlx5/Makefile      |   1 +
 drivers/infiniband/hw/mlx5/cmd.c         | 101 ----
 drivers/infiniband/hw/mlx5/cmd.h         |   3 -
 drivers/infiniband/hw/mlx5/dm.c          | 595 +++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/dm.h          |  18 +
 drivers/infiniband/hw/mlx5/main.c        | 243 +--------
 drivers/infiniband/hw/mlx5/mlx5_ib.h     |  20 +-
 include/linux/mlx5/mlx5_ifc.h            |  42 +-
 include/rdma/uverbs_named_ioctl.h        |   3 +-
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  19 +
 10 files changed, 699 insertions(+), 346 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/dm.c
 create mode 100644 drivers/infiniband/hw/mlx5/dm.h

--
2.30.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH mlx5-next 1/7] net/mlx5: Add MEMIC operations related bits
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 2/7] RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number Leon Romanovsky
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

Add the MEMIC operations bits and structures to the mlx5_ifc file.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 42 ++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index c0ce1c2e1e57..dd69cf1320ce 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -133,6 +133,7 @@ enum {
 	MLX5_CMD_OP_PAGE_FAULT_RESUME             = 0x204,
 	MLX5_CMD_OP_ALLOC_MEMIC                   = 0x205,
 	MLX5_CMD_OP_DEALLOC_MEMIC                 = 0x206,
+	MLX5_CMD_OP_MODIFY_MEMIC                  = 0x207,
 	MLX5_CMD_OP_CREATE_EQ                     = 0x301,
 	MLX5_CMD_OP_DESTROY_EQ                    = 0x302,
 	MLX5_CMD_OP_QUERY_EQ                      = 0x303,
@@ -1015,7 +1016,11 @@ struct mlx5_ifc_device_mem_cap_bits {

 	u8         header_modify_sw_icm_start_address[0x40];

-	u8         reserved_at_180[0x680];
+	u8         reserved_at_180[0x80];
+
+	u8         memic_operations[0x20];
+
+	u8         reserved_at_220[0x5e0];
 };

 struct mlx5_ifc_device_event_cap_bits {
@@ -10408,6 +10413,41 @@ struct mlx5_ifc_destroy_vport_lag_in_bits {
 	u8         reserved_at_40[0x40];
 };

+enum {
+	MLX5_MODIFY_MEMIC_OP_MOD_ALLOC,
+	MLX5_MODIFY_MEMIC_OP_MOD_DEALLOC,
+};
+
+struct mlx5_ifc_modify_memic_in_bits {
+	u8         opcode[0x10];
+	u8         uid[0x10];
+
+	u8         reserved_at_20[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_at_40[0x20];
+
+	u8         reserved_at_60[0x18];
+	u8         memic_operation_type[0x8];
+
+	u8         memic_start_addr[0x40];
+
+	u8         reserved_at_c0[0x140];
+};
+
+struct mlx5_ifc_modify_memic_out_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x40];
+
+	u8         memic_operation_addr[0x40];
+
+	u8         reserved_at_c0[0x140];
+};
+
 struct mlx5_ifc_alloc_memic_in_bits {
 	u8         opcode[0x10];
 	u8         reserved_at_10[0x10];
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 2/7] RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
  2021-03-18 11:15 ` [PATCH mlx5-next 1/7] net/mlx5: Add MEMIC operations related bits Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 3/7] RDMA/mlx5: Avoid use after free in allocate MEMIC bad flow Leon Romanovsky
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

In order to support multiple methods declaration in the same file we
should use the line number as part of the name.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/rdma/uverbs_named_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/rdma/uverbs_named_ioctl.h b/include/rdma/uverbs_named_ioctl.h
index f04f5126f61e..f247e5d57bb1 100644
--- a/include/rdma/uverbs_named_ioctl.h
+++ b/include/rdma/uverbs_named_ioctl.h
@@ -20,7 +20,8 @@

 /* These are static so they do not need to be qualified */
 #define UVERBS_METHOD_ATTRS(method_id) _method_attrs_##method_id
-#define UVERBS_OBJECT_METHODS(object_id) _object_methods_##object_id
+#define UVERBS_OBJECT_METHODS(object_id)                                       \
+	_UVERBS_NAME(_object_methods_##object_id, __LINE__)

 #define DECLARE_UVERBS_NAMED_METHOD(_method_id, ...)                           \
 	static const struct uverbs_attr_def *const UVERBS_METHOD_ATTRS(        \
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 3/7] RDMA/mlx5: Avoid use after free in allocate MEMIC bad flow
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
  2021-03-18 11:15 ` [PATCH mlx5-next 1/7] net/mlx5: Add MEMIC operations related bits Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 2/7] RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 4/7] RDMA/mlx5: Move all DM logic to separate file Leon Romanovsky
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

When driver fails to copy the MEMIC address to the user, we
call to rdma_user_mmap_entry_remove on the mmap entry. Since in this
state the refcount of the mmap entry is decreased to zero, mmap_free
is triggered and release the dm object. Therefore we need to avoid
the explicit call to free the dm.

Fixes: dc2316eba73f ("IB/mlx5: Fix device memory flows")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/main.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 5226664f1bda..d652af720036 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2375,13 +2375,18 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx,

 	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
 				   dm->size, attr->alignment);
-	if (err)
+	if (err) {
+		kfree(dm);
 		return err;
+	}

 	address = dm->dev_addr & PAGE_MASK;
 	err = add_dm_mmap_entry(ctx, dm, address);
-	if (err)
-		goto err_dealloc;
+	if (err) {
+		mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);
+		kfree(dm);
+		return err;
+	}

 	page_idx = dm->mentry.rdma_entry.start_pgoff & 0xFFFF;
 	err = uverbs_copy_to(attrs,
@@ -2402,8 +2407,6 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx,

 err_copy:
 	rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
-err_dealloc:
-	mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);

 	return err;
 }
@@ -2472,9 +2475,7 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,

 	switch (type) {
 	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
-		err = handle_alloc_dm_memic(context, dm,
-					    attr,
-					    attrs);
+		err = handle_alloc_dm_memic(context, dm, attr, attrs);
 		break;
 	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
 		err = handle_alloc_dm_sw_icm(context, dm,
@@ -2496,7 +2497,9 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
 	return &dm->ibdm;

 err_free:
-	kfree(dm);
+	/* In MEMIC error flow, dm will be freed internally */
+	if (type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
+		kfree(dm);
 	return ERR_PTR(err);
 }

--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 4/7] RDMA/mlx5: Move all DM logic to separate file
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
                   ` (2 preceding siblings ...)
  2021-03-18 11:15 ` [PATCH rdma-next 3/7] RDMA/mlx5: Avoid use after free in allocate MEMIC bad flow Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 5/7] RDMA/mlx5: Add support to MODIFY_MEMIC command Leon Romanovsky
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

Move all device memory related code to a separate file.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/Makefile  |   1 +
 drivers/infiniband/hw/mlx5/cmd.c     | 101 --------
 drivers/infiniband/hw/mlx5/cmd.h     |   3 -
 drivers/infiniband/hw/mlx5/dm.c      | 342 +++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/dm.h      |  14 ++
 drivers/infiniband/hw/mlx5/main.c    | 239 +------------------
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   3 +
 7 files changed, 362 insertions(+), 341 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/dm.c
 create mode 100644 drivers/infiniband/hw/mlx5/dm.h

diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
index b4c009bb0db6..f43380106bd0 100644
--- a/drivers/infiniband/hw/mlx5/Makefile
+++ b/drivers/infiniband/hw/mlx5/Makefile
@@ -6,6 +6,7 @@ mlx5_ib-y := ah.o \
 	     cong.o \
 	     counters.o \
 	     cq.o \
+	     dm.o \
 	     doorbell.o \
 	     gsi.o \
 	     ib_virt.o \
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index 234f29912ba9..a8db8a051170 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -47,107 +47,6 @@ int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
 	return mlx5_cmd_exec_inout(dev, query_cong_params, in, out);
 }

-int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
-			 u64 length, u32 alignment)
-{
-	struct mlx5_core_dev *dev = dm->dev;
-	u64 num_memic_hw_pages = MLX5_CAP_DEV_MEM(dev, memic_bar_size)
-					>> PAGE_SHIFT;
-	u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr);
-	u32 max_alignment = MLX5_CAP_DEV_MEM(dev, log_max_memic_addr_alignment);
-	u32 num_pages = DIV_ROUND_UP(length, PAGE_SIZE);
-	u32 out[MLX5_ST_SZ_DW(alloc_memic_out)] = {};
-	u32 in[MLX5_ST_SZ_DW(alloc_memic_in)] = {};
-	u32 mlx5_alignment;
-	u64 page_idx = 0;
-	int ret = 0;
-
-	if (!length || (length & MLX5_MEMIC_ALLOC_SIZE_MASK))
-		return -EINVAL;
-
-	/* mlx5 device sets alignment as 64*2^driver_value
-	 * so normalizing is needed.
-	 */
-	mlx5_alignment = (alignment < MLX5_MEMIC_BASE_ALIGN) ? 0 :
-			 alignment - MLX5_MEMIC_BASE_ALIGN;
-	if (mlx5_alignment > max_alignment)
-		return -EINVAL;
-
-	MLX5_SET(alloc_memic_in, in, opcode, MLX5_CMD_OP_ALLOC_MEMIC);
-	MLX5_SET(alloc_memic_in, in, range_size, num_pages * PAGE_SIZE);
-	MLX5_SET(alloc_memic_in, in, memic_size, length);
-	MLX5_SET(alloc_memic_in, in, log_memic_addr_alignment,
-		 mlx5_alignment);
-
-	while (page_idx < num_memic_hw_pages) {
-		spin_lock(&dm->lock);
-		page_idx = bitmap_find_next_zero_area(dm->memic_alloc_pages,
-						      num_memic_hw_pages,
-						      page_idx,
-						      num_pages, 0);
-
-		if (page_idx < num_memic_hw_pages)
-			bitmap_set(dm->memic_alloc_pages,
-				   page_idx, num_pages);
-
-		spin_unlock(&dm->lock);
-
-		if (page_idx >= num_memic_hw_pages)
-			break;
-
-		MLX5_SET64(alloc_memic_in, in, range_start_addr,
-			   hw_start_addr + (page_idx * PAGE_SIZE));
-
-		ret = mlx5_cmd_exec_inout(dev, alloc_memic, in, out);
-		if (ret) {
-			spin_lock(&dm->lock);
-			bitmap_clear(dm->memic_alloc_pages,
-				     page_idx, num_pages);
-			spin_unlock(&dm->lock);
-
-			if (ret == -EAGAIN) {
-				page_idx++;
-				continue;
-			}
-
-			return ret;
-		}
-
-		*addr = dev->bar_addr +
-			MLX5_GET64(alloc_memic_out, out, memic_start_addr);
-
-		return 0;
-	}
-
-	return -ENOMEM;
-}
-
-void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
-{
-	struct mlx5_core_dev *dev = dm->dev;
-	u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr);
-	u32 num_pages = DIV_ROUND_UP(length, PAGE_SIZE);
-	u32 in[MLX5_ST_SZ_DW(dealloc_memic_in)] = {};
-	u64 start_page_idx;
-	int err;
-
-	addr -= dev->bar_addr;
-	start_page_idx = (addr - hw_start_addr) >> PAGE_SHIFT;
-
-	MLX5_SET(dealloc_memic_in, in, opcode, MLX5_CMD_OP_DEALLOC_MEMIC);
-	MLX5_SET64(dealloc_memic_in, in, memic_start_addr, addr);
-	MLX5_SET(dealloc_memic_in, in, memic_size, length);
-
-	err =  mlx5_cmd_exec_in(dev, dealloc_memic, in);
-	if (err)
-		return;
-
-	spin_lock(&dm->lock);
-	bitmap_clear(dm->memic_alloc_pages,
-		     start_page_idx, num_pages);
-	spin_unlock(&dm->lock);
-}
-
 void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid)
 {
 	u32 in[MLX5_ST_SZ_DW(destroy_tir_in)] = {};
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index 88ea6ef8f2cb..66c96292ed43 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -41,9 +41,6 @@ int mlx5_cmd_dump_fill_mkey(struct mlx5_core_dev *dev, u32 *mkey);
 int mlx5_cmd_null_mkey(struct mlx5_core_dev *dev, u32 *null_mkey);
 int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
 			       void *out);
-int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
-			 u64 length, u32 alignment);
-void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length);
 int mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid);
 void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid);
 void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid);
diff --git a/drivers/infiniband/hw/mlx5/dm.c b/drivers/infiniband/hw/mlx5/dm.c
new file mode 100644
index 000000000000..3d39d93625ad
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/dm.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2021, Mellanox Technologies inc. All rights reserved.
+ */
+
+#include <rdma/uverbs_std_types.h>
+#include "dm.h"
+
+#define UVERBS_MODULE_NAME mlx5_ib
+#include <rdma/uverbs_named_ioctl.h>
+
+static int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
+				u64 length, u32 alignment)
+{
+	struct mlx5_core_dev *dev = dm->dev;
+	u64 num_memic_hw_pages = MLX5_CAP_DEV_MEM(dev, memic_bar_size)
+					>> PAGE_SHIFT;
+	u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr);
+	u32 max_alignment = MLX5_CAP_DEV_MEM(dev, log_max_memic_addr_alignment);
+	u32 num_pages = DIV_ROUND_UP(length, PAGE_SIZE);
+	u32 out[MLX5_ST_SZ_DW(alloc_memic_out)] = {};
+	u32 in[MLX5_ST_SZ_DW(alloc_memic_in)] = {};
+	u32 mlx5_alignment;
+	u64 page_idx = 0;
+	int ret = 0;
+
+	if (!length || (length & MLX5_MEMIC_ALLOC_SIZE_MASK))
+		return -EINVAL;
+
+	/* mlx5 device sets alignment as 64*2^driver_value
+	 * so normalizing is needed.
+	 */
+	mlx5_alignment = (alignment < MLX5_MEMIC_BASE_ALIGN) ? 0 :
+			 alignment - MLX5_MEMIC_BASE_ALIGN;
+	if (mlx5_alignment > max_alignment)
+		return -EINVAL;
+
+	MLX5_SET(alloc_memic_in, in, opcode, MLX5_CMD_OP_ALLOC_MEMIC);
+	MLX5_SET(alloc_memic_in, in, range_size, num_pages * PAGE_SIZE);
+	MLX5_SET(alloc_memic_in, in, memic_size, length);
+	MLX5_SET(alloc_memic_in, in, log_memic_addr_alignment,
+		 mlx5_alignment);
+
+	while (page_idx < num_memic_hw_pages) {
+		spin_lock(&dm->lock);
+		page_idx = bitmap_find_next_zero_area(dm->memic_alloc_pages,
+						      num_memic_hw_pages,
+						      page_idx,
+						      num_pages, 0);
+
+		if (page_idx < num_memic_hw_pages)
+			bitmap_set(dm->memic_alloc_pages,
+				   page_idx, num_pages);
+
+		spin_unlock(&dm->lock);
+
+		if (page_idx >= num_memic_hw_pages)
+			break;
+
+		MLX5_SET64(alloc_memic_in, in, range_start_addr,
+			   hw_start_addr + (page_idx * PAGE_SIZE));
+
+		ret = mlx5_cmd_exec_inout(dev, alloc_memic, in, out);
+		if (ret) {
+			spin_lock(&dm->lock);
+			bitmap_clear(dm->memic_alloc_pages,
+				     page_idx, num_pages);
+			spin_unlock(&dm->lock);
+
+			if (ret == -EAGAIN) {
+				page_idx++;
+				continue;
+			}
+
+			return ret;
+		}
+
+		*addr = dev->bar_addr +
+			MLX5_GET64(alloc_memic_out, out, memic_start_addr);
+
+		return 0;
+	}
+
+	return -ENOMEM;
+}
+
+void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
+			    u64 length)
+{
+	struct mlx5_core_dev *dev = dm->dev;
+	u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr);
+	u32 num_pages = DIV_ROUND_UP(length, PAGE_SIZE);
+	u32 in[MLX5_ST_SZ_DW(dealloc_memic_in)] = {};
+	u64 start_page_idx;
+	int err;
+
+	addr -= dev->bar_addr;
+	start_page_idx = (addr - hw_start_addr) >> PAGE_SHIFT;
+
+	MLX5_SET(dealloc_memic_in, in, opcode, MLX5_CMD_OP_DEALLOC_MEMIC);
+	MLX5_SET64(dealloc_memic_in, in, memic_start_addr, addr);
+	MLX5_SET(dealloc_memic_in, in, memic_size, length);
+
+	err =  mlx5_cmd_exec_in(dev, dealloc_memic, in);
+	if (err)
+		return;
+
+	spin_lock(&dm->lock);
+	bitmap_clear(dm->memic_alloc_pages,
+		     start_page_idx, num_pages);
+	spin_unlock(&dm->lock);
+}
+
+static int add_dm_mmap_entry(struct ib_ucontext *context,
+			     struct mlx5_ib_dm *mdm, u64 address)
+{
+	mdm->mentry.mmap_flag = MLX5_IB_MMAP_TYPE_MEMIC;
+	mdm->mentry.address = address;
+	return rdma_user_mmap_entry_insert_range(
+		context, &mdm->mentry.rdma_entry, mdm->size,
+		MLX5_IB_MMAP_DEVICE_MEM << 16,
+		(MLX5_IB_MMAP_DEVICE_MEM << 16) + (1UL << 16) - 1);
+}
+
+static inline int check_dm_type_support(struct mlx5_ib_dev *dev, u32 type)
+{
+	switch (type) {
+	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
+		if (!MLX5_CAP_DEV_MEM(dev->mdev, memic))
+			return -EOPNOTSUPP;
+		break;
+	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
+		if (!capable(CAP_SYS_RAWIO) || !capable(CAP_NET_RAW))
+			return -EPERM;
+
+		if (!(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) ||
+		      MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner) ||
+		      MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner_v2) ||
+		      MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner_v2)))
+			return -EOPNOTSUPP;
+		break;
+	}
+
+	return 0;
+}
+
+static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
+				 struct ib_dm_alloc_attr *attr,
+				 struct uverbs_attr_bundle *attrs)
+{
+	struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm;
+	u64 start_offset;
+	u16 page_idx;
+	int err;
+	u64 address;
+
+	dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE);
+
+	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
+				   dm->size, attr->alignment);
+	if (err) {
+		kfree(dm);
+		return err;
+	}
+
+	address = dm->dev_addr & PAGE_MASK;
+	err = add_dm_mmap_entry(ctx, dm, address);
+	if (err) {
+		mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);
+		kfree(dm);
+		return err;
+	}
+
+	page_idx = dm->mentry.rdma_entry.start_pgoff & 0xFFFF;
+	err = uverbs_copy_to(attrs,
+			     MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
+			     &page_idx,
+			     sizeof(page_idx));
+	if (err)
+		goto err_copy;
+
+	start_offset = dm->dev_addr & ~PAGE_MASK;
+	err = uverbs_copy_to(attrs,
+			     MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
+			     &start_offset, sizeof(start_offset));
+	if (err)
+		goto err_copy;
+
+	return 0;
+
+err_copy:
+	rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
+
+	return err;
+}
+
+static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
+				  struct mlx5_ib_dm *dm,
+				  struct ib_dm_alloc_attr *attr,
+				  struct uverbs_attr_bundle *attrs, int type)
+{
+	struct mlx5_core_dev *dev = to_mdev(ctx->device)->mdev;
+	u64 act_size;
+	int err;
+
+	/* Allocation size must a multiple of the basic block size
+	 * and a power of 2.
+	 */
+	act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dev));
+	act_size = roundup_pow_of_two(act_size);
+
+	dm->size = act_size;
+	err = mlx5_dm_sw_icm_alloc(dev, type, act_size, attr->alignment,
+				   to_mucontext(ctx)->devx_uid, &dm->dev_addr,
+				   &dm->icm_dm.obj_id);
+	if (err)
+		return err;
+
+	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
+			     &dm->dev_addr, sizeof(dm->dev_addr));
+	if (err)
+		mlx5_dm_sw_icm_dealloc(dev, type, dm->size,
+				       to_mucontext(ctx)->devx_uid,
+				       dm->dev_addr, dm->icm_dm.obj_id);
+
+	return err;
+}
+
+struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
+			       struct ib_ucontext *context,
+			       struct ib_dm_alloc_attr *attr,
+			       struct uverbs_attr_bundle *attrs)
+{
+	struct mlx5_ib_dm *dm;
+	enum mlx5_ib_uapi_dm_type type;
+	int err;
+
+	err = uverbs_get_const_default(&type, attrs,
+				       MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
+				       MLX5_IB_UAPI_DM_TYPE_MEMIC);
+	if (err)
+		return ERR_PTR(err);
+
+	mlx5_ib_dbg(to_mdev(ibdev), "alloc_dm req: dm_type=%d user_length=0x%llx log_alignment=%d\n",
+		    type, attr->length, attr->alignment);
+
+	err = check_dm_type_support(to_mdev(ibdev), type);
+	if (err)
+		return ERR_PTR(err);
+
+	dm = kzalloc(sizeof(*dm), GFP_KERNEL);
+	if (!dm)
+		return ERR_PTR(-ENOMEM);
+
+	dm->type = type;
+
+	switch (type) {
+	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
+		err = handle_alloc_dm_memic(context, dm, attr, attrs);
+		break;
+	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+		err = handle_alloc_dm_sw_icm(context, dm,
+					     attr, attrs,
+					     MLX5_SW_ICM_TYPE_STEERING);
+		break;
+	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
+		err = handle_alloc_dm_sw_icm(context, dm,
+					     attr, attrs,
+					     MLX5_SW_ICM_TYPE_HEADER_MODIFY);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+	}
+
+	if (err)
+		goto err_free;
+
+	return &dm->ibdm;
+
+err_free:
+	/* In MEMIC error flow, dm will be freed internally */
+	if (type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
+		kfree(dm);
+	return ERR_PTR(err);
+}
+
+int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
+{
+	struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
+		&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
+	struct mlx5_core_dev *dev = to_mdev(ibdm->device)->mdev;
+	struct mlx5_ib_dm *dm = to_mdm(ibdm);
+	int ret;
+
+	switch (dm->type) {
+	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
+		rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
+		return 0;
+	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
+					     dm->size, ctx->devx_uid,
+					     dm->dev_addr, dm->icm_dm.obj_id);
+		if (ret)
+			return ret;
+		break;
+	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
+		ret = mlx5_dm_sw_icm_dealloc(dev,
+					     MLX5_SW_ICM_TYPE_HEADER_MODIFY,
+					     dm->size, ctx->devx_uid,
+					     dm->dev_addr, dm->icm_dm.obj_id);
+		if (ret)
+			return ret;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	kfree(dm);
+
+	return 0;
+}
+
+ADD_UVERBS_ATTRIBUTES_SIMPLE(
+	mlx5_ib_dm, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC,
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
+			    UVERBS_ATTR_TYPE(u64), UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
+			    UVERBS_ATTR_TYPE(u16), UA_OPTIONAL),
+	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
+			     enum mlx5_ib_uapi_dm_type, UA_OPTIONAL));
+
+const struct uapi_definition mlx5_ib_dm_defs[] = {
+	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DM, &mlx5_ib_dm),
+	{},
+};
+
+const struct ib_device_ops mlx5_ib_dev_dm_ops = {
+	.alloc_dm = mlx5_ib_alloc_dm,
+	.dealloc_dm = mlx5_ib_dealloc_dm,
+	.reg_dm_mr = mlx5_ib_reg_dm_mr,
+};
diff --git a/drivers/infiniband/hw/mlx5/dm.h b/drivers/infiniband/hw/mlx5/dm.h
new file mode 100644
index 000000000000..dbef67e38731
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/dm.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/*
+ * Copyright (c) 2021, Mellanox Technologies inc. All rights reserved.
+ */
+
+#ifndef _MLX5_IB_DM_H
+#define _MLX5_IB_DM_H
+
+#include "mlx5_ib.h"
+
+void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
+			    u64 length);
+
+#endif /* _MLX5_IB_DM_H */
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index d652af720036..49c8c60d9520 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -34,6 +34,7 @@
 #include "ib_rep.h"
 #include "cmd.h"
 #include "devx.h"
+#include "dm.h"
 #include "fs.h"
 #include "srq.h"
 #include "qp.h"
@@ -2221,19 +2222,6 @@ static int uar_mmap(struct mlx5_ib_dev *dev, enum mlx5_ib_mmap_cmd cmd,
 	return err;
 }

-static int add_dm_mmap_entry(struct ib_ucontext *context,
-			     struct mlx5_ib_dm *mdm,
-			     u64 address)
-{
-	mdm->mentry.mmap_flag = MLX5_IB_MMAP_TYPE_MEMIC;
-	mdm->mentry.address = address;
-	return rdma_user_mmap_entry_insert_range(
-			context, &mdm->mentry.rdma_entry,
-			mdm->size,
-			MLX5_IB_MMAP_DEVICE_MEM << 16,
-			(MLX5_IB_MMAP_DEVICE_MEM << 16) + (1UL << 16) - 1);
-}
-
 static unsigned long mlx5_vma_to_pgoff(struct vm_area_struct *vma)
 {
 	unsigned long idx;
@@ -2335,209 +2323,6 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm
 	return 0;
 }

-static inline int check_dm_type_support(struct mlx5_ib_dev *dev,
-					u32 type)
-{
-	switch (type) {
-	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
-		if (!MLX5_CAP_DEV_MEM(dev->mdev, memic))
-			return -EOPNOTSUPP;
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
-	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		if (!capable(CAP_SYS_RAWIO) ||
-		    !capable(CAP_NET_RAW))
-			return -EPERM;
-
-		if (!(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) ||
-		      MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner) ||
-		      MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner_v2) ||
-		      MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner_v2)))
-			return -EOPNOTSUPP;
-		break;
-	}
-
-	return 0;
-}
-
-static int handle_alloc_dm_memic(struct ib_ucontext *ctx,
-				 struct mlx5_ib_dm *dm,
-				 struct ib_dm_alloc_attr *attr,
-				 struct uverbs_attr_bundle *attrs)
-{
-	struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm;
-	u64 start_offset;
-	u16 page_idx;
-	int err;
-	u64 address;
-
-	dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE);
-
-	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
-				   dm->size, attr->alignment);
-	if (err) {
-		kfree(dm);
-		return err;
-	}
-
-	address = dm->dev_addr & PAGE_MASK;
-	err = add_dm_mmap_entry(ctx, dm, address);
-	if (err) {
-		mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);
-		kfree(dm);
-		return err;
-	}
-
-	page_idx = dm->mentry.rdma_entry.start_pgoff & 0xFFFF;
-	err = uverbs_copy_to(attrs,
-			     MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
-			     &page_idx,
-			     sizeof(page_idx));
-	if (err)
-		goto err_copy;
-
-	start_offset = dm->dev_addr & ~PAGE_MASK;
-	err = uverbs_copy_to(attrs,
-			     MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
-			     &start_offset, sizeof(start_offset));
-	if (err)
-		goto err_copy;
-
-	return 0;
-
-err_copy:
-	rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
-
-	return err;
-}
-
-static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
-				  struct mlx5_ib_dm *dm,
-				  struct ib_dm_alloc_attr *attr,
-				  struct uverbs_attr_bundle *attrs,
-				  int type)
-{
-	struct mlx5_core_dev *dev = to_mdev(ctx->device)->mdev;
-	u64 act_size;
-	int err;
-
-	/* Allocation size must a multiple of the basic block size
-	 * and a power of 2.
-	 */
-	act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dev));
-	act_size = roundup_pow_of_two(act_size);
-
-	dm->size = act_size;
-	err = mlx5_dm_sw_icm_alloc(dev, type, act_size, attr->alignment,
-				   to_mucontext(ctx)->devx_uid, &dm->dev_addr,
-				   &dm->icm_dm.obj_id);
-	if (err)
-		return err;
-
-	err = uverbs_copy_to(attrs,
-			     MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
-			     &dm->dev_addr, sizeof(dm->dev_addr));
-	if (err)
-		mlx5_dm_sw_icm_dealloc(dev, type, dm->size,
-				       to_mucontext(ctx)->devx_uid, dm->dev_addr,
-				       dm->icm_dm.obj_id);
-
-	return err;
-}
-
-struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
-			       struct ib_ucontext *context,
-			       struct ib_dm_alloc_attr *attr,
-			       struct uverbs_attr_bundle *attrs)
-{
-	struct mlx5_ib_dm *dm;
-	enum mlx5_ib_uapi_dm_type type;
-	int err;
-
-	err = uverbs_get_const_default(&type, attrs,
-				       MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
-				       MLX5_IB_UAPI_DM_TYPE_MEMIC);
-	if (err)
-		return ERR_PTR(err);
-
-	mlx5_ib_dbg(to_mdev(ibdev), "alloc_dm req: dm_type=%d user_length=0x%llx log_alignment=%d\n",
-		    type, attr->length, attr->alignment);
-
-	err = check_dm_type_support(to_mdev(ibdev), type);
-	if (err)
-		return ERR_PTR(err);
-
-	dm = kzalloc(sizeof(*dm), GFP_KERNEL);
-	if (!dm)
-		return ERR_PTR(-ENOMEM);
-
-	dm->type = type;
-
-	switch (type) {
-	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
-		err = handle_alloc_dm_memic(context, dm, attr, attrs);
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
-		err = handle_alloc_dm_sw_icm(context, dm,
-					     attr, attrs,
-					     MLX5_SW_ICM_TYPE_STEERING);
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		err = handle_alloc_dm_sw_icm(context, dm,
-					     attr, attrs,
-					     MLX5_SW_ICM_TYPE_HEADER_MODIFY);
-		break;
-	default:
-		err = -EOPNOTSUPP;
-	}
-
-	if (err)
-		goto err_free;
-
-	return &dm->ibdm;
-
-err_free:
-	/* In MEMIC error flow, dm will be freed internally */
-	if (type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
-		kfree(dm);
-	return ERR_PTR(err);
-}
-
-int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
-{
-	struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
-		&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
-	struct mlx5_core_dev *dev = to_mdev(ibdm->device)->mdev;
-	struct mlx5_ib_dm *dm = to_mdm(ibdm);
-	int ret;
-
-	switch (dm->type) {
-	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
-		rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
-		return 0;
-	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
-		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
-					     dm->size, ctx->devx_uid, dm->dev_addr,
-					     dm->icm_dm.obj_id);
-		if (ret)
-			return ret;
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_HEADER_MODIFY,
-					     dm->size, ctx->devx_uid, dm->dev_addr,
-					     dm->icm_dm.obj_id);
-		if (ret)
-			return ret;
-		break;
-	default:
-		return -EOPNOTSUPP;
-	}
-
-	kfree(dm);
-
-	return 0;
-}
-
 static int mlx5_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct mlx5_ib_pd *pd = to_mpd(ibpd);
@@ -3821,20 +3606,6 @@ DECLARE_UVERBS_NAMED_OBJECT(MLX5_IB_OBJECT_UAR,
 			    &UVERBS_METHOD(MLX5_IB_METHOD_UAR_OBJ_ALLOC),
 			    &UVERBS_METHOD(MLX5_IB_METHOD_UAR_OBJ_DESTROY));

-ADD_UVERBS_ATTRIBUTES_SIMPLE(
-	mlx5_ib_dm,
-	UVERBS_OBJECT_DM,
-	UVERBS_METHOD_DM_ALLOC,
-	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
-			    UVERBS_ATTR_TYPE(u64),
-			    UA_MANDATORY),
-	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
-			    UVERBS_ATTR_TYPE(u16),
-			    UA_OPTIONAL),
-	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
-			     enum mlx5_ib_uapi_dm_type,
-			     UA_OPTIONAL));
-
 ADD_UVERBS_ATTRIBUTES_SIMPLE(
 	mlx5_ib_flow_action,
 	UVERBS_OBJECT_FLOW_ACTION,
@@ -3857,10 +3628,10 @@ static const struct uapi_definition mlx5_ib_defs[] = {
 	UAPI_DEF_CHAIN(mlx5_ib_flow_defs),
 	UAPI_DEF_CHAIN(mlx5_ib_qos_defs),
 	UAPI_DEF_CHAIN(mlx5_ib_std_types_defs),
+	UAPI_DEF_CHAIN(mlx5_ib_dm_defs),

 	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_FLOW_ACTION,
 				&mlx5_ib_flow_action),
-	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DM, &mlx5_ib_dm),
 	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DEVICE, &mlx5_ib_query_context),
 	UAPI_DEF_CHAIN_OBJ_TREE_NAMED(MLX5_IB_OBJECT_VAR,
 				UAPI_DEF_IS_OBJ_SUPPORTED(var_is_supported)),
@@ -4038,12 +3809,6 @@ static const struct ib_device_ops mlx5_ib_dev_xrc_ops = {
 	INIT_RDMA_OBJ_SIZE(ib_xrcd, mlx5_ib_xrcd, ibxrcd),
 };

-static const struct ib_device_ops mlx5_ib_dev_dm_ops = {
-	.alloc_dm = mlx5_ib_alloc_dm,
-	.dealloc_dm = mlx5_ib_dealloc_dm,
-	.reg_dm_mr = mlx5_ib_reg_dm_mr,
-};
-
 static int mlx5_ib_init_var_table(struct mlx5_ib_dev *dev)
 {
 	struct mlx5_core_dev *mdev = dev->mdev;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index a31097538dc7..ae971de6e934 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1410,6 +1410,8 @@ static inline int mlx5_ib_init_dmabuf_mr(struct mlx5_ib_mr *mr)

 extern const struct mmu_interval_notifier_ops mlx5_mn_ops;

+extern const struct ib_device_ops mlx5_ib_dev_dm_ops;
+
 /* Needed for rep profile */
 void __mlx5_ib_remove(struct mlx5_ib_dev *dev,
 		      const struct mlx5_ib_profile *profile,
@@ -1462,6 +1464,7 @@ void mlx5_ib_put_native_port_mdev(struct mlx5_ib_dev *dev,
 				  u32 port_num);

 extern const struct uapi_definition mlx5_ib_devx_defs[];
+extern const struct uapi_definition mlx5_ib_dm_defs[];
 extern const struct uapi_definition mlx5_ib_flow_defs[];
 extern const struct uapi_definition mlx5_ib_qos_defs[];
 extern const struct uapi_definition mlx5_ib_std_types_defs[];
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 5/7] RDMA/mlx5: Add support to MODIFY_MEMIC command
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
                   ` (3 preceding siblings ...)
  2021-03-18 11:15 ` [PATCH rdma-next 4/7] RDMA/mlx5: Move all DM logic to separate file Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations Leon Romanovsky
  2021-03-18 11:15 ` [PATCH rdma-next 7/7] RDMA/mlx5: Expose UAPI to query DM Leon Romanovsky
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

Add two functions to allocate and deallocate MEMIC operations
by using the MODIFY_MEMIC command.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/dm.c | 38 +++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/dm.h |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/dm.c b/drivers/infiniband/hw/mlx5/dm.c
index 3d39d93625ad..97a925d43312 100644
--- a/drivers/infiniband/hw/mlx5/dm.c
+++ b/drivers/infiniband/hw/mlx5/dm.c
@@ -111,6 +111,44 @@ void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
 	spin_unlock(&dm->lock);
 }

+void mlx5_cmd_dealloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
+			       u8 operation)
+{
+	u32 in[MLX5_ST_SZ_DW(modify_memic_in)] = {};
+	struct mlx5_core_dev *dev = dm->dev;
+
+	MLX5_SET(modify_memic_in, in, opcode, MLX5_CMD_OP_MODIFY_MEMIC);
+	MLX5_SET(modify_memic_in, in, op_mod, MLX5_MODIFY_MEMIC_OP_MOD_DEALLOC);
+	MLX5_SET(modify_memic_in, in, memic_operation_type,
+		 MLX5_MODIFY_MEMIC_OP_MOD_ALLOC);
+	MLX5_SET64(modify_memic_in, in, memic_start_addr, addr - dev->bar_addr);
+
+	mlx5_cmd_exec_in(dev, modify_memic, in);
+}
+
+static int mlx5_cmd_alloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
+				   u8 operation, phys_addr_t *op_addr)
+{
+	u32 out[MLX5_ST_SZ_DW(modify_memic_out)] = {};
+	u32 in[MLX5_ST_SZ_DW(modify_memic_in)] = {};
+	struct mlx5_core_dev *dev = dm->dev;
+	int err;
+
+	MLX5_SET(modify_memic_in, in, opcode, MLX5_CMD_OP_MODIFY_MEMIC);
+	MLX5_SET(modify_memic_in, in, op_mod, MLX5_MODIFY_MEMIC_OP_MOD_ALLOC);
+	MLX5_SET(modify_memic_in, in, memic_operation_type,
+		 MLX5_MODIFY_MEMIC_OP_MOD_ALLOC);
+	MLX5_SET64(modify_memic_in, in, memic_start_addr, addr - dev->bar_addr);
+
+	err = mlx5_cmd_exec_inout(dev, modify_memic, in, out);
+	if (err)
+		return err;
+
+	*op_addr = dev->bar_addr +
+		   MLX5_GET64(modify_memic_out, out, memic_operation_addr);
+	return 0;
+}
+
 static int add_dm_mmap_entry(struct ib_ucontext *context,
 			     struct mlx5_ib_dm *mdm, u64 address)
 {
diff --git a/drivers/infiniband/hw/mlx5/dm.h b/drivers/infiniband/hw/mlx5/dm.h
index dbef67e38731..adb39d3d8131 100644
--- a/drivers/infiniband/hw/mlx5/dm.h
+++ b/drivers/infiniband/hw/mlx5/dm.h
@@ -10,5 +10,7 @@

 void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
 			    u64 length);
+void mlx5_cmd_dealloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
+			       u8 operation);

 #endif /* _MLX5_IB_DM_H */
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
                   ` (4 preceding siblings ...)
  2021-03-18 11:15 ` [PATCH rdma-next 5/7] RDMA/mlx5: Add support to MODIFY_MEMIC command Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  2021-04-01 17:47   ` Jason Gunthorpe
  2021-03-18 11:15 ` [PATCH rdma-next 7/7] RDMA/mlx5: Expose UAPI to query DM Leon Romanovsky
  6 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

MEMIC buffer, in addition to regular read and write operations, can
support atomic operations from the host.

Introduce and implement new UAPI to allocate address space for MEMIC
operations such as atomic. This includes:

1. Expose new IOCTL for request mapping of MEMIC operation.
2. Hold the operations address in a list, so same operation to same DM
   will be allocated only once.
3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid
   until all addresses were unmapped.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/dm.c          | 196 +++++++++++++++++++++--
 drivers/infiniband/hw/mlx5/dm.h          |   2 +
 drivers/infiniband/hw/mlx5/main.c        |   7 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h     |  16 +-
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  11 ++
 5 files changed, 214 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/dm.c b/drivers/infiniband/hw/mlx5/dm.c
index 97a925d43312..ee4ee197a626 100644
--- a/drivers/infiniband/hw/mlx5/dm.c
+++ b/drivers/infiniband/hw/mlx5/dm.c
@@ -150,12 +150,14 @@ static int mlx5_cmd_alloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
 }

 static int add_dm_mmap_entry(struct ib_ucontext *context,
-			     struct mlx5_ib_dm *mdm, u64 address)
+			     struct mlx5_user_mmap_entry *mentry, u8 mmap_flag,
+			     size_t size, u64 address)
 {
-	mdm->mentry.mmap_flag = MLX5_IB_MMAP_TYPE_MEMIC;
-	mdm->mentry.address = address;
+	mentry->mmap_flag = mmap_flag;
+	mentry->address = address;
+
 	return rdma_user_mmap_entry_insert_range(
-		context, &mdm->mentry.rdma_entry, mdm->size,
+		context, &mentry->rdma_entry, size,
 		MLX5_IB_MMAP_DEVICE_MEM << 16,
 		(MLX5_IB_MMAP_DEVICE_MEM << 16) + (1UL << 16) - 1);
 }
@@ -183,6 +185,114 @@ static inline int check_dm_type_support(struct mlx5_ib_dev *dev, u32 type)
 	return 0;
 }

+void mlx5_ib_dm_memic_free(struct kref *kref)
+{
+	struct mlx5_ib_dm *dm =
+		container_of(kref, struct mlx5_ib_dm, memic.ref);
+	struct mlx5_ib_dev *dev = to_mdev(dm->ibdm.device);
+
+	mlx5_cmd_dealloc_memic(&dev->dm, dm->dev_addr, dm->size);
+	kfree(dm);
+}
+
+static int copy_op_to_user(struct mlx5_ib_dm_op_entry *op_entry,
+			   struct uverbs_attr_bundle *attrs)
+{
+	u64 start_offset;
+	u16 page_idx;
+	int err;
+
+	page_idx = op_entry->mentry.rdma_entry.start_pgoff & 0xFFFF;
+	start_offset = op_entry->op_addr & ~PAGE_MASK;
+	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
+			     &page_idx, sizeof(page_idx));
+	if (err)
+		return err;
+
+	return uverbs_copy_to(attrs,
+			      MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET,
+			      &start_offset, sizeof(start_offset));
+}
+
+static int map_existing_op(struct mlx5_ib_dm *dm, u8 op,
+			   struct uverbs_attr_bundle *attrs)
+{
+	struct mlx5_ib_dm_op_entry *op_entry;
+
+	op_entry = xa_load(&dm->memic.ops, op);
+	if (!op_entry)
+		return -ENOENT;
+
+	return copy_op_to_user(op_entry, attrs);
+}
+
+static int UVERBS_HANDLER(MLX5_IB_METHOD_DM_MAP_OP_ADDR)(
+	struct uverbs_attr_bundle *attrs)
+{
+	struct ib_uobject *uobj = uverbs_attr_get_uobject(
+		attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE);
+	struct mlx5_ib_dev *dev = to_mdev(uobj->context->device);
+	struct ib_dm *ibdm = uobj->object;
+	struct mlx5_ib_dm *dm = to_mdm(ibdm);
+	struct mlx5_ib_dm_op_entry *op_entry;
+	int err;
+	u8 op;
+
+	err = uverbs_copy_from(&op, attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP);
+	if (err)
+		return err;
+
+	if (!(MLX5_CAP_DEV_MEM(dev->mdev, memic_operations) & BIT(op)))
+		return -EOPNOTSUPP;
+
+	mutex_lock(&dm->memic.ops_xa_lock);
+	err = map_existing_op(dm, op, attrs);
+	if (!err || err != -ENOENT)
+		goto err_unlock;
+
+	op_entry = kzalloc(sizeof(*op_entry), GFP_KERNEL);
+	if (!op_entry)
+		goto err_unlock;
+
+	err = mlx5_cmd_alloc_memic_op(&dev->dm, dm->dev_addr, op,
+				      &op_entry->op_addr);
+	if (err) {
+		kfree(op_entry);
+		goto err_unlock;
+	}
+	op_entry->op = op;
+	op_entry->dm = dm;
+
+	err = add_dm_mmap_entry(uobj->context, &op_entry->mentry,
+				MLX5_IB_MMAP_TYPE_MEMIC_OP, dm->size,
+				op_entry->op_addr & PAGE_MASK);
+	if (err) {
+		mlx5_cmd_dealloc_memic_op(&dev->dm, dm->dev_addr, op);
+		kfree(op_entry);
+		goto err_unlock;
+	}
+	/* From this point, entry will be freed by mmap_free */
+	kref_get(&dm->memic.ref);
+
+	err = copy_op_to_user(op_entry, attrs);
+	if (err)
+		goto err_remove;
+
+	err = xa_insert(&dm->memic.ops, op, op_entry, GFP_KERNEL);
+	if (err)
+		goto err_remove;
+	mutex_unlock(&dm->memic.ops_xa_lock);
+
+	return 0;
+
+err_remove:
+	rdma_user_mmap_entry_remove(&op_entry->mentry.rdma_entry);
+err_unlock:
+	mutex_unlock(&dm->memic.ops_xa_lock);
+
+	return err;
+}
+
 static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
 				 struct ib_dm_alloc_attr *attr,
 				 struct uverbs_attr_bundle *attrs)
@@ -193,6 +303,9 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
 	int err;
 	u64 address;

+	kref_init(&dm->memic.ref);
+	xa_init(&dm->memic.ops);
+	mutex_init(&dm->memic.ops_xa_lock);
 	dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE);

 	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
@@ -203,18 +316,17 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
 	}

 	address = dm->dev_addr & PAGE_MASK;
-	err = add_dm_mmap_entry(ctx, dm, address);
+	err = add_dm_mmap_entry(ctx, &dm->memic.mentry, MLX5_IB_MMAP_TYPE_MEMIC,
+				dm->size, address);
 	if (err) {
 		mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);
 		kfree(dm);
 		return err;
 	}

-	page_idx = dm->mentry.rdma_entry.start_pgoff & 0xFFFF;
-	err = uverbs_copy_to(attrs,
-			     MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
-			     &page_idx,
-			     sizeof(page_idx));
+	page_idx = dm->memic.mentry.rdma_entry.start_pgoff & 0xFFFF;
+	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
+			     &page_idx, sizeof(page_idx));
 	if (err)
 		goto err_copy;

@@ -228,7 +340,7 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
 	return 0;

 err_copy:
-	rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
+	rdma_user_mmap_entry_remove(&dm->memic.mentry.rdma_entry);

 	return err;
 }
@@ -292,6 +404,7 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
 		return ERR_PTR(-ENOMEM);

 	dm->type = type;
+	dm->ibdm.device = ibdev;

 	switch (type) {
 	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
@@ -323,6 +436,19 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
 	return ERR_PTR(err);
 }

+static void dm_memic_remove_ops(struct mlx5_ib_dm *dm)
+{
+	struct mlx5_ib_dm_op_entry *entry;
+	unsigned long idx;
+
+	mutex_lock(&dm->memic.ops_xa_lock);
+	xa_for_each(&dm->memic.ops, idx, entry) {
+		xa_erase(&dm->memic.ops, idx);
+		rdma_user_mmap_entry_remove(&entry->mentry.rdma_entry);
+	}
+	mutex_unlock(&dm->memic.ops_xa_lock);
+}
+
 int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
 {
 	struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
@@ -333,7 +459,8 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)

 	switch (dm->type) {
 	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
-		rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
+		dm_memic_remove_ops(dm);
+		rdma_user_mmap_entry_remove(&dm->memic.mentry.rdma_entry);
 		return 0;
 	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
 		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
@@ -359,6 +486,31 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
 	return 0;
 }

+void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
+			  struct mlx5_user_mmap_entry *mentry)
+{
+	struct mlx5_ib_dm_op_entry *op_entry;
+	struct mlx5_ib_dm *mdm;
+
+	switch (mentry->mmap_flag) {
+	case MLX5_IB_MMAP_TYPE_MEMIC:
+		mdm = container_of(mentry, struct mlx5_ib_dm, memic.mentry);
+		kref_put(&mdm->memic.ref, mlx5_ib_dm_memic_free);
+		break;
+	case MLX5_IB_MMAP_TYPE_MEMIC_OP:
+		op_entry = container_of(mentry, struct mlx5_ib_dm_op_entry,
+					mentry);
+		mdm = op_entry->dm;
+		mlx5_cmd_dealloc_memic_op(&dev->dm, mdm->dev_addr,
+					  op_entry->op);
+		kfree(op_entry);
+		kref_put(&mdm->memic.ref, mlx5_ib_dm_memic_free);
+		break;
+	default:
+		WARN_ON(true);
+	}
+}
+
 ADD_UVERBS_ATTRIBUTES_SIMPLE(
 	mlx5_ib_dm, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC,
 	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
@@ -368,8 +520,28 @@ ADD_UVERBS_ATTRIBUTES_SIMPLE(
 	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
 			     enum mlx5_ib_uapi_dm_type, UA_OPTIONAL));

+DECLARE_UVERBS_NAMED_METHOD(
+	MLX5_IB_METHOD_DM_MAP_OP_ADDR,
+	UVERBS_ATTR_IDR(MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE,
+			UVERBS_OBJECT_DM,
+			UVERBS_ACCESS_READ,
+			UA_MANDATORY),
+	UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP,
+			   UVERBS_ATTR_TYPE(u8),
+			   UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET,
+			    UVERBS_ATTR_TYPE(u64),
+			    UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
+			    UVERBS_ATTR_TYPE(u16),
+			    UA_OPTIONAL));
+
+DECLARE_UVERBS_GLOBAL_METHODS(UVERBS_OBJECT_DM,
+			      &UVERBS_METHOD(MLX5_IB_METHOD_DM_MAP_OP_ADDR));
+
 const struct uapi_definition mlx5_ib_dm_defs[] = {
 	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DM, &mlx5_ib_dm),
+	UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_DM),
 	{},
 };

diff --git a/drivers/infiniband/hw/mlx5/dm.h b/drivers/infiniband/hw/mlx5/dm.h
index adb39d3d8131..56cf1ba9c985 100644
--- a/drivers/infiniband/hw/mlx5/dm.h
+++ b/drivers/infiniband/hw/mlx5/dm.h
@@ -8,6 +8,8 @@

 #include "mlx5_ib.h"

+void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
+			  struct mlx5_user_mmap_entry *mentry);
 void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
 			    u64 length);
 void mlx5_cmd_dealloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 49c8c60d9520..6908db28b796 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2090,14 +2090,11 @@ static void mlx5_ib_mmap_free(struct rdma_user_mmap_entry *entry)
 	struct mlx5_user_mmap_entry *mentry = to_mmmap(entry);
 	struct mlx5_ib_dev *dev = to_mdev(entry->ucontext->device);
 	struct mlx5_var_table *var_table = &dev->var_table;
-	struct mlx5_ib_dm *mdm;

 	switch (mentry->mmap_flag) {
 	case MLX5_IB_MMAP_TYPE_MEMIC:
-		mdm = container_of(mentry, struct mlx5_ib_dm, mentry);
-		mlx5_cmd_dealloc_memic(&dev->dm, mdm->dev_addr,
-				       mdm->size);
-		kfree(mdm);
+	case MLX5_IB_MMAP_TYPE_MEMIC_OP:
+		mlx5_ib_dm_mmap_free(dev, mentry);
 		break;
 	case MLX5_IB_MMAP_TYPE_VAR:
 		mutex_lock(&var_table->bitmap_lock);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index ae971de6e934..b714131f87b7 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -166,6 +166,7 @@ enum mlx5_ib_mmap_type {
 	MLX5_IB_MMAP_TYPE_VAR = 2,
 	MLX5_IB_MMAP_TYPE_UAR_WC = 3,
 	MLX5_IB_MMAP_TYPE_UAR_NC = 4,
+	MLX5_IB_MMAP_TYPE_MEMIC_OP = 5,
 };

 struct mlx5_bfreg_info {
@@ -618,18 +619,30 @@ struct mlx5_user_mmap_entry {
 	u32 page_idx;
 };

+struct mlx5_ib_dm_op_entry {
+	struct mlx5_user_mmap_entry	mentry;
+	phys_addr_t			op_addr;
+	struct mlx5_ib_dm		*dm;
+	u8				op;
+};
+
 struct mlx5_ib_dm {
 	struct ib_dm		ibdm;
 	phys_addr_t		dev_addr;
 	u32			type;
 	size_t			size;
 	union {
+		struct {
+				struct mlx5_user_mmap_entry mentry;
+				struct xarray		ops;
+				struct mutex		ops_xa_lock;
+				struct kref		ref;
+		} memic;
 		struct {
 			u32	obj_id;
 		} icm_dm;
 		/* other dm types specific params should be added here */
 	};
-	struct mlx5_user_mmap_entry mentry;
 };

 #define MLX5_IB_MTT_PRESENT (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE)
@@ -1352,6 +1365,7 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
 			       struct ib_dm_alloc_attr *attr,
 			       struct uverbs_attr_bundle *attrs);
 int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs);
+void mlx5_ib_dm_memic_free(struct kref *kref);
 struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm,
 				struct ib_dm_mr_attr *attr,
 				struct uverbs_attr_bundle *attrs);
diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
index 3f0bc7597ba7..c6fbc5211717 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
@@ -41,6 +41,17 @@ enum mlx5_ib_create_flow_action_attrs {
 	MLX5_IB_ATTR_CREATE_FLOW_ACTION_FLAGS = (1U << UVERBS_ID_NS_SHIFT),
 };

+enum mlx5_ib_dm_methods {
+	MLX5_IB_METHOD_DM_MAP_OP_ADDR  = (1U << UVERBS_ID_NS_SHIFT),
+};
+
+enum mlx5_ib_dm_map_op_addr_attrs {
+	MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+	MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP,
+	MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET,
+	MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
+};
+
 enum mlx5_ib_alloc_dm_attrs {
 	MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET = (1U << UVERBS_ID_NS_SHIFT),
 	MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH rdma-next 7/7] RDMA/mlx5: Expose UAPI to query DM
  2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
                   ` (5 preceding siblings ...)
  2021-03-18 11:15 ` [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations Leon Romanovsky
@ 2021-03-18 11:15 ` Leon Romanovsky
  6 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-03-18 11:15 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed, Yishai Hadas

From: Maor Gottlieb <maorg@nvidia.com>

Expose UAPI to query MEMIC DM, this will let user space application
that didn't allocate the DM but has access to by owning the matching
command FD to retrieve its information.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/dm.c          | 45 +++++++++++++++++++++++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h     |  1 +
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  8 +++++
 3 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/dm.c b/drivers/infiniband/hw/mlx5/dm.c
index ee4ee197a626..41c158216f17 100644
--- a/drivers/infiniband/hw/mlx5/dm.c
+++ b/drivers/infiniband/hw/mlx5/dm.c
@@ -307,6 +307,7 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
 	xa_init(&dm->memic.ops);
 	mutex_init(&dm->memic.ops_xa_lock);
 	dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE);
+	dm->memic.req_length = attr->length;

 	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
 				   dm->size, attr->alignment);
@@ -486,6 +487,36 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
 	return 0;
 }

+static int UVERBS_HANDLER(MLX5_IB_METHOD_DM_QUERY)(
+	struct uverbs_attr_bundle *attrs)
+{
+	struct ib_dm *ibdm =
+		uverbs_attr_get_obj(attrs, MLX5_IB_ATTR_QUERY_DM_REQ_HANDLE);
+	struct mlx5_ib_dm *dm = to_mdm(ibdm);
+	u64 start_offset;
+	u16 page_idx;
+	int err;
+
+	if (dm->type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
+		return -EOPNOTSUPP;
+
+	page_idx = dm->memic.mentry.rdma_entry.start_pgoff & 0xFFFF;
+	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_QUERY_DM_RESP_PAGE_INDEX,
+			     &page_idx, sizeof(page_idx));
+	if (err)
+		return err;
+
+	start_offset = dm->dev_addr & ~PAGE_MASK;
+	err =  uverbs_copy_to(attrs, MLX5_IB_ATTR_QUERY_DM_RESP_START_OFFSET,
+			      &start_offset, sizeof(start_offset));
+	if (err)
+		return err;
+
+	return uverbs_copy_to(attrs, MLX5_IB_ATTR_QUERY_DM_RESP_LENGTH,
+			      &dm->memic.req_length,
+			      sizeof(dm->memic.req_length));
+}
+
 void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
 			  struct mlx5_user_mmap_entry *mentry)
 {
@@ -511,6 +542,17 @@ void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
 	}
 }

+DECLARE_UVERBS_NAMED_METHOD(
+	MLX5_IB_METHOD_DM_QUERY,
+	UVERBS_ATTR_IDR(MLX5_IB_ATTR_QUERY_DM_REQ_HANDLE, UVERBS_OBJECT_DM,
+			UVERBS_ACCESS_READ, UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_QUERY_DM_RESP_START_OFFSET,
+			    UVERBS_ATTR_TYPE(u64), UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_QUERY_DM_RESP_PAGE_INDEX,
+			    UVERBS_ATTR_TYPE(u16), UA_MANDATORY),
+	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_QUERY_DM_RESP_LENGTH,
+			    UVERBS_ATTR_TYPE(u64), UA_MANDATORY));
+
 ADD_UVERBS_ATTRIBUTES_SIMPLE(
 	mlx5_ib_dm, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC,
 	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
@@ -537,7 +579,8 @@ DECLARE_UVERBS_NAMED_METHOD(
 			    UA_OPTIONAL));

 DECLARE_UVERBS_GLOBAL_METHODS(UVERBS_OBJECT_DM,
-			      &UVERBS_METHOD(MLX5_IB_METHOD_DM_MAP_OP_ADDR));
+			      &UVERBS_METHOD(MLX5_IB_METHOD_DM_MAP_OP_ADDR),
+			      &UVERBS_METHOD(MLX5_IB_METHOD_DM_QUERY));

 const struct uapi_definition mlx5_ib_dm_defs[] = {
 	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DM, &mlx5_ib_dm),
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index b714131f87b7..78099d95e8e9 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -637,6 +637,7 @@ struct mlx5_ib_dm {
 				struct xarray		ops;
 				struct mutex		ops_xa_lock;
 				struct kref		ref;
+				size_t			req_length;
 		} memic;
 		struct {
 			u32	obj_id;
diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
index c6fbc5211717..3798cbcb9021 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
@@ -43,6 +43,7 @@ enum mlx5_ib_create_flow_action_attrs {

 enum mlx5_ib_dm_methods {
 	MLX5_IB_METHOD_DM_MAP_OP_ADDR  = (1U << UVERBS_ID_NS_SHIFT),
+	MLX5_IB_METHOD_DM_QUERY,
 };

 enum mlx5_ib_dm_map_op_addr_attrs {
@@ -52,6 +53,13 @@ enum mlx5_ib_dm_map_op_addr_attrs {
 	MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
 };

+enum mlx5_ib_query_dm_attrs {
+	MLX5_IB_ATTR_QUERY_DM_REQ_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+	MLX5_IB_ATTR_QUERY_DM_RESP_START_OFFSET,
+	MLX5_IB_ATTR_QUERY_DM_RESP_PAGE_INDEX,
+	MLX5_IB_ATTR_QUERY_DM_RESP_LENGTH,
+};
+
 enum mlx5_ib_alloc_dm_attrs {
 	MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET = (1U << UVERBS_ID_NS_SHIFT),
 	MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
--
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations
  2021-03-18 11:15 ` [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations Leon Romanovsky
@ 2021-04-01 17:47   ` Jason Gunthorpe
  2021-04-04  7:51     ` Leon Romanovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2021-04-01 17:47 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed,
	Yishai Hadas

On Thu, Mar 18, 2021 at 01:15:47PM +0200, Leon Romanovsky wrote:
> From: Maor Gottlieb <maorg@nvidia.com>
> 
> MEMIC buffer, in addition to regular read and write operations, can
> support atomic operations from the host.
> 
> Introduce and implement new UAPI to allocate address space for MEMIC
> operations such as atomic. This includes:
> 
> 1. Expose new IOCTL for request mapping of MEMIC operation.
> 2. Hold the operations address in a list, so same operation to same DM
>    will be allocated only once.
> 3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid
>    until all addresses were unmapped.
> 
> Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/hw/mlx5/dm.c          | 196 +++++++++++++++++++++--
>  drivers/infiniband/hw/mlx5/dm.h          |   2 +
>  drivers/infiniband/hw/mlx5/main.c        |   7 +-
>  drivers/infiniband/hw/mlx5/mlx5_ib.h     |  16 +-
>  include/uapi/rdma/mlx5_user_ioctl_cmds.h |  11 ++
>  5 files changed, 214 insertions(+), 18 deletions(-)
> 
> --
> 2.30.2
> 
> diff --git a/drivers/infiniband/hw/mlx5/dm.c b/drivers/infiniband/hw/mlx5/dm.c
> index 97a925d43312..ee4ee197a626 100644
> --- a/drivers/infiniband/hw/mlx5/dm.c
> +++ b/drivers/infiniband/hw/mlx5/dm.c
> @@ -150,12 +150,14 @@ static int mlx5_cmd_alloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
>  }
> 
>  static int add_dm_mmap_entry(struct ib_ucontext *context,
> -			     struct mlx5_ib_dm *mdm, u64 address)
> +			     struct mlx5_user_mmap_entry *mentry, u8 mmap_flag,
> +			     size_t size, u64 address)
>  {
> -	mdm->mentry.mmap_flag = MLX5_IB_MMAP_TYPE_MEMIC;
> -	mdm->mentry.address = address;
> +	mentry->mmap_flag = mmap_flag;
> +	mentry->address = address;
> +
>  	return rdma_user_mmap_entry_insert_range(
> -		context, &mdm->mentry.rdma_entry, mdm->size,
> +		context, &mentry->rdma_entry, size,
>  		MLX5_IB_MMAP_DEVICE_MEM << 16,
>  		(MLX5_IB_MMAP_DEVICE_MEM << 16) + (1UL << 16) - 1);
>  }
> @@ -183,6 +185,114 @@ static inline int check_dm_type_support(struct mlx5_ib_dev *dev, u32 type)
>  	return 0;
>  }
> 
> +void mlx5_ib_dm_memic_free(struct kref *kref)
> +{
> +	struct mlx5_ib_dm *dm =
> +		container_of(kref, struct mlx5_ib_dm, memic.ref);
> +	struct mlx5_ib_dev *dev = to_mdev(dm->ibdm.device);
> +
> +	mlx5_cmd_dealloc_memic(&dev->dm, dm->dev_addr, dm->size);
> +	kfree(dm);
> +}
> +
> +static int copy_op_to_user(struct mlx5_ib_dm_op_entry *op_entry,
> +			   struct uverbs_attr_bundle *attrs)
> +{
> +	u64 start_offset;
> +	u16 page_idx;
> +	int err;
> +
> +	page_idx = op_entry->mentry.rdma_entry.start_pgoff & 0xFFFF;
> +	start_offset = op_entry->op_addr & ~PAGE_MASK;
> +	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
> +			     &page_idx, sizeof(page_idx));
> +	if (err)
> +		return err;
> +
> +	return uverbs_copy_to(attrs,
> +			      MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET,
> +			      &start_offset, sizeof(start_offset));
> +}
> +
> +static int map_existing_op(struct mlx5_ib_dm *dm, u8 op,
> +			   struct uverbs_attr_bundle *attrs)
> +{
> +	struct mlx5_ib_dm_op_entry *op_entry;
> +
> +	op_entry = xa_load(&dm->memic.ops, op);
> +	if (!op_entry)
> +		return -ENOENT;
> +
> +	return copy_op_to_user(op_entry, attrs);
> +}
> +
> +static int UVERBS_HANDLER(MLX5_IB_METHOD_DM_MAP_OP_ADDR)(
> +	struct uverbs_attr_bundle *attrs)
> +{
> +	struct ib_uobject *uobj = uverbs_attr_get_uobject(
> +		attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE);
> +	struct mlx5_ib_dev *dev = to_mdev(uobj->context->device);
> +	struct ib_dm *ibdm = uobj->object;
> +	struct mlx5_ib_dm *dm = to_mdm(ibdm);
> +	struct mlx5_ib_dm_op_entry *op_entry;
> +	int err;
> +	u8 op;
> +
> +	err = uverbs_copy_from(&op, attrs, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP);
> +	if (err)
> +		return err;
> +
> +	if (!(MLX5_CAP_DEV_MEM(dev->mdev, memic_operations) & BIT(op)))
> +		return -EOPNOTSUPP;
> +
> +	mutex_lock(&dm->memic.ops_xa_lock);
> +	err = map_existing_op(dm, op, attrs);
> +	if (!err || err != -ENOENT)
> +		goto err_unlock;
> +
> +	op_entry = kzalloc(sizeof(*op_entry), GFP_KERNEL);
> +	if (!op_entry)
> +		goto err_unlock;
> +
> +	err = mlx5_cmd_alloc_memic_op(&dev->dm, dm->dev_addr, op,
> +				      &op_entry->op_addr);
> +	if (err) {
> +		kfree(op_entry);
> +		goto err_unlock;
> +	}
> +	op_entry->op = op;
> +	op_entry->dm = dm;
> +
> +	err = add_dm_mmap_entry(uobj->context, &op_entry->mentry,
> +				MLX5_IB_MMAP_TYPE_MEMIC_OP, dm->size,
> +				op_entry->op_addr & PAGE_MASK);
> +	if (err) {
> +		mlx5_cmd_dealloc_memic_op(&dev->dm, dm->dev_addr, op);
> +		kfree(op_entry);
> +		goto err_unlock;
> +	}
> +	/* From this point, entry will be freed by mmap_free */
> +	kref_get(&dm->memic.ref);
> +
> +	err = copy_op_to_user(op_entry, attrs);
> +	if (err)
> +		goto err_remove;
> +
> +	err = xa_insert(&dm->memic.ops, op, op_entry, GFP_KERNEL);
> +	if (err)
> +		goto err_remove;
> +	mutex_unlock(&dm->memic.ops_xa_lock);
> +
> +	return 0;
> +
> +err_remove:
> +	rdma_user_mmap_entry_remove(&op_entry->mentry.rdma_entry);
> +err_unlock:
> +	mutex_unlock(&dm->memic.ops_xa_lock);
> +
> +	return err;
> +}
> +
>  static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
>  				 struct ib_dm_alloc_attr *attr,
>  				 struct uverbs_attr_bundle *attrs)
> @@ -193,6 +303,9 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
>  	int err;
>  	u64 address;
> 
> +	kref_init(&dm->memic.ref);
> +	xa_init(&dm->memic.ops);
> +	mutex_init(&dm->memic.ops_xa_lock);
>  	dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE);
> 
>  	err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr,
> @@ -203,18 +316,17 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
>  	}
> 
>  	address = dm->dev_addr & PAGE_MASK;
> -	err = add_dm_mmap_entry(ctx, dm, address);
> +	err = add_dm_mmap_entry(ctx, &dm->memic.mentry, MLX5_IB_MMAP_TYPE_MEMIC,
> +				dm->size, address);
>  	if (err) {
>  		mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size);
>  		kfree(dm);
>  		return err;
>  	}
> 
> -	page_idx = dm->mentry.rdma_entry.start_pgoff & 0xFFFF;
> -	err = uverbs_copy_to(attrs,
> -			     MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
> -			     &page_idx,
> -			     sizeof(page_idx));
> +	page_idx = dm->memic.mentry.rdma_entry.start_pgoff & 0xFFFF;
> +	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX,
> +			     &page_idx, sizeof(page_idx));
>  	if (err)
>  		goto err_copy;
> 
> @@ -228,7 +340,7 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm,
>  	return 0;
> 
>  err_copy:
> -	rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
> +	rdma_user_mmap_entry_remove(&dm->memic.mentry.rdma_entry);
> 
>  	return err;
>  }
> @@ -292,6 +404,7 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
>  		return ERR_PTR(-ENOMEM);
> 
>  	dm->type = type;
> +	dm->ibdm.device = ibdev;
> 
>  	switch (type) {
>  	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
> @@ -323,6 +436,19 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
>  	return ERR_PTR(err);
>  }
> 
> +static void dm_memic_remove_ops(struct mlx5_ib_dm *dm)
> +{
> +	struct mlx5_ib_dm_op_entry *entry;
> +	unsigned long idx;
> +
> +	mutex_lock(&dm->memic.ops_xa_lock);
> +	xa_for_each(&dm->memic.ops, idx, entry) {
> +		xa_erase(&dm->memic.ops, idx);
> +		rdma_user_mmap_entry_remove(&entry->mentry.rdma_entry);
> +	}
> +	mutex_unlock(&dm->memic.ops_xa_lock);
> +}
> +
>  int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
>  {
>  	struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
> @@ -333,7 +459,8 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
> 
>  	switch (dm->type) {
>  	case MLX5_IB_UAPI_DM_TYPE_MEMIC:
> -		rdma_user_mmap_entry_remove(&dm->mentry.rdma_entry);
> +		dm_memic_remove_ops(dm);
> +		rdma_user_mmap_entry_remove(&dm->memic.mentry.rdma_entry);
>  		return 0;
>  	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
>  		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
> @@ -359,6 +486,31 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
>  	return 0;
>  }
> 
> +void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
> +			  struct mlx5_user_mmap_entry *mentry)
> +{
> +	struct mlx5_ib_dm_op_entry *op_entry;
> +	struct mlx5_ib_dm *mdm;
> +
> +	switch (mentry->mmap_flag) {
> +	case MLX5_IB_MMAP_TYPE_MEMIC:
> +		mdm = container_of(mentry, struct mlx5_ib_dm, memic.mentry);
> +		kref_put(&mdm->memic.ref, mlx5_ib_dm_memic_free);
> +		break;
> +	case MLX5_IB_MMAP_TYPE_MEMIC_OP:
> +		op_entry = container_of(mentry, struct mlx5_ib_dm_op_entry,
> +					mentry);
> +		mdm = op_entry->dm;
> +		mlx5_cmd_dealloc_memic_op(&dev->dm, mdm->dev_addr,
> +					  op_entry->op);
> +		kfree(op_entry);
> +		kref_put(&mdm->memic.ref, mlx5_ib_dm_memic_free);
> +		break;
> +	default:
> +		WARN_ON(true);
> +	}
> +}
> +
>  ADD_UVERBS_ATTRIBUTES_SIMPLE(
>  	mlx5_ib_dm, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC,
>  	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
> @@ -368,8 +520,28 @@ ADD_UVERBS_ATTRIBUTES_SIMPLE(
>  	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE,
>  			     enum mlx5_ib_uapi_dm_type, UA_OPTIONAL));
> 
> +DECLARE_UVERBS_NAMED_METHOD(
> +	MLX5_IB_METHOD_DM_MAP_OP_ADDR,
> +	UVERBS_ATTR_IDR(MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE,
> +			UVERBS_OBJECT_DM,
> +			UVERBS_ACCESS_READ,
> +			UA_MANDATORY),
> +	UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP,
> +			   UVERBS_ATTR_TYPE(u8),
> +			   UA_MANDATORY),
> +	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET,
> +			    UVERBS_ATTR_TYPE(u64),
> +			    UA_MANDATORY),
> +	UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX,
> +			    UVERBS_ATTR_TYPE(u16),
> +			    UA_OPTIONAL));
> +
> +DECLARE_UVERBS_GLOBAL_METHODS(UVERBS_OBJECT_DM,
> +			      &UVERBS_METHOD(MLX5_IB_METHOD_DM_MAP_OP_ADDR));
> +
>  const struct uapi_definition mlx5_ib_dm_defs[] = {
>  	UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DM, &mlx5_ib_dm),
> +	UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_DM),
>  	{},
>  };
> 
> diff --git a/drivers/infiniband/hw/mlx5/dm.h b/drivers/infiniband/hw/mlx5/dm.h
> index adb39d3d8131..56cf1ba9c985 100644
> --- a/drivers/infiniband/hw/mlx5/dm.h
> +++ b/drivers/infiniband/hw/mlx5/dm.h
> @@ -8,6 +8,8 @@
> 
>  #include "mlx5_ib.h"
> 
> +void mlx5_ib_dm_mmap_free(struct mlx5_ib_dev *dev,
> +			  struct mlx5_user_mmap_entry *mentry);
>  void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr,
>  			    u64 length);
>  void mlx5_cmd_dealloc_memic_op(struct mlx5_dm *dm, phys_addr_t addr,
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index 49c8c60d9520..6908db28b796 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -2090,14 +2090,11 @@ static void mlx5_ib_mmap_free(struct rdma_user_mmap_entry *entry)
>  	struct mlx5_user_mmap_entry *mentry = to_mmmap(entry);
>  	struct mlx5_ib_dev *dev = to_mdev(entry->ucontext->device);
>  	struct mlx5_var_table *var_table = &dev->var_table;
> -	struct mlx5_ib_dm *mdm;
> 
>  	switch (mentry->mmap_flag) {
>  	case MLX5_IB_MMAP_TYPE_MEMIC:
> -		mdm = container_of(mentry, struct mlx5_ib_dm, mentry);
> -		mlx5_cmd_dealloc_memic(&dev->dm, mdm->dev_addr,
> -				       mdm->size);
> -		kfree(mdm);
> +	case MLX5_IB_MMAP_TYPE_MEMIC_OP:
> +		mlx5_ib_dm_mmap_free(dev, mentry);
>  		break;
>  	case MLX5_IB_MMAP_TYPE_VAR:
>  		mutex_lock(&var_table->bitmap_lock);
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index ae971de6e934..b714131f87b7 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -166,6 +166,7 @@ enum mlx5_ib_mmap_type {
>  	MLX5_IB_MMAP_TYPE_VAR = 2,
>  	MLX5_IB_MMAP_TYPE_UAR_WC = 3,
>  	MLX5_IB_MMAP_TYPE_UAR_NC = 4,
> +	MLX5_IB_MMAP_TYPE_MEMIC_OP = 5,
>  };
> 
>  struct mlx5_bfreg_info {
> @@ -618,18 +619,30 @@ struct mlx5_user_mmap_entry {
>  	u32 page_idx;
>  };
> 
> +struct mlx5_ib_dm_op_entry {
> +	struct mlx5_user_mmap_entry	mentry;
> +	phys_addr_t			op_addr;
> +	struct mlx5_ib_dm		*dm;
> +	u8				op;
> +};
> +
>  struct mlx5_ib_dm {
>  	struct ib_dm		ibdm;
>  	phys_addr_t		dev_addr;
>  	u32			type;
>  	size_t			size;
>  	union {
> +		struct {
> +				struct mlx5_user_mmap_entry mentry;
> +				struct xarray		ops;
> +				struct mutex		ops_xa_lock;
> +				struct kref		ref;
> +		} memic;
>  		struct {
>  			u32	obj_id;
>  		} icm_dm;

This union is making it much too difficult to read and understand now.
An optional kref inside a structure is too far

Please split it to more types and have proper typesafety throughout

It looks mostly fine otherwise, the error flows are a bit hard to read
though, when a new type is added this should also get re-organized so
we don't do stuff like:

err_free:
	/* In MEMIC error flow, dm will be freed internally */
	if (type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
		kfree(dm);

I'd inline the checks from check_dm_type_support() into their
respective allocation functions too

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations
  2021-04-01 17:47   ` Jason Gunthorpe
@ 2021-04-04  7:51     ` Leon Romanovsky
  0 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-04-04  7:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Maor Gottlieb, linux-rdma, netdev, Saeed Mahameed,
	Yishai Hadas

On Thu, Apr 01, 2021 at 02:47:04PM -0300, Jason Gunthorpe wrote:
> On Thu, Mar 18, 2021 at 01:15:47PM +0200, Leon Romanovsky wrote:
> > From: Maor Gottlieb <maorg@nvidia.com>
> > 
> > MEMIC buffer, in addition to regular read and write operations, can
> > support atomic operations from the host.
> > 
> > Introduce and implement new UAPI to allocate address space for MEMIC
> > operations such as atomic. This includes:

<...>

> It looks mostly fine otherwise, the error flows are a bit hard to read
> though, when a new type is added this should also get re-organized so
> we don't do stuff like:
> 
> err_free:
> 	/* In MEMIC error flow, dm will be freed internally */
> 	if (type != MLX5_IB_UAPI_DM_TYPE_MEMIC)
> 		kfree(dm);

I actually liked it, because the "re-organized" code was harder to read
than this simple check. but ok, let's try again.

Thanks

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-04  7:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18 11:15 [PATCH rdma-next 0/7] Add MEMIC operations support Leon Romanovsky
2021-03-18 11:15 ` [PATCH mlx5-next 1/7] net/mlx5: Add MEMIC operations related bits Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 2/7] RDMA/uverbs: Make UVERBS_OBJECT_METHODS to consider line number Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 3/7] RDMA/mlx5: Avoid use after free in allocate MEMIC bad flow Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 4/7] RDMA/mlx5: Move all DM logic to separate file Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 5/7] RDMA/mlx5: Add support to MODIFY_MEMIC command Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 6/7] RDMA/mlx5: Add support in MEMIC operations Leon Romanovsky
2021-04-01 17:47   ` Jason Gunthorpe
2021-04-04  7:51     ` Leon Romanovsky
2021-03-18 11:15 ` [PATCH rdma-next 7/7] RDMA/mlx5: Expose UAPI to query DM Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.