All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS
@ 2022-09-01 14:20 ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

This series adds support for DMABUF when creating a devx umem. devx umems
are quite similar to MR's execpt they cannot be revoked, so this uses the
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR.

The intended use case is primarily for P2P transfers using dmabuf as a
handle to the underlying PCI BAR memory from the exporter. When a PCI
switch is present the P2P transfers can bypass the host bridge completely
and go directly through the switch. ATS allows this bypass to function in
more cases as translated TLPs issued after an ATS query allows the request
redirect setting to be bypassed in the switch.

Have mlx5 automatically use ATS in places where it makes sense.

Jason Gunthorpe (4):
  net/mlx5: Add IFC bits for mkey ATS
  RDMA/core: Add UVERBS_ATTR_RAW_FD
  RDMA/mlx5: Add support for dmabuf to devx umem
  RDMA/mlx5: Enable ATS support for MRs and umems

 drivers/infiniband/core/uverbs_ioctl.c   |  8 ++++
 drivers/infiniband/hw/mlx5/devx.c        | 55 +++++++++++++++++-------
 drivers/infiniband/hw/mlx5/mlx5_ib.h     | 36 ++++++++++++++++
 drivers/infiniband/hw/mlx5/mr.c          |  5 ++-
 include/linux/mlx5/mlx5_ifc.h            | 11 +++--
 include/rdma/uverbs_ioctl.h              | 13 ++++++
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
 7 files changed, 109 insertions(+), 20 deletions(-)


base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
-- 
2.37.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS
@ 2022-09-01 14:20 ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

This series adds support for DMABUF when creating a devx umem. devx umems
are quite similar to MR's execpt they cannot be revoked, so this uses the
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR.

The intended use case is primarily for P2P transfers using dmabuf as a
handle to the underlying PCI BAR memory from the exporter. When a PCI
switch is present the P2P transfers can bypass the host bridge completely
and go directly through the switch. ATS allows this bypass to function in
more cases as translated TLPs issued after an ATS query allows the request
redirect setting to be bypassed in the switch.

Have mlx5 automatically use ATS in places where it makes sense.

Jason Gunthorpe (4):
  net/mlx5: Add IFC bits for mkey ATS
  RDMA/core: Add UVERBS_ATTR_RAW_FD
  RDMA/mlx5: Add support for dmabuf to devx umem
  RDMA/mlx5: Enable ATS support for MRs and umems

 drivers/infiniband/core/uverbs_ioctl.c   |  8 ++++
 drivers/infiniband/hw/mlx5/devx.c        | 55 +++++++++++++++++-------
 drivers/infiniband/hw/mlx5/mlx5_ib.h     | 36 ++++++++++++++++
 drivers/infiniband/hw/mlx5/mr.c          |  5 ++-
 include/linux/mlx5/mlx5_ifc.h            | 11 +++--
 include/rdma/uverbs_ioctl.h              | 13 ++++++
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
 7 files changed, 109 insertions(+), 20 deletions(-)


base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
-- 
2.37.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS
  2022-09-01 14:20 ` Jason Gunthorpe
@ 2022-09-01 14:20   ` Jason Gunthorpe
  -1 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

Allows telling a mkey to use PCI ATS for DMA that flows through it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 4acd5610e96bc0..92602e33a82c42 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1707,7 +1707,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         steering_format_version[0x4];
 	u8         create_qp_start_hint[0x18];
 
-	u8         reserved_at_460[0x3];
+	u8         reserved_at_460[0x1];
+	u8         ats[0x1];
+	u8         reserved_at_462[0x1];
 	u8         log_max_uctx[0x5];
 	u8         reserved_at_468[0x2];
 	u8         ipsec_offload[0x1];
@@ -3873,7 +3875,9 @@ struct mlx5_ifc_mkc_bits {
 	u8         lw[0x1];
 	u8         lr[0x1];
 	u8         access_mode_1_0[0x2];
-	u8         reserved_at_18[0x8];
+	u8         reserved_at_18[0x2];
+	u8         ma_translation_mode[0x2];
+	u8         reserved_at_1c[0x4];
 
 	u8         qpn[0x18];
 	u8         mkey_7_0[0x8];
@@ -11134,7 +11138,8 @@ struct mlx5_ifc_dealloc_memic_out_bits {
 struct mlx5_ifc_umem_bits {
 	u8         reserved_at_0[0x80];
 
-	u8         reserved_at_80[0x1b];
+	u8         ats[0x1];
+	u8         reserved_at_81[0x1a];
 	u8         log_page_size[0x5];
 
 	u8         page_offset[0x20];
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS
@ 2022-09-01 14:20   ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

Allows telling a mkey to use PCI ATS for DMA that flows through it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 4acd5610e96bc0..92602e33a82c42 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1707,7 +1707,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         steering_format_version[0x4];
 	u8         create_qp_start_hint[0x18];
 
-	u8         reserved_at_460[0x3];
+	u8         reserved_at_460[0x1];
+	u8         ats[0x1];
+	u8         reserved_at_462[0x1];
 	u8         log_max_uctx[0x5];
 	u8         reserved_at_468[0x2];
 	u8         ipsec_offload[0x1];
@@ -3873,7 +3875,9 @@ struct mlx5_ifc_mkc_bits {
 	u8         lw[0x1];
 	u8         lr[0x1];
 	u8         access_mode_1_0[0x2];
-	u8         reserved_at_18[0x8];
+	u8         reserved_at_18[0x2];
+	u8         ma_translation_mode[0x2];
+	u8         reserved_at_1c[0x4];
 
 	u8         qpn[0x18];
 	u8         mkey_7_0[0x8];
@@ -11134,7 +11138,8 @@ struct mlx5_ifc_dealloc_memic_out_bits {
 struct mlx5_ifc_umem_bits {
 	u8         reserved_at_0[0x80];
 
-	u8         reserved_at_80[0x1b];
+	u8         ats[0x1];
+	u8         reserved_at_81[0x1a];
 	u8         log_page_size[0x5];
 
 	u8         page_offset[0x20];
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] RDMA/core: Add UVERBS_ATTR_RAW_FD
  2022-09-01 14:20 ` Jason Gunthorpe
@ 2022-09-01 14:20   ` Jason Gunthorpe
  -1 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

This uses the same passing protocol as UVERBS_ATTR_FD (eg len = 0 data_s64
= fd), except that the FD is not required to be a uverbs object and the
core code does not covert the FD to an object handle automatically.

Access to the int fd is provided by uverbs_get_raw_fd().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/core/uverbs_ioctl.c |  8 ++++++++
 include/rdma/uverbs_ioctl.h            | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
index 990f0724acc6b6..d9799706c58e99 100644
--- a/drivers/infiniband/core/uverbs_ioctl.c
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -337,6 +337,14 @@ static int uverbs_process_attr(struct bundle_priv *pbundle,
 
 		break;
 
+	case UVERBS_ATTR_TYPE_RAW_FD:
+		if (uattr->attr_data.reserved || uattr->len != 0 ||
+		    uattr->data_s64 < INT_MIN || uattr->data_s64 > INT_MAX)
+			return -EINVAL;
+		/* _uverbs_get_const_signed() is the accessor */
+		e->ptr_attr.data = uattr->data_s64;
+		break;
+
 	case UVERBS_ATTR_TYPE_IDRS_ARRAY:
 		return uverbs_process_idrs_array(pbundle, attr_uapi,
 						 &e->objs_arr_attr, uattr,
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index 23bb404aba12c0..9d45a5b203169e 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -24,6 +24,7 @@ enum uverbs_attr_type {
 	UVERBS_ATTR_TYPE_PTR_OUT,
 	UVERBS_ATTR_TYPE_IDR,
 	UVERBS_ATTR_TYPE_FD,
+	UVERBS_ATTR_TYPE_RAW_FD,
 	UVERBS_ATTR_TYPE_ENUM_IN,
 	UVERBS_ATTR_TYPE_IDRS_ARRAY,
 };
@@ -521,6 +522,11 @@ struct uapi_definition {
 			  .u.obj.access = _access,                             \
 			  __VA_ARGS__ } })
 
+#define UVERBS_ATTR_RAW_FD(_attr_id, ...)                                      \
+	(&(const struct uverbs_attr_def){                                      \
+		.id = (_attr_id),                                              \
+		.attr = { .type = UVERBS_ATTR_TYPE_RAW_FD, __VA_ARGS__ } })
+
 #define UVERBS_ATTR_PTR_IN(_attr_id, _type, ...)                               \
 	(&(const struct uverbs_attr_def){                                      \
 		.id = _attr_id,                                                \
@@ -999,4 +1005,11 @@ _uverbs_get_const_unsigned(u64 *to,
 		 uverbs_get_const_default_unsigned(_to, _attrs_bundle, _idx,   \
 						    _default))
 
+static inline int
+uverbs_get_raw_fd(int *to, const struct uverbs_attr_bundle *attrs_bundle,
+		  size_t idx)
+{
+	return uverbs_get_const_signed(to, attrs_bundle, idx);
+}
+
 #endif
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] RDMA/core: Add UVERBS_ATTR_RAW_FD
@ 2022-09-01 14:20   ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

This uses the same passing protocol as UVERBS_ATTR_FD (eg len = 0 data_s64
= fd), except that the FD is not required to be a uverbs object and the
core code does not covert the FD to an object handle automatically.

Access to the int fd is provided by uverbs_get_raw_fd().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/core/uverbs_ioctl.c |  8 ++++++++
 include/rdma/uverbs_ioctl.h            | 13 +++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
index 990f0724acc6b6..d9799706c58e99 100644
--- a/drivers/infiniband/core/uverbs_ioctl.c
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -337,6 +337,14 @@ static int uverbs_process_attr(struct bundle_priv *pbundle,
 
 		break;
 
+	case UVERBS_ATTR_TYPE_RAW_FD:
+		if (uattr->attr_data.reserved || uattr->len != 0 ||
+		    uattr->data_s64 < INT_MIN || uattr->data_s64 > INT_MAX)
+			return -EINVAL;
+		/* _uverbs_get_const_signed() is the accessor */
+		e->ptr_attr.data = uattr->data_s64;
+		break;
+
 	case UVERBS_ATTR_TYPE_IDRS_ARRAY:
 		return uverbs_process_idrs_array(pbundle, attr_uapi,
 						 &e->objs_arr_attr, uattr,
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index 23bb404aba12c0..9d45a5b203169e 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -24,6 +24,7 @@ enum uverbs_attr_type {
 	UVERBS_ATTR_TYPE_PTR_OUT,
 	UVERBS_ATTR_TYPE_IDR,
 	UVERBS_ATTR_TYPE_FD,
+	UVERBS_ATTR_TYPE_RAW_FD,
 	UVERBS_ATTR_TYPE_ENUM_IN,
 	UVERBS_ATTR_TYPE_IDRS_ARRAY,
 };
@@ -521,6 +522,11 @@ struct uapi_definition {
 			  .u.obj.access = _access,                             \
 			  __VA_ARGS__ } })
 
+#define UVERBS_ATTR_RAW_FD(_attr_id, ...)                                      \
+	(&(const struct uverbs_attr_def){                                      \
+		.id = (_attr_id),                                              \
+		.attr = { .type = UVERBS_ATTR_TYPE_RAW_FD, __VA_ARGS__ } })
+
 #define UVERBS_ATTR_PTR_IN(_attr_id, _type, ...)                               \
 	(&(const struct uverbs_attr_def){                                      \
 		.id = _attr_id,                                                \
@@ -999,4 +1005,11 @@ _uverbs_get_const_unsigned(u64 *to,
 		 uverbs_get_const_default_unsigned(_to, _attrs_bundle, _idx,   \
 						    _default))
 
+static inline int
+uverbs_get_raw_fd(int *to, const struct uverbs_attr_bundle *attrs_bundle,
+		  size_t idx)
+{
+	return uverbs_get_const_signed(to, attrs_bundle, idx);
+}
+
 #endif
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] RDMA/mlx5: Add support for dmabuf to devx umem
  2022-09-01 14:20 ` Jason Gunthorpe
@ 2022-09-01 14:20   ` Jason Gunthorpe
  -1 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

This is modeled after the similar EFA enablement in commit
66f4817b5712 ("RDMA/efa: Add support for dmabuf memory regions").

Like EFA there is no support for revocation so we simply call the
ib_umem_dmabuf_get_pinned() to obtain a umem instead of the normal
ib_umem_get().  Everything else stays the same.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx5/devx.c        | 24 +++++++++++++++++++++---
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
index 2a2a9e9afc9dad..291e73d7928276 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2181,9 +2181,25 @@ static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 	if (err)
 		return err;
 
-	obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
-	if (IS_ERR(obj->umem))
-		return PTR_ERR(obj->umem);
+	if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD)) {
+		struct ib_umem_dmabuf *umem_dmabuf;
+		int dmabuf_fd;
+
+		err = uverbs_get_raw_fd(&dmabuf_fd, attrs,
+					MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD);
+		if (err)
+			return -EFAULT;
+
+		umem_dmabuf = ib_umem_dmabuf_get_pinned(
+			&dev->ib_dev, addr, size, dmabuf_fd, access);
+		if (IS_ERR(umem_dmabuf))
+			return PTR_ERR(umem_dmabuf);
+		obj->umem = &umem_dmabuf->umem;
+	} else {
+		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
+		if (IS_ERR(obj->umem))
+			return PTR_ERR(obj->umem);
+	}
 	return 0;
 }
 
@@ -2833,6 +2849,8 @@ DECLARE_UVERBS_NAMED_METHOD(
 	UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_LEN,
 			   UVERBS_ATTR_TYPE(u64),
 			   UA_MANDATORY),
+	UVERBS_ATTR_RAW_FD(MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD,
+			   UA_OPTIONAL),
 	UVERBS_ATTR_FLAGS_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
 			     enum ib_access_flags),
 	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP,
diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
index 3bee490eb5857f..595edad03dfe54 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
@@ -174,6 +174,7 @@ enum mlx5_ib_devx_umem_reg_attrs {
 	MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
 	MLX5_IB_ATTR_DEVX_UMEM_REG_OUT_ID,
 	MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP,
+	MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD,
 };
 
 enum mlx5_ib_devx_umem_dereg_attrs {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] RDMA/mlx5: Add support for dmabuf to devx umem
@ 2022-09-01 14:20   ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

This is modeled after the similar EFA enablement in commit
66f4817b5712 ("RDMA/efa: Add support for dmabuf memory regions").

Like EFA there is no support for revocation so we simply call the
ib_umem_dmabuf_get_pinned() to obtain a umem instead of the normal
ib_umem_get().  Everything else stays the same.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx5/devx.c        | 24 +++++++++++++++++++++---
 include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
index 2a2a9e9afc9dad..291e73d7928276 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2181,9 +2181,25 @@ static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 	if (err)
 		return err;
 
-	obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
-	if (IS_ERR(obj->umem))
-		return PTR_ERR(obj->umem);
+	if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD)) {
+		struct ib_umem_dmabuf *umem_dmabuf;
+		int dmabuf_fd;
+
+		err = uverbs_get_raw_fd(&dmabuf_fd, attrs,
+					MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD);
+		if (err)
+			return -EFAULT;
+
+		umem_dmabuf = ib_umem_dmabuf_get_pinned(
+			&dev->ib_dev, addr, size, dmabuf_fd, access);
+		if (IS_ERR(umem_dmabuf))
+			return PTR_ERR(umem_dmabuf);
+		obj->umem = &umem_dmabuf->umem;
+	} else {
+		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
+		if (IS_ERR(obj->umem))
+			return PTR_ERR(obj->umem);
+	}
 	return 0;
 }
 
@@ -2833,6 +2849,8 @@ DECLARE_UVERBS_NAMED_METHOD(
 	UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_LEN,
 			   UVERBS_ATTR_TYPE(u64),
 			   UA_MANDATORY),
+	UVERBS_ATTR_RAW_FD(MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD,
+			   UA_OPTIONAL),
 	UVERBS_ATTR_FLAGS_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
 			     enum ib_access_flags),
 	UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP,
diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
index 3bee490eb5857f..595edad03dfe54 100644
--- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h
+++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h
@@ -174,6 +174,7 @@ enum mlx5_ib_devx_umem_reg_attrs {
 	MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
 	MLX5_IB_ATTR_DEVX_UMEM_REG_OUT_ID,
 	MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP,
+	MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD,
 };
 
 enum mlx5_ib_devx_umem_dereg_attrs {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] RDMA/mlx5: Enable ATS support for MRs and umems
  2022-09-01 14:20 ` Jason Gunthorpe
@ 2022-09-01 14:20   ` Jason Gunthorpe
  -1 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

For mlx5 if ATS is enabled in the PCI config then the device will use ATS
requests for only certain DMA operations. This has to be opted in by the
SW side based on the mkey or umem settings.

ATS slows down the PCI performance, so it should only be set in cases when
it is needed. All of these cases revolve around optimizing PCI P2P
transfers and avoiding bad cases where the bus just doesn't work.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx5/devx.c    | 37 ++++++++++++++++------------
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 36 +++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/mr.c      |  5 +++-
 3 files changed, 61 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
index 291e73d7928276..c900977e6ccdb7 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2158,26 +2158,17 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
 
 static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 			 struct uverbs_attr_bundle *attrs,
-			 struct devx_umem *obj)
+			 struct devx_umem *obj, u32 access_flags)
 {
 	u64 addr;
 	size_t size;
-	u32 access;
 	int err;
 
 	if (uverbs_copy_from(&addr, attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_ADDR) ||
 	    uverbs_copy_from(&size, attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_LEN))
 		return -EFAULT;
 
-	err = uverbs_get_flags32(&access, attrs,
-				 MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
-				 IB_ACCESS_LOCAL_WRITE |
-				 IB_ACCESS_REMOTE_WRITE |
-				 IB_ACCESS_REMOTE_READ);
-	if (err)
-		return err;
-
-	err = ib_check_mr_access(&dev->ib_dev, access);
+	err = ib_check_mr_access(&dev->ib_dev, access_flags);
 	if (err)
 		return err;
 
@@ -2191,12 +2182,12 @@ static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 			return -EFAULT;
 
 		umem_dmabuf = ib_umem_dmabuf_get_pinned(
-			&dev->ib_dev, addr, size, dmabuf_fd, access);
+			&dev->ib_dev, addr, size, dmabuf_fd, access_flags);
 		if (IS_ERR(umem_dmabuf))
 			return PTR_ERR(umem_dmabuf);
 		obj->umem = &umem_dmabuf->umem;
 	} else {
-		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
+		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access_flags);
 		if (IS_ERR(obj->umem))
 			return PTR_ERR(obj->umem);
 	}
@@ -2238,7 +2229,8 @@ static unsigned int devx_umem_find_best_pgsize(struct ib_umem *umem,
 static int devx_umem_reg_cmd_alloc(struct mlx5_ib_dev *dev,
 				   struct uverbs_attr_bundle *attrs,
 				   struct devx_umem *obj,
-				   struct devx_umem_reg_cmd *cmd)
+				   struct devx_umem_reg_cmd *cmd,
+				   int access)
 {
 	unsigned long pgsz_bitmap;
 	unsigned int page_size;
@@ -2287,6 +2279,9 @@ static int devx_umem_reg_cmd_alloc(struct mlx5_ib_dev *dev,
 	MLX5_SET(umem, umem, page_offset,
 		 ib_umem_dma_offset(obj->umem, page_size));
 
+	if (mlx5_umem_needs_ats(dev, obj->umem, access))
+		MLX5_SET(umem, umem, ats, 1);
+
 	mlx5_ib_populate_pas(obj->umem, page_size, mtt,
 			     (obj->umem->writable ? MLX5_IB_MTT_WRITE : 0) |
 				     MLX5_IB_MTT_READ);
@@ -2304,20 +2299,30 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_UMEM_REG)(
 	struct mlx5_ib_ucontext *c = rdma_udata_to_drv_context(
 		&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
 	struct mlx5_ib_dev *dev = to_mdev(c->ibucontext.device);
+	int access_flags;
 	int err;
 
 	if (!c->devx_uid)
 		return -EINVAL;
 
+	err = uverbs_get_flags32(&access_flags, attrs,
+				 MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
+				 IB_ACCESS_LOCAL_WRITE |
+				 IB_ACCESS_REMOTE_WRITE |
+				 IB_ACCESS_REMOTE_READ |
+				 IB_ACCESS_RELAXED_ORDERING);
+	if (err)
+		return err;
+
 	obj = kzalloc(sizeof(struct devx_umem), GFP_KERNEL);
 	if (!obj)
 		return -ENOMEM;
 
-	err = devx_umem_get(dev, &c->ibucontext, attrs, obj);
+	err = devx_umem_get(dev, &c->ibucontext, attrs, obj, access_flags);
 	if (err)
 		goto err_obj_free;
 
-	err = devx_umem_reg_cmd_alloc(dev, attrs, obj, &cmd);
+	err = devx_umem_reg_cmd_alloc(dev, attrs, obj, &cmd, access_flags);
 	if (err)
 		goto err_umem_release;
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 2e2ad391838583..7e2c4a3782209d 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1550,4 +1550,40 @@ static inline bool rt_supported(int ts_cap)
 	return ts_cap == MLX5_TIMESTAMP_FORMAT_CAP_REAL_TIME ||
 	       ts_cap == MLX5_TIMESTAMP_FORMAT_CAP_FREE_RUNNING_AND_REAL_TIME;
 }
+
+/*
+ * PCI Peer to Peer is a trainwreck. If no switch is present then things
+ * sometimes work, depending on the pci_distance_p2p logic for excluding broken
+ * root complexes. However if a switch is present in the path, then things get
+ * really ugly depending on how the switch is setup. This table assumes that the
+ * root complex is strict and is validating that all req/reps are matches
+ * perfectly - so any scenario where it sees only half the transaction is a
+ * failure.
+ *
+ * CR/RR/DT  ATS RO P2P
+ * 00X       X   X  OK
+ * 010       X   X  fails (request is routed to root but root never sees comp)
+ * 011       0   X  fails (request is routed to root but root never sees comp)
+ * 011       1   X  OK
+ * 10X       X   1  OK
+ * 101       X   0  fails (completion is routed to root but root didn't see req)
+ * 110       X   0  SLOW
+ * 111       0   0  SLOW
+ * 111       1   0  fails (completion is routed to root but root didn't see req)
+ * 111       1   1  OK
+ *
+ * Unfortunately we cannot reliably know if a switch is present or what the
+ * CR/RR/DT ACS settings are, as in a VM that is all hidden. Assume that
+ * CR/RR/DT is 111 if the ATS cap is enabled and follow the last three rows.
+ *
+ * For now assume if the umem is a dma_buf then it is P2P.
+ */
+static inline bool mlx5_umem_needs_ats(struct mlx5_ib_dev *dev,
+				       struct ib_umem *umem, int access_flags)
+{
+	if (!MLX5_CAP_GEN(dev->mdev, ats) || !umem->is_dmabuf)
+		return false;
+	return access_flags & IB_ACCESS_RELAXED_ORDERING;
+}
+
 #endif /* MLX5_IB_H */
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 129d531bd01bc8..7fd3adea370290 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -937,7 +937,8 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd,
 	 * cache then synchronously create an uncached one.
 	 */
 	if (!ent || ent->limit == 0 ||
-	    !mlx5r_umr_can_reconfig(dev, 0, access_flags)) {
+	    !mlx5r_umr_can_reconfig(dev, 0, access_flags) ||
+	    mlx5_umem_needs_ats(dev, umem, access_flags)) {
 		mutex_lock(&dev->slow_path_mutex);
 		mr = reg_create(pd, umem, iova, access_flags, page_size, false);
 		mutex_unlock(&dev->slow_path_mutex);
@@ -1018,6 +1019,8 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem,
 	MLX5_SET(mkc, mkc, translations_octword_size,
 		 get_octo_len(iova, umem->length, mr->page_shift));
 	MLX5_SET(mkc, mkc, log_page_size, mr->page_shift);
+	if (mlx5_umem_needs_ats(dev, umem, access_flags))
+		MLX5_SET(mkc, mkc, ma_translation_mode, 1);
 	if (populate) {
 		MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
 			 get_octo_len(iova, umem->length, mr->page_shift));
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] RDMA/mlx5: Enable ATS support for MRs and umems
@ 2022-09-01 14:20   ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-01 14:20 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

For mlx5 if ATS is enabled in the PCI config then the device will use ATS
requests for only certain DMA operations. This has to be opted in by the
SW side based on the mkey or umem settings.

ATS slows down the PCI performance, so it should only be set in cases when
it is needed. All of these cases revolve around optimizing PCI P2P
transfers and avoiding bad cases where the bus just doesn't work.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx5/devx.c    | 37 ++++++++++++++++------------
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 36 +++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/mr.c      |  5 +++-
 3 files changed, 61 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
index 291e73d7928276..c900977e6ccdb7 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2158,26 +2158,17 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
 
 static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 			 struct uverbs_attr_bundle *attrs,
-			 struct devx_umem *obj)
+			 struct devx_umem *obj, u32 access_flags)
 {
 	u64 addr;
 	size_t size;
-	u32 access;
 	int err;
 
 	if (uverbs_copy_from(&addr, attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_ADDR) ||
 	    uverbs_copy_from(&size, attrs, MLX5_IB_ATTR_DEVX_UMEM_REG_LEN))
 		return -EFAULT;
 
-	err = uverbs_get_flags32(&access, attrs,
-				 MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
-				 IB_ACCESS_LOCAL_WRITE |
-				 IB_ACCESS_REMOTE_WRITE |
-				 IB_ACCESS_REMOTE_READ);
-	if (err)
-		return err;
-
-	err = ib_check_mr_access(&dev->ib_dev, access);
+	err = ib_check_mr_access(&dev->ib_dev, access_flags);
 	if (err)
 		return err;
 
@@ -2191,12 +2182,12 @@ static int devx_umem_get(struct mlx5_ib_dev *dev, struct ib_ucontext *ucontext,
 			return -EFAULT;
 
 		umem_dmabuf = ib_umem_dmabuf_get_pinned(
-			&dev->ib_dev, addr, size, dmabuf_fd, access);
+			&dev->ib_dev, addr, size, dmabuf_fd, access_flags);
 		if (IS_ERR(umem_dmabuf))
 			return PTR_ERR(umem_dmabuf);
 		obj->umem = &umem_dmabuf->umem;
 	} else {
-		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access);
+		obj->umem = ib_umem_get(&dev->ib_dev, addr, size, access_flags);
 		if (IS_ERR(obj->umem))
 			return PTR_ERR(obj->umem);
 	}
@@ -2238,7 +2229,8 @@ static unsigned int devx_umem_find_best_pgsize(struct ib_umem *umem,
 static int devx_umem_reg_cmd_alloc(struct mlx5_ib_dev *dev,
 				   struct uverbs_attr_bundle *attrs,
 				   struct devx_umem *obj,
-				   struct devx_umem_reg_cmd *cmd)
+				   struct devx_umem_reg_cmd *cmd,
+				   int access)
 {
 	unsigned long pgsz_bitmap;
 	unsigned int page_size;
@@ -2287,6 +2279,9 @@ static int devx_umem_reg_cmd_alloc(struct mlx5_ib_dev *dev,
 	MLX5_SET(umem, umem, page_offset,
 		 ib_umem_dma_offset(obj->umem, page_size));
 
+	if (mlx5_umem_needs_ats(dev, obj->umem, access))
+		MLX5_SET(umem, umem, ats, 1);
+
 	mlx5_ib_populate_pas(obj->umem, page_size, mtt,
 			     (obj->umem->writable ? MLX5_IB_MTT_WRITE : 0) |
 				     MLX5_IB_MTT_READ);
@@ -2304,20 +2299,30 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_UMEM_REG)(
 	struct mlx5_ib_ucontext *c = rdma_udata_to_drv_context(
 		&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
 	struct mlx5_ib_dev *dev = to_mdev(c->ibucontext.device);
+	int access_flags;
 	int err;
 
 	if (!c->devx_uid)
 		return -EINVAL;
 
+	err = uverbs_get_flags32(&access_flags, attrs,
+				 MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS,
+				 IB_ACCESS_LOCAL_WRITE |
+				 IB_ACCESS_REMOTE_WRITE |
+				 IB_ACCESS_REMOTE_READ |
+				 IB_ACCESS_RELAXED_ORDERING);
+	if (err)
+		return err;
+
 	obj = kzalloc(sizeof(struct devx_umem), GFP_KERNEL);
 	if (!obj)
 		return -ENOMEM;
 
-	err = devx_umem_get(dev, &c->ibucontext, attrs, obj);
+	err = devx_umem_get(dev, &c->ibucontext, attrs, obj, access_flags);
 	if (err)
 		goto err_obj_free;
 
-	err = devx_umem_reg_cmd_alloc(dev, attrs, obj, &cmd);
+	err = devx_umem_reg_cmd_alloc(dev, attrs, obj, &cmd, access_flags);
 	if (err)
 		goto err_umem_release;
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 2e2ad391838583..7e2c4a3782209d 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1550,4 +1550,40 @@ static inline bool rt_supported(int ts_cap)
 	return ts_cap == MLX5_TIMESTAMP_FORMAT_CAP_REAL_TIME ||
 	       ts_cap == MLX5_TIMESTAMP_FORMAT_CAP_FREE_RUNNING_AND_REAL_TIME;
 }
+
+/*
+ * PCI Peer to Peer is a trainwreck. If no switch is present then things
+ * sometimes work, depending on the pci_distance_p2p logic for excluding broken
+ * root complexes. However if a switch is present in the path, then things get
+ * really ugly depending on how the switch is setup. This table assumes that the
+ * root complex is strict and is validating that all req/reps are matches
+ * perfectly - so any scenario where it sees only half the transaction is a
+ * failure.
+ *
+ * CR/RR/DT  ATS RO P2P
+ * 00X       X   X  OK
+ * 010       X   X  fails (request is routed to root but root never sees comp)
+ * 011       0   X  fails (request is routed to root but root never sees comp)
+ * 011       1   X  OK
+ * 10X       X   1  OK
+ * 101       X   0  fails (completion is routed to root but root didn't see req)
+ * 110       X   0  SLOW
+ * 111       0   0  SLOW
+ * 111       1   0  fails (completion is routed to root but root didn't see req)
+ * 111       1   1  OK
+ *
+ * Unfortunately we cannot reliably know if a switch is present or what the
+ * CR/RR/DT ACS settings are, as in a VM that is all hidden. Assume that
+ * CR/RR/DT is 111 if the ATS cap is enabled and follow the last three rows.
+ *
+ * For now assume if the umem is a dma_buf then it is P2P.
+ */
+static inline bool mlx5_umem_needs_ats(struct mlx5_ib_dev *dev,
+				       struct ib_umem *umem, int access_flags)
+{
+	if (!MLX5_CAP_GEN(dev->mdev, ats) || !umem->is_dmabuf)
+		return false;
+	return access_flags & IB_ACCESS_RELAXED_ORDERING;
+}
+
 #endif /* MLX5_IB_H */
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 129d531bd01bc8..7fd3adea370290 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -937,7 +937,8 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd,
 	 * cache then synchronously create an uncached one.
 	 */
 	if (!ent || ent->limit == 0 ||
-	    !mlx5r_umr_can_reconfig(dev, 0, access_flags)) {
+	    !mlx5r_umr_can_reconfig(dev, 0, access_flags) ||
+	    mlx5_umem_needs_ats(dev, umem, access_flags)) {
 		mutex_lock(&dev->slow_path_mutex);
 		mr = reg_create(pd, umem, iova, access_flags, page_size, false);
 		mutex_unlock(&dev->slow_path_mutex);
@@ -1018,6 +1019,8 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem,
 	MLX5_SET(mkc, mkc, translations_octword_size,
 		 get_octo_len(iova, umem->length, mr->page_shift));
 	MLX5_SET(mkc, mkc, log_page_size, mr->page_shift);
+	if (mlx5_umem_needs_ats(dev, umem, access_flags))
+		MLX5_SET(mkc, mkc, ma_translation_mode, 1);
 	if (populate) {
 		MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
 			 get_octo_len(iova, umem->length, mr->page_shift));
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS
  2022-09-01 14:20 ` Jason Gunthorpe
@ 2022-09-26 17:51   ` Jason Gunthorpe
  -1 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-26 17:51 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Mohammad Kabat, Kamal Heib

On Thu, Sep 01, 2022 at 11:20:52AM -0300, Jason Gunthorpe wrote:
> This series adds support for DMABUF when creating a devx umem. devx umems
> are quite similar to MR's execpt they cannot be revoked, so this uses the
> dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
> work with MR.
> 
> The intended use case is primarily for P2P transfers using dmabuf as a
> handle to the underlying PCI BAR memory from the exporter. When a PCI
> switch is present the P2P transfers can bypass the host bridge completely
> and go directly through the switch. ATS allows this bypass to function in
> more cases as translated TLPs issued after an ATS query allows the request
> redirect setting to be bypassed in the switch.
> 
> Have mlx5 automatically use ATS in places where it makes sense.
> 
> Jason Gunthorpe (4):
>   net/mlx5: Add IFC bits for mkey ATS
>   RDMA/core: Add UVERBS_ATTR_RAW_FD
>   RDMA/mlx5: Add support for dmabuf to devx umem
>   RDMA/mlx5: Enable ATS support for MRs and umems
> 
>  drivers/infiniband/core/uverbs_ioctl.c   |  8 ++++
>  drivers/infiniband/hw/mlx5/devx.c        | 55 +++++++++++++++++-------
>  drivers/infiniband/hw/mlx5/mlx5_ib.h     | 36 ++++++++++++++++
>  drivers/infiniband/hw/mlx5/mr.c          |  5 ++-
>  include/linux/mlx5/mlx5_ifc.h            | 11 +++--
>  include/rdma/uverbs_ioctl.h              | 13 ++++++
>  include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
>  7 files changed, 109 insertions(+), 20 deletions(-)

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS
@ 2022-09-26 17:51   ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2022-09-26 17:51 UTC (permalink / raw)
  To: Christian König, dri-devel, Leon Romanovsky, linaro-mm-sig,
	linux-media, linux-rdma, netdev, Saeed Mahameed, Sumit Semwal
  Cc: Kamal Heib, Mohammad Kabat

On Thu, Sep 01, 2022 at 11:20:52AM -0300, Jason Gunthorpe wrote:
> This series adds support for DMABUF when creating a devx umem. devx umems
> are quite similar to MR's execpt they cannot be revoked, so this uses the
> dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
> work with MR.
> 
> The intended use case is primarily for P2P transfers using dmabuf as a
> handle to the underlying PCI BAR memory from the exporter. When a PCI
> switch is present the P2P transfers can bypass the host bridge completely
> and go directly through the switch. ATS allows this bypass to function in
> more cases as translated TLPs issued after an ATS query allows the request
> redirect setting to be bypassed in the switch.
> 
> Have mlx5 automatically use ATS in places where it makes sense.
> 
> Jason Gunthorpe (4):
>   net/mlx5: Add IFC bits for mkey ATS
>   RDMA/core: Add UVERBS_ATTR_RAW_FD
>   RDMA/mlx5: Add support for dmabuf to devx umem
>   RDMA/mlx5: Enable ATS support for MRs and umems
> 
>  drivers/infiniband/core/uverbs_ioctl.c   |  8 ++++
>  drivers/infiniband/hw/mlx5/devx.c        | 55 +++++++++++++++++-------
>  drivers/infiniband/hw/mlx5/mlx5_ib.h     | 36 ++++++++++++++++
>  drivers/infiniband/hw/mlx5/mr.c          |  5 ++-
>  include/linux/mlx5/mlx5_ifc.h            | 11 +++--
>  include/rdma/uverbs_ioctl.h              | 13 ++++++
>  include/uapi/rdma/mlx5_user_ioctl_cmds.h |  1 +
>  7 files changed, 109 insertions(+), 20 deletions(-)

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-26 18:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01 14:20 [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS Jason Gunthorpe
2022-09-01 14:20 ` Jason Gunthorpe
2022-09-01 14:20 ` [PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS Jason Gunthorpe
2022-09-01 14:20   ` Jason Gunthorpe
2022-09-01 14:20 ` [PATCH 2/4] RDMA/core: Add UVERBS_ATTR_RAW_FD Jason Gunthorpe
2022-09-01 14:20   ` Jason Gunthorpe
2022-09-01 14:20 ` [PATCH 3/4] RDMA/mlx5: Add support for dmabuf to devx umem Jason Gunthorpe
2022-09-01 14:20   ` Jason Gunthorpe
2022-09-01 14:20 ` [PATCH 4/4] RDMA/mlx5: Enable ATS support for MRs and umems Jason Gunthorpe
2022-09-01 14:20   ` Jason Gunthorpe
2022-09-26 17:51 ` [PATCH 0/4] RDMA/mlx5: Support DMABUF in umems and enable ATS Jason Gunthorpe
2022-09-26 17:51   ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.