Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions
@ 2020-01-09 14:04 Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 1/7] Update kernel headers Yishai Hadas
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

This series exposes an IBV_ACCESS_RELAXED_ORDERING optional MR access flag.
This optional flag allows creation of relaxed ordering memory regions.
Access through such MRs can improve performance by allowing the system to reorder
certain accesses.

The series uses the new ioctl command to get a device context, this command
enables reading some core generic capabilities such as supporting an optional
MR access flags by IB core and its related drivers.

This capability enables transparent masking of the optional flags in libibverbs
when the kernel doesn't support the MR optional access mode.

The series is based on an RFC that was sent to the ML [1], the matching kernel
series was sent to 'for-next'.
[1] https://www.spinics.net/lists/linux-rdma/msg86188.html

PR was sent:
https://github.com/linux-rdma/rdma-core/pull/660

Yishai

Michael Guralnik (6):
  verbs: Move free_context from verbs_device_ops to verbs_context_ops
  verbs: Move alloc_context to ioctl
  verbs: Relaxed ordering memory regions
  mlx5: Add optional access flags range to DM
  pyverbs: Add relaxed ordering access flag
  tests: Add relaxed ordering access test

Yishai Hadas (1):
  Update kernel headers

 debian/libibverbs1.symbols                |  2 +
 kernel-headers/rdma/ib_user_ioctl_cmds.h  | 15 ++++++
 kernel-headers/rdma/ib_user_ioctl_verbs.h | 12 +++++
 libibverbs/CMakeLists.txt                 |  2 +-
 libibverbs/cmd.c                          | 18 -------
 libibverbs/cmd_device.c                   | 79 +++++++++++++++++++++++++++++++
 libibverbs/device.c                       |  5 +-
 libibverbs/driver.h                       |  3 +-
 libibverbs/dummy_ops.c                    |  7 +++
 libibverbs/libibverbs.map.in              |  5 ++
 libibverbs/man/ibv_reg_mr.3               |  2 +
 libibverbs/verbs.c                        | 13 +++++
 libibverbs/verbs.h                        | 51 +++++++++++++++++++-
 libibverbs/verbs_api.h                    |  2 +
 providers/bnxt_re/main.c                  |  6 ++-
 providers/cxgb4/dev.c                     |  4 +-
 providers/efa/efa.c                       |  4 +-
 providers/hfi1verbs/hfiverbs.c            |  4 +-
 providers/hns/hns_roce_u.c                |  4 +-
 providers/i40iw/i40iw_umain.c             |  6 ++-
 providers/ipathverbs/ipathverbs.c         |  4 +-
 providers/mlx4/mlx4.c                     |  4 +-
 providers/mlx5/mlx5.c                     |  4 +-
 providers/mlx5/verbs.c                    |  3 +-
 providers/mthca/mthca.c                   |  6 ++-
 providers/ocrdma/ocrdma_main.c            |  6 ++-
 providers/qedr/qelr_main.c                |  4 +-
 providers/rxe/rxe.c                       |  6 ++-
 providers/siw/siw.c                       |  3 +-
 providers/vmw_pvrdma/pvrdma_main.c        |  4 +-
 pyverbs/libibverbs_enums.pxd              |  1 +
 tests/CMakeLists.txt                      |  5 +-
 tests/test_relaxed_ordering.py            | 55 +++++++++++++++++++++
 33 files changed, 301 insertions(+), 48 deletions(-)
 create mode 100644 tests/test_relaxed_ordering.py

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 1/7] Update kernel headers
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 2/7] verbs: Move free_context from verbs_device_ops to verbs_context_ops Yishai Hadas
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

to commit 336883935151 ("RDMA/mlx5: Set relaxed ordering when requested")

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 kernel-headers/rdma/ib_user_ioctl_cmds.h  | 15 +++++++++++++++
 kernel-headers/rdma/ib_user_ioctl_verbs.h | 12 ++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
index 64f0e3a..d4ddbe4 100644
--- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
+++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
@@ -56,6 +56,7 @@ enum uverbs_default_objects {
 	UVERBS_OBJECT_FLOW_ACTION,
 	UVERBS_OBJECT_DM,
 	UVERBS_OBJECT_COUNTERS,
+	UVERBS_OBJECT_ASYNC_EVENT,
 };
 
 enum {
@@ -67,6 +68,7 @@ enum uverbs_methods_device {
 	UVERBS_METHOD_INVOKE_WRITE,
 	UVERBS_METHOD_INFO_HANDLES,
 	UVERBS_METHOD_QUERY_PORT,
+	UVERBS_METHOD_GET_CONTEXT,
 };
 
 enum uverbs_attrs_invoke_write_cmd_attr_ids {
@@ -80,6 +82,11 @@ enum uverbs_attrs_query_port_cmd_attr_ids {
 	UVERBS_ATTR_QUERY_PORT_RESP,
 };
 
+enum uverbs_attrs_get_context_attr_ids {
+	UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS,
+	UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT,
+};
+
 enum uverbs_attrs_create_cq_cmd_attr_ids {
 	UVERBS_ATTR_CREATE_CQ_HANDLE,
 	UVERBS_ATTR_CREATE_CQ_CQE,
@@ -241,4 +248,12 @@ enum uverbs_attrs_flow_destroy_ids {
 	UVERBS_ATTR_DESTROY_FLOW_HANDLE,
 };
 
+enum uverbs_method_async_event {
+	UVERBS_METHOD_ASYNC_EVENT_ALLOC,
+};
+
+enum uverbs_attrs_async_event_create {
+	UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE,
+};
+
 #endif
diff --git a/kernel-headers/rdma/ib_user_ioctl_verbs.h b/kernel-headers/rdma/ib_user_ioctl_verbs.h
index 9019b2d..a640bb8 100644
--- a/kernel-headers/rdma/ib_user_ioctl_verbs.h
+++ b/kernel-headers/rdma/ib_user_ioctl_verbs.h
@@ -41,6 +41,13 @@
 #define RDMA_UAPI_PTR(_type, _name)	__aligned_u64 _name
 #endif
 
+#define IB_UVERBS_ACCESS_OPTIONAL_FIRST (1 << 20)
+#define IB_UVERBS_ACCESS_OPTIONAL_LAST (1 << 29)
+
+enum ib_uverbs_core_support {
+	IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS = 1 << 0,
+};
+
 enum ib_uverbs_access_flags {
 	IB_UVERBS_ACCESS_LOCAL_WRITE = 1 << 0,
 	IB_UVERBS_ACCESS_REMOTE_WRITE = 1 << 1,
@@ -50,6 +57,11 @@ enum ib_uverbs_access_flags {
 	IB_UVERBS_ACCESS_ZERO_BASED = 1 << 5,
 	IB_UVERBS_ACCESS_ON_DEMAND = 1 << 6,
 	IB_UVERBS_ACCESS_HUGETLB = 1 << 7,
+
+	IB_UVERBS_ACCESS_RELAXED_ORDERING = IB_UVERBS_ACCESS_OPTIONAL_FIRST,
+	IB_UVERBS_ACCESS_OPTIONAL_RANGE =
+		((IB_UVERBS_ACCESS_OPTIONAL_LAST << 1) - 1) &
+		~(IB_UVERBS_ACCESS_OPTIONAL_FIRST - 1)
 };
 
 enum ib_uverbs_query_port_cap_flags {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 2/7] verbs: Move free_context from verbs_device_ops to verbs_context_ops
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 1/7] Update kernel headers Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 3/7] verbs: Move alloc_context to ioctl Yishai Hadas
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

From: Michael Guralnik <michaelgur@mellanox.com>

As free_context is always called after alloc_context has been called, we
can have the operation in the verbs_context_ops struct.

This is needed for downstream patch from this series where the
alloc_context is moved to use the ioctl interface.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 libibverbs/device.c                | 5 ++---
 libibverbs/driver.h                | 2 +-
 libibverbs/dummy_ops.c             | 7 +++++++
 providers/bnxt_re/main.c           | 6 ++++--
 providers/cxgb4/dev.c              | 4 +++-
 providers/efa/efa.c                | 4 +++-
 providers/hfi1verbs/hfiverbs.c     | 4 +++-
 providers/hns/hns_roce_u.c         | 4 +++-
 providers/i40iw/i40iw_umain.c      | 6 ++++--
 providers/ipathverbs/ipathverbs.c  | 4 +++-
 providers/mlx4/mlx4.c              | 4 +++-
 providers/mlx5/mlx5.c              | 4 +++-
 providers/mthca/mthca.c            | 6 ++++--
 providers/ocrdma/ocrdma_main.c     | 6 ++++--
 providers/qedr/qelr_main.c         | 4 +++-
 providers/rxe/rxe.c                | 6 ++++--
 providers/siw/siw.c                | 3 ++-
 providers/vmw_pvrdma/pvrdma_main.c | 4 +++-
 18 files changed, 59 insertions(+), 24 deletions(-)

diff --git a/libibverbs/device.c b/libibverbs/device.c
index edd8f33..bc7df1b 100644
--- a/libibverbs/device.c
+++ b/libibverbs/device.c
@@ -379,10 +379,9 @@ LATEST_SYMVER_FUNC(ibv_close_device, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_context *context)
 {
-	struct verbs_device *verbs_device = verbs_get_device(context->device);
-
-	verbs_device->ops->free_context(context);
+	const struct verbs_context_ops *ops = get_ops(context);
 
+	ops->free_context(context);
 	return 0;
 }
 
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 88ed2b5..88603ce 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -218,7 +218,6 @@ struct verbs_device_ops {
 	struct verbs_context *(*alloc_context)(struct ibv_device *device,
 					       int cmd_fd,
 					       void *private_data);
-	void (*free_context)(struct ibv_context *context);
 
 	struct verbs_device *(*alloc_device)(struct verbs_sysfs_dev *sysfs_dev);
 	void (*uninit_device)(struct verbs_device *device);
@@ -315,6 +314,7 @@ struct verbs_context_ops {
 	int (*destroy_wq)(struct ibv_wq *wq);
 	int (*detach_mcast)(struct ibv_qp *qp, const union ibv_gid *gid,
 			    uint16_t lid);
+	void (*free_context)(struct ibv_context *context);
 	int (*free_dm)(struct ibv_dm *dm);
 	int (*get_srq_num)(struct ibv_srq *srq, uint32_t *srq_num);
 	int (*modify_cq)(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr);
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index 6560371..d949275 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -272,6 +272,11 @@ static int detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid,
 	return ENOSYS;
 }
 
+static void free_context(struct ibv_context *ctx)
+{
+	return;
+}
+
 static int free_dm(struct ibv_dm *dm)
 {
 	return ENOSYS;
@@ -485,6 +490,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	destroy_srq,
 	destroy_wq,
 	detach_mcast,
+	free_context,
 	free_dm,
 	get_srq_num,
 	modify_cq,
@@ -600,6 +606,7 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_PRIV_OP(ctx, destroy_srq);
 	SET_OP(vctx, destroy_wq);
 	SET_PRIV_OP(ctx, detach_mcast);
+	SET_PRIV_OP_IC(ctx, free_context);
 	SET_OP(vctx, free_dm);
 	SET_OP(vctx, get_srq_num);
 	SET_OP(vctx, modify_cq);
diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index 803eff7..8893673 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -50,6 +50,8 @@
 #include "main.h"
 #include "verbs.h"
 
+static void bnxt_re_free_context(struct ibv_context *ibvctx);
+
 #define PCI_VENDOR_ID_BROADCOM		0x14E4
 
 #define CNA(v, d) VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, d, NULL)
@@ -113,7 +115,8 @@ static const struct verbs_context_ops bnxt_re_cntx_ops = {
 	.post_send     = bnxt_re_post_send,
 	.post_recv     = bnxt_re_post_recv,
 	.create_ah     = bnxt_re_create_ah,
-	.destroy_ah    = bnxt_re_destroy_ah
+	.destroy_ah    = bnxt_re_destroy_ah,
+	.free_context  = bnxt_re_free_context,
 };
 
 bool bnxt_re_is_chip_gen_p5(struct bnxt_re_chip_ctx *cctx)
@@ -218,6 +221,5 @@ static const struct verbs_device_ops bnxt_re_dev_ops = {
 	.match_table = cna_table,
 	.alloc_device = bnxt_re_device_alloc,
 	.alloc_context = bnxt_re_alloc_context,
-	.free_context = bnxt_re_free_context,
 };
 PROVIDER_DRIVER(bnxt_re, bnxt_re_dev_ops);
diff --git a/providers/cxgb4/dev.c b/providers/cxgb4/dev.c
index ecd87e6..1e99ee3 100644
--- a/providers/cxgb4/dev.c
+++ b/providers/cxgb4/dev.c
@@ -43,6 +43,8 @@
 #include "libcxgb4.h"
 #include "cxgb4-abi.h"
 
+static void c4iw_free_context(struct ibv_context *ibctx);
+
 #define PCI_VENDOR_ID_CHELSIO		0x1425
 
 /*
@@ -96,6 +98,7 @@ static const struct verbs_context_ops  c4iw_ctx_common_ops = {
 	.detach_mcast = c4iw_detach_mcast,
 	.post_srq_recv = c4iw_post_srq_recv,
 	.req_notify_cq = c4iw_arm_cq,
+	.free_context = c4iw_free_context,
 };
 
 static const struct verbs_context_ops c4iw_ctx_t4_ops = {
@@ -456,7 +459,6 @@ static const struct verbs_device_ops c4iw_dev_ops = {
 	.alloc_device = c4iw_device_alloc,
 	.uninit_device = c4iw_uninit_device,
 	.alloc_context = c4iw_alloc_context,
-	.free_context = c4iw_free_context,
 };
 PROVIDER_DRIVER(cxgb4, c4iw_dev_ops);
 
diff --git a/providers/efa/efa.c b/providers/efa/efa.c
index 645a29b..a8ba14e 100644
--- a/providers/efa/efa.c
+++ b/providers/efa/efa.c
@@ -12,6 +12,8 @@
 #include "efa.h"
 #include "verbs.h"
 
+static void efa_free_context(struct ibv_context *ibvctx);
+
 #define PCI_VENDOR_ID_AMAZON 0x1d0f
 
 static const struct verbs_match_ent efa_table[] = {
@@ -40,6 +42,7 @@ static const struct verbs_context_ops efa_ctx_ops = {
 	.query_port = efa_query_port,
 	.query_qp = efa_query_qp,
 	.reg_mr = efa_reg_mr,
+	.free_context = efa_free_context,
 };
 
 static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
@@ -127,7 +130,6 @@ static const struct verbs_device_ops efa_dev_ops = {
 	.alloc_device = efa_device_alloc,
 	.uninit_device = efa_uninit_device,
 	.alloc_context = efa_alloc_context,
-	.free_context = efa_free_context,
 };
 
 bool is_efa_dev(struct ibv_device *device)
diff --git a/providers/hfi1verbs/hfiverbs.c b/providers/hfi1verbs/hfiverbs.c
index 02e15d7..9bfb967 100644
--- a/providers/hfi1verbs/hfiverbs.c
+++ b/providers/hfi1verbs/hfiverbs.c
@@ -65,6 +65,8 @@
 #include "hfiverbs.h"
 #include "hfi-abi.h"
 
+static void hfi1_free_context(struct ibv_context *ibctx);
+
 #ifndef PCI_VENDOR_ID_INTEL
 #define PCI_VENDOR_ID_INTEL			0x8086
 #endif
@@ -87,6 +89,7 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops hfi1_ctx_common_ops = {
+	.free_context	= hfi1_free_context,
 	.query_device	= hfi1_query_device,
 	.query_port	= hfi1_query_port,
 
@@ -205,6 +208,5 @@ static const struct verbs_device_ops hfi1_dev_ops = {
 	.alloc_device = hfi1_device_alloc,
 	.uninit_device  = hf11_uninit_device,
 	.alloc_context = hfi1_alloc_context,
-	.free_context = hfi1_free_context,
 };
 PROVIDER_DRIVER(hfi1verbs, hfi1_dev_ops);
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 5872599..fffc9ff 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -40,6 +40,8 @@
 #include "hns_roce_u.h"
 #include "hns_roce_u_abi.h"
 
+static void hns_roce_free_context(struct ibv_context *ibctx);
+
 #define HID_LEN			15
 #define DEV_MATCH_LEN		128
 
@@ -81,6 +83,7 @@ static const struct verbs_context_ops hns_common_ops = {
 	.modify_srq = hns_roce_u_modify_srq,
 	.query_srq = hns_roce_u_query_srq,
 	.destroy_srq = hns_roce_u_destroy_srq,
+	.free_context = hns_roce_free_context,
 };
 
 static struct verbs_context *hns_roce_alloc_context(struct ibv_device *ibdev,
@@ -206,6 +209,5 @@ static const struct verbs_device_ops hns_roce_dev_ops = {
 	.alloc_device = hns_device_alloc,
 	.uninit_device = hns_uninit_device,
 	.alloc_context = hns_roce_alloc_context,
-	.free_context = hns_roce_free_context,
 };
 PROVIDER_DRIVER(hns, hns_roce_dev_ops);
diff --git a/providers/i40iw/i40iw_umain.c b/providers/i40iw/i40iw_umain.c
index 33f3c57..b418d11 100644
--- a/providers/i40iw/i40iw_umain.c
+++ b/providers/i40iw/i40iw_umain.c
@@ -50,6 +50,8 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 
+static void i40iw_ufree_context(struct ibv_context *ibctx);
+
 #define INTEL_HCA(v, d) VERBS_PCI_MATCH(v, d, NULL)
 static const struct verbs_match_ent hca_table[] = {
 	VERBS_DRIVER_ID(RDMA_DRIVER_I40IW),
@@ -115,7 +117,8 @@ static const struct verbs_context_ops i40iw_uctx_ops = {
 	.destroy_ah	= i40iw_udestroy_ah,
 	.attach_mcast	= i40iw_uattach_mcast,
 	.detach_mcast	= i40iw_udetach_mcast,
-	.async_event	= i40iw_async_event
+	.async_event	= i40iw_async_event,
+	.free_context	= i40iw_ufree_context,
 };
 
 /**
@@ -224,6 +227,5 @@ static const struct verbs_device_ops i40iw_udev_ops = {
 	.alloc_device = i40iw_device_alloc,
 	.uninit_device  = i40iw_uninit_device,
 	.alloc_context = i40iw_ualloc_context,
-	.free_context = i40iw_ufree_context,
 };
 PROVIDER_DRIVER(i40iw, i40iw_udev_ops);
diff --git a/providers/ipathverbs/ipathverbs.c b/providers/ipathverbs/ipathverbs.c
index c22571a..0e1a584 100644
--- a/providers/ipathverbs/ipathverbs.c
+++ b/providers/ipathverbs/ipathverbs.c
@@ -45,6 +45,8 @@
 #include "ipathverbs.h"
 #include "ipath-abi.h"
 
+static void ipath_free_context(struct ibv_context *ibctx);
+
 #ifndef PCI_VENDOR_ID_PATHSCALE
 #define PCI_VENDOR_ID_PATHSCALE			0x1fc1
 #endif
@@ -86,6 +88,7 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops ipath_ctx_common_ops = {
+	.free_context	= ipath_free_context,
 	.query_device	= ipath_query_device,
 	.query_port	= ipath_query_port,
 
@@ -203,6 +206,5 @@ static const struct verbs_device_ops ipath_dev_ops = {
 	.alloc_device = ipath_device_alloc,
 	.uninit_device  = ipath_uninit_device,
 	.alloc_context = ipath_alloc_context,
-	.free_context = ipath_free_context,
 };
 PROVIDER_DRIVER(ipathverbs, ipath_dev_ops);
diff --git a/providers/mlx4/mlx4.c b/providers/mlx4/mlx4.c
index 0afe59c..0842ff0 100644
--- a/providers/mlx4/mlx4.c
+++ b/providers/mlx4/mlx4.c
@@ -43,6 +43,8 @@
 #include "mlx4.h"
 #include "mlx4-abi.h"
 
+static void mlx4_free_context(struct ibv_context *ibv_ctx);
+
 #ifndef PCI_VENDOR_ID_MELLANOX
 #define PCI_VENDOR_ID_MELLANOX			0x15b3
 #endif
@@ -131,6 +133,7 @@ static const struct verbs_context_ops mlx4_ctx_ops = {
 	.open_xrcd = mlx4_open_xrcd,
 	.query_device_ex = mlx4_query_device_ex,
 	.query_rt_values = mlx4_query_rt_values,
+	.free_context = mlx4_free_context,
 };
 
 static int mlx4_map_internal_clock(struct mlx4_device *mdev,
@@ -302,7 +305,6 @@ static const struct verbs_device_ops mlx4_dev_ops = {
 	.alloc_device = mlx4_device_alloc,
 	.uninit_device = mlx4_uninit_device,
 	.alloc_context = mlx4_alloc_context,
-	.free_context = mlx4_free_context,
 };
 PROVIDER_DRIVER(mlx4, mlx4_dev_ops);
 
diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 7ea725d..1a54e0e 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -49,6 +49,8 @@
 #include "wqe.h"
 #include "mlx5_ifc.h"
 
+static void mlx5_free_context(struct ibv_context *ibctx);
+
 #ifndef PCI_VENDOR_ID_MELLANOX
 #define PCI_VENDOR_ID_MELLANOX			0x15b3
 #endif
@@ -156,6 +158,7 @@ static const struct verbs_context_ops mlx5_ctx_common_ops = {
 	.read_counters = mlx5_read_counters,
 	.reg_dm_mr = mlx5_reg_dm_mr,
 	.alloc_null_mr = mlx5_alloc_null_mr,
+	.free_context = mlx5_free_context,
 };
 
 static const struct verbs_context_ops mlx5_ctx_cqev1_ops = {
@@ -1451,7 +1454,6 @@ static const struct verbs_device_ops mlx5_dev_ops = {
 	.alloc_device = mlx5_device_alloc,
 	.uninit_device = mlx5_uninit_device,
 	.alloc_context = mlx5_alloc_context,
-	.free_context = mlx5_free_context,
 };
 
 bool is_mlx5_dev(struct ibv_device *device)
diff --git a/providers/mthca/mthca.c b/providers/mthca/mthca.c
index c3293d8..abce486 100644
--- a/providers/mthca/mthca.c
+++ b/providers/mthca/mthca.c
@@ -44,6 +44,8 @@
 #include "mthca.h"
 #include "mthca-abi.h"
 
+static void mthca_free_context(struct ibv_context *ibctx);
+
 #ifndef PCI_VENDOR_ID_MELLANOX
 #define PCI_VENDOR_ID_MELLANOX			0x15b3
 #endif
@@ -111,7 +113,8 @@ static const struct verbs_context_ops mthca_ctx_common_ops = {
 	.create_ah     = mthca_create_ah,
 	.destroy_ah    = mthca_destroy_ah,
 	.attach_mcast  = ibv_cmd_attach_mcast,
-	.detach_mcast  = ibv_cmd_detach_mcast
+	.detach_mcast  = ibv_cmd_detach_mcast,
+	.free_context = mthca_free_context,
 };
 
 static const struct verbs_context_ops mthca_ctx_arbel_ops = {
@@ -237,6 +240,5 @@ static const struct verbs_device_ops mthca_dev_ops = {
 	.alloc_device = mthca_device_alloc,
 	.uninit_device = mthca_uninit_device,
 	.alloc_context = mthca_alloc_context,
-	.free_context = mthca_free_context,
 };
 PROVIDER_DRIVER(mthca, mthca_dev_ops);
diff --git a/providers/ocrdma/ocrdma_main.c b/providers/ocrdma/ocrdma_main.c
index 31fefe9..f7ed629 100644
--- a/providers/ocrdma/ocrdma_main.c
+++ b/providers/ocrdma/ocrdma_main.c
@@ -50,6 +50,8 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 
+static void ocrdma_free_context(struct ibv_context *ibctx);
+
 #define PCI_VENDOR_ID_EMULEX		0x10DF
 #define PCI_DEVICE_ID_EMULEX_GEN1	0xe220
 #define PCI_DEVICE_ID_EMULEX_GEN2        0x720
@@ -93,7 +95,8 @@ static const struct verbs_context_ops ocrdma_ctx_ops = {
 	.destroy_srq = ocrdma_destroy_srq,
 	.post_srq_recv = ocrdma_post_srq_recv,
 	.attach_mcast = ocrdma_attach_mcast,
-	.detach_mcast = ocrdma_detach_mcast
+	.detach_mcast = ocrdma_detach_mcast,
+	.free_context = ocrdma_free_context,
 };
 
 static void ocrdma_uninit_device(struct verbs_device *verbs_device)
@@ -194,6 +197,5 @@ static const struct verbs_device_ops ocrdma_dev_ops = {
 	.alloc_device = ocrdma_device_alloc,
 	.uninit_device = ocrdma_uninit_device,
 	.alloc_context = ocrdma_alloc_context,
-	.free_context = ocrdma_free_context,
 };
 PROVIDER_DRIVER(ocrdma, ocrdma_dev_ops);
diff --git a/providers/qedr/qelr_main.c b/providers/qedr/qelr_main.c
index bbe9b02..da31456 100644
--- a/providers/qedr/qelr_main.c
+++ b/providers/qedr/qelr_main.c
@@ -48,6 +48,8 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 
+static void qelr_free_context(struct ibv_context *ibctx);
+
 #define PCI_VENDOR_ID_QLOGIC           (0x1077)
 #define PCI_DEVICE_ID_QLOGIC_57980S    (0x1629)
 #define PCI_DEVICE_ID_QLOGIC_57980S_40 (0x1634)
@@ -104,6 +106,7 @@ static const struct verbs_context_ops qelr_ctx_ops = {
 	.post_send = qelr_post_send,
 	.post_recv = qelr_post_recv,
 	.async_event = qelr_async_event,
+	.free_context = qelr_free_context,
 };
 
 static void qelr_uninit_device(struct verbs_device *verbs_device)
@@ -249,6 +252,5 @@ static const struct verbs_device_ops qelr_dev_ops = {
 	.alloc_device = qelr_device_alloc,
 	.uninit_device = qelr_uninit_device,
 	.alloc_context = qelr_alloc_context,
-	.free_context = qelr_free_context,
 };
 PROVIDER_DRIVER(qedr, qelr_dev_ops);
diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c
index 4e05d5b..3af58bf 100644
--- a/providers/rxe/rxe.c
+++ b/providers/rxe/rxe.c
@@ -56,6 +56,8 @@
 #include "rxe-abi.h"
 #include "rxe.h"
 
+static void rxe_free_context(struct ibv_context *ibctx);
+
 static const struct verbs_match_ent hca_table[] = {
 	VERBS_DRIVER_ID(RDMA_DRIVER_RXE),
 	VERBS_NAME_MATCH("rxe", NULL),
@@ -856,7 +858,8 @@ static const struct verbs_context_ops rxe_ctx_ops = {
 	.create_ah = rxe_create_ah,
 	.destroy_ah = rxe_destroy_ah,
 	.attach_mcast = ibv_cmd_attach_mcast,
-	.detach_mcast = ibv_cmd_detach_mcast
+	.detach_mcast = ibv_cmd_detach_mcast,
+	.free_context = rxe_free_context,
 };
 
 static struct verbs_context *rxe_alloc_context(struct ibv_device *ibdev,
@@ -926,6 +929,5 @@ static const struct verbs_device_ops rxe_dev_ops = {
 	.alloc_device = rxe_device_alloc,
 	.uninit_device = rxe_uninit_device,
 	.alloc_context = rxe_alloc_context,
-	.free_context = rxe_free_context,
 };
 PROVIDER_DRIVER(rxe, rxe_dev_ops);
diff --git a/providers/siw/siw.c b/providers/siw/siw.c
index df00fc5..9530833 100644
--- a/providers/siw/siw.c
+++ b/providers/siw/siw.c
@@ -18,6 +18,7 @@
 #include "siw.h"
 
 static const int siw_debug;
+static void siw_free_context(struct ibv_context *ibv_ctx);
 
 static int siw_query_device(struct ibv_context *ctx,
 			    struct ibv_device_attr *attr)
@@ -841,6 +842,7 @@ static const struct verbs_context_ops siw_context_ops = {
 	.destroy_cq = siw_destroy_cq,
 	.destroy_qp = siw_destroy_qp,
 	.destroy_srq = siw_destroy_srq,
+	.free_context = siw_free_context,
 	.modify_qp = siw_modify_qp,
 	.modify_srq = siw_modify_srq,
 	.poll_cq = siw_poll_cq,
@@ -919,7 +921,6 @@ static const struct verbs_device_ops siw_dev_ops = {
 	.alloc_device = siw_device_alloc,
 	.uninit_device = siw_device_free,
 	.alloc_context = siw_alloc_context,
-	.free_context = siw_free_context
 };
 
 PROVIDER_DRIVER(siw, siw_dev_ops);
diff --git a/providers/vmw_pvrdma/pvrdma_main.c b/providers/vmw_pvrdma/pvrdma_main.c
index 4d64d96..14a67c1 100644
--- a/providers/vmw_pvrdma/pvrdma_main.c
+++ b/providers/vmw_pvrdma/pvrdma_main.c
@@ -45,6 +45,8 @@
 
 #include "pvrdma.h"
 
+static void pvrdma_free_context(struct ibv_context *ibctx);
+
 /*
  * VMware PVRDMA vendor id and PCI device id.
  */
@@ -52,6 +54,7 @@
 #define PCI_DEVICE_ID_VMWARE_PVRDMA	0x0820
 
 static const struct verbs_context_ops pvrdma_ctx_ops = {
+	.free_context = pvrdma_free_context,
 	.query_device = pvrdma_query_device,
 	.query_port = pvrdma_query_port,
 	.alloc_pd = pvrdma_alloc_pd,
@@ -208,6 +211,5 @@ static const struct verbs_device_ops pvrdma_dev_ops = {
 	.alloc_device = pvrdma_device_alloc,
 	.uninit_device = pvrdma_uninit_device,
 	.alloc_context = pvrdma_alloc_context,
-	.free_context  = pvrdma_free_context,
 };
 PROVIDER_DRIVER(vmw_pvrdma, pvrdma_dev_ops);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 3/7] verbs: Move alloc_context to ioctl
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 1/7] Update kernel headers Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 2/7] verbs: Move free_context from verbs_device_ops to verbs_context_ops Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 4/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

From: Michael Guralnik <michaelgur@mellanox.com>

Execute alloc_context using ioctl mechanism with fallback to write
method.

In the ioctl flow split the aync_fd allocation to a new ioctl call and
get the 'core_support' field returned from the kernel.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 libibverbs/cmd.c        | 18 -----------
 libibverbs/cmd_device.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++
 libibverbs/driver.h     |  1 +
 3 files changed, 80 insertions(+), 18 deletions(-)

diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index 26eaa47..b68f3e3 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -47,24 +47,6 @@
 
 bool verbs_allow_disassociate_destroy;
 
-int ibv_cmd_get_context(struct verbs_context *context_ex,
-			struct ibv_get_context *cmd, size_t cmd_size,
-			struct ib_uverbs_get_context_resp *resp, size_t resp_size)
-{
-	int ret;
-
-	ret = execute_cmd_write(&context_ex->context,
-				IB_USER_VERBS_CMD_GET_CONTEXT, cmd, cmd_size,
-				resp, resp_size);
-	if (ret)
-		return ret;
-
-	context_ex->context.async_fd = resp->async_fd;
-	context_ex->context.num_comp_vectors = resp->num_comp_vectors;
-
-	return 0;
-}
-
 static void copy_query_dev_fields(struct ibv_device_attr *device_attr,
 				  struct ib_uverbs_query_device_resp *resp,
 				  uint64_t *raw_fw_ver)
diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index d806351..4de59c0 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -99,3 +99,82 @@ int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num,
 	return 0;
 }
 
+static int cmd_alloc_async_fd(struct ibv_context *context)
+{
+	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_ASYNC_EVENT,
+			       UVERBS_METHOD_ASYNC_EVENT_ALLOC, 1);
+	struct ib_uverbs_attr *handle;
+	int ret;
+
+	handle = fill_attr_out_fd(cmdb, UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE,
+				  0);
+
+	ret = execute_ioctl(context, cmdb);
+	if (ret)
+		return ret;
+
+	context->async_fd =
+		read_attr_fd(UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE, handle);
+	return 0;
+}
+
+static int cmd_get_context(struct verbs_context *context_ex,
+				struct ibv_command_buffer *link)
+{
+	DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_DEVICE,
+			     UVERBS_METHOD_GET_CONTEXT, 2, link);
+
+	struct ibv_context *context = &context_ex->context;
+	struct verbs_device *verbs_device;
+	uint64_t core_support;
+	uint32_t num_comp_vectors;
+	int ret;
+
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS,
+			  &num_comp_vectors);
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT,
+			  &core_support);
+
+	/* Using free_context cmd_name as alloc context is not in
+	 * verbs_context_ops while free_context is and doesn't use ioctl
+	 */
+	switch (execute_ioctl_fallback(context, free_context, cmdb, &ret)) {
+	case TRY_WRITE: {
+		DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_GET_CONTEXT);
+
+		ret = execute_write_bufs(context, IB_USER_VERBS_CMD_GET_CONTEXT,
+					 req, resp);
+		if (ret)
+			return ret;
+
+		context->async_fd = resp->async_fd;
+		context->num_comp_vectors = resp->num_comp_vectors;
+
+		return 0;
+	}
+	case SUCCESS:
+		ret = cmd_alloc_async_fd(context);
+		if (ret)
+			return ret;
+		break;
+	default:
+		return ret;
+	};
+
+	context->num_comp_vectors = num_comp_vectors;
+	verbs_device = verbs_get_device(context->device);
+	verbs_device->core_support = core_support;
+	return 0;
+}
+
+int ibv_cmd_get_context(struct verbs_context *context_ex,
+			struct ibv_get_context *cmd, size_t cmd_size,
+			struct ib_uverbs_get_context_resp *resp,
+			size_t resp_size)
+{
+	DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_DEVICE,
+				  UVERBS_METHOD_GET_CONTEXT, cmd, cmd_size,
+				  resp, resp_size);
+
+	return cmd_get_context(context_ex, cmdb);
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 88603ce..09974d9 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -230,6 +230,7 @@ struct verbs_device {
 	atomic_int refcount;
 	struct list_node entry;
 	struct verbs_sysfs_dev *sysfs;
+	uint64_t core_support;
 };
 
 struct verbs_counters {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 4/7] verbs: Relaxed ordering memory regions
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (2 preceding siblings ...)
  2020-01-09 14:04 ` [PATCH rdma-core 3/7] verbs: Move alloc_context to ioctl Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 5/7] mlx5: Add optional access flags range to DM Yishai Hadas
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

From: Michael Guralnik <michaelgur@mellanox.com>

Add a flag to allow creation of relaxed ordering memory regions.
Access through such MRs can improve performance by allowing the system
to reorder certain accesses.

As relaxed ordering is an optimization, drivers that do not support it
can simply ignore it.

An optional MR access bit range is defined based on the kernel matching
part and its first entry will be IBV_ACCESS_RELAXED_ORDERING.

In case an application uses one of the bits from the optional range the
library will mask it out in case the 'MR optional mode' is not supported by
the kernel.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 debian/libibverbs1.symbols   |  2 ++
 libibverbs/CMakeLists.txt    |  2 +-
 libibverbs/libibverbs.map.in |  5 +++++
 libibverbs/man/ibv_reg_mr.3  |  2 ++
 libibverbs/verbs.c           | 13 +++++++++++
 libibverbs/verbs.h           | 51 ++++++++++++++++++++++++++++++++++++++++++--
 libibverbs/verbs_api.h       |  2 ++
 7 files changed, 74 insertions(+), 3 deletions(-)

diff --git a/debian/libibverbs1.symbols b/debian/libibverbs1.symbols
index 51c5407..ec40b29 100644
--- a/debian/libibverbs1.symbols
+++ b/debian/libibverbs1.symbols
@@ -5,6 +5,7 @@ libibverbs.so.1 libibverbs1 #MINVER#
  IBVERBS_1.5@IBVERBS_1.5 20
  IBVERBS_1.6@IBVERBS_1.6 24
  IBVERBS_1.7@IBVERBS_1.7 25
+ IBVERBS_1.8@IBVERBS_1.8 28
  (symver)IBVERBS_PRIVATE_25 25
  ibv_ack_async_event@IBVERBS_1.0 1.1.6
  ibv_ack_async_event@IBVERBS_1.1 1.1.6
@@ -91,6 +92,7 @@ libibverbs.so.1 libibverbs1 #MINVER#
  ibv_reg_mr@IBVERBS_1.0 1.1.6
  ibv_reg_mr@IBVERBS_1.1 1.1.6
  ibv_reg_mr_iova@IBVERBS_1.7 25
+ ibv_reg_mr_iova2@IBVERBS_1.8 28
  ibv_register_driver@IBVERBS_1.1 1.1.6
  ibv_rereg_mr@IBVERBS_1.1 1.2.1
  ibv_resize_cq@IBVERBS_1.0 1.1.6
diff --git a/libibverbs/CMakeLists.txt b/libibverbs/CMakeLists.txt
index a5926bb..4328548 100644
--- a/libibverbs/CMakeLists.txt
+++ b/libibverbs/CMakeLists.txt
@@ -21,7 +21,7 @@ configure_file("libibverbs.map.in"
 
 rdma_library(ibverbs "${CMAKE_CURRENT_BINARY_DIR}/libibverbs.map"
   # See Documentation/versioning.md
-  1 1.7.${PACKAGE_VERSION}
+  1 1.8.${PACKAGE_VERSION}
   all_providers.c
   cmd.c
   cmd_ah.c
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index c1b4537..5280cfe 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -121,6 +121,11 @@ IBVERBS_1.7 {
 		ibv_reg_mr_iova;
 } IBVERBS_1.6;
 
+IBVERBS_1.8 {
+	global:
+		ibv_reg_mr_iova2;
+} IBVERBS_1.7;
+
 /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
    version. See the top level CMakeLists.txt for this setting. */
 
diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
index aa0fe48..2bfc955 100644
--- a/libibverbs/man/ibv_reg_mr.3
+++ b/libibverbs/man/ibv_reg_mr.3
@@ -45,6 +45,8 @@ describes the desired memory protection attributes; it is either 0 or the bitwis
 .B IBV_ACCESS_ON_DEMAND\fR    Create an on-demand paging MR
 .TP
 .B IBV_ACCESS_HUGETLB\fR      Huge pages are guaranteed to be used for this MR, applicable with IBV_ACCESS_ON_DEMAND in explicit mode only
+.TP
+.B IBV_ACCESS_RELAXED_ORDERING\fR Allow system to reorder accesses to the MR to improve performance
 .PP
 If
 .B IBV_ACCESS_REMOTE_WRITE
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index e5063af..b5efd63 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -296,6 +296,7 @@ LATEST_SYMVER_FUNC(ibv_dealloc_pd, 1_1, "IBVERBS_1.1",
 	return get_ops(pd->context)->dealloc_pd(pd);
 }
 
+#undef ibv_reg_mr
 LATEST_SYMVER_FUNC(ibv_reg_mr, 1_1, "IBVERBS_1.1",
 		   struct ibv_mr *,
 		   struct ibv_pd *pd, void *addr,
@@ -319,6 +320,7 @@ LATEST_SYMVER_FUNC(ibv_reg_mr, 1_1, "IBVERBS_1.1",
 	return mr;
 }
 
+#undef ibv_reg_mr_iova
 struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length,
 			       uint64_t iova, int access)
 {
@@ -339,6 +341,17 @@ struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length,
 	return mr;
 }
 
+struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length,
+				uint64_t iova, int access)
+{
+	struct verbs_device *device = verbs_get_device(pd->context->device);
+
+	if (!(device->core_support & IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS))
+		access &= ~(typeof(access))IBV_ACCESS_OPTIONAL_RANGE;
+
+	return ibv_reg_mr_iova(pd, addr, length, iova, access);
+}
+
 LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_mr *mr, int flags,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index fa9833a..13509aa 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -581,6 +581,7 @@ enum ibv_access_flags {
 	IBV_ACCESS_ZERO_BASED		= (1<<5),
 	IBV_ACCESS_ON_DEMAND		= (1<<6),
 	IBV_ACCESS_HUGETLB		= (1<<7),
+	IBV_ACCESS_RELAXED_ORDERING	= IBV_ACCESS_OPTIONAL_FIRST,
 };
 
 struct ibv_mw_bind_info {
@@ -2383,11 +2384,41 @@ static inline int ibv_close_xrcd(struct ibv_xrcd *xrcd)
 	return vctx->close_xrcd(xrcd);
 }
 
+#define _IBV_IS_OPTIONAL_ACCESS(access, is_access_const)                       \
+	(!is_access_const || ((access) & IBV_ACCESS_OPTIONAL_RANGE))
+/**
+ * ibv_reg_mr_iova2 - Register memory region with a virtual offset address
+ *
+ * This version will be called if ibv_reg_mr or ibv_reg_mr_iova were called
+ * with at least one potential access flag from the IBV_OPTIONAL_ACCESS_RANGE
+ * flags range The optional access flags will be masked if running over kernel
+ * that does not support passing them.
+ */
+struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length,
+				uint64_t iova, int access);
+
 /**
  * ibv_reg_mr - Register a memory region
  */
-struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr,
-			  size_t length, int access);
+struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+			  int access);
+/* use new ibv_reg_mr version only if access flags that require it are used */
+static inline struct ibv_mr *__ibv_reg_mr(struct ibv_pd *pd, void *addr,
+					  size_t length, int access,
+					  int is_access_const)
+{
+	struct ibv_mr *__mr;
+
+	if (_IBV_IS_OPTIONAL_ACCESS(access, is_access_const))
+		__mr = ibv_reg_mr_iova2(pd, addr, length, (uintptr_t)addr,
+					access);
+	else
+		__mr = ibv_reg_mr(pd, addr, length, access);
+	return __mr;
+}
+
+#define ibv_reg_mr(pd, addr, length, access)                                   \
+	__ibv_reg_mr(pd, addr, length, access, __builtin_constant_p(access))
 
 /**
  * ibv_reg_mr_iova - Register a memory region with a virtual offset
@@ -2395,7 +2426,23 @@ struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr,
  */
 struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length,
 			       uint64_t iova, int access);
+/* use new ibv_reg_mr version only if access flags that require it are used */
+static inline struct ibv_mr *__ibv_reg_mr_iova(struct ibv_pd *pd, void *addr,
+					       size_t length, uint64_t iova,
+					       int access, int is_access_const)
+{
+	struct ibv_mr *__mr;
+
+	if (_IBV_IS_OPTIONAL_ACCESS(access, is_access_const))
+		__mr = ibv_reg_mr_iova2(pd, addr, length, iova, access);
+	else
+		__mr = ibv_reg_mr_iova(pd, addr, length, iova, access);
+	return __mr;
+}
 
+#define ibv_reg_mr_iova(pd, addr, length, iova, access)                        \
+	__ibv_reg_mr_iova(pd, addr, length, iova, access,                      \
+			  __builtin_constant_p(access))
 
 enum ibv_rereg_mr_err_code {
 	/* Old MR is valid, invalid input */
diff --git a/libibverbs/verbs_api.h b/libibverbs/verbs_api.h
index bdfd677..ded6fa4 100644
--- a/libibverbs/verbs_api.h
+++ b/libibverbs/verbs_api.h
@@ -93,5 +93,7 @@
 
 #define IBV_QPF_GRH_REQUIRED				IB_UVERBS_QPF_GRH_REQUIRED
 
+#define IBV_ACCESS_OPTIONAL_RANGE			IB_UVERBS_ACCESS_OPTIONAL_RANGE
+#define IBV_ACCESS_OPTIONAL_FIRST			IB_UVERBS_ACCESS_OPTIONAL_FIRST
 #endif
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 5/7] mlx5: Add optional access flags range to DM
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (3 preceding siblings ...)
  2020-01-09 14:04 ` [PATCH rdma-core 4/7] verbs: Relaxed ordering memory regions Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 6/7] pyverbs: Add relaxed ordering access flag Yishai Hadas
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur

From: Michael Guralnik <michaelgur@mellanox.com>

Enable passing access flags from the optional access flags range when
registering a DM.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
---
 providers/mlx5/verbs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index d680777..a9b8d56 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -454,7 +454,8 @@ enum {
 				 IBV_ACCESS_REMOTE_WRITE	|
 				 IBV_ACCESS_REMOTE_READ		|
 				 IBV_ACCESS_REMOTE_ATOMIC	|
-				 IBV_ACCESS_ZERO_BASED
+				 IBV_ACCESS_ZERO_BASED		|
+				 IBV_ACCESS_OPTIONAL_RANGE
 };
 
 struct ibv_mr *mlx5_reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *ibdm,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 6/7] pyverbs: Add relaxed ordering access flag
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (4 preceding siblings ...)
  2020-01-09 14:04 ` [PATCH rdma-core 5/7] mlx5: Add optional access flags range to DM Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-09 14:04 ` [PATCH rdma-core 7/7] tests: Add relaxed ordering access test Yishai Hadas
  2020-01-22  9:50 ` [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur, Edward Srouji

From: Michael Guralnik <michaelgur@mellanox.com>

Add to access flags enum the value for enabling relaxed ordering on
memory regions.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Edward Srouji <edwards@mellanox.com>
---
 pyverbs/libibverbs_enums.pxd | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pyverbs/libibverbs_enums.pxd b/pyverbs/libibverbs_enums.pxd
index 74ee16b..67a1b6a 100755
--- a/pyverbs/libibverbs_enums.pxd
+++ b/pyverbs/libibverbs_enums.pxd
@@ -112,6 +112,7 @@ cdef extern from '<infiniband/verbs.h>':
         IBV_ACCESS_ZERO_BASED
         IBV_ACCESS_ON_DEMAND
         IBV_ACCESS_HUGETLB
+        IBV_ACCESS_RELAXED_ORDERING
 
     cpdef enum ibv_wr_opcode:
         IBV_WR_RDMA_WRITE
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-core 7/7] tests: Add relaxed ordering access test
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (5 preceding siblings ...)
  2020-01-09 14:04 ` [PATCH rdma-core 6/7] pyverbs: Add relaxed ordering access flag Yishai Hadas
@ 2020-01-09 14:04 ` Yishai Hadas
  2020-01-22  9:50 ` [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-09 14:04 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, dledford, yishaih, maorg, michaelgur, Edward Srouji

From: Michael Guralnik <michaelgur@mellanox.com>

Test traffic with MRs with relaxed ordering access set.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Edward Srouji <edwards@mellanox.com>
---
 tests/CMakeLists.txt           |  5 ++--
 tests/test_relaxed_ordering.py | 55 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 2 deletions(-)
 create mode 100644 tests/test_relaxed_ordering.py

diff --git a/tests/CMakeLists.txt b/tests/CMakeLists.txt
index 6d70242..cacfc52 100755
--- a/tests/CMakeLists.txt
+++ b/tests/CMakeLists.txt
@@ -10,11 +10,12 @@ rdma_python_test(tests
   test_cqex.py
   test_device.py
   test_mr.py
-  test_pd.py
-  test_qp.py
   test_odp.py
+  test_pd.py
   test_parent_domain.py
+  test_qp.py
   test_rdmacm.py
+  test_relaxed_ordering.py
   utils.py
   )
 
diff --git a/tests/test_relaxed_ordering.py b/tests/test_relaxed_ordering.py
new file mode 100644
index 0000000..27af992
--- /dev/null
+++ b/tests/test_relaxed_ordering.py
@@ -0,0 +1,55 @@
+from tests.base import RCResources, UDResources, XRCResources
+from tests.utils import traffic, xrc_traffic
+from tests.base import RDMATestCase
+from pyverbs.mr import MR
+import pyverbs.enums as e
+
+
+class RoUD(UDResources):
+    def create_mr(self):
+        self.mr = MR(self.pd, self.msg_size + self.GRH_SIZE,
+                     e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING)
+
+
+class RoRC(RCResources):
+    def create_mr(self):
+        self.mr = MR(self.pd, self.msg_size,
+                     e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING)
+
+
+class RoXRC(XRCResources):
+    def create_mr(self):
+        self.mr = MR(self.pd, self.msg_size,
+                     e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING)
+
+
+class RoTestCase(RDMATestCase):
+    def setUp(self):
+        super(RoTestCase, self).setUp()
+        self.iters = 100
+        self.qp_dict = {'rc': RoRC, 'ud': RoUD, 'xrc': RoXRC}
+
+    def create_players(self, qp_type):
+        client = self.qp_dict[qp_type](self.dev_name, self.ib_port,
+                                       self.gid_index)
+        server = self.qp_dict[qp_type](self.dev_name, self.ib_port,
+                                       self.gid_index)
+        if qp_type == 'xrc':
+            client.pre_run(server.psns, server.qps_num)
+            server.pre_run(client.psns, client.qps_num)
+        else:
+            client.pre_run(server.psn, server.qpn)
+            server.pre_run(client.psn, client.qpn)
+        return client, server
+
+    def test_ro_rc_traffic(self):
+        client, server = self.create_players('rc')
+        traffic(client, server, self.iters, self.gid_index, self.ib_port)
+
+    def test_ro_ud_traffic(self):
+        client, server = self.create_players('ud')
+        traffic(client, server, self.iters, self.gid_index, self.ib_port)
+
+    def test_ro_xrc_traffic(self):
+        client, server = self.create_players('xrc')
+        xrc_traffic(client, server)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions
  2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
                   ` (6 preceding siblings ...)
  2020-01-09 14:04 ` [PATCH rdma-core 7/7] tests: Add relaxed ordering access test Yishai Hadas
@ 2020-01-22  9:50 ` Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2020-01-22  9:50 UTC (permalink / raw)
  To: linux-rdma, michaelgur; +Cc: Yishai Hadas, jgg, dledford, maorg

On 1/9/2020 4:04 PM, Yishai Hadas wrote:
> This series exposes an IBV_ACCESS_RELAXED_ORDERING optional MR access flag.
> This optional flag allows creation of relaxed ordering memory regions.
> Access through such MRs can improve performance by allowing the system to reorder
> certain accesses.
> 
> The series uses the new ioctl command to get a device context, this command
> enables reading some core generic capabilities such as supporting an optional
> MR access flags by IB core and its related drivers.
> 
> This capability enables transparent masking of the optional flags in libibverbs
> when the kernel doesn't support the MR optional access mode.
> 
> The series is based on an RFC that was sent to the ML [1], the matching kernel
> series was sent to 'for-next'.
> [1] https://www.spinics.net/lists/linux-rdma/msg86188.html
> 
> PR was sent:
> https://github.com/linux-rdma/rdma-core/pull/660
> 
> Yishai
> 
> Michael Guralnik (6):
>    verbs: Move free_context from verbs_device_ops to verbs_context_ops
>    verbs: Move alloc_context to ioctl
>    verbs: Relaxed ordering memory regions
>    mlx5: Add optional access flags range to DM
>    pyverbs: Add relaxed ordering access flag
>    tests: Add relaxed ordering access test
> 
> Yishai Hadas (1):
>    Update kernel headers
> 
>   debian/libibverbs1.symbols                |  2 +
>   kernel-headers/rdma/ib_user_ioctl_cmds.h  | 15 ++++++
>   kernel-headers/rdma/ib_user_ioctl_verbs.h | 12 +++++
>   libibverbs/CMakeLists.txt                 |  2 +-
>   libibverbs/cmd.c                          | 18 -------
>   libibverbs/cmd_device.c                   | 79 +++++++++++++++++++++++++++++++
>   libibverbs/device.c                       |  5 +-
>   libibverbs/driver.h                       |  3 +-
>   libibverbs/dummy_ops.c                    |  7 +++
>   libibverbs/libibverbs.map.in              |  5 ++
>   libibverbs/man/ibv_reg_mr.3               |  2 +
>   libibverbs/verbs.c                        | 13 +++++
>   libibverbs/verbs.h                        | 51 +++++++++++++++++++-
>   libibverbs/verbs_api.h                    |  2 +
>   providers/bnxt_re/main.c                  |  6 ++-
>   providers/cxgb4/dev.c                     |  4 +-
>   providers/efa/efa.c                       |  4 +-
>   providers/hfi1verbs/hfiverbs.c            |  4 +-
>   providers/hns/hns_roce_u.c                |  4 +-
>   providers/i40iw/i40iw_umain.c             |  6 ++-
>   providers/ipathverbs/ipathverbs.c         |  4 +-
>   providers/mlx4/mlx4.c                     |  4 +-
>   providers/mlx5/mlx5.c                     |  4 +-
>   providers/mlx5/verbs.c                    |  3 +-
>   providers/mthca/mthca.c                   |  6 ++-
>   providers/ocrdma/ocrdma_main.c            |  6 ++-
>   providers/qedr/qelr_main.c                |  4 +-
>   providers/rxe/rxe.c                       |  6 ++-
>   providers/siw/siw.c                       |  3 +-
>   providers/vmw_pvrdma/pvrdma_main.c        |  4 +-
>   pyverbs/libibverbs_enums.pxd              |  1 +
>   tests/CMakeLists.txt                      |  5 +-
>   tests/test_relaxed_ordering.py            | 55 +++++++++++++++++++++
>   33 files changed, 301 insertions(+), 48 deletions(-)
>   create mode 100644 tests/test_relaxed_ordering.py
> 

The PR was merged, thanks.

Yishai

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 14:04 [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 1/7] Update kernel headers Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 2/7] verbs: Move free_context from verbs_device_ops to verbs_context_ops Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 3/7] verbs: Move alloc_context to ioctl Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 4/7] verbs: Relaxed ordering memory regions Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 5/7] mlx5: Add optional access flags range to DM Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 6/7] pyverbs: Add relaxed ordering access flag Yishai Hadas
2020-01-09 14:04 ` [PATCH rdma-core 7/7] tests: Add relaxed ordering access test Yishai Hadas
2020-01-22  9:50 ` [PATCH rdma-core 0/7] verbs: Relaxed ordering memory regions Yishai Hadas

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git