linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 rdma-core 00/12] Shared PD and MR
@ 2019-08-21 14:26 Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 01/12] verbs: Introduce new inline helpers Yuval Shaia
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma

Following patch-set introduce the shared object feature.

A shared object feature allows one process to create HW objects (currently
PD and MR) so that a second process can import.

Patch-set is logically splits to two parts, one for PD and one for MR.
(patches 2 and 7 are the changes to man pages)

v0 -> v1:
	* Fix typo in comment
	* Rebase to latest upstream branch

Shamir Rabinovitch (2):
  verbs: Introduce new inline helpers
  verbs: pinpong test using shared objects API

Yuval Shaia (10):
  man: Add description to ibv_import_pd function
  verbs: Introduce new verb to import PD object
  mlx4: Implementation of import PD callback
  mlx5: Implementation of import PD callback
  rxe: Implementation of import PD callback
  man: Add description to ibv_import_mr function
  verbs: Introduce new verb to import MR object
  mlx4: Implementation of import MR callback
  mlx5: Implementation of import MR callback
  rxe: Implementation of import MR callback

 kernel-headers/rdma/ib_user_verbs.h |   26 +
 libibverbs/cmd.c                    |   39 +
 libibverbs/driver.h                 |   12 +
 libibverbs/dummy_ops.c              |   18 +
 libibverbs/examples/CMakeLists.txt  |    3 +
 libibverbs/examples/shpd_pingpong.c | 1142 +++++++++++++++++++++++++++
 libibverbs/kern-abi.h               |    3 +-
 libibverbs/libibverbs.map.in        |    3 +
 libibverbs/man/ibv_alloc_pd.3       |   22 +-
 libibverbs/man/ibv_reg_mr.3         |   17 +-
 libibverbs/verbs.h                  |   55 ++
 providers/mlx4/mlx4-abi.h           |    2 +
 providers/mlx4/mlx4.c               |    2 +
 providers/mlx4/mlx4.h               |    4 +
 providers/mlx4/verbs.c              |   56 ++
 providers/mlx5/mlx5-abi.h           |    2 +
 providers/mlx5/mlx5.c               |    2 +
 providers/mlx5/mlx5.h               |    4 +
 providers/mlx5/verbs.c              |   54 ++
 providers/rxe/rxe-abi.h             |    2 +
 providers/rxe/rxe.c                 |   55 ++
 21 files changed, 1520 insertions(+), 3 deletions(-)
 create mode 100644 libibverbs/examples/shpd_pingpong.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 01/12] verbs: Introduce new inline helpers
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 02/12] man: Add description to ibv_import_pd function Yuval Shaia
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

From: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>

For sharing object an application needs an access to object's hadnle
(such as PD handle).

Add helpers to do that.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 libibverbs/verbs.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index 1e01b5db..eb9df3a4 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -3270,6 +3270,21 @@ static inline int ibv_read_counters(struct ibv_counters *counters,
 	return vctx->read_counters(counters, counters_value, ncounters, flags);
 }
 
+static inline uint32_t ibv_context_to_fd(struct ibv_context *context)
+{
+	return context->cmd_fd;
+}
+
+static inline uint32_t ibv_pd_to_handle(struct ibv_pd *pd)
+{
+	return pd->handle;
+}
+
+static inline uint32_t ibv_mr_to_handle(struct ibv_mr *mr)
+{
+	return mr->handle;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 02/12] man: Add description to ibv_import_pd function
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 01/12] verbs: Introduce new inline helpers Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 03/12] verbs: Introduce new verb to import PD object Yuval Shaia
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

New ibv_import_pd is introduce to allow process to import a PD craeted
by another process.
Add description of the API to ibv_alloc_pd man page.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 libibverbs/man/ibv_alloc_pd.3 | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/libibverbs/man/ibv_alloc_pd.3 b/libibverbs/man/ibv_alloc_pd.3
index cc475f4a..aed72ff4 100644
--- a/libibverbs/man/ibv_alloc_pd.3
+++ b/libibverbs/man/ibv_alloc_pd.3
@@ -3,13 +3,16 @@
 .\"
 .TH IBV_ALLOC_PD 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
 .SH "NAME"
-ibv_alloc_pd, ibv_dealloc_pd \- allocate or deallocate a protection domain (PDs)
+ibv_alloc_pd, ibv_import_pd, ibv_dealloc_pd \- allocate, import or deallocate a protection domain (PDs)
 .SH "SYNOPSIS"
 .nf
 .B #include <infiniband/verbs.h>
 .sp
 .BI "struct ibv_pd *ibv_alloc_pd(struct ibv_context " "*context" );
 .sp
+.BI "struct ibv_pd *ibv_import_pd(struct ibv_context " "*context" ",
+.BI "                             uint32_t" " fd" ", uint32_t" " handle" );
+.sp
 .BI "int ibv_dealloc_pd(struct ibv_pd " "*pd" );
 .fi
 .SH "DESCRIPTION"
@@ -17,6 +20,14 @@ ibv_alloc_pd, ibv_dealloc_pd \- allocate or deallocate a protection domain (PDs)
 allocates a PD for the RDMA device context 
 .I context\fR.
 .PP
+.B ibv_import_pd()
+imports PD identified by
+.I handle\fR
+from context identified by file descriptor
+.I fd\fR
+to context
+.I context\fR.
+.PP
 .B ibv_dealloc_pd()
 deallocates the PD
 .I pd\fR.
@@ -24,9 +35,18 @@ deallocates the PD
 .B ibv_alloc_pd()
 returns a pointer to the allocated PD, or NULL if the request fails.
 .PP
+.B ibv_import_pd()
+returns a pointer to the imported PD, or NULL if the request fails.
+.PP
 .B ibv_dealloc_pd()
 returns 0 on success, or the value of errno on failure (which indicates the failure reason).
 .SH "NOTES"
+.B ibv_import_pd()
+once PD is imported the process which created it stays on hold until all
+reference to it are deallocated (PD can be imported more than once). The
+results of importing a PD from the same process that creates it are
+unexpected.
+.PP
 .B ibv_dealloc_pd()
 may fail if any other resource is still associated with the PD being
 freed.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 03/12] verbs: Introduce new verb to import PD object
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 01/12] verbs: Introduce new inline helpers Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 02/12] man: Add description to ibv_import_pd function Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 04/12] mlx4: Implementation of import PD callback Yuval Shaia
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

Second step in sharing an IB object is to import the exported object.

A new IB verb is introduced to import an IB object from one context
(identified by it's fd) to a second one.

Importing an IB object increases the reference count of that object in
the kernel.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 kernel-headers/rdma/ib_user_verbs.h | 16 ++++++++++++++++
 libibverbs/cmd.c                    | 18 ++++++++++++++++++
 libibverbs/driver.h                 |  6 ++++++
 libibverbs/dummy_ops.c              |  9 +++++++++
 libibverbs/kern-abi.h               |  2 +-
 libibverbs/libibverbs.map.in        |  2 ++
 libibverbs/verbs.h                  | 20 ++++++++++++++++++++
 7 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/kernel-headers/rdma/ib_user_verbs.h b/kernel-headers/rdma/ib_user_verbs.h
index 0474c740..872298bf 100644
--- a/kernel-headers/rdma/ib_user_verbs.h
+++ b/kernel-headers/rdma/ib_user_verbs.h
@@ -88,6 +88,8 @@ enum ib_uverbs_write_cmds {
 	IB_USER_VERBS_CMD_CLOSE_XRCD,
 	IB_USER_VERBS_CMD_CREATE_XSRQ,
 	IB_USER_VERBS_CMD_OPEN_QP,
+	IB_USER_VERBS_CMD_IMPORT_FROM_FD,
+	IB_USER_VERBS_CMD_IMPORT_PD = IB_USER_VERBS_CMD_IMPORT_FROM_FD,
 };
 
 enum {
@@ -1299,6 +1301,20 @@ struct ib_uverbs_ex_modify_cq {
 	__u32 reserved;
 };
 
+struct ib_uverbs_import_pd {
+	__aligned_u64 response;
+	__u32 fd;
+	__u32 handle;
+	__u16 type;
+	__u8  reserved[6];
+};
+
+struct ib_uverbs_import_fr_fd_resp {
+	union {
+		struct ib_uverbs_alloc_pd_resp alloc_pd;
+	} u;
+};
+
 #define IB_DEVICE_NAME_MAX 64
 
 #endif /* IB_USER_VERBS_H */
diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index 3936e69b..75efcf75 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -294,6 +294,24 @@ int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd,
 	return 0;
 }
 
+int ibv_cmd_import_pd(struct ibv_context *context, struct ibv_pd *pd,
+		      struct ibv_import_pd *cmd, size_t cmd_size,
+		      struct ib_uverbs_import_fr_fd_resp *resp,
+		      size_t resp_size)
+{
+	int ret;
+
+	ret = execute_cmd_write(context, IB_USER_VERBS_CMD_IMPORT_PD, cmd,
+				cmd_size, resp, resp_size);
+	if (ret)
+		return ret;
+
+	pd->handle  = resp->u.alloc_pd.pd_handle;
+	pd->context = context;
+
+	return 0;
+}
+
 int ibv_cmd_open_xrcd(struct ibv_context *context, struct verbs_xrcd *xrcd,
 		      int vxrcd_size,
 		      struct ibv_xrcd_init_attr *attr,
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 88ed2b5e..f2e2f11c 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -317,6 +317,8 @@ struct verbs_context_ops {
 			    uint16_t lid);
 	int (*free_dm)(struct ibv_dm *dm);
 	int (*get_srq_num)(struct ibv_srq *srq, uint32_t *srq_num);
+	struct ibv_pd *(*import_pd)(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle);
 	int (*modify_cq)(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr);
 	int (*modify_flow_action_esp)(struct ibv_flow_action *action,
 				      struct ibv_flow_action_esp_attr *attr);
@@ -456,6 +458,10 @@ int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd,
 		     struct ibv_alloc_pd *cmd, size_t cmd_size,
 		     struct ib_uverbs_alloc_pd_resp *resp, size_t resp_size);
 int ibv_cmd_dealloc_pd(struct ibv_pd *pd);
+int ibv_cmd_import_pd(struct ibv_context *context, struct ibv_pd *pd,
+		      struct ibv_import_pd *cmd, size_t cmd_size,
+		      struct ib_uverbs_import_fr_fd_resp *resp,
+		      size_t resp_size);
 int ibv_cmd_open_xrcd(struct ibv_context *context, struct verbs_xrcd *xrcd,
 		      int vxrcd_size,
 		      struct ibv_xrcd_init_attr *attr,
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index 6560371a..295e0732 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -282,6 +282,13 @@ static int get_srq_num(struct ibv_srq *srq, uint32_t *srq_num)
 	return ENOSYS;
 }
 
+static struct ibv_pd *import_pd(struct ibv_context *context, uint32_t fd,
+				uint32_t handle)
+{
+	errno = ENOSYS;
+	return NULL;
+}
+
 static int modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr)
 {
 	return ENOSYS;
@@ -487,6 +494,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	detach_mcast,
 	free_dm,
 	get_srq_num,
+	import_pd,
 	modify_cq,
 	modify_flow_action_esp,
 	modify_qp,
@@ -627,6 +635,7 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_OP(ctx, req_notify_cq);
 	SET_PRIV_OP(ctx, rereg_mr);
 	SET_PRIV_OP(ctx, resize_cq);
+	SET_OP(vctx, import_pd);
 
 #undef SET_OP
 #undef SET_OP2
diff --git a/libibverbs/kern-abi.h b/libibverbs/kern-abi.h
index dc2f33d3..714be0c8 100644
--- a/libibverbs/kern-abi.h
+++ b/libibverbs/kern-abi.h
@@ -207,7 +207,7 @@ DECLARE_CMD(IB_USER_VERBS_CMD_REG_MR, ibv_reg_mr, ib_uverbs_reg_mr);
 DECLARE_CMDX(IB_USER_VERBS_CMD_REQ_NOTIFY_CQ, ibv_req_notify_cq, ib_uverbs_req_notify_cq, empty);
 DECLARE_CMD(IB_USER_VERBS_CMD_REREG_MR, ibv_rereg_mr, ib_uverbs_rereg_mr);
 DECLARE_CMD(IB_USER_VERBS_CMD_RESIZE_CQ, ibv_resize_cq, ib_uverbs_resize_cq);
-
+DECLARE_CMDX(IB_USER_VERBS_CMD_IMPORT_PD, ibv_import_pd, ib_uverbs_import_pd, ib_uverbs_import_fr_fd_resp);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_CQ, ibv_create_cq_ex, ib_uverbs_ex_create_cq);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_FLOW, ibv_create_flow, ib_uverbs_create_flow);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_QP, ibv_create_qp_ex, ib_uverbs_ex_create_qp);
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index c1b4537a..6fff7065 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -114,6 +114,7 @@ IBVERBS_1.5 {
 IBVERBS_1.6 {
 	global:
 		ibv_qp_to_qp_ex;
+		ibv_import_pd;
 } IBVERBS_1.5;
 
 IBVERBS_1.7 {
@@ -164,6 +165,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_detach_mcast;
 		ibv_cmd_free_dm;
 		ibv_cmd_get_context;
+		ibv_cmd_import_pd;
 		ibv_cmd_modify_flow_action_esp;
 		ibv_cmd_modify_qp;
 		ibv_cmd_modify_qp_ex;
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index eb9df3a4..f3f7200a 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -2015,6 +2015,8 @@ struct ibv_values_ex {
 
 struct verbs_context {
 	/*  "grows up" - new fields go here */
+	struct ibv_pd *(*import_pd)(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle);
 	int (*query_port)(struct ibv_context *context, uint8_t port_num,
 			  struct ibv_port_attr *port_attr,
 			  size_t port_attr_len);
@@ -3285,6 +3287,24 @@ static inline uint32_t ibv_mr_to_handle(struct ibv_mr *mr)
 	return mr->handle;
 }
 
+static inline struct ibv_pd *ibv_import_pd(struct ibv_context *context,
+					   uint32_t fd, uint32_t handle)
+{
+	struct verbs_context *vctx = verbs_get_ctx_op(context, import_pd);
+	struct ibv_pd *pd;
+
+	if (!vctx) {
+		errno = ENOSYS;
+		return NULL;
+	}
+
+	pd = vctx->import_pd(context, fd, handle);
+	if (pd)
+		pd->context = context;
+
+	return pd;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 04/12] mlx4: Implementation of import PD callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (2 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 03/12] verbs: Introduce new verb to import PD object Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 05/12] mlx5: " Yuval Shaia
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import PD verb take care of importing the generic part of the PD
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of mlx4 related PD attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/mlx4/mlx4-abi.h |  2 ++
 providers/mlx4/mlx4.c     |  1 +
 providers/mlx4/mlx4.h     |  2 ++
 providers/mlx4/verbs.c    | 30 ++++++++++++++++++++++++++++++
 4 files changed, 35 insertions(+)

diff --git a/providers/mlx4/mlx4-abi.h b/providers/mlx4/mlx4-abi.h
index e1d8327e..f43c512d 100644
--- a/providers/mlx4/mlx4-abi.h
+++ b/providers/mlx4/mlx4-abi.h
@@ -70,5 +70,7 @@ DECLARE_DRV_CMD(mlx4_query_device_ex, IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
 		empty, mlx4_uverbs_ex_query_device_resp);
 DECLARE_DRV_CMD(mlx4_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ,
 		mlx4_ib_resize_cq, empty);
+DECLARE_DRV_CMD(mlx4_import_pd, IB_USER_VERBS_CMD_IMPORT_PD,
+		empty, mlx4_ib_alloc_pd_resp);
 
 #endif /* MLX4_ABI_H */
diff --git a/providers/mlx4/mlx4.c b/providers/mlx4/mlx4.c
index 0afe59ca..62ea5539 100644
--- a/providers/mlx4/mlx4.c
+++ b/providers/mlx4/mlx4.c
@@ -86,6 +86,7 @@ static const struct verbs_context_ops mlx4_ctx_ops = {
 	.query_port    = mlx4_query_port,
 	.alloc_pd      = mlx4_alloc_pd,
 	.dealloc_pd    = mlx4_free_pd,
+	.import_pd     = mlx4_import_pd,
 	.reg_mr	       = mlx4_reg_mr,
 	.rereg_mr      = mlx4_rereg_mr,
 	.dereg_mr      = mlx4_dereg_mr,
diff --git a/providers/mlx4/mlx4.h b/providers/mlx4/mlx4.h
index 3c161e8e..9f171d09 100644
--- a/providers/mlx4/mlx4.h
+++ b/providers/mlx4/mlx4.h
@@ -316,6 +316,8 @@ int mlx4_query_rt_values(struct ibv_context *context,
 			 struct ibv_values_ex *values);
 struct ibv_pd *mlx4_alloc_pd(struct ibv_context *context);
 int mlx4_free_pd(struct ibv_pd *pd);
+struct ibv_pd *mlx4_import_pd(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle);
 struct ibv_xrcd *mlx4_open_xrcd(struct ibv_context *context,
 				struct ibv_xrcd_init_attr *attr);
 int mlx4_close_xrcd(struct ibv_xrcd *xrcd);
diff --git a/providers/mlx4/verbs.c b/providers/mlx4/verbs.c
index d814a2bc..87fbf2e1 100644
--- a/providers/mlx4/verbs.c
+++ b/providers/mlx4/verbs.c
@@ -41,6 +41,8 @@
 
 #include <util/mmio.h>
 
+#include <rdma/ib_user_ioctl_cmds.h>
+
 #include "mlx4.h"
 #include "mlx4-abi.h"
 
@@ -237,6 +239,34 @@ int mlx4_free_pd(struct ibv_pd *pd)
 	return 0;
 }
 
+struct ibv_pd *mlx4_import_pd(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle)
+{
+	struct ibv_import_pd cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_PD,
+		.fd = fd,
+	};
+	struct mlx4_import_pd_resp resp;
+	struct mlx4_pd *pd;
+	int ret;
+
+	pd = malloc(sizeof(*pd));
+	if (!pd)
+		return NULL;
+
+	ret = ibv_cmd_import_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+				&resp.ibv_resp, sizeof(resp));
+	if (ret) {
+		free(pd);
+		return NULL;
+	}
+
+	pd->pdn = resp.pdn;
+
+	return &pd->ibv_pd;
+}
+
 struct ibv_xrcd *mlx4_open_xrcd(struct ibv_context *context,
 				struct ibv_xrcd_init_attr *attr)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 05/12] mlx5: Implementation of import PD callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (3 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 04/12] mlx4: Implementation of import PD callback Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 06/12] rxe: " Yuval Shaia
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import PD verb take care of importing the generic part of the PD
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of mlx5 related PD attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/mlx5/mlx5-abi.h |  2 ++
 providers/mlx5/mlx5.c     |  1 +
 providers/mlx5/mlx5.h     |  2 ++
 providers/mlx5/verbs.c    | 28 ++++++++++++++++++++++++++++
 4 files changed, 33 insertions(+)

diff --git a/providers/mlx5/mlx5-abi.h b/providers/mlx5/mlx5-abi.h
index 2b66e820..1faeb4ba 100644
--- a/providers/mlx5/mlx5-abi.h
+++ b/providers/mlx5/mlx5-abi.h
@@ -85,6 +85,8 @@ DECLARE_DRV_CMD(mlx5_query_device_ex, IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
 		empty, mlx5_ib_query_device_resp);
 DECLARE_DRV_CMD(mlx5_modify_qp_ex, IB_USER_VERBS_EX_CMD_MODIFY_QP,
 		empty, mlx5_ib_modify_qp_resp);
+DECLARE_DRV_CMD(mlx5_import_pd, IB_USER_VERBS_CMD_IMPORT_PD,
+		empty, mlx5_ib_alloc_pd_resp);
 
 struct mlx5_modify_qp {
 	struct ibv_modify_qp_ex		ibv_cmd;
diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 291e7ee0..c16b30b3 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -91,6 +91,7 @@ static const struct verbs_context_ops mlx5_ctx_common_ops = {
 	.alloc_pd      = mlx5_alloc_pd,
 	.async_event   = mlx5_async_event,
 	.dealloc_pd    = mlx5_free_pd,
+	.import_pd     = mlx5_import_pd,
 	.reg_mr	       = mlx5_reg_mr,
 	.rereg_mr      = mlx5_rereg_mr,
 	.dereg_mr      = mlx5_dereg_mr,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index ab3c2c1a..06e2b471 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -816,6 +816,8 @@ int mlx5_query_port(struct ibv_context *context, uint8_t port,
 
 struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context);
 int mlx5_free_pd(struct ibv_pd *pd);
+struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle);
 
 void mlx5_async_event(struct ibv_context *context,
 		      struct ibv_async_event *event);
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 714c5f7e..3d2510c3 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -178,6 +178,34 @@ struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context)
 	return &pd->ibv_pd;
 }
 
+struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle)
+{
+	struct ibv_import_pd cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_PD,
+		.fd = fd,
+	};
+	struct mlx5_import_pd_resp resp;
+	struct mlx5_pd *pd;
+	int ret;
+
+	pd = calloc(1, sizeof(*pd));
+	if (!pd)
+		return NULL;
+
+	ret = ibv_cmd_import_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+				&resp.ibv_resp, sizeof(resp));
+	if (ret) {
+		free(pd);
+		return NULL;
+	}
+
+	pd->pdn = resp.pdn;
+
+	return &pd->ibv_pd;
+}
+
 static void mlx5_put_bfreg_index(struct mlx5_context *ctx, uint32_t bfreg_dyn_index)
 {
 	pthread_mutex_lock(&ctx->dyn_bfregs_mutex);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 06/12] rxe: Implementation of import PD callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (4 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 05/12] mlx5: " Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 07/12] man: Add description to ibv_import_mr function Yuval Shaia
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import PD verb take care of importing the generic part of the PD
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of mlx5 related PD attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/rxe/rxe-abi.h |  2 ++
 providers/rxe/rxe.c     | 28 ++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/providers/rxe/rxe-abi.h b/providers/rxe/rxe-abi.h
index b4680a24..1b1d7248 100644
--- a/providers/rxe/rxe-abi.h
+++ b/providers/rxe/rxe-abi.h
@@ -49,5 +49,7 @@ DECLARE_DRV_CMD(urxe_modify_srq, IB_USER_VERBS_CMD_MODIFY_SRQ,
 		rxe_modify_srq_cmd, empty);
 DECLARE_DRV_CMD(urxe_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ,
 		empty, rxe_resize_cq_resp);
+DECLARE_DRV_CMD(urxe_import_pd, IB_USER_VERBS_CMD_IMPORT_PD,
+		empty, ib_uverbs_alloc_pd_resp);
 
 #endif /* RXE_ABI_H */
diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c
index 4e05d5b9..3ea4ff08 100644
--- a/providers/rxe/rxe.c
+++ b/providers/rxe/rxe.c
@@ -49,6 +49,7 @@
 #include <pthread.h>
 #include <stddef.h>
 
+#include <rdma/ib_user_ioctl_cmds.h>
 #include <infiniband/driver.h>
 #include <infiniband/verbs.h>
 
@@ -111,6 +112,32 @@ static struct ibv_pd *rxe_alloc_pd(struct ibv_context *context)
 	return pd;
 }
 
+static struct ibv_pd *rxe_import_pd(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle)
+{
+	struct ibv_import_pd cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_PD,
+		.fd = fd,
+	};
+	struct urxe_import_pd_resp resp;
+	struct ibv_pd *pd;
+	int ret;
+
+	pd = calloc(1, sizeof(*pd));
+	if (!pd)
+		return NULL;
+
+	ret = ibv_cmd_import_pd(context, pd, &cmd, sizeof(cmd), &resp.ibv_resp,
+				sizeof(resp));
+	if (ret) {
+		free(pd);
+		return NULL;
+	}
+
+	return pd;
+}
+
 static int rxe_dealloc_pd(struct ibv_pd *pd)
 {
 	int ret;
@@ -835,6 +862,7 @@ static const struct verbs_context_ops rxe_ctx_ops = {
 	.query_port = rxe_query_port,
 	.alloc_pd = rxe_alloc_pd,
 	.dealloc_pd = rxe_dealloc_pd,
+	.import_pd = rxe_import_pd,
 	.reg_mr = rxe_reg_mr,
 	.dereg_mr = rxe_dereg_mr,
 	.create_cq = rxe_create_cq,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 07/12] man: Add description to ibv_import_mr function
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (5 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 06/12] rxe: " Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 08/12] verbs: Introduce new verb to import MR object Yuval Shaia
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

New ibv_import_mr is introduce to allow process to import a MR craeted
by another process.
Add description of the API to ibv_reg_mr man page.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 libibverbs/man/ibv_reg_mr.3 | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
index be90a57b..85d5b9e1 100644
--- a/libibverbs/man/ibv_reg_mr.3
+++ b/libibverbs/man/ibv_reg_mr.3
@@ -3,7 +3,7 @@
 .\"
 .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
 .SH "NAME"
-ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
+ibv_reg_mr, ibv_import_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
 .SH "SYNOPSIS"
 .nf
 .B #include <infiniband/verbs.h>
@@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
 .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
 .BI "                               int " "access" );
 .sp
+.BI "struct ibv_mr *ibv_import_mr(struct ibv_context " "*context" ",
+.BI "                             uint32_t" " fd" ", uint32_t" " handle");
+.sp
 .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
 .fi
 .SH "DESCRIPTION"
@@ -63,6 +66,14 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
 (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
 IBV_ACCESS_ZERO_BASED.
 .PP
+.B ibv_import_mr()
+imports MR identified by
+.I handle\fR
+from context identified by file descriptor
+.I fd\fR
+to device context
+.I context\fR.
+.PP
 .B ibv_dereg_mr()
 deregisters the MR
 .I mr\fR.
@@ -79,9 +90,13 @@ is used by remote processes to perform Atomic and RDMA operations.  The remote p
 .B rkey
 as the rkey field of struct ibv_send_wr passed to the ibv_post_send function.
 .PP
+.B ibv_import_mr()
+returns a pointer to the imported MR, or NULL if the request fails.
+.PP
 .B ibv_dereg_mr()
 returns 0 on success, or the value of errno on failure (which indicates the failure reason).
 .SH "NOTES"
+.PP
 .B ibv_dereg_mr()
 fails if any memory window is still bound to this MR.
 .SH "SEE ALSO"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 08/12] verbs: Introduce new verb to import MR object
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (6 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 07/12] man: Add description to ibv_import_mr function Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 09/12] mlx4: Implementation of import MR callback Yuval Shaia
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

Second step in sharing an IB object is to import the exported object.

A new IB verb is introduced to import an IB object from one context
(identified by it's fd) to a second one.

Importing an IB object increases the reference count of that object in
the kernel

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 kernel-headers/rdma/ib_user_verbs.h | 10 ++++++++++
 libibverbs/cmd.c                    | 21 +++++++++++++++++++++
 libibverbs/driver.h                 |  6 ++++++
 libibverbs/dummy_ops.c              |  9 +++++++++
 libibverbs/kern-abi.h               |  1 +
 libibverbs/libibverbs.map.in        |  1 +
 libibverbs/verbs.h                  | 20 ++++++++++++++++++++
 7 files changed, 68 insertions(+)

diff --git a/kernel-headers/rdma/ib_user_verbs.h b/kernel-headers/rdma/ib_user_verbs.h
index 872298bf..5e55979c 100644
--- a/kernel-headers/rdma/ib_user_verbs.h
+++ b/kernel-headers/rdma/ib_user_verbs.h
@@ -90,6 +90,7 @@ enum ib_uverbs_write_cmds {
 	IB_USER_VERBS_CMD_OPEN_QP,
 	IB_USER_VERBS_CMD_IMPORT_FROM_FD,
 	IB_USER_VERBS_CMD_IMPORT_PD = IB_USER_VERBS_CMD_IMPORT_FROM_FD,
+	IB_USER_VERBS_CMD_IMPORT_MR = IB_USER_VERBS_CMD_IMPORT_FROM_FD,
 };
 
 enum {
@@ -1309,9 +1310,18 @@ struct ib_uverbs_import_pd {
 	__u8  reserved[6];
 };
 
+struct ib_uverbs_import_mr {
+	__aligned_u64 response;
+	__u32 fd;
+	__u32 handle;
+	__u16 type;
+	__u8  reserved[6];
+};
+
 struct ib_uverbs_import_fr_fd_resp {
 	union {
 		struct ib_uverbs_alloc_pd_resp alloc_pd;
+		struct ib_uverbs_reg_mr_resp reg_mr;
 	} u;
 };
 
diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index 75efcf75..f7dc5597 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -382,6 +382,27 @@ int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 	return 0;
 }
 
+int ibv_cmd_import_mr(struct ibv_context *context, struct verbs_mr *vmr,
+		      struct ibv_import_mr *cmd, size_t cmd_size,
+		      struct ib_uverbs_import_fr_fd_resp *resp,
+		      size_t resp_size)
+{
+	int ret;
+
+	ret = execute_cmd_write(context, IB_USER_VERBS_CMD_IMPORT_MR, cmd,
+				cmd_size, resp, resp_size);
+	if (ret)
+		return ret;
+
+	vmr->ibv_mr.handle  = resp->u.reg_mr.mr_handle;
+	vmr->ibv_mr.lkey    = resp->u.reg_mr.lkey;
+	vmr->ibv_mr.rkey    = resp->u.reg_mr.rkey;
+	vmr->ibv_mr.context = context;
+	vmr->mr_type        = IBV_MR_TYPE_MR;
+
+	return 0;
+}
+
 int ibv_cmd_rereg_mr(struct verbs_mr *vmr, uint32_t flags, void *addr,
 		     size_t length, uint64_t hca_va, int access,
 		     struct ibv_pd *pd, struct ibv_rereg_mr *cmd,
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index f2e2f11c..bb7ac1eb 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -317,6 +317,8 @@ struct verbs_context_ops {
 			    uint16_t lid);
 	int (*free_dm)(struct ibv_dm *dm);
 	int (*get_srq_num)(struct ibv_srq *srq, uint32_t *srq_num);
+	struct ibv_mr *(*import_mr)(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle);
 	struct ibv_pd *(*import_pd)(struct ibv_context *context, uint32_t fd,
 				    uint32_t handle);
 	int (*modify_cq)(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr);
@@ -479,6 +481,10 @@ int ibv_cmd_rereg_mr(struct verbs_mr *vmr, uint32_t flags, void *addr,
 		     size_t cmd_sz, struct ib_uverbs_rereg_mr_resp *resp,
 		     size_t resp_sz);
 int ibv_cmd_dereg_mr(struct verbs_mr *vmr);
+int ibv_cmd_import_mr(struct ibv_context *context, struct verbs_mr *vmr,
+		      struct ibv_import_mr *cmd, size_t cmd_size,
+		      struct ib_uverbs_import_fr_fd_resp *resp,
+		      size_t resp_size);
 int ibv_cmd_advise_mr(struct ibv_pd *pd,
 		      enum ibv_advise_mr_advice advice,
 		      uint32_t flags,
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index 295e0732..e577e12f 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -282,6 +282,13 @@ static int get_srq_num(struct ibv_srq *srq, uint32_t *srq_num)
 	return ENOSYS;
 }
 
+static struct ibv_mr *import_mr(struct ibv_context *context, uint32_t fd,
+				uint32_t handle)
+{
+	errno = ENOSYS;
+	return NULL;
+}
+
 static struct ibv_pd *import_pd(struct ibv_context *context, uint32_t fd,
 				uint32_t handle)
 {
@@ -494,6 +501,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	detach_mcast,
 	free_dm,
 	get_srq_num,
+	import_mr,
 	import_pd,
 	modify_cq,
 	modify_flow_action_esp,
@@ -636,6 +644,7 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_PRIV_OP(ctx, rereg_mr);
 	SET_PRIV_OP(ctx, resize_cq);
 	SET_OP(vctx, import_pd);
+	SET_OP(vctx, import_mr);
 
 #undef SET_OP
 #undef SET_OP2
diff --git a/libibverbs/kern-abi.h b/libibverbs/kern-abi.h
index 714be0c8..52eb2694 100644
--- a/libibverbs/kern-abi.h
+++ b/libibverbs/kern-abi.h
@@ -208,6 +208,7 @@ DECLARE_CMDX(IB_USER_VERBS_CMD_REQ_NOTIFY_CQ, ibv_req_notify_cq, ib_uverbs_req_n
 DECLARE_CMD(IB_USER_VERBS_CMD_REREG_MR, ibv_rereg_mr, ib_uverbs_rereg_mr);
 DECLARE_CMD(IB_USER_VERBS_CMD_RESIZE_CQ, ibv_resize_cq, ib_uverbs_resize_cq);
 DECLARE_CMDX(IB_USER_VERBS_CMD_IMPORT_PD, ibv_import_pd, ib_uverbs_import_pd, ib_uverbs_import_fr_fd_resp);
+DECLARE_CMDX(IB_USER_VERBS_CMD_IMPORT_MR, ibv_import_mr, ib_uverbs_import_mr, ib_uverbs_import_fr_fd_resp);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_CQ, ibv_create_cq_ex, ib_uverbs_ex_create_cq);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_FLOW, ibv_create_flow, ib_uverbs_create_flow);
 DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_QP, ibv_create_qp_ex, ib_uverbs_ex_create_qp);
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index 6fff7065..ee26d8a1 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -165,6 +165,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_detach_mcast;
 		ibv_cmd_free_dm;
 		ibv_cmd_get_context;
+		ibv_cmd_import_mr;
 		ibv_cmd_import_pd;
 		ibv_cmd_modify_flow_action_esp;
 		ibv_cmd_modify_qp;
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index f3f7200a..259dd2c0 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -2015,6 +2015,8 @@ struct ibv_values_ex {
 
 struct verbs_context {
 	/*  "grows up" - new fields go here */
+	struct ibv_mr *(*import_mr)(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle);
 	struct ibv_pd *(*import_pd)(struct ibv_context *context, uint32_t fd,
 				    uint32_t handle);
 	int (*query_port)(struct ibv_context *context, uint8_t port_num,
@@ -3305,6 +3307,24 @@ static inline struct ibv_pd *ibv_import_pd(struct ibv_context *context,
 	return pd;
 }
 
+static inline struct ibv_mr *ibv_import_mr(struct ibv_context *context,
+					   uint32_t fd, uint32_t handle)
+{
+	struct verbs_context *vctx = verbs_get_ctx_op(context, import_mr);
+	struct ibv_mr *mr;
+
+	if (!vctx) {
+		errno = ENOSYS;
+		return NULL;
+	}
+
+	mr = vctx->import_mr(context, fd, handle);
+	if (mr)
+		mr->context = context;
+
+	return mr;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 09/12] mlx4: Implementation of import MR callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (7 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 08/12] verbs: Introduce new verb to import MR object Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 10/12] mlx5: " Yuval Shaia
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import MR verb take care of importing the generic part of the MR
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of mlx4 related MR attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/mlx4/mlx4.c  |  1 +
 providers/mlx4/mlx4.h  |  2 ++
 providers/mlx4/verbs.c | 26 ++++++++++++++++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/providers/mlx4/mlx4.c b/providers/mlx4/mlx4.c
index 62ea5539..40935ca0 100644
--- a/providers/mlx4/mlx4.c
+++ b/providers/mlx4/mlx4.c
@@ -86,6 +86,7 @@ static const struct verbs_context_ops mlx4_ctx_ops = {
 	.query_port    = mlx4_query_port,
 	.alloc_pd      = mlx4_alloc_pd,
 	.dealloc_pd    = mlx4_free_pd,
+	.import_mr     = mlx4_import_mr,
 	.import_pd     = mlx4_import_pd,
 	.reg_mr	       = mlx4_reg_mr,
 	.rereg_mr      = mlx4_rereg_mr,
diff --git a/providers/mlx4/mlx4.h b/providers/mlx4/mlx4.h
index 9f171d09..d919e30c 100644
--- a/providers/mlx4/mlx4.h
+++ b/providers/mlx4/mlx4.h
@@ -316,6 +316,8 @@ int mlx4_query_rt_values(struct ibv_context *context,
 			 struct ibv_values_ex *values);
 struct ibv_pd *mlx4_alloc_pd(struct ibv_context *context);
 int mlx4_free_pd(struct ibv_pd *pd);
+struct ibv_mr *mlx4_import_mr(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle);
 struct ibv_pd *mlx4_import_pd(struct ibv_context *context, uint32_t fd,
 			      uint32_t handle);
 struct ibv_xrcd *mlx4_open_xrcd(struct ibv_context *context,
diff --git a/providers/mlx4/verbs.c b/providers/mlx4/verbs.c
index 87fbf2e1..13b2799c 100644
--- a/providers/mlx4/verbs.c
+++ b/providers/mlx4/verbs.c
@@ -239,6 +239,32 @@ int mlx4_free_pd(struct ibv_pd *pd)
 	return 0;
 }
 
+struct ibv_mr *mlx4_import_mr(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle)
+{
+	struct ibv_import_mr cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_MR,
+		.fd = fd,
+	};
+	struct ib_uverbs_import_fr_fd_resp resp;
+	struct verbs_mr *vmr;
+	int ret;
+
+	vmr = calloc(1, sizeof(*vmr));
+	if (!vmr)
+		return NULL;
+
+	ret = ibv_cmd_import_mr(context, vmr, &cmd, sizeof(cmd), &resp,
+				sizeof(resp));
+	if (ret) {
+		free(vmr);
+		return NULL;
+	}
+
+	return &vmr->ibv_mr;
+}
+
 struct ibv_pd *mlx4_import_pd(struct ibv_context *context, uint32_t fd,
 			      uint32_t handle)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 10/12] mlx5: Implementation of import MR callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (8 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 09/12] mlx4: Implementation of import MR callback Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 11/12] rxe: " Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 12/12] verbs: pinpong test using shared objects API Yuval Shaia
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import MR verb take care of importing the generic part of the MR
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of mlx5 related MR attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/mlx5/mlx5.c  |  1 +
 providers/mlx5/mlx5.h  |  2 ++
 providers/mlx5/verbs.c | 26 ++++++++++++++++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index c16b30b3..8d1fa232 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -91,6 +91,7 @@ static const struct verbs_context_ops mlx5_ctx_common_ops = {
 	.alloc_pd      = mlx5_alloc_pd,
 	.async_event   = mlx5_async_event,
 	.dealloc_pd    = mlx5_free_pd,
+	.import_mr     = mlx5_import_mr,
 	.import_pd     = mlx5_import_pd,
 	.reg_mr	       = mlx5_reg_mr,
 	.rereg_mr      = mlx5_rereg_mr,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index 06e2b471..858ae7c2 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -816,6 +816,8 @@ int mlx5_query_port(struct ibv_context *context, uint8_t port,
 
 struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context);
 int mlx5_free_pd(struct ibv_pd *pd);
+struct ibv_mr *mlx5_import_mr(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle);
 struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t fd,
 			      uint32_t handle);
 
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 3d2510c3..b4964b17 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -178,6 +178,32 @@ struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context)
 	return &pd->ibv_pd;
 }
 
+struct ibv_mr *mlx5_import_mr(struct ibv_context *context, uint32_t fd,
+			      uint32_t handle)
+{
+	struct ibv_import_mr cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_MR,
+		.fd = fd,
+	};
+	struct ib_uverbs_import_fr_fd_resp resp = {0};
+	struct mlx5_mr *mr;
+	int ret;
+
+	mr = calloc(1, sizeof(*mr));
+	if (!mr)
+		return NULL;
+
+	ret = ibv_cmd_import_mr(context, &mr->vmr, &cmd, sizeof(cmd), &resp,
+				sizeof(resp));
+	if (ret) {
+		free(mr);
+		return NULL;
+	}
+
+	return &mr->vmr.ibv_mr;
+}
+
 struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t fd,
 			      uint32_t handle)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 11/12] rxe: Implementation of import MR callback
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (9 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 10/12] mlx5: " Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  2019-08-21 14:26 ` [PATCH v1 rdma-core 12/12] verbs: pinpong test using shared objects API Yuval Shaia
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

The import MR verb take care of importing the generic part of the MR
object and then triggers provider's specific callback to take care of
provider's specific attributes.
Add implementation of rxe related MR attributes.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 providers/rxe/rxe.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c
index 3ea4ff08..c1cdde07 100644
--- a/providers/rxe/rxe.c
+++ b/providers/rxe/rxe.c
@@ -112,6 +112,32 @@ static struct ibv_pd *rxe_alloc_pd(struct ibv_context *context)
 	return pd;
 }
 
+static struct ibv_mr *rxe_import_mr(struct ibv_context *context, uint32_t fd,
+				    uint32_t handle)
+{
+	struct ibv_import_mr cmd = {
+		.handle = handle,
+		.type = UVERBS_OBJECT_MR,
+		.fd = fd,
+	};
+	struct ib_uverbs_import_fr_fd_resp resp = {};
+	struct verbs_mr *vmr;
+	int ret;
+
+	vmr = calloc(1, sizeof(*vmr));
+	if (!vmr)
+		return NULL;
+
+	ret = ibv_cmd_import_mr(context, vmr, &cmd, sizeof(cmd), &resp,
+				sizeof(resp));
+	if (ret) {
+		free(vmr);
+		return NULL;
+	}
+
+	return &vmr->ibv_mr;
+}
+
 static struct ibv_pd *rxe_import_pd(struct ibv_context *context, uint32_t fd,
 				    uint32_t handle)
 {
@@ -862,6 +888,7 @@ static const struct verbs_context_ops rxe_ctx_ops = {
 	.query_port = rxe_query_port,
 	.alloc_pd = rxe_alloc_pd,
 	.dealloc_pd = rxe_dealloc_pd,
+	.import_mr = rxe_import_mr,
 	.import_pd = rxe_import_pd,
 	.reg_mr = rxe_reg_mr,
 	.dereg_mr = rxe_dereg_mr,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 rdma-core 12/12] verbs: pinpong test using shared objects API
  2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
                   ` (10 preceding siblings ...)
  2019-08-21 14:26 ` [PATCH v1 rdma-core 11/12] rxe: " Yuval Shaia
@ 2019-08-21 14:26 ` Yuval Shaia
  11 siblings, 0 replies; 13+ messages in thread
From: Yuval Shaia @ 2019-08-21 14:26 UTC (permalink / raw)
  To: dledford, jgg, leon, monis, parav, danielj, kamalheib1, markz,
	swise, shamir.rabinovitch, johannes.berg, willy, michaelgur,
	markb, yuval.shaia, dan.carpenter, bvanassche, maxg, israelr,
	galpress, denisd, yuvalav, dennis.dalessandro, will, ereza, jgg,
	linux-rdma
  Cc: Shamir Rabinovitch

From: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>

Implementation of pingpong test using the shared object API.
The example is compose of two processes, one creates all the resources,
shared them on with second process via SCM socket and then act as a
server and wait for incoming messages. The second process imports the
shared objects and use them for a communicaion with the server process.

This commit add the ibv_shpd_pingpong sample that demonstrate the use
of the new shared PD capability added to libibverbs and the kernel.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Shamir Rabinovitch <srabinov7@gmail.com>
---
 libibverbs/examples/CMakeLists.txt  |    3 +
 libibverbs/examples/shpd_pingpong.c | 1142 +++++++++++++++++++++++++++
 2 files changed, 1145 insertions(+)
 create mode 100644 libibverbs/examples/shpd_pingpong.c

diff --git a/libibverbs/examples/CMakeLists.txt b/libibverbs/examples/CMakeLists.txt
index dc4c4978..d738aa41 100644
--- a/libibverbs/examples/CMakeLists.txt
+++ b/libibverbs/examples/CMakeLists.txt
@@ -26,3 +26,6 @@ target_link_libraries(ibv_ud_pingpong LINK_PRIVATE ibverbs ibverbs_tools)
 
 rdma_executable(ibv_xsrq_pingpong xsrq_pingpong.c)
 target_link_libraries(ibv_xsrq_pingpong LINK_PRIVATE ibverbs ibverbs_tools)
+
+rdma_executable(ibv_shpd_pingpong shpd_pingpong.c)
+target_link_libraries(ibv_shpd_pingpong LINK_PRIVATE ibverbs ibverbs_tools)
diff --git a/libibverbs/examples/shpd_pingpong.c b/libibverbs/examples/shpd_pingpong.c
new file mode 100644
index 00000000..c1a30b22
--- /dev/null
+++ b/libibverbs/examples/shpd_pingpong.c
@@ -0,0 +1,1142 @@
+/*
+ * Copyright (c) 2019 Oracle.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ * Authors:
+ *	Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
+ *	Yuval Shaia <yuval.shaia@oracle.com>
+ *	Shamir Rabinovitch <srabinov7@gmail.com>
+ *
+ */
+
+#if HAVE_CONFIG_H
+#  include <config.h>
+#endif /* HAVE_CONFIG_H */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/time.h>
+#include <netdb.h>
+#include <malloc.h>
+#include <getopt.h>
+#include <arpa/inet.h>
+#include <time.h>
+#include <sys/ipc.h>
+#include <sys/shm.h>
+#include <errno.h>
+#include <signal.h>
+#include <infiniband/driver.h>
+#include <rdma/ib_user_ioctl_cmds.h>
+
+#include "pingpong.h"
+
+enum {
+	PINGPONG_RECV_WRID = 1,
+	PINGPONG_SEND_WRID = 2,
+};
+
+static int page_size;
+
+struct ppshm {
+	void		*shmaddr;
+	volatile int	status;
+	uint32_t	shared_pd;
+	uint32_t	shared_mr;
+	char		buf[1];
+};
+
+struct pingpong_context {
+	struct ibv_context	*context;
+	struct ibv_comp_channel	*channel;
+	struct ibv_pd		*pd;
+	struct ibv_mr		*mr;
+	struct ibv_cq		*cq;
+	struct ibv_qp		*qp;
+	void			*buf;
+	int			size;
+	int			rx_depth;
+	int			pending;
+	struct ibv_port_attr	portinfo;
+	int			is_server;
+	key_t			key;
+	int			shmsize;
+	int			shmid;
+	uintptr_t		shmoffset;
+	struct ppshm		*shm;
+	int			sock;
+};
+
+struct pingpong_dest {
+	int lid;
+	int qpn;
+	int psn;
+	union ibv_gid gid;
+};
+
+static int pp_connect_ctx(struct pingpong_context *ctx, int port, int my_psn,
+			  enum ibv_mtu mtu, int sl,
+			  struct pingpong_dest *dest, int sgid_idx)
+{
+	struct ibv_qp_attr attr = {
+		.qp_state		= IBV_QPS_RTR,
+		.path_mtu		= mtu,
+		.dest_qp_num		= dest->qpn,
+		.rq_psn			= dest->psn,
+		.max_dest_rd_atomic	= 1,
+		.min_rnr_timer		= 12,
+		.ah_attr		= {
+			.is_global	= 0,
+			.dlid		= dest->lid,
+			.sl		= sl,
+			.src_path_bits	= 0,
+			.port_num	= port
+		}
+	};
+
+	if (dest->gid.global.interface_id) {
+		attr.ah_attr.is_global = 1;
+		attr.ah_attr.grh.hop_limit = 1;
+		attr.ah_attr.grh.dgid = dest->gid;
+		attr.ah_attr.grh.sgid_index = sgid_idx;
+	}
+	if (ibv_modify_qp(ctx->qp, &attr,
+			  IBV_QP_STATE              |
+			  IBV_QP_AV                 |
+			  IBV_QP_PATH_MTU           |
+			  IBV_QP_DEST_QPN           |
+			  IBV_QP_RQ_PSN             |
+			  IBV_QP_MAX_DEST_RD_ATOMIC |
+			  IBV_QP_MIN_RNR_TIMER)) {
+		fprintf(stderr, "Failed to modify QP to RTR\n");
+		return 1;
+	}
+
+	attr.qp_state	    = IBV_QPS_RTS;
+	attr.timeout	    = 14;
+	attr.retry_cnt	    = 7;
+	attr.rnr_retry	    = 7;
+	attr.sq_psn	    = my_psn;
+	attr.max_rd_atomic  = 1;
+	if (ibv_modify_qp(ctx->qp, &attr,
+			  IBV_QP_STATE              |
+			  IBV_QP_TIMEOUT            |
+			  IBV_QP_RETRY_CNT          |
+			  IBV_QP_RNR_RETRY          |
+			  IBV_QP_SQ_PSN             |
+			  IBV_QP_MAX_QP_RD_ATOMIC)) {
+		fprintf(stderr, "Failed to modify QP to RTS\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+static struct
+pingpong_dest *pp_client_exch_dest(const char *servername,
+				   int port,
+				   const struct pingpong_dest *my_dest)
+{
+	struct addrinfo *res, *t;
+	struct addrinfo hints = {
+		.ai_family   = AF_UNSPEC,
+		.ai_socktype = SOCK_STREAM
+	};
+	char *service;
+	char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"];
+	int n;
+	int sockfd = -1;
+	struct pingpong_dest *rem_dest = NULL;
+	char gid[33];
+
+	if (asprintf(&service, "%d", port) < 0)
+		return NULL;
+
+	n = getaddrinfo(servername, service, &hints, &res);
+
+	if (n < 0) {
+		fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername,
+			port);
+		free(service);
+		return NULL;
+	}
+
+	for (t = res; t; t = t->ai_next) {
+		sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol);
+		if (sockfd >= 0) {
+			if (!connect(sockfd, t->ai_addr, t->ai_addrlen))
+				break;
+			close(sockfd);
+			sockfd = -1;
+		}
+	}
+
+	freeaddrinfo(res);
+	free(service);
+
+	if (sockfd < 0) {
+		fprintf(stderr, "Couldn't connect to %s:%d\n", servername,
+			port);
+		return NULL;
+	}
+
+	gid_to_wire_gid(&my_dest->gid, gid);
+	sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn,
+		my_dest->psn, gid);
+	if (write(sockfd, msg, sizeof(msg)) != sizeof(msg)) {
+		fprintf(stderr, "Couldn't send local address\n");
+		goto out;
+	}
+
+	if (read(sockfd, msg, sizeof(msg)) != sizeof(msg)) {
+		perror("client read");
+		fprintf(stderr, "Couldn't read remote address\n");
+		goto out;
+	}
+
+	write(sockfd, "done", sizeof("done"));
+
+	rem_dest = malloc(sizeof(*rem_dest));
+	if (!rem_dest)
+		goto out;
+
+	sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn,
+	       &rem_dest->psn, gid);
+	wire_gid_to_gid(gid, &rem_dest->gid);
+
+out:
+	close(sockfd);
+	return rem_dest;
+}
+
+static struct
+pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx, int ib_port,
+				   enum ibv_mtu mtu, int port, int sl,
+				   const struct pingpong_dest *my_dest,
+				   int sgid_idx)
+{
+	struct addrinfo *res, *t;
+	struct addrinfo hints = {
+		.ai_flags    = AI_PASSIVE,
+		.ai_family   = AF_UNSPEC,
+		.ai_socktype = SOCK_STREAM
+	};
+	char *service;
+	char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"];
+	int n;
+	int sockfd = -1, connfd;
+	struct pingpong_dest *rem_dest = NULL;
+	char gid[33];
+
+	if (asprintf(&service, "%d", port) < 0)
+		return NULL;
+
+	n = getaddrinfo(NULL, service, &hints, &res);
+
+	if (n < 0) {
+		fprintf(stderr, "%s for port %d\n", gai_strerror(n), port);
+		free(service);
+		return NULL;
+	}
+
+	for (t = res; t; t = t->ai_next) {
+		sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol);
+		if (sockfd >= 0) {
+			n = 1;
+
+			setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n,
+				   sizeof(n));
+
+			if (!bind(sockfd, t->ai_addr, t->ai_addrlen))
+				break;
+			close(sockfd);
+			sockfd = -1;
+		}
+	}
+
+	freeaddrinfo(res);
+	free(service);
+
+	if (sockfd < 0) {
+		fprintf(stderr, "Couldn't listen to port %d\n", port);
+		return NULL;
+	}
+
+	listen(sockfd, 1);
+	connfd = accept(sockfd, NULL, 0);
+	close(sockfd);
+	if (connfd < 0) {
+		fprintf(stderr, "accept() failed\n");
+		return NULL;
+	}
+
+	n = read(connfd, msg, sizeof(msg));
+	if (n != sizeof(msg)) {
+		perror("server read");
+		fprintf(stderr, "%d/%d: Couldn't read remote address\n", n,
+			(int) sizeof(msg));
+		goto out;
+	}
+
+	rem_dest = malloc(sizeof(*rem_dest));
+	if (!rem_dest)
+		goto out;
+
+	sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn,
+	       &rem_dest->psn, gid);
+	wire_gid_to_gid(gid, &rem_dest->gid);
+
+	if (pp_connect_ctx(ctx, ib_port, my_dest->psn, mtu, sl, rem_dest,
+			   sgid_idx)) {
+		fprintf(stderr, "Couldn't connect to remote QP\n");
+		free(rem_dest);
+		rem_dest = NULL;
+		goto out;
+	}
+
+
+	gid_to_wire_gid(&my_dest->gid, gid);
+	sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn,
+		my_dest->psn, gid);
+	if (write(connfd, msg, sizeof(msg)) != sizeof(msg)) {
+		fprintf(stderr, "Couldn't send local address\n");
+		free(rem_dest);
+		rem_dest = NULL;
+		goto out;
+	}
+
+	read(connfd, msg, sizeof(msg));
+
+out:
+	close(connfd);
+	return rem_dest;
+}
+
+static int pp_setup_shm(struct pingpong_context *ctx)
+{
+	ctx->shmid = shmget(ctx->key, ctx->shmsize, IPC_CREAT|IPC_EXCL|0666);
+	if (ctx->shmid == -1) {
+		fprintf(stderr, "shm with id %d already exists\n", ctx->key);
+		return 1;
+	}
+
+	ctx->shm = shmat(ctx->shmid, NULL, 0);
+	if (ctx->shm == (void *)-1) {
+		fprintf(stderr, "attach failed\n");
+		return 1;
+	}
+
+	ctx->shm->shmaddr = ctx->shm;
+
+	return 0;
+}
+
+static int pp_waitfor_shm(struct pingpong_context *ctx)
+{
+retry:
+	ctx->shmid = shmget(ctx->key, ctx->shmsize, 0666);
+	if (ctx->shmid == -1) {
+		sleep(1);
+		goto retry;
+	}
+	ctx->shm = shmat(ctx->shmid, NULL, 0);
+	if (ctx->shm == (void *)-1) {
+		fprintf(stderr, "attach failed\n");
+		return 1;
+	}
+
+	/* wait for status 2 */
+
+	while (ctx->shm->status == 0)
+		sleep(1);
+
+	if (ctx->shm->status == 1)
+		return 1;
+
+	return 0;
+}
+
+static int pp_delete_shm(struct pingpong_context *ctx)
+{
+	if (shmdt(ctx->shm)) {
+		fprintf(stderr, "Couldn't detach shm\n");
+		return 1;
+	}
+
+	if (ctx->is_server)
+		shmctl(ctx->shmid, IPC_RMID, 0);
+
+	return 0;
+}
+
+static int pp_share_context(struct pingpong_context *ctx)
+{
+	struct	 msghdr msg;
+	struct	 cmsghdr *cmsghdr;
+	char	 buf[CMSG_SPACE(sizeof(int))];
+	int	 ret, *fd, tmp;
+	struct	 iovec vec = {
+		.iov_base = &tmp,
+		.iov_len = sizeof(tmp),
+	};
+
+	memset(buf, 0, sizeof(buf));
+	cmsghdr = (struct cmsghdr *)buf;
+	cmsghdr->cmsg_len = CMSG_LEN(sizeof(int));
+	cmsghdr->cmsg_level = SOL_SOCKET;
+	cmsghdr->cmsg_type = SCM_RIGHTS;
+	msg.msg_name = NULL;
+	msg.msg_namelen = 0;
+	msg.msg_iov = &vec;
+	msg.msg_iovlen = 1;
+	msg.msg_control = cmsghdr;
+	msg.msg_controllen = CMSG_LEN(sizeof(int));
+	msg.msg_flags = 0;
+	fd = (int *)CMSG_DATA(cmsghdr);
+
+	if (ctx->is_server) {
+		*fd = ibv_context_to_fd(ctx->context);
+		ret = sendmsg(ctx->sock, &msg, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Couldn't share fd. ret %d\n", ret);
+			return -1;
+		}
+	} else {
+		ret = recvmsg(ctx->sock, &msg, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Couldn't shared fd. ret %d\n", ret);
+			return -1;
+		}
+		for (cmsghdr = CMSG_FIRSTHDR(&msg); cmsghdr != NULL;
+		     cmsghdr = CMSG_NXTHDR(&msg, cmsghdr)) {
+			if (cmsghdr->cmsg_level == SOL_SOCKET &&
+			    cmsghdr->cmsg_type == SCM_RIGHTS)
+				break;
+		}
+		if (!cmsghdr) {
+			fprintf(stderr, "Couldn't find cmsg\n");
+			return -1;
+		}
+		fd = (int *)CMSG_DATA(cmsghdr);
+
+		ctx->pd = ibv_import_pd(ctx->context, *fd, ctx->shm->shared_pd);
+		if (!ctx->pd) {
+			fprintf(stderr, "Couldn't import PD\n");
+			return -1;
+		}
+
+		ctx->mr = ibv_import_mr(ctx->context, *fd, ctx->shm->shared_mr);
+		if (!ctx->pd) {
+			fprintf(stderr, "Couldn't import MR\n");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static int pp_open_unix_socket(int port, struct pingpong_context *ctx)
+{
+	struct sockaddr_un addr = {0};
+	int srv_sock;
+	int ret;
+
+	addr.sun_family = AF_LOCAL;
+
+	ret = snprintf(addr.sun_path, sizeof(addr.sun_path),
+		       "/tmp/shpd_pingpong.%d", port);
+	if (ret < 0 || ret >= sizeof(addr.sun_path)) {
+		fprintf(stderr, "Couldn't format unix socket name\n");
+		return -1;
+	}
+
+	if (ctx->is_server) {
+		unlink(addr.sun_path);
+
+		srv_sock = socket(PF_LOCAL, SOCK_STREAM, 0);
+		if (srv_sock < 0) {
+			perror("Couldn't create unix socket");
+			return -1;
+		}
+		ret = bind(srv_sock, (struct sockaddr *)&addr, sizeof(addr));
+		if (ret < 0) {
+			perror("Couldn't bind unix socket");
+			return -1;
+		}
+		ret = listen(srv_sock, 1);
+		if (ret < 0) {
+			perror("Couldn't listen on unix socket");
+			return -1;
+		}
+		ctx->sock = accept(srv_sock, NULL, 0);
+		if (ctx->sock < 0) {
+			perror("Couldn't accept on unix socket");
+			return -1;
+		}
+	} else {
+		ctx->sock = socket(PF_LOCAL, SOCK_STREAM, 0);
+		if (ctx->sock < 0) {
+			perror("Couldn't create unix socket");
+			return -1;
+		}
+		ret = connect(ctx->sock, (struct sockaddr *)&addr,
+			      sizeof(addr));
+		if (ret < 0) {
+			perror("Couldn't connect unix socket");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev,
+					    int size, int rx_depth,
+					    int ib_port, int use_event,
+					    key_t key, int is_server,
+					    int port)
+{
+	struct pingpong_context *ctx;
+
+	ctx = calloc(1, sizeof(*ctx));
+	if (!ctx)
+		return NULL;
+
+	ctx->size     = size;
+	ctx->rx_depth = rx_depth;
+	ctx->is_server = is_server;
+	ctx->key = key;
+	ctx->shmsize = sizeof(*(ctx->shm)) + ctx->size * 2 + page_size * 2;
+	ctx->shmoffset = 0;
+
+	ctx->context = ibv_open_device(ib_dev);
+	if (!ctx->context) {
+		fprintf(stderr, "Couldn't get context for %s\n",
+			ibv_get_device_name(ib_dev));
+		return NULL;
+	}
+
+	if (use_event) {
+		ctx->channel = ibv_create_comp_channel(ctx->context);
+		if (!ctx->channel) {
+			fprintf(stderr, "Couldn't create completion channel\n");
+			goto err;
+		}
+	} else
+		ctx->channel = NULL;
+
+	if (is_server) {
+		if (pp_setup_shm(ctx))
+			goto err;
+
+		ctx->pd = ibv_alloc_pd(ctx->context);
+		if (!ctx->pd) {
+			fprintf(stderr, "Couldn't allocate PD\n");
+			goto err;
+		}
+
+		ctx->shm->shared_pd = ibv_pd_to_handle(ctx->pd);
+
+#define PAGE_ALIGN(addr, page) (uintptr_t)(((uintptr_t)addr + page - 1) \
+					 & ~(page - 1))
+
+		/* use shared memory as buffer */
+		ctx->buf = (char *)PAGE_ALIGN(ctx->shm->buf, page_size);
+
+		ctx->mr = ibv_reg_mr(ctx->pd, ctx->shm, ctx->shmsize,
+				     IBV_ACCESS_LOCAL_WRITE);
+		if (!ctx->mr) {
+			fprintf(stderr, "Couldn't register MR\n");
+			ctx->shm->status = 1;
+			goto err;
+		}
+
+		ctx->shm->shared_mr = ibv_mr_to_handle(ctx->mr);
+
+		/* all details initialized ready to go */
+		ctx->shm->status = 2;
+	} else {
+		if (pp_waitfor_shm(ctx)) {
+			fprintf(stderr, "Couldn't get shm working\n");
+			goto err;
+		}
+
+		/* The memory address at which shm is mapped in the client
+		 * may not be same as that in server. All WR to HCA should
+		 * give local VA's w.r.t server's shared memory address
+		 */
+		ctx->shmoffset = (uintptr_t)(ctx->shm->shmaddr) -
+				 (uintptr_t)ctx->shm;
+		ctx->buf = (char *)PAGE_ALIGN(ctx->shm->buf, page_size) +
+			   PAGE_ALIGN(size, page_size);
+	}
+
+	memset(ctx->buf, 0x7b + is_server, size);
+
+	ctx->cq = ibv_create_cq(ctx->context, rx_depth + 1, NULL,
+				ctx->channel, 0);
+	if (!ctx->cq) {
+		fprintf(stderr, "Couldn't create CQ\n");
+		goto err;
+	}
+
+	if (pp_open_unix_socket(port, ctx)) {
+		fprintf(stderr, "Couldn't open UNIX socket\n");
+		goto err;
+	}
+
+	if (pp_share_context(ctx)) {
+		fprintf(stderr, "Couldn't share PD\n");
+		goto err;
+	}
+
+	{
+		struct ibv_qp_init_attr attr = {
+			.send_cq = ctx->cq,
+			.recv_cq = ctx->cq,
+			.cap     = {
+				.max_send_wr  = 1,
+				.max_recv_wr  = rx_depth,
+				.max_send_sge = 1,
+				.max_recv_sge = 1
+			},
+			.qp_type = IBV_QPT_RC
+		};
+
+		ctx->qp = ibv_create_qp(ctx->pd, &attr);
+		if (!ctx->qp)  {
+			fprintf(stderr, "Couldn't create QP\n");
+			goto err;
+		}
+	}
+
+	{
+		struct ibv_qp_attr attr = {
+			.qp_state        = IBV_QPS_INIT,
+			.pkey_index      = 0,
+			.port_num        = ib_port,
+			.qp_access_flags = 0
+		};
+
+		if (ibv_modify_qp(ctx->qp, &attr,
+				  IBV_QP_STATE              |
+				  IBV_QP_PKEY_INDEX         |
+				  IBV_QP_PORT               |
+				  IBV_QP_ACCESS_FLAGS)) {
+			fprintf(stderr, "Failed to modify QP to INIT\n");
+			goto err;
+		}
+	}
+
+	return ctx;
+
+err:
+	if (ctx->qp)
+		ibv_destroy_qp(ctx->qp);
+	if (ctx->cq)
+		ibv_destroy_cq(ctx->cq);
+	if (ctx->mr)
+		ibv_dereg_mr(ctx->mr);
+	if (ctx->pd)
+		ibv_dealloc_pd(ctx->pd);
+	if (ctx->channel)
+		ibv_destroy_comp_channel(ctx->channel);
+	if (ctx->context)
+		ibv_close_device(ctx->context);
+
+	free(ctx);
+
+	return NULL;
+}
+
+static int pp_close_ctx(struct pingpong_context *ctx)
+{
+	if (pp_delete_shm(ctx)) {
+		fprintf(stderr, "couldn't destroy shared memory\n");
+		return 1;
+	}
+
+	if (ctx->channel) {
+		if (ibv_destroy_comp_channel(ctx->channel)) {
+			fprintf(stderr,
+				"Couldn't destroy completion channel\n");
+			return 1;
+		}
+	}
+
+	if (ibv_destroy_qp(ctx->qp)) {
+		fprintf(stderr, "Couldn't destroy QP\n");
+		return 1;
+	}
+
+	if (ibv_destroy_cq(ctx->cq)) {
+		fprintf(stderr, "Couldn't destroy CQ\n");
+		return 1;
+	}
+
+	if (ibv_dereg_mr(ctx->mr)) {
+		fprintf(stderr, "Couldn't deregister MR\n");
+		return 1;
+	}
+
+	if (ibv_dealloc_pd(ctx->pd)) {
+		fprintf(stderr, "Couldn't deallocate PD\n");
+		return 1;
+	}
+
+	if (ibv_close_device(ctx->context)) {
+		fprintf(stderr, "Couldn't release context\n");
+		return 1;
+	}
+
+	free(ctx);
+
+	return 0;
+}
+
+static int pp_post_recv(struct pingpong_context *ctx, int n)
+{
+	struct ibv_sge list = {
+		.addr	= (uintptr_t) ctx->buf + ctx->shmoffset,
+		.length = ctx->size,
+		.lkey	= ctx->mr->lkey
+	};
+	struct ibv_recv_wr wr = {
+		.wr_id	    = PINGPONG_RECV_WRID,
+		.sg_list    = &list,
+		.num_sge    = 1,
+	};
+	struct ibv_recv_wr *bad_wr;
+	int i;
+
+	for (i = 0; i < n; ++i)
+		if (ibv_post_recv(ctx->qp, &wr, &bad_wr))
+			break;
+
+	return i;
+}
+
+static int pp_post_send(struct pingpong_context *ctx)
+{
+	struct ibv_sge list = {
+		.addr	= (uintptr_t) ctx->buf + ctx->shmoffset,
+		.length = ctx->size,
+		.lkey	= ctx->mr->lkey
+	};
+	struct ibv_send_wr wr = {
+		.wr_id	    = PINGPONG_SEND_WRID,
+		.sg_list    = &list,
+		.num_sge    = 1,
+		.opcode     = IBV_WR_SEND,
+		.send_flags = IBV_SEND_SIGNALED,
+	};
+	struct ibv_send_wr *bad_wr;
+
+	return ibv_post_send(ctx->qp, &wr, &bad_wr);
+}
+
+static void usage(const char *argv0)
+{
+	printf("Usage:\n");
+	printf("  %s            start a server and wait for connection\n",
+	       argv0);
+	printf("  %s <host>     connect to server at <host>\n", argv0);
+	printf("\n");
+	printf("Options:\n");
+	printf("  -p, --port=<port>      listen on/connect to port <port> (default 18515)\n");
+	printf("  -d, --ib-dev=<dev>     use IB device <dev> (default first device found)\n");
+	printf("  -i, --ib-port=<port>   use port <port> of IB device (default 1)\n");
+	printf("  -s, --size=<size>      size of message to exchange (default 4096)\n");
+	printf("  -m, --mtu=<size>       path MTU (default 1024)\n");
+	printf("  -r, --rx-depth=<dep>   number of receives to post at a time (default 500)\n");
+	printf("  -n, --iters=<iters>    number of exchanges (default 1000)\n");
+	printf("  -l, --sl=<sl>          service level value\n");
+	printf("  -e, --events           sleep on CQ events (default poll)\n");
+	printf("  -g, --gid-idx=<gid index> local port gid index\n");
+	printf("  -S, --shm-key=<shm key> shared memory key for the test (default 18515)\n");
+}
+
+int main(int argc, char *argv[])
+{
+	struct pingpong_context *ctx;
+	struct ibv_device      **dev_list;
+	struct ibv_device	*ib_dev;
+	struct pingpong_dest     my_dest;
+	struct pingpong_dest    *rem_dest = NULL;
+	struct timeval           start, end;
+	char                    *ib_devname = NULL;
+	char                    *servername = NULL;
+	int                      port = 18515;
+	int                      ib_port = 1;
+	int                      size = 4096;
+	enum ibv_mtu		 mtu = IBV_MTU_1024;
+	int                      rx_depth = 500;
+	int                      iters = 1000;
+	int                      use_event = 0;
+	int                      routs;
+	int                      rcnt, scnt;
+	int                      num_cq_events = 0;
+	int                      sl = 0;
+	int			 gidx = -1;
+	char			 gid[33];
+	key_t			 key = 18515;
+
+	srand48(getpid() * time(NULL));
+
+	while (1) {
+		int c;
+
+		static struct option long_options[] = {
+			{ .name = "port",     .has_arg = 1, .val = 'p' },
+			{ .name = "ib-dev",   .has_arg = 1, .val = 'd' },
+			{ .name = "ib-port",  .has_arg = 1, .val = 'i' },
+			{ .name = "size",     .has_arg = 1, .val = 's' },
+			{ .name = "mtu",      .has_arg = 1, .val = 'm' },
+			{ .name = "rx-depth", .has_arg = 1, .val = 'r' },
+			{ .name = "iters",    .has_arg = 1, .val = 'n' },
+			{ .name = "sl",       .has_arg = 1, .val = 'l' },
+			{ .name = "events",   .has_arg = 0, .val = 'e' },
+			{ .name = "gid-idx",  .has_arg = 1, .val = 'g' },
+			{ .name = "shm-key",  .has_arg = 1, .val = 'S' },
+			{ 0 }
+		};
+
+		c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:eg:S:",
+				long_options, NULL);
+		if (c == -1)
+			break;
+
+		switch (c) {
+		case 'p':
+			port = strtol(optarg, NULL, 0);
+			if (port < 0 || port > 65535) {
+				usage(argv[0]);
+				return 1;
+			}
+			break;
+
+		case 'd':
+			ib_devname = strdupa(optarg);
+			break;
+
+		case 'i':
+			ib_port = strtol(optarg, NULL, 0);
+			if (ib_port < 0) {
+				usage(argv[0]);
+				return 1;
+			}
+			break;
+
+		case 's':
+			size = strtol(optarg, NULL, 0);
+			break;
+
+		case 'm':
+			mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0));
+			break;
+
+		case 'r':
+			rx_depth = strtol(optarg, NULL, 0);
+			break;
+
+		case 'n':
+			iters = strtol(optarg, NULL, 0);
+			break;
+
+		case 'l':
+			sl = strtol(optarg, NULL, 0);
+			break;
+
+		case 'e':
+			++use_event;
+			break;
+
+		case 'g':
+			gidx = strtol(optarg, NULL, 0);
+			break;
+
+		case 'S':
+			key = strtol(optarg, NULL, 0);
+			break;
+
+		default:
+			usage(argv[0]);
+			return 1;
+		}
+	}
+
+	if (optind == argc - 1)
+		servername = strdupa(argv[optind]);
+	else if (optind < argc) {
+		usage(argv[0]);
+		return 1;
+	}
+
+	page_size = sysconf(_SC_PAGESIZE);
+
+	dev_list = ibv_get_device_list(NULL);
+	if (!dev_list) {
+		perror("Failed to get IB devices list");
+		return 1;
+	}
+
+	if (!ib_devname) {
+		ib_dev = *dev_list;
+		if (!ib_dev) {
+			fprintf(stderr, "No IB devices found\n");
+			goto err_dev_list;
+		}
+	} else {
+		int i;
+
+		for (i = 0; dev_list[i]; ++i)
+			if (!strcmp(ibv_get_device_name(dev_list[i]),
+				    ib_devname))
+				break;
+		ib_dev = dev_list[i];
+		if (!ib_dev) {
+			fprintf(stderr, "IB device %s not found\n", ib_devname);
+			goto err_dev_list;
+		}
+	}
+
+	ctx = pp_init_ctx(ib_dev, size, rx_depth, ib_port, use_event, key,
+			  !servername, port);
+	if (!ctx)
+		goto err_dev_list;
+
+	routs = pp_post_recv(ctx, ctx->rx_depth);
+	if (routs < ctx->rx_depth) {
+		fprintf(stderr, "Couldn't post receive (%d)\n", routs);
+		goto err_ctx;
+	}
+
+	if (use_event)
+		if (ibv_req_notify_cq(ctx->cq, 0)) {
+			fprintf(stderr, "Couldn't request CQ notification\n");
+			goto err_ctx;
+		}
+
+
+	if (pp_get_port_info(ctx->context, ib_port, &ctx->portinfo)) {
+		fprintf(stderr, "Couldn't get port info\n");
+		goto err_ctx;
+	}
+
+	my_dest.lid = ctx->portinfo.lid;
+	if (ctx->portinfo.link_layer == IBV_LINK_LAYER_INFINIBAND &&
+	    !my_dest.lid) {
+		fprintf(stderr, "Couldn't get local LID\n");
+		goto err_ctx;
+	}
+
+	if (gidx >= 0) {
+		if (ibv_query_gid(ctx->context, ib_port, gidx, &my_dest.gid)) {
+			fprintf(stderr,
+				"Could not get local gid for gid index %d\n",
+				gidx);
+			goto err_ctx;
+		}
+	} else
+		memset(&my_dest.gid, 0, sizeof(my_dest.gid));
+
+	my_dest.qpn = ctx->qp->qp_num;
+	my_dest.psn = lrand48() & 0xffffff;
+	inet_ntop(AF_INET6, &my_dest.gid, gid, sizeof(gid));
+	printf("  local address:  LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n",
+	       my_dest.lid, my_dest.qpn, my_dest.psn, gid);
+
+	if (servername)
+		rem_dest = pp_client_exch_dest(servername, port, &my_dest);
+	else
+		rem_dest = pp_server_exch_dest(ctx, ib_port, mtu, port, sl,
+					       &my_dest, gidx);
+
+	if (!rem_dest)
+		goto err_ctx;
+
+	inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof(gid));
+	printf("  remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n",
+	       rem_dest->lid, rem_dest->qpn, rem_dest->psn, gid);
+
+	if (servername)
+		if (pp_connect_ctx(ctx, ib_port, my_dest.psn, mtu, sl, rem_dest,
+				   gidx))
+			goto err_rem_dest;
+
+	ctx->pending = PINGPONG_RECV_WRID;
+
+	if (servername) {
+		if (pp_post_send(ctx)) {
+			fprintf(stderr, "Couldn't post send\n");
+			goto err_rem_dest;
+		}
+		ctx->pending |= PINGPONG_SEND_WRID;
+	}
+
+	if (gettimeofday(&start, NULL)) {
+		perror("gettimeofday");
+		goto err_rem_dest;
+	}
+
+	rcnt = scnt = 0;
+	while (rcnt < iters || scnt < iters) {
+		if (use_event) {
+			struct ibv_cq *ev_cq;
+			void          *ev_ctx;
+
+			if (ibv_get_cq_event(ctx->channel, &ev_cq, &ev_ctx)) {
+				fprintf(stderr, "Failed to get cq_event\n");
+				goto err_rem_dest;
+			}
+
+			++num_cq_events;
+
+			if (ev_cq != ctx->cq) {
+				fprintf(stderr, "CQ event for unknown CQ %p\n",
+					ev_cq);
+				goto err_rem_dest;
+			}
+
+			if (ibv_req_notify_cq(ctx->cq, 0)) {
+				fprintf(stderr,
+					"Couldn't request CQ notification\n");
+				goto err_rem_dest;
+			}
+		}
+
+		{
+			struct ibv_wc wc[2];
+			int ne, i;
+
+			do {
+				ne = ibv_poll_cq(ctx->cq, 2, wc);
+				if (ne < 0) {
+					fprintf(stderr, "poll CQ failed %d\n",
+						ne);
+					goto err_rem_dest;
+				}
+
+			} while (!use_event && ne < 1);
+
+			for (i = 0; i < ne; ++i) {
+				if (wc[i].status != IBV_WC_SUCCESS) {
+					fprintf(stderr,
+						"Failed status %s (%d) for wr_id %d\n",
+						ibv_wc_status_str(wc[i].status),
+						wc[i].status,
+						(int) wc[i].wr_id);
+					goto err_rem_dest;
+				}
+
+				switch ((int) wc[i].wr_id) {
+				case PINGPONG_SEND_WRID:
+					++scnt;
+					break;
+
+				case PINGPONG_RECV_WRID:
+					if (--routs <= 1) {
+						routs += pp_post_recv(ctx,
+								ctx->rx_depth -
+								routs);
+						if (routs < ctx->rx_depth) {
+							fprintf(stderr,
+								"Couldn't post receive (%d)\n",
+								routs);
+							goto err_rem_dest;
+						}
+					}
+
+					++rcnt;
+					break;
+
+				default:
+					fprintf(stderr,
+						"Completion for unknown wr_id %d\n",
+						(int) wc[i].wr_id);
+					goto err_rem_dest;
+				}
+
+				ctx->pending &= ~(int) wc[i].wr_id;
+				if (scnt < iters && !ctx->pending) {
+					if (pp_post_send(ctx)) {
+						fprintf(stderr,
+							"Couldn't post send\n");
+						goto err_rem_dest;
+					}
+					ctx->pending = PINGPONG_RECV_WRID |
+						       PINGPONG_SEND_WRID;
+				}
+			}
+		}
+	}
+
+	if (gettimeofday(&end, NULL)) {
+		perror("gettimeofday");
+		goto err_rem_dest;
+	}
+
+	{
+		float usec = (end.tv_sec - start.tv_sec) * 1000000 +
+			(end.tv_usec - start.tv_usec);
+		long long bytes = (long long) size * iters * 2;
+
+		printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n",
+		       bytes, usec / 1000000., bytes * 8. / usec);
+		printf("%d iters in %.2f seconds = %.2f usec/iter\n",
+		       iters, usec / 1000000., usec / iters);
+	}
+
+	ibv_ack_cq_events(ctx->cq, num_cq_events);
+
+	if (pp_close_ctx(ctx))
+		fprintf(stderr, "Couldn't close context\n");
+
+	ibv_free_device_list(dev_list);
+	free(rem_dest);
+
+	return 0;
+
+err_rem_dest:
+	free(rem_dest);
+
+err_ctx:
+	pp_close_ctx(ctx);
+
+err_dev_list:
+	ibv_free_device_list(dev_list);
+
+	return 1;
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-21 14:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-21 14:26 [PATCH v1 rdma-core 00/12] Shared PD and MR Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 01/12] verbs: Introduce new inline helpers Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 02/12] man: Add description to ibv_import_pd function Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 03/12] verbs: Introduce new verb to import PD object Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 04/12] mlx4: Implementation of import PD callback Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 05/12] mlx5: " Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 06/12] rxe: " Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 07/12] man: Add description to ibv_import_mr function Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 08/12] verbs: Introduce new verb to import MR object Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 09/12] mlx4: Implementation of import MR callback Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 10/12] mlx5: " Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 11/12] rxe: " Yuval Shaia
2019-08-21 14:26 ` [PATCH v1 rdma-core 12/12] verbs: pinpong test using shared objects API Yuval Shaia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).