All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] libibverbs: On-demand paging support
@ 2015-08-27 15:22 Haggai Eran
       [not found] ` <1440688955-7709-1-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Haggai Eran @ 2015-08-27 15:22 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Haggai Eran

This series adds userspace support for on-demand paging. The first patch adds
support for the new extended query device verb. Patch 2 adds the capability and
interface bits related to on-demand paging, and patch 3 adds example code to
the rc_pingpong program to use on-demand paging.

Eli Cohen (1):
  Add support for extended query device capabilities

Haggai Eran (1):
  Add on-demand paging support

Majd Dibbiny (1):
  libibverbs/examples: Support odp in rc_pingpong

 Makefile.am                   |   3 +-
 examples/devinfo.c            |  67 ++++++++++++++++++++--
 examples/rc_pingpong.c        |  31 +++++++++-
 include/infiniband/driver.h   |   9 +++
 include/infiniband/kern-abi.h |  36 +++++++++++-
 include/infiniband/verbs.h    |  53 ++++++++++++++++-
 man/ibv_query_device_ex.3     |  70 +++++++++++++++++++++++
 man/ibv_reg_mr.3              |   2 +
 src/cmd.c                     | 129 +++++++++++++++++++++++++++++-------------
 src/libibverbs.map            |   2 +
 10 files changed, 352 insertions(+), 50 deletions(-)
 create mode 100644 man/ibv_query_device_ex.3

-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Add support for extended query device capabilities
       [not found] ` <1440688955-7709-1-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-27 15:22   ` Haggai Eran
       [not found]     ` <1440688955-7709-2-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-27 15:22   ` [PATCH 2/3] Add on-demand paging support Haggai Eran
  2015-08-27 15:22   ` [PATCH 3/3] libibverbs/examples: Support odp in rc_pingpong Haggai Eran
  2 siblings, 1 reply; 7+ messages in thread
From: Haggai Eran @ 2015-08-27 15:22 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Haggai Eran

From: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add the verb ibv_query_device_ex which is extensible and allows following
commits to add new features to define additional properties.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 Makefile.am                   |   3 +-
 examples/devinfo.c            |  16 ++++--
 include/infiniband/driver.h   |   9 ++++
 include/infiniband/kern-abi.h |  26 +++++++++-
 include/infiniband/verbs.h    |  28 ++++++++++
 man/ibv_query_device_ex.3     |  47 +++++++++++++++++
 src/cmd.c                     | 118 ++++++++++++++++++++++++++++--------------
 src/libibverbs.map            |   2 +
 8 files changed, 202 insertions(+), 47 deletions(-)
 create mode 100644 man/ibv_query_device_ex.3

diff --git a/Makefile.am b/Makefile.am
index ef4df033581d..c85e98ae0662 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -62,7 +62,8 @@ man_MANS = man/ibv_asyncwatch.1 man/ibv_devices.1 man/ibv_devinfo.1	\
     man/ibv_query_srq.3 man/ibv_rate_to_mult.3 man/ibv_reg_mr.3		\
     man/ibv_req_notify_cq.3 man/ibv_resize_cq.3 man/ibv_rate_to_mbps.3  \
     man/ibv_create_qp_ex.3 man/ibv_create_srq_ex.3 man/ibv_open_xrcd.3  \
-    man/ibv_get_srq_num.3 man/ibv_open_qp.3
+    man/ibv_get_srq_num.3 man/ibv_open_qp.3 \
+    man/ibv_query_device_ex.3
 
 DEBIAN = debian/changelog debian/compat debian/control debian/copyright \
     debian/ibverbs-utils.install debian/libibverbs1.install \
diff --git a/examples/devinfo.c b/examples/devinfo.c
index afa8c853868f..95e8f83753ca 100644
--- a/examples/devinfo.c
+++ b/examples/devinfo.c
@@ -208,6 +208,7 @@ static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
 {
 	struct ibv_context *ctx;
 	struct ibv_device_attr device_attr;
+	struct ibv_device_attr_ex attrx;
 	struct ibv_port_attr port_attr;
 	int rc = 0;
 	uint8_t port;
@@ -219,11 +220,18 @@ static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
 		rc = 1;
 		goto cleanup;
 	}
-	if (ibv_query_device(ctx, &device_attr)) {
-		fprintf(stderr, "Failed to query device props\n");
-		rc = 2;
-		goto cleanup;
+
+	if (ibv_query_device_ex(ctx, &attrx)) {
+		attrx.comp_mask = 0;
+		if (ibv_query_device(ctx, &device_attr)) {
+			fprintf(stderr, "Failed to query device props\n");
+			rc = 2;
+			goto cleanup;
+		}
+	} else {
+		device_attr = attrx.orig_attr;
 	}
+
 	if (ib_port && ib_port > device_attr.phys_port_cnt) {
 		fprintf(stderr, "Invalid port requested for device\n");
 		/* rc = 3 is taken by failure to clean up */
diff --git a/include/infiniband/driver.h b/include/infiniband/driver.h
index 5cc092bf9bd5..b78093ae6a8e 100644
--- a/include/infiniband/driver.h
+++ b/include/infiniband/driver.h
@@ -105,6 +105,15 @@ int ibv_cmd_query_device(struct ibv_context *context,
 			 struct ibv_device_attr *device_attr,
 			 uint64_t *raw_fw_ver,
 			 struct ibv_query_device *cmd, size_t cmd_size);
+int ibv_cmd_query_device_ex(struct ibv_context *context,
+			    struct ibv_device_attr_ex *attr,
+			    uint64_t *raw_fw_ver,
+			    struct ibv_query_device_ex *cmd,
+			    size_t cmd_core_size,
+			    size_t cmd_size,
+			    struct ibv_query_device_resp_ex *resp,
+			    size_t resp_core_size,
+			    size_t resp_size);
 int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num,
 		       struct ibv_port_attr *port_attr,
 		       struct ibv_query_port *cmd, size_t cmd_size);
diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h
index 91b45d837239..af2a1bebf683 100644
--- a/include/infiniband/kern-abi.h
+++ b/include/infiniband/kern-abi.h
@@ -101,12 +101,20 @@ enum {
 
 #define IB_USER_VERBS_CMD_FLAG_EXTENDED		0x80ul
 
+/* use this mask for creating extended commands that
+   correspond to old commands */
+#define IB_USER_VERBS_CMD_EXTENDED_MASK \
+	(IB_USER_VERBS_CMD_FLAG_EXTENDED << \
+	 IB_USER_VERBS_CMD_FLAGS_SHIFT)
+
 
 enum {
 	IB_USER_VERBS_CMD_CREATE_FLOW = (IB_USER_VERBS_CMD_FLAG_EXTENDED <<
 					 IB_USER_VERBS_CMD_FLAGS_SHIFT) +
 					IB_USER_VERBS_CMD_THRESHOLD,
-	IB_USER_VERBS_CMD_DESTROY_FLOW
+	IB_USER_VERBS_CMD_DESTROY_FLOW,
+	IB_USER_VERBS_CMD_QUERY_DEVICE_EX = IB_USER_VERBS_CMD_EXTENDED_MASK |
+						IB_USER_VERBS_CMD_QUERY_DEVICE,
 };
 
 /*
@@ -240,6 +248,19 @@ struct ibv_query_device_resp {
 	__u8  reserved[4];
 };
 
+struct ibv_query_device_ex {
+	struct ex_hdr	hdr;
+	__u32		comp_mask;
+	__u32		reserved;
+};
+
+struct ibv_query_device_resp_ex {
+	struct ibv_query_device_resp base;
+	__u32 comp_mask;
+	__u32 response_length;
+	__u64 reserved[3];
+};
+
 struct ibv_query_port {
 	__u32 command;
 	__u16 in_words;
@@ -1001,7 +1022,8 @@ enum {
 	IB_USER_VERBS_CMD_CREATE_XSRQ_V2 = -1,
 	IB_USER_VERBS_CMD_OPEN_QP_V2 = -1,
 	IB_USER_VERBS_CMD_CREATE_FLOW_V2 = -1,
-	IB_USER_VERBS_CMD_DESTROY_FLOW_V2 = -1
+	IB_USER_VERBS_CMD_DESTROY_FLOW_V2 = -1,
+	IB_USER_VERBS_CMD_QUERY_DEVICE_EX_V2 = -1
 };
 
 struct ibv_modify_srq_v3 {
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index 28e1586b0c96..ff806bf8555d 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -168,6 +168,16 @@ struct ibv_device_attr {
 	uint8_t			phys_port_cnt;
 };
 
+struct ibv_device_attr_ex {
+	struct ibv_device_attr	orig_attr;
+	uint32_t		comp_mask;
+};
+
+struct ibv_device_attr_ex_resp {
+	struct ibv_device_attr	orig_attr;
+	uint32_t		comp_mask;
+};
+
 enum ibv_mtu {
 	IBV_MTU_256  = 1,
 	IBV_MTU_512  = 2,
@@ -977,6 +987,8 @@ enum verbs_context_mask {
 
 struct verbs_context {
 	/*  "grows up" - new fields go here */
+	int (*query_device_ex)(struct ibv_context *context,
+			       struct ibv_device_attr_ex *attr);
 	int (*drv_ibv_destroy_flow) (struct ibv_flow *flow);
 	int (*lib_ibv_destroy_flow) (struct ibv_flow *flow);
 	struct ibv_flow * (*drv_ibv_create_flow) (struct ibv_qp *qp,
@@ -1400,6 +1412,22 @@ ibv_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_ini
 }
 
 /**
+ * ibv_query_device_ex - Get extended device properties
+ */
+static inline int
+ibv_query_device_ex(struct ibv_context *context,
+		    struct ibv_device_attr_ex *attr)
+{
+	struct verbs_context *vctx;
+
+	vctx = verbs_get_ctx_op(context, query_device_ex);
+	if (!vctx || !vctx->query_device_ex)
+		return -ENOSYS;
+
+	return vctx->query_device_ex(context, attr);
+}
+
+/**
  * ibv_open_qp - Open a shareable queue pair.
  */
 static inline struct ibv_qp *
diff --git a/man/ibv_query_device_ex.3 b/man/ibv_query_device_ex.3
new file mode 100644
index 000000000000..6b33f9f92ab1
--- /dev/null
+++ b/man/ibv_query_device_ex.3
@@ -0,0 +1,47 @@
+.\" -*- nroff -*-
+.\"
+.TH IBV_QUERY_DEVICE_EX 3 2014-12-17 libibverbs "Libibverbs Programmer's Manual"
+.SH "NAME"
+ibv_query_device_ex \- query an RDMA device's attributes
+.SH "SYNOPSIS"
+.nf
+.B #include <infiniband/verbs.h>
+.sp
+.BI "int ibv_query_device_ex(struct ibv_context " "*context",
+.BI "                        struct ibv_device_attr_ex " "*attr" );
+.fi
+.SH "DESCRIPTION"
+.B ibv_query_device_ex()
+returns the attributes of the device with context
+.I context\fR.
+The argument
+.I attr
+is a pointer to an ibv_device_attr_ex struct, as defined in <infiniband/verbs.h>.
+.PP
+.nf
+struct ibv_device_attr_ex {
+.in +8
+struct ibv_device_attr orig_attr;
+uint32_t               comp_mask;              /* Compatibility mask that defines which of the following variables are valid */
+.in -8
+};
+.fi
+.SH "RETURN VALUE"
+.B ibv_query_device_ex()
+returns 0 on success, or the value of errno on failure (which indicates the failure reason).
+.SH "NOTES"
+The maximum values returned by this function are the upper limits of
+supported resources by the device.  However, it may not be possible to
+use these maximum values, since the actual number of any resource that
+can be created may be limited by the machine configuration, the amount
+of host memory, user permissions, and the amount of resources already
+in use by other users/processes.
+.SH "SEE ALSO"
+.BR ibv_query_device (3),
+.BR ibv_open_device (3),
+.BR ibv_query_port (3),
+.BR ibv_query_pkey (3),
+.BR ibv_query_gid (3)
+.SH "AUTHORS"
+.TP
+Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
diff --git a/src/cmd.c b/src/cmd.c
index 45ea06ff4705..47f1acd33d68 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -66,6 +66,52 @@ int ibv_cmd_get_context(struct ibv_context *context, struct ibv_get_context *cmd
 	return 0;
 }
 
+static void copy_query_dev_fields(struct ibv_device_attr *device_attr,
+				  struct ibv_query_device_resp *resp,
+				  uint64_t *raw_fw_ver)
+{
+	*raw_fw_ver				= resp->fw_ver;
+	device_attr->node_guid			= resp->node_guid;
+	device_attr->sys_image_guid		= resp->sys_image_guid;
+	device_attr->max_mr_size		= resp->max_mr_size;
+	device_attr->page_size_cap		= resp->page_size_cap;
+	device_attr->vendor_id			= resp->vendor_id;
+	device_attr->vendor_part_id		= resp->vendor_part_id;
+	device_attr->hw_ver			= resp->hw_ver;
+	device_attr->max_qp			= resp->max_qp;
+	device_attr->max_qp_wr			= resp->max_qp_wr;
+	device_attr->device_cap_flags		= resp->device_cap_flags;
+	device_attr->max_sge			= resp->max_sge;
+	device_attr->max_sge_rd			= resp->max_sge_rd;
+	device_attr->max_cq			= resp->max_cq;
+	device_attr->max_cqe			= resp->max_cqe;
+	device_attr->max_mr			= resp->max_mr;
+	device_attr->max_pd			= resp->max_pd;
+	device_attr->max_qp_rd_atom		= resp->max_qp_rd_atom;
+	device_attr->max_ee_rd_atom		= resp->max_ee_rd_atom;
+	device_attr->max_res_rd_atom		= resp->max_res_rd_atom;
+	device_attr->max_qp_init_rd_atom	= resp->max_qp_init_rd_atom;
+	device_attr->max_ee_init_rd_atom	= resp->max_ee_init_rd_atom;
+	device_attr->atomic_cap			= resp->atomic_cap;
+	device_attr->max_ee			= resp->max_ee;
+	device_attr->max_rdd			= resp->max_rdd;
+	device_attr->max_mw			= resp->max_mw;
+	device_attr->max_raw_ipv6_qp		= resp->max_raw_ipv6_qp;
+	device_attr->max_raw_ethy_qp		= resp->max_raw_ethy_qp;
+	device_attr->max_mcast_grp		= resp->max_mcast_grp;
+	device_attr->max_mcast_qp_attach	= resp->max_mcast_qp_attach;
+	device_attr->max_total_mcast_qp_attach	= resp->max_total_mcast_qp_attach;
+	device_attr->max_ah			= resp->max_ah;
+	device_attr->max_fmr			= resp->max_fmr;
+	device_attr->max_map_per_fmr		= resp->max_map_per_fmr;
+	device_attr->max_srq			= resp->max_srq;
+	device_attr->max_srq_wr			= resp->max_srq_wr;
+	device_attr->max_srq_sge		= resp->max_srq_sge;
+	device_attr->max_pkeys			= resp->max_pkeys;
+	device_attr->local_ca_ack_delay		= resp->local_ca_ack_delay;
+	device_attr->phys_port_cnt		= resp->phys_port_cnt;
+}
+
 int ibv_cmd_query_device(struct ibv_context *context,
 			 struct ibv_device_attr *device_attr,
 			 uint64_t *raw_fw_ver,
@@ -81,46 +127,38 @@ int ibv_cmd_query_device(struct ibv_context *context,
 	(void) VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp);
 
 	memset(device_attr->fw_ver, 0, sizeof device_attr->fw_ver);
-	*raw_fw_ver			       = resp.fw_ver;
-	device_attr->node_guid 		       = resp.node_guid;
-	device_attr->sys_image_guid 	       = resp.sys_image_guid;
-	device_attr->max_mr_size 	       = resp.max_mr_size;
-	device_attr->page_size_cap 	       = resp.page_size_cap;
-	device_attr->vendor_id 		       = resp.vendor_id;
-	device_attr->vendor_part_id 	       = resp.vendor_part_id;
-	device_attr->hw_ver 		       = resp.hw_ver;
-	device_attr->max_qp 		       = resp.max_qp;
-	device_attr->max_qp_wr 		       = resp.max_qp_wr;
-	device_attr->device_cap_flags 	       = resp.device_cap_flags;
-	device_attr->max_sge 		       = resp.max_sge;
-	device_attr->max_sge_rd 	       = resp.max_sge_rd;
-	device_attr->max_cq 		       = resp.max_cq;
-	device_attr->max_cqe 		       = resp.max_cqe;
-	device_attr->max_mr 		       = resp.max_mr;
-	device_attr->max_pd 		       = resp.max_pd;
-	device_attr->max_qp_rd_atom 	       = resp.max_qp_rd_atom;
-	device_attr->max_ee_rd_atom 	       = resp.max_ee_rd_atom;
-	device_attr->max_res_rd_atom 	       = resp.max_res_rd_atom;
-	device_attr->max_qp_init_rd_atom       = resp.max_qp_init_rd_atom;
-	device_attr->max_ee_init_rd_atom       = resp.max_ee_init_rd_atom;
-	device_attr->atomic_cap 	       = resp.atomic_cap;
-	device_attr->max_ee 		       = resp.max_ee;
-	device_attr->max_rdd 		       = resp.max_rdd;
-	device_attr->max_mw 		       = resp.max_mw;
-	device_attr->max_raw_ipv6_qp 	       = resp.max_raw_ipv6_qp;
-	device_attr->max_raw_ethy_qp 	       = resp.max_raw_ethy_qp;
-	device_attr->max_mcast_grp 	       = resp.max_mcast_grp;
-	device_attr->max_mcast_qp_attach       = resp.max_mcast_qp_attach;
-	device_attr->max_total_mcast_qp_attach = resp.max_total_mcast_qp_attach;
-	device_attr->max_ah 		       = resp.max_ah;
-	device_attr->max_fmr 		       = resp.max_fmr;
-	device_attr->max_map_per_fmr 	       = resp.max_map_per_fmr;
-	device_attr->max_srq 		       = resp.max_srq;
-	device_attr->max_srq_wr 	       = resp.max_srq_wr;
-	device_attr->max_srq_sge 	       = resp.max_srq_sge;
-	device_attr->max_pkeys 		       = resp.max_pkeys;
-	device_attr->local_ca_ack_delay        = resp.local_ca_ack_delay;
-	device_attr->phys_port_cnt	       = resp.phys_port_cnt;
+	copy_query_dev_fields(device_attr, &resp, raw_fw_ver);
+
+	return 0;
+}
+
+int ibv_cmd_query_device_ex(struct ibv_context *context,
+			    struct ibv_device_attr_ex *attr,
+			    uint64_t *raw_fw_ver,
+			    struct ibv_query_device_ex *cmd,
+			    size_t cmd_core_size,
+			    size_t cmd_size,
+			    struct ibv_query_device_resp_ex *resp,
+			    size_t resp_core_size,
+			    size_t resp_size)
+{
+	int err;
+
+	IBV_INIT_CMD_RESP_EX_V(cmd, cmd_core_size, cmd_size,
+			       QUERY_DEVICE_EX, resp, resp_core_size,
+			       resp_size);
+	cmd->comp_mask = 0;
+	cmd->reserved = 0;
+	memset(attr->orig_attr.fw_ver, 0, sizeof(attr->orig_attr.fw_ver));
+	err = write(context->cmd_fd, cmd, cmd_size);
+	if (err != cmd_size)
+		return errno;
+
+	(void)VALGRIND_MAKE_MEM_DEFINED(resp, resp_size);
+	copy_query_dev_fields(&attr->orig_attr,
+			      (struct ibv_query_device_resp *)resp,
+			      raw_fw_ver);
+	attr->comp_mask = 0;
 
 	return 0;
 }
diff --git a/src/libibverbs.map b/src/libibverbs.map
index 9f0ec69de183..3b40a0fbb80f 100644
--- a/src/libibverbs.map
+++ b/src/libibverbs.map
@@ -9,6 +9,7 @@ IBVERBS_1.0 {
 		ibv_get_async_event;
 		ibv_ack_async_event;
 		ibv_query_device;
+		ibv_query_device_ex;
 		ibv_query_port;
 		ibv_query_gid;
 		ibv_query_pkey;
@@ -37,6 +38,7 @@ IBVERBS_1.0 {
 		ibv_detach_mcast;
 		ibv_cmd_get_context;
 		ibv_cmd_query_device;
+		ibv_cmd_query_device_ex;
 		ibv_cmd_query_port;
 		ibv_cmd_query_gid;
 		ibv_cmd_query_pkey;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Add on-demand paging support
       [not found] ` <1440688955-7709-1-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-27 15:22   ` [PATCH 1/3] Add support for extended query device capabilities Haggai Eran
@ 2015-08-27 15:22   ` Haggai Eran
       [not found]     ` <1440688955-7709-3-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-27 15:22   ` [PATCH 3/3] libibverbs/examples: Support odp in rc_pingpong Haggai Eran
  2 siblings, 1 reply; 7+ messages in thread
From: Haggai Eran @ 2015-08-27 15:22 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Haggai Eran,
	Shachar Raindel, Majd Dibbiny

On-demand paging feature allows registering memory regions without pinning
their pages. Unfortunately the feature doesn't work together will all
transports and all operations. This patch adds the ability to report on-demand
paging capabilities through the ibv_query_device_ex.

The patch also add the IBV_ACCESS_ON_DEMAND access flag to allow registration
of on-demand paging enabled memory regions.

Signed-off-by: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 examples/devinfo.c            | 51 +++++++++++++++++++++++++++++++++++++++++++
 include/infiniband/kern-abi.h | 12 +++++++++-
 include/infiniband/verbs.h    | 25 ++++++++++++++++++++-
 man/ibv_query_device_ex.3     | 23 +++++++++++++++++++
 man/ibv_reg_mr.3              |  2 ++
 src/cmd.c                     | 11 ++++++++++
 6 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/examples/devinfo.c b/examples/devinfo.c
index 95e8f83753ca..61cfdf520be6 100644
--- a/examples/devinfo.c
+++ b/examples/devinfo.c
@@ -43,6 +43,7 @@
 #include <netinet/in.h>
 #include <endian.h>
 #include <byteswap.h>
+#include <inttypes.h>
 
 #include <infiniband/verbs.h>
 #include <infiniband/driver.h>
@@ -204,6 +205,54 @@ static const char *link_layer_str(uint8_t link_layer)
 	}
 }
 
+void print_odp_trans_caps(uint32_t trans)
+{
+	uint32_t unknown_transport_caps = ~(IBV_ODP_SUPPORT_SEND |
+					    IBV_ODP_SUPPORT_RECV |
+					    IBV_ODP_SUPPORT_WRITE |
+					    IBV_ODP_SUPPORT_READ |
+					    IBV_ODP_SUPPORT_ATOMIC);
+
+	if (!trans) {
+		printf("\t\t\t\t\tNO SUPPORT\n");
+	} else {
+		if (trans & IBV_ODP_SUPPORT_SEND)
+			printf("\t\t\t\t\tSUPPORT_SEND\n");
+		if (trans & IBV_ODP_SUPPORT_RECV)
+			printf("\t\t\t\t\tSUPPORT_RECV\n");
+		if (trans & IBV_ODP_SUPPORT_WRITE)
+			printf("\t\t\t\t\tSUPPORT_WRITE\n");
+		if (trans & IBV_ODP_SUPPORT_READ)
+			printf("\t\t\t\t\tSUPPORT_READ\n");
+		if (trans & IBV_ODP_SUPPORT_ATOMIC)
+			printf("\t\t\t\t\tSUPPORT_ATOMIC\n");
+		if (trans & unknown_transport_caps)
+			printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n",
+			       trans & unknown_transport_caps);
+	}
+}
+
+void print_odp_caps(struct ibv_odp_caps caps)
+{
+	uint64_t unknown_general_caps = ~(IBV_ODP_SUPPORT);
+
+	/* general odp caps */
+	printf("\tgeneral_odp_caps:\n");
+	if (caps.general_caps & IBV_ODP_SUPPORT)
+		printf("\t\t\t\t\tODP_SUPPORT\n");
+	if (caps.general_caps & unknown_general_caps)
+		printf("\t\t\t\t\tUnknown flags: 0x%" PRIX64 "\n",
+		       caps.general_caps & unknown_general_caps);
+
+	/* RC transport */
+	printf("\trc_odp_caps:\n");
+	print_odp_trans_caps(caps.per_transport_caps.rc_odp_caps);
+	printf("\tuc_odp_caps:\n");
+	print_odp_trans_caps(caps.per_transport_caps.uc_odp_caps);
+	printf("\tud_odp_caps:\n");
+	print_odp_trans_caps(caps.per_transport_caps.ud_odp_caps);
+}
+
 static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
 {
 	struct ibv_context *ctx;
@@ -296,6 +345,8 @@ static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
 		}
 		printf("\tmax_pkeys:\t\t\t%d\n", device_attr.max_pkeys);
 		printf("\tlocal_ca_ack_delay:\t\t%d\n", device_attr.local_ca_ack_delay);
+
+		print_odp_caps(attrx.odp_caps);
 	}
 
 	for (port = 1; port <= device_attr.phys_port_cnt; ++port) {
diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h
index af2a1bebf683..1c0d0d30c612 100644
--- a/include/infiniband/kern-abi.h
+++ b/include/infiniband/kern-abi.h
@@ -254,11 +254,21 @@ struct ibv_query_device_ex {
 	__u32		reserved;
 };
 
+struct ibv_odp_caps_resp {
+	__u64 general_caps;
+	struct {
+		__u32 rc_odp_caps;
+		__u32 uc_odp_caps;
+		__u32 ud_odp_caps;
+	} per_transport_caps;
+	__u32 reserved;
+};
+
 struct ibv_query_device_resp_ex {
 	struct ibv_query_device_resp base;
 	__u32 comp_mask;
 	__u32 response_length;
-	__u64 reserved[3];
+	struct ibv_odp_caps_resp odp_caps;
 };
 
 struct ibv_query_port {
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index ff806bf8555d..ce56315b236e 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -168,9 +168,31 @@ struct ibv_device_attr {
 	uint8_t			phys_port_cnt;
 };
 
+enum ibv_odp_transport_cap_bits {
+	IBV_ODP_SUPPORT_SEND     = 1 << 0,
+	IBV_ODP_SUPPORT_RECV     = 1 << 1,
+	IBV_ODP_SUPPORT_WRITE    = 1 << 2,
+	IBV_ODP_SUPPORT_READ     = 1 << 3,
+	IBV_ODP_SUPPORT_ATOMIC   = 1 << 4,
+};
+
+struct ibv_odp_caps {
+	uint64_t general_caps;
+	struct {
+		uint32_t rc_odp_caps;
+		uint32_t uc_odp_caps;
+		uint32_t ud_odp_caps;
+	} per_transport_caps;
+};
+
+enum ibv_odp_general_caps {
+	IBV_ODP_SUPPORT = 1 << 0,
+};
+
 struct ibv_device_attr_ex {
 	struct ibv_device_attr	orig_attr;
 	uint32_t		comp_mask;
+	struct ibv_odp_caps	odp_caps;
 };
 
 struct ibv_device_attr_ex_resp {
@@ -350,7 +372,8 @@ enum ibv_access_flags {
 	IBV_ACCESS_REMOTE_WRITE		= (1<<1),
 	IBV_ACCESS_REMOTE_READ		= (1<<2),
 	IBV_ACCESS_REMOTE_ATOMIC	= (1<<3),
-	IBV_ACCESS_MW_BIND		= (1<<4)
+	IBV_ACCESS_MW_BIND		= (1<<4),
+	IBV_ACCESS_ON_DEMAND		= (1<<6),
 };
 
 struct ibv_pd {
diff --git a/man/ibv_query_device_ex.3 b/man/ibv_query_device_ex.3
index 6b33f9f92ab1..1f483d276628 100644
--- a/man/ibv_query_device_ex.3
+++ b/man/ibv_query_device_ex.3
@@ -23,8 +23,31 @@ struct ibv_device_attr_ex {
 .in +8
 struct ibv_device_attr orig_attr;
 uint32_t               comp_mask;              /* Compatibility mask that defines which of the following variables are valid */
+struct ibv_odp_caps    odp_caps;               /* On-Demand Paging capabilities */
 .in -8
 };
+
+struct ibv_exp_odp_caps {
+	uint64_t	general_odp_caps;  /* Mask with enum ibv_odp_general_cap_bits */
+	struct {
+		uint32_t	rc_odp_caps;      /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */
+		uint32_t	uc_odp_caps;      /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */
+		uint32_t	ud_odp_caps;      /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */
+	} per_transport_caps;
+};
+
+enum ibv_odp_general_cap_bits {
+        IBV_ODP_SUPPORT = 1 << 0, /* On demand paging is supported */
+};
+
+enum ibv_odp_transport_cap_bits {
+        IBV_ODP_SUPPORT_SEND     = 1 << 0, /* Send operations support on-demand paging */
+        IBV_ODP_SUPPORT_RECV     = 1 << 1, /* Receive operations support on-demand paging */
+        IBV_ODP_SUPPORT_WRITE    = 1 << 2, /* RDMA-Write operations support on-demand paging */
+        IBV_ODP_SUPPORT_READ     = 1 << 3, /* RDMA-Read operations support on-demand paging */
+        IBV_ODP_SUPPORT_ATOMIC   = 1 << 4, /* RDMA-Atomic operations support on-demand paging */
+};
+
 .fi
 .SH "RETURN VALUE"
 .B ibv_query_device_ex()
diff --git a/man/ibv_reg_mr.3 b/man/ibv_reg_mr.3
index 77237716b47c..cf151113070c 100644
--- a/man/ibv_reg_mr.3
+++ b/man/ibv_reg_mr.3
@@ -34,6 +34,8 @@ describes the desired memory protection attributes; it is either 0 or the bitwis
 .B IBV_ACCESS_REMOTE_ATOMIC\fR Enable Remote Atomic Operation Access (if supported)
 .TP
 .B IBV_ACCESS_MW_BIND\fR       Enable Memory Window Binding
+.TP
+.B IBV_ACCESS_ON_DEMAND\fR    Create an on-demand paging MR
 .PP
 If
 .B IBV_ACCESS_REMOTE_WRITE
diff --git a/src/cmd.c b/src/cmd.c
index 47f1acd33d68..215dc0159a2c 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -159,6 +159,17 @@ int ibv_cmd_query_device_ex(struct ibv_context *context,
 			      (struct ibv_query_device_resp *)resp,
 			      raw_fw_ver);
 	attr->comp_mask = 0;
+	if (resp->response_length >= sizeof(*resp)) {
+		attr->odp_caps.general_caps = resp->odp_caps.general_caps;
+		attr->odp_caps.per_transport_caps.rc_odp_caps =
+			resp->odp_caps.per_transport_caps.rc_odp_caps;
+		attr->odp_caps.per_transport_caps.uc_odp_caps =
+			resp->odp_caps.per_transport_caps.uc_odp_caps;
+		attr->odp_caps.per_transport_caps.ud_odp_caps =
+			resp->odp_caps.per_transport_caps.ud_odp_caps;
+	} else {
+		memset(&attr->odp_caps, 0, sizeof(attr->odp_caps));
+	}
 
 	return 0;
 }
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] libibverbs/examples: Support odp in rc_pingpong
       [not found] ` <1440688955-7709-1-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-27 15:22   ` [PATCH 1/3] Add support for extended query device capabilities Haggai Eran
  2015-08-27 15:22   ` [PATCH 2/3] Add on-demand paging support Haggai Eran
@ 2015-08-27 15:22   ` Haggai Eran
  2 siblings, 0 replies; 7+ messages in thread
From: Haggai Eran @ 2015-08-27 15:22 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Majd Dibbiny,
	Haggai Eran

From: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 examples/rc_pingpong.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c
index ddfe8d007e1a..904ec83a633f 100644
--- a/examples/rc_pingpong.c
+++ b/examples/rc_pingpong.c
@@ -55,6 +55,7 @@ enum {
 };
 
 static int page_size;
+static int use_odp;
 
 struct pingpong_context {
 	struct ibv_context	*context;
@@ -315,6 +316,7 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size,
 					    int use_event)
 {
 	struct pingpong_context *ctx;
+	int access_flags = IBV_ACCESS_LOCAL_WRITE;
 
 	ctx = calloc(1, sizeof *ctx);
 	if (!ctx)
@@ -355,7 +357,25 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size,
 		goto clean_comp_channel;
 	}
 
-	ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size, IBV_ACCESS_LOCAL_WRITE);
+	if (use_odp) {
+		const uint32_t rc_caps_mask = IBV_ODP_SUPPORT_SEND |
+					      IBV_ODP_SUPPORT_RECV;
+		struct ibv_device_attr_ex attrx = {};
+
+		if (ibv_query_device_ex(ctx->context, &attrx)) {
+			fprintf(stderr, "Couldn't query device for its features\n");
+			goto clean_comp_channel;
+		}
+
+		if (!(attrx.odp_caps.general_caps & IBV_ODP_SUPPORT) ||
+		    (attrx.odp_caps.per_transport_caps.rc_odp_caps & rc_caps_mask) != rc_caps_mask) {
+			fprintf(stderr, "The device isn't ODP capable or does not support RC send and receive with ODP\n");
+			goto clean_comp_channel;
+		}
+		access_flags |= IBV_ACCESS_ON_DEMAND;
+	}
+	ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size, access_flags);
+
 	if (!ctx->mr) {
 		fprintf(stderr, "Couldn't register MR\n");
 		goto clean_pd;
@@ -540,6 +560,7 @@ static void usage(const char *argv0)
 	printf("  -l, --sl=<sl>          service level value\n");
 	printf("  -e, --events           sleep on CQ events (default poll)\n");
 	printf("  -g, --gid-idx=<gid index> local port gid index\n");
+	printf("  -o, --odp		    use on demand paging\n");
 }
 
 int main(int argc, char *argv[])
@@ -582,11 +603,13 @@ int main(int argc, char *argv[])
 			{ .name = "sl",       .has_arg = 1, .val = 'l' },
 			{ .name = "events",   .has_arg = 0, .val = 'e' },
 			{ .name = "gid-idx",  .has_arg = 1, .val = 'g' },
+			{ .name = "odp",      .has_arg = 0, .val = 'o' },
 			{ 0 }
 		};
 
-		c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:eg:",
+		c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:eg:o",
 							long_options, NULL);
+
 		if (c == -1)
 			break;
 
@@ -643,6 +666,10 @@ int main(int argc, char *argv[])
 			gidx = strtol(optarg, NULL, 0);
 			break;
 
+		case 'o':
+			use_odp = 1;
+			break;
+
 		default:
 			usage(argv[0]);
 			return 1;
-- 
1.7.11.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Add support for extended query device capabilities
       [not found]     ` <1440688955-7709-2-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-09-01  8:28       ` Sagi Grimberg
  0 siblings, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2015-09-01  8:28 UTC (permalink / raw)
  To: Haggai Eran, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer

On 8/27/2015 6:22 PM, Haggai Eran wrote:
> From: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> Add the verb ibv_query_device_ex which is extensible and allows following
> commits to add new features to define additional properties.
>
> Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

This looks good. And I'll need it too soon.

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] Add on-demand paging support
       [not found]     ` <1440688955-7709-3-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-09-02 19:17       ` Sagi Grimberg
       [not found]         ` <55E74B5B.90805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2015-09-02 19:17 UTC (permalink / raw)
  To: Haggai Eran, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Shachar Raindel,
	Majd Dibbiny

On 8/27/2015 6:22 PM, Haggai Eran wrote:
> On-demand paging feature allows registering memory regions without pinning
> their pages. Unfortunately the feature doesn't work together will all
> transports and all operations. This patch adds the ability to report on-demand
> paging capabilities through the ibv_query_device_ex.
>
> The patch also add the IBV_ACCESS_ON_DEMAND access flag to allow registration
> of on-demand paging enabled memory regions.
>
> Signed-off-by: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---

Looks good,

Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

I have a patch to add ODP support to TGT user-space target.
The performance gain is a clear cut.

Doug, can we get this in?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] Add on-demand paging support
       [not found]         ` <55E74B5B.90805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-09-03  6:46           ` Haggai Eran
  0 siblings, 0 replies; 7+ messages in thread
From: Haggai Eran @ 2015-09-03  6:46 UTC (permalink / raw)
  To: Sagi Grimberg, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Eli Cohen, Matan Barak,
	Yevgeny Petrilin, Eran Ben Elisha, Moshe Lazer, Shachar Raindel,
	Majd Dibbiny

On 02/09/2015 22:17, Sagi Grimberg wrote:
> On 8/27/2015 6:22 PM, Haggai Eran wrote:
>> On-demand paging feature allows registering memory regions without
>> pinning
>> their pages. Unfortunately the feature doesn't work together will all
>> transports and all operations. This patch adds the ability to report
>> on-demand
>> paging capabilities through the ibv_query_device_ex.
>>
>> The patch also add the IBV_ACCESS_ON_DEMAND access flag to allow
>> registration
>> of on-demand paging enabled memory regions.
>>
>> Signed-off-by: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
> 
> Looks good,
> 
> Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> I have a patch to add ODP support to TGT user-space target.
> The performance gain is a clear cut.
> 
> Doug, can we get this in?

I received some comments from Moshe Lazer about the first patch in this
series that I should fix. Mainly, the ibv_query_device_ex() verb as it
was defined by the patch doesn't receive an extensible input struct
(only an output struct), and doesn't receive the length of the output
struct.

I'll fix these comments and send a v1.

Haggai

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-09-03  6:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-27 15:22 [PATCH 0/3] libibverbs: On-demand paging support Haggai Eran
     [not found] ` <1440688955-7709-1-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-27 15:22   ` [PATCH 1/3] Add support for extended query device capabilities Haggai Eran
     [not found]     ` <1440688955-7709-2-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-09-01  8:28       ` Sagi Grimberg
2015-08-27 15:22   ` [PATCH 2/3] Add on-demand paging support Haggai Eran
     [not found]     ` <1440688955-7709-3-git-send-email-haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-09-02 19:17       ` Sagi Grimberg
     [not found]         ` <55E74B5B.90805-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-09-03  6:46           ` Haggai Eran
2015-08-27 15:22   ` [PATCH 3/3] libibverbs/examples: Support odp in rc_pingpong Haggai Eran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.