All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-core 0/5] Add user space dma-buf support
@ 2020-11-23 17:52 ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:52 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

This is the user space counter-part of the kernel patch set to add
dma-buf support to the RDMA subsystem. This patch series adds user
space API for registering dma-buf based memory regions, updates
pyverbs with the new API, and adds new tests.

This series consists of five patches. The first patch adds the new API
function and updates the man pages. Patch 2 implements the new API in
the mlx5 provider. Patch 3 adds new class definitions to pyverbs for
the new API. Patch 4 adds new tests for the new API. Patch 5 fixes bug
in the utility code of the tests.

Jianxin Xiong (5):
  verbs: Support dma-buf based memory region
  mlx5: Support dma-buf based memory region
  pyverbs: Add dma-buf based MR support
  tests: Add tests for dma-buf based memory regions
  tests: Bug fix for get_access_flags()

 kernel-headers/rdma/ib_user_ioctl_cmds.h |  14 ++++
 libibverbs/cmd_mr.c                      |  38 +++++++++
 libibverbs/driver.h                      |   7 ++
 libibverbs/dummy_ops.c                   |  11 +++
 libibverbs/libibverbs.map.in             |   6 ++
 libibverbs/man/ibv_reg_mr.3              |  21 ++++-
 libibverbs/verbs.c                       |  19 +++++
 libibverbs/verbs.h                       |  10 +++
 providers/mlx5/mlx5.c                    |   2 +
 providers/mlx5/mlx5.h                    |   3 +
 providers/mlx5/verbs.c                   |  23 ++++++
 pyverbs/CMakeLists.txt                   |   2 +
 pyverbs/dmabuf.pxd                       |  13 ++++
 pyverbs/dmabuf.pyx                       |  58 ++++++++++++++
 pyverbs/libibverbs.pxd                   |   2 +
 pyverbs/mr.pxd                           |   5 ++
 pyverbs/mr.pyx                           |  77 +++++++++++++++++-
 tests/test_mr.py                         | 130 ++++++++++++++++++++++++++++++-
 tests/utils.py                           |  31 +++++++-
 19 files changed, 464 insertions(+), 8 deletions(-)
 create mode 100644 pyverbs/dmabuf.pxd
 create mode 100644 pyverbs/dmabuf.pyx

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 0/5] Add user space dma-buf support
@ 2020-11-23 17:52 ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:52 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

This is the user space counter-part of the kernel patch set to add
dma-buf support to the RDMA subsystem. This patch series adds user
space API for registering dma-buf based memory regions, updates
pyverbs with the new API, and adds new tests.

This series consists of five patches. The first patch adds the new API
function and updates the man pages. Patch 2 implements the new API in
the mlx5 provider. Patch 3 adds new class definitions to pyverbs for
the new API. Patch 4 adds new tests for the new API. Patch 5 fixes bug
in the utility code of the tests.

Jianxin Xiong (5):
  verbs: Support dma-buf based memory region
  mlx5: Support dma-buf based memory region
  pyverbs: Add dma-buf based MR support
  tests: Add tests for dma-buf based memory regions
  tests: Bug fix for get_access_flags()

 kernel-headers/rdma/ib_user_ioctl_cmds.h |  14 ++++
 libibverbs/cmd_mr.c                      |  38 +++++++++
 libibverbs/driver.h                      |   7 ++
 libibverbs/dummy_ops.c                   |  11 +++
 libibverbs/libibverbs.map.in             |   6 ++
 libibverbs/man/ibv_reg_mr.3              |  21 ++++-
 libibverbs/verbs.c                       |  19 +++++
 libibverbs/verbs.h                       |  10 +++
 providers/mlx5/mlx5.c                    |   2 +
 providers/mlx5/mlx5.h                    |   3 +
 providers/mlx5/verbs.c                   |  23 ++++++
 pyverbs/CMakeLists.txt                   |   2 +
 pyverbs/dmabuf.pxd                       |  13 ++++
 pyverbs/dmabuf.pyx                       |  58 ++++++++++++++
 pyverbs/libibverbs.pxd                   |   2 +
 pyverbs/mr.pxd                           |   5 ++
 pyverbs/mr.pyx                           |  77 +++++++++++++++++-
 tests/test_mr.py                         | 130 ++++++++++++++++++++++++++++++-
 tests/utils.py                           |  31 +++++++-
 19 files changed, 464 insertions(+), 8 deletions(-)
 create mode 100644 pyverbs/dmabuf.pxd
 create mode 100644 pyverbs/dmabuf.pyx

-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
  2020-11-23 17:52 ` Jianxin Xiong
@ 2020-11-23 17:53   ` Jianxin Xiong
  -1 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

Add new API function and new provider method for registering dma-buf
based memory region. Update the man page and bump the API version.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
 libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
 libibverbs/driver.h                      |  7 ++++++
 libibverbs/dummy_ops.c                   | 11 +++++++++
 libibverbs/libibverbs.map.in             |  6 +++++
 libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
 libibverbs/verbs.c                       | 19 ++++++++++++++++
 libibverbs/verbs.h                       | 10 +++++++++
 8 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
index 7968a18..dafc7eb 100644
--- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
+++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
+ * Copyright (c) 2020, Intel Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -251,6 +252,7 @@ enum uverbs_methods_mr {
 	UVERBS_METHOD_MR_DESTROY,
 	UVERBS_METHOD_ADVISE_MR,
 	UVERBS_METHOD_QUERY_MR,
+	UVERBS_METHOD_REG_DMABUF_MR,
 };
 
 enum uverbs_attrs_mr_destroy_ids {
@@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
 	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
 };
 
+enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
+	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
+	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
+	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
+	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
+	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
+	UVERBS_ATTR_REG_DMABUF_MR_FD,
+	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
+	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
+	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
+};
+
 enum uverbs_attrs_create_counters_cmd_attr_ids {
 	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
 };
diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
index 42dbe42..91ce2ef 100644
--- a/libibverbs/cmd_mr.c
+++ b/libibverbs/cmd_mr.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
 	return 0;
 }
 
+int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+			  uint64_t iova, int fd, int access,
+			  struct verbs_mr *vmr)
+{
+	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
+			       UVERBS_METHOD_REG_DMABUF_MR,
+			       9);
+	struct ib_uverbs_attr *handle;
+	uint32_t lkey, rkey;
+	int ret;
+
+	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
+
+	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
+
+	ret = execute_ioctl(pd->context, cmdb);
+	if (ret)
+		return errno;
+
+	vmr->ibv_mr.handle =
+		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
+	vmr->ibv_mr.context = pd->context;
+	vmr->ibv_mr.lkey    = lkey;
+	vmr->ibv_mr.rkey    = rkey;
+	vmr->ibv_mr.pd	    = pd;
+	vmr->ibv_mr.addr    = (void *)offset;
+	vmr->ibv_mr.length  = length;
+	vmr->mr_type        = IBV_MR_TYPE_MR;
+	return 0;
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index ab80f4b..d6a9d0a 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -2,6 +2,7 @@
  * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
  * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
  * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -373,6 +374,9 @@ struct verbs_context_ops {
 	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
 				    uint64_t dm_offset, size_t length,
 				    unsigned int access);
+	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
+					size_t length, uint64_t iova,
+					int fd, int access);
 	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
 				 uint64_t hca_va, int access);
 	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only);
@@ -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
 		      uint32_t flags,
 		      struct ibv_sge *sg_list,
 		      uint32_t num_sge);
+int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+			  uint64_t iova, int fd, int access,
+			  struct verbs_mr *vmr);
 int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
 		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
 		     size_t cmd_size,
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index e5af9e4..dca96d2 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 	return NULL;
 }
 
+static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
+				    size_t length, uint64_t iova,
+				    int fd, int access)
+{
+	errno = ENOSYS;
+	return NULL;
+}
+
 static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
 {
 	return EOPNOTSUPP;
@@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	query_srq,
 	read_counters,
 	reg_dm_mr,
+	reg_dmabuf_mr,
 	reg_mr,
 	req_notify_cq,
 	rereg_mr,
@@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_PRIV_OP_IC(vctx, set_ece);
 	SET_PRIV_OP_IC(vctx, unimport_mr);
 	SET_PRIV_OP_IC(vctx, unimport_pd);
+	SET_OP(vctx, reg_dmabuf_mr);
 
 #undef SET_OP
 #undef SET_OP2
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index b5ccaca..f67e1ef 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -148,6 +148,11 @@ IBVERBS_1.11 {
 		_ibv_query_gid_table;
 } IBVERBS_1.10;
 
+IBVERBS_1.12 {
+	global:
+		ibv_reg_dmabuf_mr;
+} IBVERBS_1.11;
+
 /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
    version. See the top level CMakeLists.txt for this setting. */
 
@@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_query_srq;
 		ibv_cmd_read_counters;
 		ibv_cmd_reg_dm_mr;
+		ibv_cmd_reg_dmabuf_mr;
 		ibv_cmd_reg_mr;
 		ibv_cmd_req_notify_cq;
 		ibv_cmd_rereg_mr;
diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
index 2bfc955..4975c79 100644
--- a/libibverbs/man/ibv_reg_mr.3
+++ b/libibverbs/man/ibv_reg_mr.3
@@ -3,7 +3,7 @@
 .\"
 .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
 .SH "NAME"
-ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
+ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \- register or deregister a memory region (MR)
 .SH "SYNOPSIS"
 .nf
 .B #include <infiniband/verbs.h>
@@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
 .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
 .BI "                               int " "access" );
 .sp
+.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
+.BI "                                 size_t " "length" ", int " "access" );
+.sp
 .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
 .fi
 .SH "DESCRIPTION"
@@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
 (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
 IBV_ACCESS_ZERO_BASED.
 .PP
+.B ibv_reg_dmabuf_mr()
+registers a dma-buf based memory region (MR) associated with the protection domain
+.I pd\fR.
+The MR starts at
+.I offset
+of the dma-buf and its size is
+.I length\fR.
+The dma-buf is identified by the file descriptor
+.I fd\fR.
+The argument
+.I access
+describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are supported:
+.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC, IBV_ACCESS_RELAXED_ORDERING.
+.PP
 .B ibv_dereg_mr()
 deregisters the MR
 .I mr\fR.
 .SH "RETURN VALUE"
-.B ibv_reg_mr() / ibv_reg_mr_iova()
+.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
 returns a pointer to the registered MR, or NULL if the request fails.
 The local key (\fBL_Key\fR) field
 .B lkey
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 2b0ede8..84ddac7 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) 2005 Topspin Communications.  All rights reserved.
  * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corperation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
 	get_ops(mr->context)->unimport_mr(mr);
 }
 
+LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
+		   struct ibv_mr *,
+		   struct ibv_pd *pd, uint64_t offset,
+		   size_t length, int fd, int access)
+{
+	struct ibv_mr *mr;
+
+	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
+						 fd, access);
+	if (mr) {
+		mr->context = pd->context;
+		mr->pd      = pd;
+		mr->addr    = (void *)offset;
+		mr->length  = length;
+	}
+	return mr;
+}
+
 LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_mr *mr, int flags,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index ee57e05..3961b1e 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -3,6 +3,7 @@
  * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
  * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
  * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -2133,6 +2134,9 @@ struct verbs_context {
 	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
 					     struct ibv_xrcd_init_attr *xrcd_init_attr);
 	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
+	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
+						 size_t length, uint64_t iova,
+						 int fd, int access);
 	uint64_t _ABI_placeholder3;
 	size_t   sz;			/* Must be immediately before struct ibv_context */
 	struct ibv_context context;	/* Must be last field in the struct */
@@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
 			  __builtin_constant_p(                                \
 				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
 
+/**
+ * ibv_reg_dmabuf_mr - Register a dambuf-based memory region
+ */
+struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				 int fd, int access);
+
 enum ibv_rereg_mr_err_code {
 	/* Old MR is valid, invalid input */
 	IBV_REREG_MR_ERR_INPUT = -1,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
@ 2020-11-23 17:53   ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

Add new API function and new provider method for registering dma-buf
based memory region. Update the man page and bump the API version.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
 libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
 libibverbs/driver.h                      |  7 ++++++
 libibverbs/dummy_ops.c                   | 11 +++++++++
 libibverbs/libibverbs.map.in             |  6 +++++
 libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
 libibverbs/verbs.c                       | 19 ++++++++++++++++
 libibverbs/verbs.h                       | 10 +++++++++
 8 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
index 7968a18..dafc7eb 100644
--- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
+++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
+ * Copyright (c) 2020, Intel Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -251,6 +252,7 @@ enum uverbs_methods_mr {
 	UVERBS_METHOD_MR_DESTROY,
 	UVERBS_METHOD_ADVISE_MR,
 	UVERBS_METHOD_QUERY_MR,
+	UVERBS_METHOD_REG_DMABUF_MR,
 };
 
 enum uverbs_attrs_mr_destroy_ids {
@@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
 	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
 };
 
+enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
+	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
+	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
+	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
+	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
+	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
+	UVERBS_ATTR_REG_DMABUF_MR_FD,
+	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
+	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
+	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
+};
+
 enum uverbs_attrs_create_counters_cmd_attr_ids {
 	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
 };
diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
index 42dbe42..91ce2ef 100644
--- a/libibverbs/cmd_mr.c
+++ b/libibverbs/cmd_mr.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
 	return 0;
 }
 
+int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+			  uint64_t iova, int fd, int access,
+			  struct verbs_mr *vmr)
+{
+	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
+			       UVERBS_METHOD_REG_DMABUF_MR,
+			       9);
+	struct ib_uverbs_attr *handle;
+	uint32_t lkey, rkey;
+	int ret;
+
+	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
+
+	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
+	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
+
+	ret = execute_ioctl(pd->context, cmdb);
+	if (ret)
+		return errno;
+
+	vmr->ibv_mr.handle =
+		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
+	vmr->ibv_mr.context = pd->context;
+	vmr->ibv_mr.lkey    = lkey;
+	vmr->ibv_mr.rkey    = rkey;
+	vmr->ibv_mr.pd	    = pd;
+	vmr->ibv_mr.addr    = (void *)offset;
+	vmr->ibv_mr.length  = length;
+	vmr->mr_type        = IBV_MR_TYPE_MR;
+	return 0;
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index ab80f4b..d6a9d0a 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -2,6 +2,7 @@
  * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
  * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
  * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -373,6 +374,9 @@ struct verbs_context_ops {
 	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
 				    uint64_t dm_offset, size_t length,
 				    unsigned int access);
+	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
+					size_t length, uint64_t iova,
+					int fd, int access);
 	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
 				 uint64_t hca_va, int access);
 	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only);
@@ -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
 		      uint32_t flags,
 		      struct ibv_sge *sg_list,
 		      uint32_t num_sge);
+int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+			  uint64_t iova, int fd, int access,
+			  struct verbs_mr *vmr);
 int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
 		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
 		     size_t cmd_size,
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index e5af9e4..dca96d2 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 	return NULL;
 }
 
+static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
+				    size_t length, uint64_t iova,
+				    int fd, int access)
+{
+	errno = ENOSYS;
+	return NULL;
+}
+
 static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
 {
 	return EOPNOTSUPP;
@@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	query_srq,
 	read_counters,
 	reg_dm_mr,
+	reg_dmabuf_mr,
 	reg_mr,
 	req_notify_cq,
 	rereg_mr,
@@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_PRIV_OP_IC(vctx, set_ece);
 	SET_PRIV_OP_IC(vctx, unimport_mr);
 	SET_PRIV_OP_IC(vctx, unimport_pd);
+	SET_OP(vctx, reg_dmabuf_mr);
 
 #undef SET_OP
 #undef SET_OP2
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index b5ccaca..f67e1ef 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -148,6 +148,11 @@ IBVERBS_1.11 {
 		_ibv_query_gid_table;
 } IBVERBS_1.10;
 
+IBVERBS_1.12 {
+	global:
+		ibv_reg_dmabuf_mr;
+} IBVERBS_1.11;
+
 /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
    version. See the top level CMakeLists.txt for this setting. */
 
@@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_query_srq;
 		ibv_cmd_read_counters;
 		ibv_cmd_reg_dm_mr;
+		ibv_cmd_reg_dmabuf_mr;
 		ibv_cmd_reg_mr;
 		ibv_cmd_req_notify_cq;
 		ibv_cmd_rereg_mr;
diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
index 2bfc955..4975c79 100644
--- a/libibverbs/man/ibv_reg_mr.3
+++ b/libibverbs/man/ibv_reg_mr.3
@@ -3,7 +3,7 @@
 .\"
 .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
 .SH "NAME"
-ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
+ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \- register or deregister a memory region (MR)
 .SH "SYNOPSIS"
 .nf
 .B #include <infiniband/verbs.h>
@@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
 .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
 .BI "                               int " "access" );
 .sp
+.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
+.BI "                                 size_t " "length" ", int " "access" );
+.sp
 .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
 .fi
 .SH "DESCRIPTION"
@@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
 (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
 IBV_ACCESS_ZERO_BASED.
 .PP
+.B ibv_reg_dmabuf_mr()
+registers a dma-buf based memory region (MR) associated with the protection domain
+.I pd\fR.
+The MR starts at
+.I offset
+of the dma-buf and its size is
+.I length\fR.
+The dma-buf is identified by the file descriptor
+.I fd\fR.
+The argument
+.I access
+describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are supported:
+.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC, IBV_ACCESS_RELAXED_ORDERING.
+.PP
 .B ibv_dereg_mr()
 deregisters the MR
 .I mr\fR.
 .SH "RETURN VALUE"
-.B ibv_reg_mr() / ibv_reg_mr_iova()
+.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
 returns a pointer to the registered MR, or NULL if the request fails.
 The local key (\fBL_Key\fR) field
 .B lkey
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 2b0ede8..84ddac7 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) 2005 Topspin Communications.  All rights reserved.
  * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corperation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
 	get_ops(mr->context)->unimport_mr(mr);
 }
 
+LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
+		   struct ibv_mr *,
+		   struct ibv_pd *pd, uint64_t offset,
+		   size_t length, int fd, int access)
+{
+	struct ibv_mr *mr;
+
+	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
+						 fd, access);
+	if (mr) {
+		mr->context = pd->context;
+		mr->pd      = pd;
+		mr->addr    = (void *)offset;
+		mr->length  = length;
+	}
+	return mr;
+}
+
 LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_mr *mr, int flags,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index ee57e05..3961b1e 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -3,6 +3,7 @@
  * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
  * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
  * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -2133,6 +2134,9 @@ struct verbs_context {
 	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
 					     struct ibv_xrcd_init_attr *xrcd_init_attr);
 	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
+	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
+						 size_t length, uint64_t iova,
+						 int fd, int access);
 	uint64_t _ABI_placeholder3;
 	size_t   sz;			/* Must be immediately before struct ibv_context */
 	struct ibv_context context;	/* Must be last field in the struct */
@@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
 			  __builtin_constant_p(                                \
 				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
 
+/**
+ * ibv_reg_dmabuf_mr - Register a dambuf-based memory region
+ */
+struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				 int fd, int access);
+
 enum ibv_rereg_mr_err_code {
 	/* Old MR is valid, invalid input */
 	IBV_REREG_MR_ERR_INPUT = -1,
-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
  2020-11-23 17:52 ` Jianxin Xiong
@ 2020-11-23 17:53   ` Jianxin Xiong
  -1 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

Implement the new provider method for registering dma-buf based memory
regions.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 providers/mlx5/mlx5.c  |  2 ++
 providers/mlx5/mlx5.h  |  3 +++
 providers/mlx5/verbs.c | 23 +++++++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 1378acf..b3e2d57 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -96,6 +97,7 @@ static const struct verbs_context_ops mlx5_ctx_common_ops = {
 	.async_event   = mlx5_async_event,
 	.dealloc_pd    = mlx5_free_pd,
 	.reg_mr	       = mlx5_reg_mr,
+	.reg_dmabuf_mr = mlx5_reg_dmabuf_mr,
 	.rereg_mr      = mlx5_rereg_mr,
 	.dereg_mr      = mlx5_dereg_mr,
 	.alloc_mw      = mlx5_alloc_mw,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index 8c94f72..17a2470 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -903,6 +904,8 @@ void mlx5_async_event(struct ibv_context *context,
 struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd);
 struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 			   uint64_t hca_va, int access);
+struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				  uint64_t iova, int fd, int access);
 int mlx5_rereg_mr(struct verbs_mr *mr, int flags, struct ibv_pd *pd, void *addr,
 		  size_t length, int access);
 int mlx5_dereg_mr(struct verbs_mr *mr);
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index b956156..6279993 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -647,6 +648,28 @@ struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 	return &mr->vmr.ibv_mr;
 }
 
+struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				  uint64_t iova, int fd, int acc)
+{
+	struct mlx5_mr *mr;
+	int ret;
+	enum ibv_access_flags access = (enum ibv_access_flags)acc;
+
+	mr = calloc(1, sizeof(*mr));
+	if (!mr)
+		return NULL;
+
+	ret = ibv_cmd_reg_dmabuf_mr(pd, offset, length, iova, fd, access,
+				    &mr->vmr);
+	if (ret) {
+		free(mr);
+		return NULL;
+	}
+	mr->alloc_flags = acc;
+
+	return &mr->vmr.ibv_mr;
+}
+
 struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd)
 {
 	struct mlx5_mr *mr;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
@ 2020-11-23 17:53   ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

Implement the new provider method for registering dma-buf based memory
regions.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 providers/mlx5/mlx5.c  |  2 ++
 providers/mlx5/mlx5.h  |  3 +++
 providers/mlx5/verbs.c | 23 +++++++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 1378acf..b3e2d57 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -96,6 +97,7 @@ static const struct verbs_context_ops mlx5_ctx_common_ops = {
 	.async_event   = mlx5_async_event,
 	.dealloc_pd    = mlx5_free_pd,
 	.reg_mr	       = mlx5_reg_mr,
+	.reg_dmabuf_mr = mlx5_reg_dmabuf_mr,
 	.rereg_mr      = mlx5_rereg_mr,
 	.dereg_mr      = mlx5_dereg_mr,
 	.alloc_mw      = mlx5_alloc_mw,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index 8c94f72..17a2470 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -903,6 +904,8 @@ void mlx5_async_event(struct ibv_context *context,
 struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd);
 struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 			   uint64_t hca_va, int access);
+struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				  uint64_t iova, int fd, int access);
 int mlx5_rereg_mr(struct verbs_mr *mr, int flags, struct ibv_pd *pd, void *addr,
 		  size_t length, int access);
 int mlx5_dereg_mr(struct verbs_mr *mr);
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index b956156..6279993 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2012 Mellanox Technologies, Inc.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -647,6 +648,28 @@ struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
 	return &mr->vmr.ibv_mr;
 }
 
+struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
+				  uint64_t iova, int fd, int acc)
+{
+	struct mlx5_mr *mr;
+	int ret;
+	enum ibv_access_flags access = (enum ibv_access_flags)acc;
+
+	mr = calloc(1, sizeof(*mr));
+	if (!mr)
+		return NULL;
+
+	ret = ibv_cmd_reg_dmabuf_mr(pd, offset, length, iova, fd, access,
+				    &mr->vmr);
+	if (ret) {
+		free(mr);
+		return NULL;
+	}
+	mr->alloc_flags = acc;
+
+	return &mr->vmr.ibv_mr;
+}
+
 struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd)
 {
 	struct mlx5_mr *mr;
-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-23 17:52 ` Jianxin Xiong
@ 2020-11-23 17:53   ` Jianxin Xiong
  -1 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

Define a new sub-class of 'MR' that uses dma-buf object for the memory
region. Define a new class 'DmaBuf' for dma-buf object allocation.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 pyverbs/CMakeLists.txt |  2 ++
 pyverbs/dmabuf.pxd     | 13 +++++++++
 pyverbs/dmabuf.pyx     | 58 +++++++++++++++++++++++++++++++++++++
 pyverbs/libibverbs.pxd |  2 ++
 pyverbs/mr.pxd         |  5 ++++
 pyverbs/mr.pyx         | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 155 insertions(+), 2 deletions(-)
 create mode 100644 pyverbs/dmabuf.pxd
 create mode 100644 pyverbs/dmabuf.pyx

diff --git a/pyverbs/CMakeLists.txt b/pyverbs/CMakeLists.txt
index 9542c4b..5aee02b 100644
--- a/pyverbs/CMakeLists.txt
+++ b/pyverbs/CMakeLists.txt
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 rdma_cython_module(pyverbs ""
   addr.pyx
@@ -16,6 +17,7 @@ rdma_cython_module(pyverbs ""
   wr.pyx
   xrcd.pyx
   srq.pyx
+  dmabuf.pyx
   )
 
 rdma_python_module(pyverbs
diff --git a/pyverbs/dmabuf.pxd b/pyverbs/dmabuf.pxd
new file mode 100644
index 0000000..040db4b
--- /dev/null
+++ b/pyverbs/dmabuf.pxd
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+
+#cython: language_level=3
+
+cdef class DmaBuf:
+    cdef int dri_fd
+    cdef int handle
+    cdef int fd
+    cdef unsigned long size
+    cdef unsigned long map_offset
+    cdef object dmabuf_mrs
+    cdef add_ref(self, obj)
diff --git a/pyverbs/dmabuf.pyx b/pyverbs/dmabuf.pyx
new file mode 100644
index 0000000..6c7622d
--- /dev/null
+++ b/pyverbs/dmabuf.pyx
@@ -0,0 +1,58 @@
+# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+
+#cython: language_level=3
+
+import weakref
+
+from os import open, close, O_RDWR
+from fcntl import ioctl
+from struct import pack_into, unpack
+from pyverbs.base cimport close_weakrefs
+from pyverbs.mr cimport DmaBufMR
+
+cdef extern from "drm/drm.h":
+    cdef int DRM_IOCTL_MODE_CREATE_DUMB
+    cdef int DRM_IOCTL_MODE_MAP_DUMB
+    cdef int DRM_IOCTL_MODE_DESTROY_DUMB
+    cdef int DRM_IOCTL_PRIME_HANDLE_TO_FD
+
+cdef class DmaBuf:
+    def __init__(self, size, unit=0):
+        """
+        Allocate DmaBuf object from a GPU device. This is done through the
+        DRI device interface (/dev/dri/card*). Usually this requires the
+        effective user id being root or being a member of the 'video' group.
+        :param size: The size (in number of bytes) of the buffer.
+        :param unit: The unit number of the GPU to allocate the buffer from.
+        :return: The newly created DmaBuf object on success.
+        """
+        self.dmabuf_mrs = weakref.WeakSet()
+        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
+
+        args = bytearray(32)
+        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
+        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)
+
+        args = bytearray(12)
+        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
+        a, b, self.fd = unpack('=iii', args)
+
+        args = bytearray(16)
+        pack_into('=iiq', args, 0, self.handle, 0, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
+        a, b, self.map_offset = unpack('=iiq', args);
+
+    def __dealloc__(self):
+        close_weakrefs([self.dmabuf_mrs])
+        args = bytearray(4)
+        pack_into('=i', args, 0, self.handle)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_DESTROY_DUMB, args)
+        close(self.dri_fd)
+
+    cdef add_ref(self, obj):
+        if isinstance(obj, DmaBufMR):
+            self.dmabuf_mrs.add(obj)
+
diff --git a/pyverbs/libibverbs.pxd b/pyverbs/libibverbs.pxd
index 6fbba54..95e51e1 100644
--- a/pyverbs/libibverbs.pxd
+++ b/pyverbs/libibverbs.pxd
@@ -507,6 +507,8 @@ cdef extern from 'infiniband/verbs.h':
     ibv_pd *ibv_alloc_pd(ibv_context *context)
     int ibv_dealloc_pd(ibv_pd *pd)
     ibv_mr *ibv_reg_mr(ibv_pd *pd, void *addr, size_t length, int access)
+    ibv_mr *ibv_reg_dmabuf_mr(ibv_pd *pd, uint64_t offset, size_t length,
+			      int fd, int access)
     int ibv_dereg_mr(ibv_mr *mr)
     int ibv_advise_mr(ibv_pd *pd, uint32_t advice, uint32_t flags,
                       ibv_sge *sg_list, uint32_t num_sge)
diff --git a/pyverbs/mr.pxd b/pyverbs/mr.pxd
index ebe8ada..b89cf02 100644
--- a/pyverbs/mr.pxd
+++ b/pyverbs/mr.pxd
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 #cython: language_level=3
 
@@ -33,3 +34,7 @@ cdef class MW(PyverbsCM):
 
 cdef class DMMR(MR):
     cdef object dm
+
+cdef class DmaBufMR(MR):
+    cdef object dmabuf
+    cdef unsigned long offset
diff --git a/pyverbs/mr.pyx b/pyverbs/mr.pyx
index 7011da1..4102d3c 100644
--- a/pyverbs/mr.pyx
+++ b/pyverbs/mr.pyx
@@ -1,11 +1,12 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 import resource
 import logging
 
 from posix.mman cimport mmap, munmap, MAP_PRIVATE, PROT_READ, PROT_WRITE, \
-    MAP_ANONYMOUS, MAP_HUGETLB
+    MAP_ANONYMOUS, MAP_HUGETLB, MAP_SHARED
 from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError, \
     PyverbsUserError
 from libc.stdint cimport uintptr_t, SIZE_MAX
@@ -14,9 +15,10 @@ from posix.stdlib cimport posix_memalign
 from libc.string cimport memcpy, memset
 cimport pyverbs.libibverbs_enums as e
 from pyverbs.device cimport DM
-from libc.stdlib cimport free
+from libc.stdlib cimport free, malloc
 from .cmid cimport CMID
 from .pd cimport PD
+from .dmabuf cimport DmaBuf
 
 cdef extern from 'sys/mman.h':
     cdef void* MAP_FAILED
@@ -348,6 +350,77 @@ cdef class DMMR(MR):
     cpdef read(self, length, offset):
         return self.dm.copy_from_dm(offset, length)
 
+cdef class DmaBufMR(MR):
+    def __init__(self, PD pd not None, length=0, access=0, DmaBuf dmabuf=None, offset=0):
+        """
+        Initializes a DmaBufMR (DMA-BUF Memory Region) of the given length
+        and access flags using the given PD and DmaBuf objects.
+        :param pd: A PD object
+        :param length: Length in bytes
+        :param access: Access flags, see ibv_access_flags enum
+        :param dmabuf: A DmaBuf object
+        :param offset: Byte offset from the beginning of the dma-buf
+        :return: The newly create DMABUFMR
+        """
+        self.logger = logging.getLogger(self.__class__.__name__)
+        if dmabuf is None:
+            dmabuf = DmaBuf(length + offset)
+        self.mr = v.ibv_reg_dmabuf_mr(pd.pd, offset, length, dmabuf.fd, access)
+        if self.mr == NULL:
+            raise PyverbsRDMAErrno('Failed to register a dma-buf MR. length: {len}, access flags: {flags}'.
+                                   format(len=length, flags=access,))
+        super().__init__(pd, length, access)
+        self.pd = pd
+        self.dmabuf = dmabuf
+        self.offset = offset
+        pd.add_ref(self)
+        dmabuf.add_ref(self)
+        self.logger.debug('Registered dma-buf ibv_mr. Length: {len}, access flags {flags}'.
+                          format(len=length, flags=access))
+
+    def write(self, data, length, offset=0):
+        """
+        Write user data to the dma-buf backing the MR
+        :param data: User data to write
+        :param length: Length of the data to write
+        :param offset: Writing offset
+        :return: None
+        """
+        if isinstance(data, str):
+            data = data.encode()
+        # can't access the attributes if self.dmabuf is used w/o cast
+        dmabuf = <DmaBuf>self.dmabuf
+        cdef int off = offset + self.offset
+        cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE,
+                              MAP_SHARED, dmabuf.dri_fd, dmabuf.map_offset)
+        if buf == MAP_FAILED:
+            raise PyverbsError('Failed to map dma-buf of size {l}'.
+                               format(l=length))
+        memcpy(<char*>(buf + off), <char *>data, length)
+        munmap(buf, length + off)
+
+    cpdef read(self, length, offset):
+        """
+        Reads data from the dma-buf backing the MR
+        :param length: Length of data to read
+        :param offset: Reading offset
+        :return: The data on the buffer in the requested offset
+        """
+        # can't access the attributes if self.dmabuf is used w/o cast
+        dmabuf = <DmaBuf>self.dmabuf
+        cdef int off = offset + self.offset
+        cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE,
+                              MAP_SHARED, dmabuf.dri_fd, dmabuf.map_offset)
+        if buf == MAP_FAILED:
+            raise PyverbsError('Failed to map dma-buf of size {l}'.
+                               format(l=length))
+        cdef char *data =<char*>malloc(length)
+        memset(data, 0, length)
+        memcpy(data, <char*>(buf + off), length)
+        munmap(buf, length + off)
+        res = data[:length]
+        free(data)
+        return res
 
 def mwtype2str(mw_type):
     mw_types = {1:'IBV_MW_TYPE_1', 2:'IBV_MW_TYPE_2'}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-23 17:53   ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

Define a new sub-class of 'MR' that uses dma-buf object for the memory
region. Define a new class 'DmaBuf' for dma-buf object allocation.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 pyverbs/CMakeLists.txt |  2 ++
 pyverbs/dmabuf.pxd     | 13 +++++++++
 pyverbs/dmabuf.pyx     | 58 +++++++++++++++++++++++++++++++++++++
 pyverbs/libibverbs.pxd |  2 ++
 pyverbs/mr.pxd         |  5 ++++
 pyverbs/mr.pyx         | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 155 insertions(+), 2 deletions(-)
 create mode 100644 pyverbs/dmabuf.pxd
 create mode 100644 pyverbs/dmabuf.pyx

diff --git a/pyverbs/CMakeLists.txt b/pyverbs/CMakeLists.txt
index 9542c4b..5aee02b 100644
--- a/pyverbs/CMakeLists.txt
+++ b/pyverbs/CMakeLists.txt
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 rdma_cython_module(pyverbs ""
   addr.pyx
@@ -16,6 +17,7 @@ rdma_cython_module(pyverbs ""
   wr.pyx
   xrcd.pyx
   srq.pyx
+  dmabuf.pyx
   )
 
 rdma_python_module(pyverbs
diff --git a/pyverbs/dmabuf.pxd b/pyverbs/dmabuf.pxd
new file mode 100644
index 0000000..040db4b
--- /dev/null
+++ b/pyverbs/dmabuf.pxd
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+
+#cython: language_level=3
+
+cdef class DmaBuf:
+    cdef int dri_fd
+    cdef int handle
+    cdef int fd
+    cdef unsigned long size
+    cdef unsigned long map_offset
+    cdef object dmabuf_mrs
+    cdef add_ref(self, obj)
diff --git a/pyverbs/dmabuf.pyx b/pyverbs/dmabuf.pyx
new file mode 100644
index 0000000..6c7622d
--- /dev/null
+++ b/pyverbs/dmabuf.pyx
@@ -0,0 +1,58 @@
+# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+
+#cython: language_level=3
+
+import weakref
+
+from os import open, close, O_RDWR
+from fcntl import ioctl
+from struct import pack_into, unpack
+from pyverbs.base cimport close_weakrefs
+from pyverbs.mr cimport DmaBufMR
+
+cdef extern from "drm/drm.h":
+    cdef int DRM_IOCTL_MODE_CREATE_DUMB
+    cdef int DRM_IOCTL_MODE_MAP_DUMB
+    cdef int DRM_IOCTL_MODE_DESTROY_DUMB
+    cdef int DRM_IOCTL_PRIME_HANDLE_TO_FD
+
+cdef class DmaBuf:
+    def __init__(self, size, unit=0):
+        """
+        Allocate DmaBuf object from a GPU device. This is done through the
+        DRI device interface (/dev/dri/card*). Usually this requires the
+        effective user id being root or being a member of the 'video' group.
+        :param size: The size (in number of bytes) of the buffer.
+        :param unit: The unit number of the GPU to allocate the buffer from.
+        :return: The newly created DmaBuf object on success.
+        """
+        self.dmabuf_mrs = weakref.WeakSet()
+        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
+
+        args = bytearray(32)
+        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
+        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)
+
+        args = bytearray(12)
+        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
+        a, b, self.fd = unpack('=iii', args)
+
+        args = bytearray(16)
+        pack_into('=iiq', args, 0, self.handle, 0, 0)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
+        a, b, self.map_offset = unpack('=iiq', args);
+
+    def __dealloc__(self):
+        close_weakrefs([self.dmabuf_mrs])
+        args = bytearray(4)
+        pack_into('=i', args, 0, self.handle)
+        ioctl(self.dri_fd, DRM_IOCTL_MODE_DESTROY_DUMB, args)
+        close(self.dri_fd)
+
+    cdef add_ref(self, obj):
+        if isinstance(obj, DmaBufMR):
+            self.dmabuf_mrs.add(obj)
+
diff --git a/pyverbs/libibverbs.pxd b/pyverbs/libibverbs.pxd
index 6fbba54..95e51e1 100644
--- a/pyverbs/libibverbs.pxd
+++ b/pyverbs/libibverbs.pxd
@@ -507,6 +507,8 @@ cdef extern from 'infiniband/verbs.h':
     ibv_pd *ibv_alloc_pd(ibv_context *context)
     int ibv_dealloc_pd(ibv_pd *pd)
     ibv_mr *ibv_reg_mr(ibv_pd *pd, void *addr, size_t length, int access)
+    ibv_mr *ibv_reg_dmabuf_mr(ibv_pd *pd, uint64_t offset, size_t length,
+			      int fd, int access)
     int ibv_dereg_mr(ibv_mr *mr)
     int ibv_advise_mr(ibv_pd *pd, uint32_t advice, uint32_t flags,
                       ibv_sge *sg_list, uint32_t num_sge)
diff --git a/pyverbs/mr.pxd b/pyverbs/mr.pxd
index ebe8ada..b89cf02 100644
--- a/pyverbs/mr.pxd
+++ b/pyverbs/mr.pxd
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 #cython: language_level=3
 
@@ -33,3 +34,7 @@ cdef class MW(PyverbsCM):
 
 cdef class DMMR(MR):
     cdef object dm
+
+cdef class DmaBufMR(MR):
+    cdef object dmabuf
+    cdef unsigned long offset
diff --git a/pyverbs/mr.pyx b/pyverbs/mr.pyx
index 7011da1..4102d3c 100644
--- a/pyverbs/mr.pyx
+++ b/pyverbs/mr.pyx
@@ -1,11 +1,12 @@
 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB)
 # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
 
 import resource
 import logging
 
 from posix.mman cimport mmap, munmap, MAP_PRIVATE, PROT_READ, PROT_WRITE, \
-    MAP_ANONYMOUS, MAP_HUGETLB
+    MAP_ANONYMOUS, MAP_HUGETLB, MAP_SHARED
 from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError, \
     PyverbsUserError
 from libc.stdint cimport uintptr_t, SIZE_MAX
@@ -14,9 +15,10 @@ from posix.stdlib cimport posix_memalign
 from libc.string cimport memcpy, memset
 cimport pyverbs.libibverbs_enums as e
 from pyverbs.device cimport DM
-from libc.stdlib cimport free
+from libc.stdlib cimport free, malloc
 from .cmid cimport CMID
 from .pd cimport PD
+from .dmabuf cimport DmaBuf
 
 cdef extern from 'sys/mman.h':
     cdef void* MAP_FAILED
@@ -348,6 +350,77 @@ cdef class DMMR(MR):
     cpdef read(self, length, offset):
         return self.dm.copy_from_dm(offset, length)
 
+cdef class DmaBufMR(MR):
+    def __init__(self, PD pd not None, length=0, access=0, DmaBuf dmabuf=None, offset=0):
+        """
+        Initializes a DmaBufMR (DMA-BUF Memory Region) of the given length
+        and access flags using the given PD and DmaBuf objects.
+        :param pd: A PD object
+        :param length: Length in bytes
+        :param access: Access flags, see ibv_access_flags enum
+        :param dmabuf: A DmaBuf object
+        :param offset: Byte offset from the beginning of the dma-buf
+        :return: The newly create DMABUFMR
+        """
+        self.logger = logging.getLogger(self.__class__.__name__)
+        if dmabuf is None:
+            dmabuf = DmaBuf(length + offset)
+        self.mr = v.ibv_reg_dmabuf_mr(pd.pd, offset, length, dmabuf.fd, access)
+        if self.mr == NULL:
+            raise PyverbsRDMAErrno('Failed to register a dma-buf MR. length: {len}, access flags: {flags}'.
+                                   format(len=length, flags=access,))
+        super().__init__(pd, length, access)
+        self.pd = pd
+        self.dmabuf = dmabuf
+        self.offset = offset
+        pd.add_ref(self)
+        dmabuf.add_ref(self)
+        self.logger.debug('Registered dma-buf ibv_mr. Length: {len}, access flags {flags}'.
+                          format(len=length, flags=access))
+
+    def write(self, data, length, offset=0):
+        """
+        Write user data to the dma-buf backing the MR
+        :param data: User data to write
+        :param length: Length of the data to write
+        :param offset: Writing offset
+        :return: None
+        """
+        if isinstance(data, str):
+            data = data.encode()
+        # can't access the attributes if self.dmabuf is used w/o cast
+        dmabuf = <DmaBuf>self.dmabuf
+        cdef int off = offset + self.offset
+        cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE,
+                              MAP_SHARED, dmabuf.dri_fd, dmabuf.map_offset)
+        if buf == MAP_FAILED:
+            raise PyverbsError('Failed to map dma-buf of size {l}'.
+                               format(l=length))
+        memcpy(<char*>(buf + off), <char *>data, length)
+        munmap(buf, length + off)
+
+    cpdef read(self, length, offset):
+        """
+        Reads data from the dma-buf backing the MR
+        :param length: Length of data to read
+        :param offset: Reading offset
+        :return: The data on the buffer in the requested offset
+        """
+        # can't access the attributes if self.dmabuf is used w/o cast
+        dmabuf = <DmaBuf>self.dmabuf
+        cdef int off = offset + self.offset
+        cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE,
+                              MAP_SHARED, dmabuf.dri_fd, dmabuf.map_offset)
+        if buf == MAP_FAILED:
+            raise PyverbsError('Failed to map dma-buf of size {l}'.
+                               format(l=length))
+        cdef char *data =<char*>malloc(length)
+        memset(data, 0, length)
+        memcpy(data, <char*>(buf + off), length)
+        munmap(buf, length + off)
+        res = data[:length]
+        free(data)
+        return res
 
 def mwtype2str(mw_type):
     mw_types = {1:'IBV_MW_TYPE_1', 2:'IBV_MW_TYPE_2'}
-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 4/5] tests: Add tests for dma-buf based memory regions
  2020-11-23 17:52 ` Jianxin Xiong
@ 2020-11-23 17:53   ` Jianxin Xiong
  -1 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

Define a full set of tests similar to regular MR tests. Add a utility
function to generate access flags for dma-buf based MRs because the
set of supported flags is smaller.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 tests/test_mr.py | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 tests/utils.py   |  25 +++++++++++
 2 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/tests/test_mr.py b/tests/test_mr.py
index adc649c..8d7f002 100644
--- a/tests/test_mr.py
+++ b/tests/test_mr.py
@@ -9,9 +9,10 @@ import errno
 
 from tests.base import PyverbsAPITestCase, RCResources, RDMATestCase
 from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError
-from pyverbs.mr import MR, MW, DMMR, MWBindInfo, MWBind
+from pyverbs.mr import MR, MW, DMMR, DmaBufMR, MWBindInfo, MWBind
 from pyverbs.qp import QPCap, QPInitAttr, QPAttr, QP
 from pyverbs.wr import SendWR
+from pyverbs.dmabuf import DmaBuf
 import pyverbs.device as d
 from pyverbs.pd import PD
 import pyverbs.enums as e
@@ -366,3 +367,130 @@ class DMMRTest(PyverbsAPITestCase):
                         dm_mr = DMMR(pd, dm_mr_len, e.IBV_ACCESS_ZERO_BASED,
                                      dm=dm, offset=dm_mr_offset)
                         dm_mr.close()
+
+class DmaBufMRTest(PyverbsAPITestCase):
+    """
+    Test various functionalities of the DmaBufMR class.
+    """
+    def test_dmabuf_reg_mr(self):
+        """
+        Test ibv_reg_dmabuf_mr()
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        pass
+
+    def test_dmabuf_dereg_mr(self):
+        """
+        Test ibv_dereg_mr() with DmaBufMR
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        mr.close()
+
+    def test_dmabuf_dereg_mr_twice(self):
+        """
+        Verify that explicit call to DmaBufMR's close() doesn't fail
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        # Pyverbs supports multiple destruction of objects,
+                        # we are not expecting an exception here.
+                        mr.close()
+                        mr.close()
+
+    def test_dmabuf_reg_mr_bad_flags(self):
+        """
+        Verify that illegal flags combination fails as expected
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(5):
+                    flags = random.sample([e.IBV_ACCESS_REMOTE_WRITE,
+                                           e.IBV_ACCESS_REMOTE_ATOMIC],
+                                          random.randint(1, 2))
+                    mr_flags = 0
+                    for i in flags:
+                        mr_flags += i.value
+                    try:
+                        DmaBufMR(pd, u.get_mr_length(), mr_flags)
+                    except PyverbsRDMAError as err:
+                        assert 'Failed to register a dma-buf MR' in err.args[0]
+                    else:
+                        raise PyverbsRDMAError('Registered a dma-buf MR with illegal falgs')
+
+    def test_dmabuf_write(self):
+        """
+        Test writing to DmaBufMR's buffer
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(10):
+                    mr_len = u.get_mr_length()
+                    mr_off = random.randint(0, mr_len//2)
+                    flags = u.get_dmabuf_access_flags(ctx)
+                    for f in flags:
+                        with DmaBufMR(pd, mr_len, f, offset=mr_off) as mr:
+                            write_len = min(random.randint(1, MAX_IO_LEN),
+                                            mr_len)
+                            mr.write('a' * write_len, write_len)
+
+    def test_dmabuf_read(self):
+        """
+        Test reading from DmaBufMR's buffer
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(10):
+                    mr_len = u.get_mr_length()
+                    mr_off = random.randint(0, mr_len//2)
+                    flags = u.get_dmabuf_access_flags(ctx)
+                    for f in flags:
+                        with DmaBufMR(pd, mr_len, f, offset=mr_off) as mr:
+                            write_len = min(random.randint(1, MAX_IO_LEN),
+                                            mr_len)
+                            write_str = 'a' * write_len
+                            mr.write(write_str, write_len)
+                            read_len = random.randint(1, write_len)
+                            offset = random.randint(0, write_len-read_len)
+                            read_str = mr.read(read_len, offset).decode()
+                            assert read_str in write_str
+
+    def test_dmabuf_lkey(self):
+        """
+        Test reading lkey property
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                length = u.get_mr_length()
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    with DmaBufMR(pd, length, f) as mr:
+                        mr.lkey
+
+    def test_dmabuf_rkey(self):
+        """
+        Test reading rkey property
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                length = u.get_mr_length()
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    with DmaBufMR(pd, length, f) as mr:
+                        mr.rkey
diff --git a/tests/utils.py b/tests/utils.py
index 7039f41..0ad7110 100644
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -94,6 +94,31 @@ def get_access_flags(ctx):
     return arr
 
 
+def get_dmabuf_access_flags(ctx):
+    """
+    Similar to get_access_flags, except that dma-buf MR only support
+    a subset of the flags.
+    :param ctx: Device Context to check capabilities
+    :return: A random legal value for MR flags
+    """
+    attr = ctx.query_device()
+    vals = [e.IBV_ACCESS_LOCAL_WRITE, e.IBV_ACCESS_REMOTE_WRITE,
+            e.IBV_ACCESS_REMOTE_READ, e.IBV_ACCESS_REMOTE_ATOMIC,
+            e.IBV_ACCESS_RELAXED_ORDERING]
+    if not attr.atomic_caps & e.IBV_ATOMIC_HCA:
+        vals.remove(e.IBV_ACCESS_REMOTE_ATOMIC)
+    arr = []
+    for i in range(1, len(vals)):
+        tmp = list(com(vals, i))
+        tmp = filter(filter_illegal_access_flags, tmp)
+        for t in tmp:  # Iterate legal combinations and bitwise OR them
+            val = 0
+            for flag in t:
+                val += flag.value
+            arr.append(val)
+    return arr
+
+
 def get_dm_attrs(dm_len):
     """
     Initializes an AllocDmAttr member with the given length and random
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 4/5] tests: Add tests for dma-buf based memory regions
@ 2020-11-23 17:53   ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

Define a full set of tests similar to regular MR tests. Add a utility
function to generate access flags for dma-buf based MRs because the
set of supported flags is smaller.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 tests/test_mr.py | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 tests/utils.py   |  25 +++++++++++
 2 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/tests/test_mr.py b/tests/test_mr.py
index adc649c..8d7f002 100644
--- a/tests/test_mr.py
+++ b/tests/test_mr.py
@@ -9,9 +9,10 @@ import errno
 
 from tests.base import PyverbsAPITestCase, RCResources, RDMATestCase
 from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError
-from pyverbs.mr import MR, MW, DMMR, MWBindInfo, MWBind
+from pyverbs.mr import MR, MW, DMMR, DmaBufMR, MWBindInfo, MWBind
 from pyverbs.qp import QPCap, QPInitAttr, QPAttr, QP
 from pyverbs.wr import SendWR
+from pyverbs.dmabuf import DmaBuf
 import pyverbs.device as d
 from pyverbs.pd import PD
 import pyverbs.enums as e
@@ -366,3 +367,130 @@ class DMMRTest(PyverbsAPITestCase):
                         dm_mr = DMMR(pd, dm_mr_len, e.IBV_ACCESS_ZERO_BASED,
                                      dm=dm, offset=dm_mr_offset)
                         dm_mr.close()
+
+class DmaBufMRTest(PyverbsAPITestCase):
+    """
+    Test various functionalities of the DmaBufMR class.
+    """
+    def test_dmabuf_reg_mr(self):
+        """
+        Test ibv_reg_dmabuf_mr()
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        pass
+
+    def test_dmabuf_dereg_mr(self):
+        """
+        Test ibv_dereg_mr() with DmaBufMR
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        mr.close()
+
+    def test_dmabuf_dereg_mr_twice(self):
+        """
+        Verify that explicit call to DmaBufMR's close() doesn't fail
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    len = u.get_mr_length()
+                    off = random.randint(0, len//2)
+                    with DmaBufMR(pd, len, f, offset=off) as mr:
+                        # Pyverbs supports multiple destruction of objects,
+                        # we are not expecting an exception here.
+                        mr.close()
+                        mr.close()
+
+    def test_dmabuf_reg_mr_bad_flags(self):
+        """
+        Verify that illegal flags combination fails as expected
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(5):
+                    flags = random.sample([e.IBV_ACCESS_REMOTE_WRITE,
+                                           e.IBV_ACCESS_REMOTE_ATOMIC],
+                                          random.randint(1, 2))
+                    mr_flags = 0
+                    for i in flags:
+                        mr_flags += i.value
+                    try:
+                        DmaBufMR(pd, u.get_mr_length(), mr_flags)
+                    except PyverbsRDMAError as err:
+                        assert 'Failed to register a dma-buf MR' in err.args[0]
+                    else:
+                        raise PyverbsRDMAError('Registered a dma-buf MR with illegal falgs')
+
+    def test_dmabuf_write(self):
+        """
+        Test writing to DmaBufMR's buffer
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(10):
+                    mr_len = u.get_mr_length()
+                    mr_off = random.randint(0, mr_len//2)
+                    flags = u.get_dmabuf_access_flags(ctx)
+                    for f in flags:
+                        with DmaBufMR(pd, mr_len, f, offset=mr_off) as mr:
+                            write_len = min(random.randint(1, MAX_IO_LEN),
+                                            mr_len)
+                            mr.write('a' * write_len, write_len)
+
+    def test_dmabuf_read(self):
+        """
+        Test reading from DmaBufMR's buffer
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                for i in range(10):
+                    mr_len = u.get_mr_length()
+                    mr_off = random.randint(0, mr_len//2)
+                    flags = u.get_dmabuf_access_flags(ctx)
+                    for f in flags:
+                        with DmaBufMR(pd, mr_len, f, offset=mr_off) as mr:
+                            write_len = min(random.randint(1, MAX_IO_LEN),
+                                            mr_len)
+                            write_str = 'a' * write_len
+                            mr.write(write_str, write_len)
+                            read_len = random.randint(1, write_len)
+                            offset = random.randint(0, write_len-read_len)
+                            read_str = mr.read(read_len, offset).decode()
+                            assert read_str in write_str
+
+    def test_dmabuf_lkey(self):
+        """
+        Test reading lkey property
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                length = u.get_mr_length()
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    with DmaBufMR(pd, length, f) as mr:
+                        mr.lkey
+
+    def test_dmabuf_rkey(self):
+        """
+        Test reading rkey property
+        """
+        for ctx, attr, attr_ex in self.devices:
+            with PD(ctx) as pd:
+                length = u.get_mr_length()
+                flags = u.get_dmabuf_access_flags(ctx)
+                for f in flags:
+                    with DmaBufMR(pd, length, f) as mr:
+                        mr.rkey
diff --git a/tests/utils.py b/tests/utils.py
index 7039f41..0ad7110 100644
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -94,6 +94,31 @@ def get_access_flags(ctx):
     return arr
 
 
+def get_dmabuf_access_flags(ctx):
+    """
+    Similar to get_access_flags, except that dma-buf MR only support
+    a subset of the flags.
+    :param ctx: Device Context to check capabilities
+    :return: A random legal value for MR flags
+    """
+    attr = ctx.query_device()
+    vals = [e.IBV_ACCESS_LOCAL_WRITE, e.IBV_ACCESS_REMOTE_WRITE,
+            e.IBV_ACCESS_REMOTE_READ, e.IBV_ACCESS_REMOTE_ATOMIC,
+            e.IBV_ACCESS_RELAXED_ORDERING]
+    if not attr.atomic_caps & e.IBV_ATOMIC_HCA:
+        vals.remove(e.IBV_ACCESS_REMOTE_ATOMIC)
+    arr = []
+    for i in range(1, len(vals)):
+        tmp = list(com(vals, i))
+        tmp = filter(filter_illegal_access_flags, tmp)
+        for t in tmp:  # Iterate legal combinations and bitwise OR them
+            val = 0
+            for flag in t:
+                val += flag.value
+            arr.append(val)
+    return arr
+
+
 def get_dm_attrs(dm_len):
     """
     Initializes an AllocDmAttr member with the given length and random
-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
  2020-11-23 17:52 ` Jianxin Xiong
@ 2020-11-23 17:53   ` Jianxin Xiong
  -1 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Jianxin Xiong, Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

The filter defintion is wrong and causes get_access_flags() always
returning empty list. As the result the MR tests using this function
are effectively skipped (but report success).

Also fix a typo in the comments.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 tests/utils.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/utils.py b/tests/utils.py
index 0ad7110..eee44b4 100644
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
     :param element: A list of access flags to check
     :return: True if this list is legal, else False
     """
-    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
-        if e.IBV_ACCESS_LOCAL_WRITE:
+    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
+        if not e.IBV_ACCESS_LOCAL_WRITE in element:
             return False
     return True
 
@@ -69,7 +69,7 @@ def get_access_flags(ctx):
     added as well.
     After verifying that the flags selection is legal, it is appended to an
     array, assuming it wasn't previously appended.
-    :param ctx: Device Context to check capabilities
+    :param ctx: Device Coyyntext to check capabilities
     :param num: Size of initial collection
     :return: A random legal value for MR flags
     """
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
@ 2020-11-23 17:53   ` Jianxin Xiong
  0 siblings, 0 replies; 48+ messages in thread
From: Jianxin Xiong @ 2020-11-23 17:53 UTC (permalink / raw)
  To: linux-rdma, dri-devel
  Cc: Leon Romanovsky, Jason Gunthorpe, Doug Ledford, Daniel Vetter,
	Christian Koenig, Jianxin Xiong

The filter defintion is wrong and causes get_access_flags() always
returning empty list. As the result the MR tests using this function
are effectively skipped (but report success).

Also fix a typo in the comments.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 tests/utils.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/utils.py b/tests/utils.py
index 0ad7110..eee44b4 100644
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
     :param element: A list of access flags to check
     :return: True if this list is legal, else False
     """
-    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
-        if e.IBV_ACCESS_LOCAL_WRITE:
+    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
+        if not e.IBV_ACCESS_LOCAL_WRITE in element:
             return False
     return True
 
@@ -69,7 +69,7 @@ def get_access_flags(ctx):
     added as well.
     After verifying that the flags selection is legal, it is appended to an
     array, assuming it wasn't previously appended.
-    :param ctx: Device Context to check capabilities
+    :param ctx: Device Coyyntext to check capabilities
     :param num: Size of initial collection
     :return: A random legal value for MR flags
     """
-- 
1.8.3.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
  2020-11-23 17:53   ` Jianxin Xiong
@ 2020-11-23 18:01     ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:01 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: linux-rdma, dri-devel, Doug Ledford, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

On Mon, Nov 23, 2020 at 09:53:00AM -0800, Jianxin Xiong wrote:
> Add new API function and new provider method for registering dma-buf
> based memory region. Update the man page and bump the API version.
> 
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
>  kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
>  libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
>  libibverbs/driver.h                      |  7 ++++++
>  libibverbs/dummy_ops.c                   | 11 +++++++++
>  libibverbs/libibverbs.map.in             |  6 +++++
>  libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
>  libibverbs/verbs.c                       | 19 ++++++++++++++++
>  libibverbs/verbs.h                       | 10 +++++++++
>  8 files changed, 124 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> index 7968a18..dafc7eb 100644
> +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> @@ -1,5 +1,6 @@
>  /*
>   * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> + * Copyright (c) 2020, Intel Corporation. All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
>  	UVERBS_METHOD_MR_DESTROY,
>  	UVERBS_METHOD_ADVISE_MR,
>  	UVERBS_METHOD_QUERY_MR,
> +	UVERBS_METHOD_REG_DMABUF_MR,
>  };
>  
>  enum uverbs_attrs_mr_destroy_ids {
> @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
>  	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
>  };
>  
> +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> +};
> +
>  enum uverbs_attrs_create_counters_cmd_attr_ids {
>  	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
>  };

Please put changes to kernel-headers/ into their own patch

There is a script to make that commit in kernel-headers/update

> diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
> index 42dbe42..91ce2ef 100644
> +++ b/libibverbs/cmd_mr.c
> @@ -1,5 +1,6 @@
>  /*
>   * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
>  	return 0;
>  }
>  
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr)
> +{
> +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> +			       UVERBS_METHOD_REG_DMABUF_MR,
> +			       9);
> +	struct ib_uverbs_attr *handle;
> +	uint32_t lkey, rkey;
> +	int ret;
> +
> +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> +
> +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
> +
> +	ret = execute_ioctl(pd->context, cmdb);
> +	if (ret)
> +		return errno;
> +
> +	vmr->ibv_mr.handle =
> +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> +	vmr->ibv_mr.context = pd->context;
> +	vmr->ibv_mr.lkey    = lkey;
> +	vmr->ibv_mr.rkey    = rkey;
> +	vmr->ibv_mr.pd	    = pd;
> +	vmr->ibv_mr.addr    = (void *)offset;
> +	vmr->ibv_mr.length  = length;
> +	vmr->mr_type        = IBV_MR_TYPE_MR;

Remove the extra spaces around the = please

> diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
> index b5ccaca..f67e1ef 100644
> +++ b/libibverbs/libibverbs.map.in
> @@ -148,6 +148,11 @@ IBVERBS_1.11 {
>  		_ibv_query_gid_table;
>  } IBVERBS_1.10;
>  
> +IBVERBS_1.12 {
> +	global:
> +		ibv_reg_dmabuf_mr;
> +} IBVERBS_1.11;

There are a few things missing for this, the github CI should throw
some errors, please check it and fix everything

After this you need to change libibverbs/CMakeLists.txt and update:

  1 1.11.${PACKAGE_VERSION}

To 
  1 1.12.${PACKAGE_VERSION}

You also need to update debian/libibverbs1.symbols, check the git
history for examples
  
> +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> +		   struct ibv_mr *,
> +		   struct ibv_pd *pd, uint64_t offset,
> +		   size_t length, int fd, int access)

Do not use LATEST_SYMVER_FUNC, it is only for special cases where we
have two implementations of the same function. A normal function and
the map file update is all that is needed

> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
> index ee57e05..3961b1e 100644
> +++ b/libibverbs/verbs.h
> @@ -3,6 +3,7 @@
>   * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
>   * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
>   * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -2133,6 +2134,9 @@ struct verbs_context {
>  	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
>  					     struct ibv_xrcd_init_attr *xrcd_init_attr);
>  	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +						 size_t length, uint64_t iova,
> +						 int fd, int access);
>  	uint64_t _ABI_placeholder3;
>  	size_t   sz;			/* Must be immediately before struct ibv_context */
>  	struct ibv_context context;	/* Must be last field in the struct */

No, delete this whole hunk, it is not needed

Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
@ 2020-11-23 18:01     ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:01 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford,
	Daniel Vetter, Christian Koenig

On Mon, Nov 23, 2020 at 09:53:00AM -0800, Jianxin Xiong wrote:
> Add new API function and new provider method for registering dma-buf
> based memory region. Update the man page and bump the API version.
> 
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
>  kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
>  libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
>  libibverbs/driver.h                      |  7 ++++++
>  libibverbs/dummy_ops.c                   | 11 +++++++++
>  libibverbs/libibverbs.map.in             |  6 +++++
>  libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
>  libibverbs/verbs.c                       | 19 ++++++++++++++++
>  libibverbs/verbs.h                       | 10 +++++++++
>  8 files changed, 124 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> index 7968a18..dafc7eb 100644
> +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> @@ -1,5 +1,6 @@
>  /*
>   * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> + * Copyright (c) 2020, Intel Corporation. All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
>  	UVERBS_METHOD_MR_DESTROY,
>  	UVERBS_METHOD_ADVISE_MR,
>  	UVERBS_METHOD_QUERY_MR,
> +	UVERBS_METHOD_REG_DMABUF_MR,
>  };
>  
>  enum uverbs_attrs_mr_destroy_ids {
> @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
>  	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
>  };
>  
> +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> +};
> +
>  enum uverbs_attrs_create_counters_cmd_attr_ids {
>  	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
>  };

Please put changes to kernel-headers/ into their own patch

There is a script to make that commit in kernel-headers/update

> diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
> index 42dbe42..91ce2ef 100644
> +++ b/libibverbs/cmd_mr.c
> @@ -1,5 +1,6 @@
>  /*
>   * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
>  	return 0;
>  }
>  
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr)
> +{
> +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> +			       UVERBS_METHOD_REG_DMABUF_MR,
> +			       9);
> +	struct ib_uverbs_attr *handle;
> +	uint32_t lkey, rkey;
> +	int ret;
> +
> +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> +
> +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
> +
> +	ret = execute_ioctl(pd->context, cmdb);
> +	if (ret)
> +		return errno;
> +
> +	vmr->ibv_mr.handle =
> +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> +	vmr->ibv_mr.context = pd->context;
> +	vmr->ibv_mr.lkey    = lkey;
> +	vmr->ibv_mr.rkey    = rkey;
> +	vmr->ibv_mr.pd	    = pd;
> +	vmr->ibv_mr.addr    = (void *)offset;
> +	vmr->ibv_mr.length  = length;
> +	vmr->mr_type        = IBV_MR_TYPE_MR;

Remove the extra spaces around the = please

> diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
> index b5ccaca..f67e1ef 100644
> +++ b/libibverbs/libibverbs.map.in
> @@ -148,6 +148,11 @@ IBVERBS_1.11 {
>  		_ibv_query_gid_table;
>  } IBVERBS_1.10;
>  
> +IBVERBS_1.12 {
> +	global:
> +		ibv_reg_dmabuf_mr;
> +} IBVERBS_1.11;

There are a few things missing for this, the github CI should throw
some errors, please check it and fix everything

After this you need to change libibverbs/CMakeLists.txt and update:

  1 1.11.${PACKAGE_VERSION}

To 
  1 1.12.${PACKAGE_VERSION}

You also need to update debian/libibverbs1.symbols, check the git
history for examples
  
> +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> +		   struct ibv_mr *,
> +		   struct ibv_pd *pd, uint64_t offset,
> +		   size_t length, int fd, int access)

Do not use LATEST_SYMVER_FUNC, it is only for special cases where we
have two implementations of the same function. A normal function and
the map file update is all that is needed

> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
> index ee57e05..3961b1e 100644
> +++ b/libibverbs/verbs.h
> @@ -3,6 +3,7 @@
>   * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
>   * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
>   * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -2133,6 +2134,9 @@ struct verbs_context {
>  	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
>  					     struct ibv_xrcd_init_attr *xrcd_init_attr);
>  	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +						 size_t length, uint64_t iova,
> +						 int fd, int access);
>  	uint64_t _ABI_placeholder3;
>  	size_t   sz;			/* Must be immediately before struct ibv_context */
>  	struct ibv_context context;	/* Must be last field in the struct */

No, delete this whole hunk, it is not needed

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
  2020-11-23 17:53   ` Jianxin Xiong
@ 2020-11-23 18:01     ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:01 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: linux-rdma, dri-devel, Doug Ledford, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

On Mon, Nov 23, 2020 at 09:53:01AM -0800, Jianxin Xiong wrote:

> +struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +				  uint64_t iova, int fd, int acc)
> +{
> +	struct mlx5_mr *mr;
> +	int ret;
> +	enum ibv_access_flags access = (enum ibv_access_flags)acc;

Why?

Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
@ 2020-11-23 18:01     ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:01 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford,
	Daniel Vetter, Christian Koenig

On Mon, Nov 23, 2020 at 09:53:01AM -0800, Jianxin Xiong wrote:

> +struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +				  uint64_t iova, int fd, int acc)
> +{
> +	struct mlx5_mr *mr;
> +	int ret;
> +	enum ibv_access_flags access = (enum ibv_access_flags)acc;

Why?

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-23 17:53   ` Jianxin Xiong
@ 2020-11-23 18:05     ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:05 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: linux-rdma, dri-devel, Doug Ledford, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Daniel Vetter

On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:

> +cdef class DmaBuf:
> +    def __init__(self, size, unit=0):
> +        """
> +        Allocate DmaBuf object from a GPU device. This is done through the
> +        DRI device interface (/dev/dri/card*). Usually this requires the
> +        effective user id being root or being a member of the 'video' group.
> +        :param size: The size (in number of bytes) of the buffer.
> +        :param unit: The unit number of the GPU to allocate the buffer from.
> +        :return: The newly created DmaBuf object on success.
> +        """
> +        self.dmabuf_mrs = weakref.WeakSet()
> +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> +
> +        args = bytearray(32)
> +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)
> +
> +        args = bytearray(12)
> +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> +        a, b, self.fd = unpack('=iii', args)
> +
> +        args = bytearray(16)
> +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> +        a, b, self.map_offset = unpack('=iiq', args);

Wow, OK

Is it worth using ctypes here instead? Can you at least add a comment
before each pack specifying the 'struct XXX' this is following?

Does this work with normal Intel GPUs, like in a Laptop? AMD too?

Christian, I would be very happy to hear from you that this entire
work is good for AMD as well

Edward should look through this, but I'm glad to see something like
this

Thanks,
Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-23 18:05     ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-23 18:05 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford,
	Daniel Vetter, Christian Koenig

On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:

> +cdef class DmaBuf:
> +    def __init__(self, size, unit=0):
> +        """
> +        Allocate DmaBuf object from a GPU device. This is done through the
> +        DRI device interface (/dev/dri/card*). Usually this requires the
> +        effective user id being root or being a member of the 'video' group.
> +        :param size: The size (in number of bytes) of the buffer.
> +        :param unit: The unit number of the GPU to allocate the buffer from.
> +        :return: The newly created DmaBuf object on success.
> +        """
> +        self.dmabuf_mrs = weakref.WeakSet()
> +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> +
> +        args = bytearray(32)
> +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)
> +
> +        args = bytearray(12)
> +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> +        a, b, self.fd = unpack('=iii', args)
> +
> +        args = bytearray(16)
> +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> +        a, b, self.map_offset = unpack('=iiq', args);

Wow, OK

Is it worth using ctypes here instead? Can you at least add a comment
before each pack specifying the 'struct XXX' this is following?

Does this work with normal Intel GPUs, like in a Laptop? AMD too?

Christian, I would be very happy to hear from you that this entire
work is good for AMD as well

Edward should look through this, but I'm glad to see something like
this

Thanks,
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
  2020-11-23 18:01     ` Jason Gunthorpe
@ 2020-11-23 19:40       ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-23 19:40 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma, dri-devel, Doug Ledford, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Vetter, Daniel

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, November 23, 2020 10:02 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
> 
> On Mon, Nov 23, 2020 at 09:53:01AM -0800, Jianxin Xiong wrote:
> 
> > +struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +				  uint64_t iova, int fd, int acc) {
> > +	struct mlx5_mr *mr;
> > +	int ret;
> > +	enum ibv_access_flags access = (enum ibv_access_flags)acc;
> 
> Why?

It's copied from mlx5_reg_mr(). Didn't pay attention to this but looks unnecessary now.

> 
> Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
@ 2020-11-23 19:40       ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-23 19:40 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, November 23, 2020 10:02 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 2/5] mlx5: Support dma-buf based memory region
> 
> On Mon, Nov 23, 2020 at 09:53:01AM -0800, Jianxin Xiong wrote:
> 
> > +struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +				  uint64_t iova, int fd, int acc) {
> > +	struct mlx5_mr *mr;
> > +	int ret;
> > +	enum ibv_access_flags access = (enum ibv_access_flags)acc;
> 
> Why?

It's copied from mlx5_reg_mr(). Didn't pay attention to this but looks unnecessary now.

> 
> Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-23 18:05     ` Jason Gunthorpe
@ 2020-11-23 19:48       ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-23 19:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma, dri-devel, Doug Ledford, Leon Romanovsky,
	Sumit Semwal, Christian Koenig, Vetter, Daniel

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, November 23, 2020 10:05 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the
> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > + args)
> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment before each pack specifying the 'struct XXX' this is following?
> 

The ioctl call only accept a bytearray, not sure how to use ctypes here. I will add 
comments with the actual layout of the parameter structure.

> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 

Yes, the interface is generic and works with most GPUs. Works with AMD, too.

> Christian, I would be very happy to hear from you that this entire work is good for AMD as well
> 
> Edward should look through this, but I'm glad to see something like this
> 
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-23 19:48       ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-23 19:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, November 23, 2020 10:05 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the
> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > + args)
> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment before each pack specifying the 'struct XXX' this is following?
> 

The ioctl call only accept a bytearray, not sure how to use ctypes here. I will add 
comments with the actual layout of the parameter structure.

> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 

Yes, the interface is generic and works with most GPUs. Works with AMD, too.

> Christian, I would be very happy to hear from you that this entire work is good for AMD as well
> 
> Edward should look through this, but I'm glad to see something like this
> 
> Thanks,
> Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
  2020-11-23 17:53   ` Jianxin Xiong
@ 2020-11-24 10:31     ` Yishai Hadas
  -1 siblings, 0 replies; 48+ messages in thread
From: Yishai Hadas @ 2020-11-24 10:31 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: linux-rdma, dri-devel, Doug Ledford, Jason Gunthorpe,
	Leon Romanovsky, Sumit Semwal, Christian Koenig, Daniel Vetter,
	Yishai Hadas

On 11/23/2020 7:53 PM, Jianxin Xiong wrote:
> Add new API function and new provider method for registering dma-buf
> based memory region. Update the man page and bump the API version.
>
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> ---
>   kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
>   libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
>   libibverbs/driver.h                      |  7 ++++++
>   libibverbs/dummy_ops.c                   | 11 +++++++++
>   libibverbs/libibverbs.map.in             |  6 +++++
>   libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
>   libibverbs/verbs.c                       | 19 ++++++++++++++++
>   libibverbs/verbs.h                       | 10 +++++++++
>   8 files changed, 124 insertions(+), 2 deletions(-)
>
> diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> index 7968a18..dafc7eb 100644
> --- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> + * Copyright (c) 2020, Intel Corporation. All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
>   	UVERBS_METHOD_MR_DESTROY,
>   	UVERBS_METHOD_ADVISE_MR,
>   	UVERBS_METHOD_QUERY_MR,
> +	UVERBS_METHOD_REG_DMABUF_MR,
>   };
>   
>   enum uverbs_attrs_mr_destroy_ids {
> @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
>   	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
>   };
>   
> +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> +};
> +
>   enum uverbs_attrs_create_counters_cmd_attr_ids {
>   	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
>   };
> diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
> index 42dbe42..91ce2ef 100644
> --- a/libibverbs/cmd_mr.c
> +++ b/libibverbs/cmd_mr.c
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
>   	return 0;
>   }
>   
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr)
> +{
> +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> +			       UVERBS_METHOD_REG_DMABUF_MR,
> +			       9);
> +	struct ib_uverbs_attr *handle;
> +	uint32_t lkey, rkey;
> +	int ret;
> +
> +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> +
> +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
> +
> +	ret = execute_ioctl(pd->context, cmdb);
> +	if (ret)
> +		return errno;
> +
> +	vmr->ibv_mr.handle =
> +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> +	vmr->ibv_mr.context = pd->context;
> +	vmr->ibv_mr.lkey    = lkey;
> +	vmr->ibv_mr.rkey    = rkey;
> +	vmr->ibv_mr.pd	    = pd;
> +	vmr->ibv_mr.addr    = (void *)offset;
> +	vmr->ibv_mr.length  = length;
> +	vmr->mr_type        = IBV_MR_TYPE_MR;
> +	return 0;
> +}
> diff --git a/libibverbs/driver.h b/libibverbs/driver.h
> index ab80f4b..d6a9d0a 100644
> --- a/libibverbs/driver.h
> +++ b/libibverbs/driver.h
> @@ -2,6 +2,7 @@
>    * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
>    * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
>    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation. All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -373,6 +374,9 @@ struct verbs_context_ops {
>   	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
>   				    uint64_t dm_offset, size_t length,
>   				    unsigned int access);
> +	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +					size_t length, uint64_t iova,
> +					int fd, int access);
>   	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
>   				 uint64_t hca_va, int access);
>   	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only);
> @@ -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
>   		      uint32_t flags,
>   		      struct ibv_sge *sg_list,
>   		      uint32_t num_sge);
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr);
>   int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
>   		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
>   		     size_t cmd_size,
> diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
> index e5af9e4..dca96d2 100644
> --- a/libibverbs/dummy_ops.c
> +++ b/libibverbs/dummy_ops.c
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
>   	return NULL;
>   }
>   
> +static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
> +				    size_t length, uint64_t iova,
> +				    int fd, int access)
> +{
> +	errno = ENOSYS;
> +	return NULL;
> +}
> +
>   static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
>   {
>   	return EOPNOTSUPP;
> @@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
>   	query_srq,
>   	read_counters,
>   	reg_dm_mr,
> +	reg_dmabuf_mr,
>   	reg_mr,
>   	req_notify_cq,
>   	rereg_mr,
> @@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
>   	SET_PRIV_OP_IC(vctx, set_ece);
>   	SET_PRIV_OP_IC(vctx, unimport_mr);
>   	SET_PRIV_OP_IC(vctx, unimport_pd);
> +	SET_OP(vctx, reg_dmabuf_mr);
>   
>   #undef SET_OP
>   #undef SET_OP2
> diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
> index b5ccaca..f67e1ef 100644
> --- a/libibverbs/libibverbs.map.in
> +++ b/libibverbs/libibverbs.map.in
> @@ -148,6 +148,11 @@ IBVERBS_1.11 {
>   		_ibv_query_gid_table;
>   } IBVERBS_1.10;
>   
> +IBVERBS_1.12 {
> +	global:
> +		ibv_reg_dmabuf_mr;
> +} IBVERBS_1.11;
> +
>   /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
>      version. See the top level CMakeLists.txt for this setting. */
>   
> @@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
>   		ibv_cmd_query_srq;
>   		ibv_cmd_read_counters;
>   		ibv_cmd_reg_dm_mr;
> +		ibv_cmd_reg_dmabuf_mr;
>   		ibv_cmd_reg_mr;
>   		ibv_cmd_req_notify_cq;
>   		ibv_cmd_rereg_mr;
> diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
> index 2bfc955..4975c79 100644
> --- a/libibverbs/man/ibv_reg_mr.3
> +++ b/libibverbs/man/ibv_reg_mr.3
> @@ -3,7 +3,7 @@
>   .\"
>   .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
>   .SH "NAME"
> -ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
> +ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \- register or deregister a memory region (MR)
>   .SH "SYNOPSIS"
>   .nf
>   .B #include <infiniband/verbs.h>
> @@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
>   .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
>   .BI "                               int " "access" );
>   .sp
> +.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
> +.BI "                                 size_t " "length" ", int " "access" );
> +.sp


This misses the 'fd' parameter.

>   .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
>   .fi
>   .SH "DESCRIPTION"
> @@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
>   (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
>   IBV_ACCESS_ZERO_BASED.
>   .PP
> +.B ibv_reg_dmabuf_mr()
> +registers a dma-buf based memory region (MR) associated with the protection domain
> +.I pd\fR.
> +The MR starts at
> +.I offset
> +of the dma-buf and its size is
> +.I length\fR.
> +The dma-buf is identified by the file descriptor
> +.I fd\fR.
> +The argument
> +.I access
> +describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are supported:
> +.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC, IBV_ACCESS_RELAXED_ORDERING.
> +.PP
>   .B ibv_dereg_mr()
>   deregisters the MR
>   .I mr\fR.
>   .SH "RETURN VALUE"
> -.B ibv_reg_mr() / ibv_reg_mr_iova()
> +.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
>   returns a pointer to the registered MR, or NULL if the request fails.
>   The local key (\fBL_Key\fR) field
>   .B lkey
> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
> index 2b0ede8..84ddac7 100644
> --- a/libibverbs/verbs.c
> +++ b/libibverbs/verbs.c
> @@ -1,6 +1,7 @@
>   /*
>    * Copyright (c) 2005 Topspin Communications.  All rights reserved.
>    * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corperation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
>   	get_ops(mr->context)->unimport_mr(mr);
>   }
>   
> +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> +		   struct ibv_mr *,
> +		   struct ibv_pd *pd, uint64_t offset,
> +		   size_t length, int fd, int access)
> +{


  ibv_dereg_mr() -> ibv_dofork_range() doesn't seem to be applicable in 
this case, right ? if offset / addr won't be 0 it may be activated on 
the address range.

Do we rely on applications to *not* turn on fork support in libibverbs ? 
(e.g. call ibv_fork_init()), what if some other registrations may need 
being fork safe ?

> +	struct ibv_mr *mr;
> +
> +	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
> +						 fd, access);
> +	if (mr) {
> +		mr->context = pd->context;
> +		mr->pd      = pd;
> +		mr->addr    = (void *)offset;
> +		mr->length  = length;
> +	}
> +	return mr;
> +}
> +
>   LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
>   		   int,
>   		   struct ibv_mr *mr, int flags,
> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
> index ee57e05..3961b1e 100644
> --- a/libibverbs/verbs.h
> +++ b/libibverbs/verbs.h
> @@ -3,6 +3,7 @@
>    * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
>    * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
>    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -2133,6 +2134,9 @@ struct verbs_context {
>   	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
>   					     struct ibv_xrcd_init_attr *xrcd_init_attr);
>   	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +						 size_t length, uint64_t iova,
> +						 int fd, int access);
>   	uint64_t _ABI_placeholder3;
>   	size_t   sz;			/* Must be immediately before struct ibv_context */
>   	struct ibv_context context;	/* Must be last field in the struct */
> @@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
>   			  __builtin_constant_p(                                \
>   				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
>   
> +/**
> + * ibv_reg_dmabuf_mr - Register a dambuf-based memory region
> + */
> +struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +				 int fd, int access);
> +
>   enum ibv_rereg_mr_err_code {
>   	/* Old MR is valid, invalid input */
>   	IBV_REREG_MR_ERR_INPUT = -1,



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
@ 2020-11-24 10:31     ` Yishai Hadas
  0 siblings, 0 replies; 48+ messages in thread
From: Yishai Hadas @ 2020-11-24 10:31 UTC (permalink / raw)
  To: Jianxin Xiong
  Cc: Yishai Hadas, Leon Romanovsky, linux-rdma, dri-devel,
	Christian Koenig, Jason Gunthorpe, Doug Ledford, Daniel Vetter

On 11/23/2020 7:53 PM, Jianxin Xiong wrote:
> Add new API function and new provider method for registering dma-buf
> based memory region. Update the man page and bump the API version.
>
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> ---
>   kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
>   libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
>   libibverbs/driver.h                      |  7 ++++++
>   libibverbs/dummy_ops.c                   | 11 +++++++++
>   libibverbs/libibverbs.map.in             |  6 +++++
>   libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
>   libibverbs/verbs.c                       | 19 ++++++++++++++++
>   libibverbs/verbs.h                       | 10 +++++++++
>   8 files changed, 124 insertions(+), 2 deletions(-)
>
> diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> index 7968a18..dafc7eb 100644
> --- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> + * Copyright (c) 2020, Intel Corporation. All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
>   	UVERBS_METHOD_MR_DESTROY,
>   	UVERBS_METHOD_ADVISE_MR,
>   	UVERBS_METHOD_QUERY_MR,
> +	UVERBS_METHOD_REG_DMABUF_MR,
>   };
>   
>   enum uverbs_attrs_mr_destroy_ids {
> @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
>   	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
>   };
>   
> +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> +};
> +
>   enum uverbs_attrs_create_counters_cmd_attr_ids {
>   	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
>   };
> diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c
> index 42dbe42..91ce2ef 100644
> --- a/libibverbs/cmd_mr.c
> +++ b/libibverbs/cmd_mr.c
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
>   	return 0;
>   }
>   
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr)
> +{
> +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> +			       UVERBS_METHOD_REG_DMABUF_MR,
> +			       9);
> +	struct ib_uverbs_attr *handle;
> +	uint32_t lkey, rkey;
> +	int ret;
> +
> +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> +
> +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access);
> +
> +	ret = execute_ioctl(pd->context, cmdb);
> +	if (ret)
> +		return errno;
> +
> +	vmr->ibv_mr.handle =
> +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> +	vmr->ibv_mr.context = pd->context;
> +	vmr->ibv_mr.lkey    = lkey;
> +	vmr->ibv_mr.rkey    = rkey;
> +	vmr->ibv_mr.pd	    = pd;
> +	vmr->ibv_mr.addr    = (void *)offset;
> +	vmr->ibv_mr.length  = length;
> +	vmr->mr_type        = IBV_MR_TYPE_MR;
> +	return 0;
> +}
> diff --git a/libibverbs/driver.h b/libibverbs/driver.h
> index ab80f4b..d6a9d0a 100644
> --- a/libibverbs/driver.h
> +++ b/libibverbs/driver.h
> @@ -2,6 +2,7 @@
>    * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
>    * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
>    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation. All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -373,6 +374,9 @@ struct verbs_context_ops {
>   	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
>   				    uint64_t dm_offset, size_t length,
>   				    unsigned int access);
> +	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +					size_t length, uint64_t iova,
> +					int fd, int access);
>   	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
>   				 uint64_t hca_va, int access);
>   	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only);
> @@ -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
>   		      uint32_t flags,
>   		      struct ibv_sge *sg_list,
>   		      uint32_t num_sge);
> +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +			  uint64_t iova, int fd, int access,
> +			  struct verbs_mr *vmr);
>   int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
>   		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
>   		     size_t cmd_size,
> diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
> index e5af9e4..dca96d2 100644
> --- a/libibverbs/dummy_ops.c
> +++ b/libibverbs/dummy_ops.c
> @@ -1,5 +1,6 @@
>   /*
>    * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
>   	return NULL;
>   }
>   
> +static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
> +				    size_t length, uint64_t iova,
> +				    int fd, int access)
> +{
> +	errno = ENOSYS;
> +	return NULL;
> +}
> +
>   static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
>   {
>   	return EOPNOTSUPP;
> @@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
>   	query_srq,
>   	read_counters,
>   	reg_dm_mr,
> +	reg_dmabuf_mr,
>   	reg_mr,
>   	req_notify_cq,
>   	rereg_mr,
> @@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
>   	SET_PRIV_OP_IC(vctx, set_ece);
>   	SET_PRIV_OP_IC(vctx, unimport_mr);
>   	SET_PRIV_OP_IC(vctx, unimport_pd);
> +	SET_OP(vctx, reg_dmabuf_mr);
>   
>   #undef SET_OP
>   #undef SET_OP2
> diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
> index b5ccaca..f67e1ef 100644
> --- a/libibverbs/libibverbs.map.in
> +++ b/libibverbs/libibverbs.map.in
> @@ -148,6 +148,11 @@ IBVERBS_1.11 {
>   		_ibv_query_gid_table;
>   } IBVERBS_1.10;
>   
> +IBVERBS_1.12 {
> +	global:
> +		ibv_reg_dmabuf_mr;
> +} IBVERBS_1.11;
> +
>   /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
>      version. See the top level CMakeLists.txt for this setting. */
>   
> @@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
>   		ibv_cmd_query_srq;
>   		ibv_cmd_read_counters;
>   		ibv_cmd_reg_dm_mr;
> +		ibv_cmd_reg_dmabuf_mr;
>   		ibv_cmd_reg_mr;
>   		ibv_cmd_req_notify_cq;
>   		ibv_cmd_rereg_mr;
> diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
> index 2bfc955..4975c79 100644
> --- a/libibverbs/man/ibv_reg_mr.3
> +++ b/libibverbs/man/ibv_reg_mr.3
> @@ -3,7 +3,7 @@
>   .\"
>   .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
>   .SH "NAME"
> -ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory region (MR)
> +ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \- register or deregister a memory region (MR)
>   .SH "SYNOPSIS"
>   .nf
>   .B #include <infiniband/verbs.h>
> @@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
>   .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
>   .BI "                               int " "access" );
>   .sp
> +.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
> +.BI "                                 size_t " "length" ", int " "access" );
> +.sp


This misses the 'fd' parameter.

>   .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
>   .fi
>   .SH "DESCRIPTION"
> @@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
>   (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
>   IBV_ACCESS_ZERO_BASED.
>   .PP
> +.B ibv_reg_dmabuf_mr()
> +registers a dma-buf based memory region (MR) associated with the protection domain
> +.I pd\fR.
> +The MR starts at
> +.I offset
> +of the dma-buf and its size is
> +.I length\fR.
> +The dma-buf is identified by the file descriptor
> +.I fd\fR.
> +The argument
> +.I access
> +describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are supported:
> +.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC, IBV_ACCESS_RELAXED_ORDERING.
> +.PP
>   .B ibv_dereg_mr()
>   deregisters the MR
>   .I mr\fR.
>   .SH "RETURN VALUE"
> -.B ibv_reg_mr() / ibv_reg_mr_iova()
> +.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
>   returns a pointer to the registered MR, or NULL if the request fails.
>   The local key (\fBL_Key\fR) field
>   .B lkey
> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
> index 2b0ede8..84ddac7 100644
> --- a/libibverbs/verbs.c
> +++ b/libibverbs/verbs.c
> @@ -1,6 +1,7 @@
>   /*
>    * Copyright (c) 2005 Topspin Communications.  All rights reserved.
>    * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corperation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
>   	get_ops(mr->context)->unimport_mr(mr);
>   }
>   
> +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> +		   struct ibv_mr *,
> +		   struct ibv_pd *pd, uint64_t offset,
> +		   size_t length, int fd, int access)
> +{


  ibv_dereg_mr() -> ibv_dofork_range() doesn't seem to be applicable in 
this case, right ? if offset / addr won't be 0 it may be activated on 
the address range.

Do we rely on applications to *not* turn on fork support in libibverbs ? 
(e.g. call ibv_fork_init()), what if some other registrations may need 
being fork safe ?

> +	struct ibv_mr *mr;
> +
> +	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
> +						 fd, access);
> +	if (mr) {
> +		mr->context = pd->context;
> +		mr->pd      = pd;
> +		mr->addr    = (void *)offset;
> +		mr->length  = length;
> +	}
> +	return mr;
> +}
> +
>   LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
>   		   int,
>   		   struct ibv_mr *mr, int flags,
> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
> index ee57e05..3961b1e 100644
> --- a/libibverbs/verbs.h
> +++ b/libibverbs/verbs.h
> @@ -3,6 +3,7 @@
>    * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
>    * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
>    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
>    *
>    * This software is available to you under a choice of one of two
>    * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -2133,6 +2134,9 @@ struct verbs_context {
>   	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
>   					     struct ibv_xrcd_init_attr *xrcd_init_attr);
>   	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> +						 size_t length, uint64_t iova,
> +						 int fd, int access);
>   	uint64_t _ABI_placeholder3;
>   	size_t   sz;			/* Must be immediately before struct ibv_context */
>   	struct ibv_context context;	/* Must be last field in the struct */
> @@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
>   			  __builtin_constant_p(                                \
>   				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
>   
> +/**
> + * ibv_reg_dmabuf_mr - Register a dambuf-based memory region
> + */
> +struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> +				 int fd, int access);
> +
>   enum ibv_rereg_mr_err_code {
>   	/* Old MR is valid, invalid input */
>   	IBV_REREG_MR_ERR_INPUT = -1,


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-23 18:05     ` Jason Gunthorpe
@ 2020-11-24 15:16       ` Daniel Vetter
  -1 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2020-11-24 15:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jianxin Xiong, Leon Romanovsky, linux-rdma, dri-devel,
	Doug Ledford, Daniel Vetter, Christian Koenig

On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the

Please use /dev/dri/renderD* instead. That's the interface meant for
unpriviledged rendering access. card* is the legacy interface with
backwards compat galore, don't use.

Specifically if you do this on a gpu which also has display (maybe some
testing on a local developer machine, no idea ...) then you mess with
compositors and stuff.

Also wherever you copied this from, please also educate those teams that
using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)

Yeah no, don't allocate render buffers with create_dumb. Every time this
comes up I'm wondering whether we should just completely disable dma-buf
operations on these. Dumb buffers are explicitly only for software
rendering for display purposes when the gpu userspace stack isn't fully
running yet, aka boot splash.

And yes I know there's endless amounts of abuse of that stuff floating
around, especially on arm-soc/android systems.

> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment
> before each pack specifying the 'struct XXX' this is following?
> 
> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 
> Christian, I would be very happy to hear from you that this entire
> work is good for AMD as well

I think the smallest generic interface for allocating gpu buffers which
are more useful than the stuff you get from CREATE_DUMB is gbm. That's
used by compositors to get bare metal opengl going on linux. Ofc Android
has gralloc for the same purpose, and cros has minigbm (which isn't the
same as gbm at all). So not so cool.

The other generic option is using vulkan, which works directly on bare
metal (without a compositor or anything running), and is cross vendor. So
cool, except not used for compute, which is generally the thing you want
if you have an rdma card.

Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back,
properly.

Compute is the worst, because opencl is widely considered a mistake (maybe
opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
have xe (intel-only).

It's pretty glorious :-/

Also I think we discussed this already, but for actual p2p the intel
patches aren't in upstream yet. We have some internally, but with very
broken locking (in the process of getting fixed up, but it's taking time).

Cheers, Daniel

> Edward should look through this, but I'm glad to see something like
> this
> 
> Thanks,
> Jason
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-24 15:16       ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2020-11-24 15:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford,
	Daniel Vetter, Christian Koenig, Jianxin Xiong

On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the

Please use /dev/dri/renderD* instead. That's the interface meant for
unpriviledged rendering access. card* is the legacy interface with
backwards compat galore, don't use.

Specifically if you do this on a gpu which also has display (maybe some
testing on a local developer machine, no idea ...) then you mess with
compositors and stuff.

Also wherever you copied this from, please also educate those teams that
using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)

Yeah no, don't allocate render buffers with create_dumb. Every time this
comes up I'm wondering whether we should just completely disable dma-buf
operations on these. Dumb buffers are explicitly only for software
rendering for display purposes when the gpu userspace stack isn't fully
running yet, aka boot splash.

And yes I know there's endless amounts of abuse of that stuff floating
around, especially on arm-soc/android systems.

> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment
> before each pack specifying the 'struct XXX' this is following?
> 
> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 
> Christian, I would be very happy to hear from you that this entire
> work is good for AMD as well

I think the smallest generic interface for allocating gpu buffers which
are more useful than the stuff you get from CREATE_DUMB is gbm. That's
used by compositors to get bare metal opengl going on linux. Ofc Android
has gralloc for the same purpose, and cros has minigbm (which isn't the
same as gbm at all). So not so cool.

The other generic option is using vulkan, which works directly on bare
metal (without a compositor or anything running), and is cross vendor. So
cool, except not used for compute, which is generally the thing you want
if you have an rdma card.

Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back,
properly.

Compute is the worst, because opencl is widely considered a mistake (maybe
opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
have xe (intel-only).

It's pretty glorious :-/

Also I think we discussed this already, but for actual p2p the intel
patches aren't in upstream yet. We have some internally, but with very
broken locking (in the process of getting fixed up, but it's taking time).

Cheers, Daniel

> Edward should look through this, but I'm glad to see something like
> this
> 
> Thanks,
> Jason
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-24 15:16       ` Daniel Vetter
@ 2020-11-24 15:36         ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-24 15:36 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jianxin Xiong, Leon Romanovsky, linux-rdma, dri-devel,
	Doug Ledford, Daniel Vetter, Christian Koenig

On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:

> Compute is the worst, because opencl is widely considered a mistake (maybe
> opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
> cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
> have xe (intel-only).

> It's pretty glorious :-/

I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third
API" ;)

Hopefuly xe compute won't leave a lot of half finished abandoned
kernel code like Xeon Phi did :(

> Also I think we discussed this already, but for actual p2p the intel
> patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).

Someone needs to say this test works on a real system with an
unpatched upstream driver.

I thought AMD had the needed parts merged?

Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-24 15:36         ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-24 15:36 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford,
	Daniel Vetter, Christian Koenig, Jianxin Xiong

On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:

> Compute is the worst, because opencl is widely considered a mistake (maybe
> opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
> cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
> have xe (intel-only).

> It's pretty glorious :-/

I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third
API" ;)

Hopefuly xe compute won't leave a lot of half finished abandoned
kernel code like Xeon Phi did :(

> Also I think we discussed this already, but for actual p2p the intel
> patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).

Someone needs to say this test works on a real system with an
unpatched upstream driver.

I thought AMD had the needed parts merged?

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-24 15:36         ` Jason Gunthorpe
@ 2020-11-24 16:21           ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 16:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Tuesday, November 24, 2020 7:36 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:
> 
> > Compute is the worst, because opencl is widely considered a mistake
> > (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually
> > used stuff is cuda (nvidia-only), rocm (amd-only) and now with intel
> > also playing we have xe (intel-only).
> 
> > It's pretty glorious :-/
> 
> I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third API" ;)
> 
> Hopefuly xe compute won't leave a lot of half finished abandoned kernel code like Xeon Phi did :(
> 
> > Also I think we discussed this already, but for actual p2p the intel
> > patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> 
> Someone needs to say this test works on a real system with an unpatched upstream driver.
> 
> I thought AMD had the needed parts merged?

Yes, I have tested these with AMD GPU.

> 
> Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-24 16:21           ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 16:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Tuesday, November 24, 2020 7:36 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:
> 
> > Compute is the worst, because opencl is widely considered a mistake
> > (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually
> > used stuff is cuda (nvidia-only), rocm (amd-only) and now with intel
> > also playing we have xe (intel-only).
> 
> > It's pretty glorious :-/
> 
> I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third API" ;)
> 
> Hopefuly xe compute won't leave a lot of half finished abandoned kernel code like Xeon Phi did :(
> 
> > Also I think we discussed this already, but for actual p2p the intel
> > patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> 
> Someone needs to say this test works on a real system with an unpatched upstream driver.
> 
> I thought AMD had the needed parts merged?

Yes, I have tested these with AMD GPU.

> 
> Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
  2020-11-24 10:31     ` Yishai Hadas
@ 2020-11-24 18:00       ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 18:00 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: linux-rdma, dri-devel, Doug Ledford, Jason Gunthorpe,
	Leon Romanovsky, Sumit Semwal, Christian Koenig, Vetter, Daniel

> -----Original Message-----
> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Tuesday, November 24, 2020 2:32 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Jason Gunthorpe <jgg@ziepe.ca>;
> Leon Romanovsky <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter,
> Daniel <daniel.vetter@intel.com>; Yishai Hadas <yishaih@nvidia.com>
> Subject: Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
> 
> On 11/23/2020 7:53 PM, Jianxin Xiong wrote:
> > Add new API function and new provider method for registering dma-buf
> > based memory region. Update the man page and bump the API version.
> >
> > Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> > ---
> >   kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
> >   libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
> >   libibverbs/driver.h                      |  7 ++++++
> >   libibverbs/dummy_ops.c                   | 11 +++++++++
> >   libibverbs/libibverbs.map.in             |  6 +++++
> >   libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
> >   libibverbs/verbs.c                       | 19 ++++++++++++++++
> >   libibverbs/verbs.h                       | 10 +++++++++
> >   8 files changed, 124 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > index 7968a18..dafc7eb 100644
> > --- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> > + * Copyright (c) 2020, Intel Corporation. All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
> >   	UVERBS_METHOD_MR_DESTROY,
> >   	UVERBS_METHOD_ADVISE_MR,
> >   	UVERBS_METHOD_QUERY_MR,
> > +	UVERBS_METHOD_REG_DMABUF_MR,
> >   };
> >
> >   enum uverbs_attrs_mr_destroy_ids {
> > @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
> >   	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
> >   };
> >
> > +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> > +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> > +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> > +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> > +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> > +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> > +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> > +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> > +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> > +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> > +};
> > +
> >   enum uverbs_attrs_create_counters_cmd_attr_ids {
> >   	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
> >   };
> > diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c index
> > 42dbe42..91ce2ef 100644
> > --- a/libibverbs/cmd_mr.c
> > +++ b/libibverbs/cmd_mr.c
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
> >   	return 0;
> >   }
> >
> > +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +			  uint64_t iova, int fd, int access,
> > +			  struct verbs_mr *vmr)
> > +{
> > +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> > +			       UVERBS_METHOD_REG_DMABUF_MR,
> > +			       9);
> > +	struct ib_uverbs_attr *handle;
> > +	uint32_t lkey, rkey;
> > +	int ret;
> > +
> > +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> > +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> > +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> > +
> > +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> > +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> > +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> > +access);
> > +
> > +	ret = execute_ioctl(pd->context, cmdb);
> > +	if (ret)
> > +		return errno;
> > +
> > +	vmr->ibv_mr.handle =
> > +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> > +	vmr->ibv_mr.context = pd->context;
> > +	vmr->ibv_mr.lkey    = lkey;
> > +	vmr->ibv_mr.rkey    = rkey;
> > +	vmr->ibv_mr.pd	    = pd;
> > +	vmr->ibv_mr.addr    = (void *)offset;
> > +	vmr->ibv_mr.length  = length;
> > +	vmr->mr_type        = IBV_MR_TYPE_MR;
> > +	return 0;
> > +}
> > diff --git a/libibverbs/driver.h b/libibverbs/driver.h index
> > ab80f4b..d6a9d0a 100644
> > --- a/libibverbs/driver.h
> > +++ b/libibverbs/driver.h
> > @@ -2,6 +2,7 @@
> >    * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
> >    * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
> >    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation. All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -373,6 +374,9 @@ struct verbs_context_ops {
> >   	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
> >   				    uint64_t dm_offset, size_t length,
> >   				    unsigned int access);
> > +	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> > +					size_t length, uint64_t iova,
> > +					int fd, int access);
> >   	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
> >   				 uint64_t hca_va, int access);
> >   	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only); @@
> > -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
> >   		      uint32_t flags,
> >   		      struct ibv_sge *sg_list,
> >   		      uint32_t num_sge);
> > +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +			  uint64_t iova, int fd, int access,
> > +			  struct verbs_mr *vmr);
> >   int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
> >   		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
> >   		     size_t cmd_size,
> > diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c index
> > e5af9e4..dca96d2 100644
> > --- a/libibverbs/dummy_ops.c
> > +++ b/libibverbs/dummy_ops.c
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
> >   	return NULL;
> >   }
> >
> > +static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
> > +				    size_t length, uint64_t iova,
> > +				    int fd, int access)
> > +{
> > +	errno = ENOSYS;
> > +	return NULL;
> > +}
> > +
> >   static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
> >   {
> >   	return EOPNOTSUPP;
> > @@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
> >   	query_srq,
> >   	read_counters,
> >   	reg_dm_mr,
> > +	reg_dmabuf_mr,
> >   	reg_mr,
> >   	req_notify_cq,
> >   	rereg_mr,
> > @@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
> >   	SET_PRIV_OP_IC(vctx, set_ece);
> >   	SET_PRIV_OP_IC(vctx, unimport_mr);
> >   	SET_PRIV_OP_IC(vctx, unimport_pd);
> > +	SET_OP(vctx, reg_dmabuf_mr);
> >
> >   #undef SET_OP
> >   #undef SET_OP2
> > diff --git a/libibverbs/libibverbs.map.in
> > b/libibverbs/libibverbs.map.in index b5ccaca..f67e1ef 100644
> > --- a/libibverbs/libibverbs.map.in
> > +++ b/libibverbs/libibverbs.map.in
> > @@ -148,6 +148,11 @@ IBVERBS_1.11 {
> >   		_ibv_query_gid_table;
> >   } IBVERBS_1.10;
> >
> > +IBVERBS_1.12 {
> > +	global:
> > +		ibv_reg_dmabuf_mr;
> > +} IBVERBS_1.11;
> > +
> >   /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
> >      version. See the top level CMakeLists.txt for this setting. */
> >
> > @@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
> >   		ibv_cmd_query_srq;
> >   		ibv_cmd_read_counters;
> >   		ibv_cmd_reg_dm_mr;
> > +		ibv_cmd_reg_dmabuf_mr;
> >   		ibv_cmd_reg_mr;
> >   		ibv_cmd_req_notify_cq;
> >   		ibv_cmd_rereg_mr;
> > diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
> > index 2bfc955..4975c79 100644
> > --- a/libibverbs/man/ibv_reg_mr.3
> > +++ b/libibverbs/man/ibv_reg_mr.3
> > @@ -3,7 +3,7 @@
> >   .\"
> >   .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
> >   .SH "NAME"
> > -ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a
> > memory region (MR)
> > +ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \-
> > +register or deregister a memory region (MR)
> >   .SH "SYNOPSIS"
> >   .nf
> >   .B #include <infiniband/verbs.h>
> > @@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
> >   .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
> >   .BI "                               int " "access" );
> >   .sp
> > +.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
> > +.BI "                                 size_t " "length" ", int " "access" );
> > +.sp
> 
> 
> This misses the 'fd' parameter.

Thanks. This has been fixed in the new version I am working on (and the github PR).

> 
> >   .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
> >   .fi
> >   .SH "DESCRIPTION"
> > @@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
> >   (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
> >   IBV_ACCESS_ZERO_BASED.
> >   .PP
> > +.B ibv_reg_dmabuf_mr()
> > +registers a dma-buf based memory region (MR) associated with the
> > +protection domain .I pd\fR.
> > +The MR starts at
> > +.I offset
> > +of the dma-buf and its size is
> > +.I length\fR.
> > +The dma-buf is identified by the file descriptor .I fd\fR.
> > +The argument
> > +.I access
> > +describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are
> supported:
> > +.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC,
> IBV_ACCESS_RELAXED_ORDERING.
> > +.PP
> >   .B ibv_dereg_mr()
> >   deregisters the MR
> >   .I mr\fR.
> >   .SH "RETURN VALUE"
> > -.B ibv_reg_mr() / ibv_reg_mr_iova()
> > +.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
> >   returns a pointer to the registered MR, or NULL if the request fails.
> >   The local key (\fBL_Key\fR) field
> >   .B lkey
> > diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c index
> > 2b0ede8..84ddac7 100644
> > --- a/libibverbs/verbs.c
> > +++ b/libibverbs/verbs.c
> > @@ -1,6 +1,7 @@
> >   /*
> >    * Copyright (c) 2005 Topspin Communications.  All rights reserved.
> >    * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corperation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
> >   	get_ops(mr->context)->unimport_mr(mr);
> >   }
> >
> > +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> > +		   struct ibv_mr *,
> > +		   struct ibv_pd *pd, uint64_t offset,
> > +		   size_t length, int fd, int access) {
> 
> 
>   ibv_dereg_mr() -> ibv_dofork_range() doesn't seem to be applicable in this case, right ? if offset / addr won't be 0 it may be activated on
> the address range.

I missed that one. Should add condition to ibv_dereg_mr() to not call ibv_dofork_range()
on dmabuf mr.  

> 
> Do we rely on applications to *not* turn on fork support in libibverbs ?
> (e.g. call ibv_fork_init()), what if some other registrations may need being fork safe ?
> 

If the backing storage is device memory that doesn't have a mapping in the virtual
address space, it's not affected by fork. If the backing storage is shared/system memory
that has user accessible virtual address (e.g. iGPU), the exporter is responsible for handling
fork() -- and in that case, regular memory registration can be used instead, unless for the
purpose of testing dma-buf functionality.  

> > +	struct ibv_mr *mr;
> > +
> > +	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
> > +						 fd, access);
> > +	if (mr) {
> > +		mr->context = pd->context;
> > +		mr->pd      = pd;
> > +		mr->addr    = (void *)offset;
> > +		mr->length  = length;
> > +	}
> > +	return mr;
> > +}
> > +
> >   LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
> >   		   int,
> >   		   struct ibv_mr *mr, int flags,
> > diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h index
> > ee57e05..3961b1e 100644
> > --- a/libibverbs/verbs.h
> > +++ b/libibverbs/verbs.h
> > @@ -3,6 +3,7 @@
> >    * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
> >    * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> >    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -2133,6 +2134,9 @@ struct verbs_context {
> >   	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
> >   					     struct ibv_xrcd_init_attr *xrcd_init_attr);
> >   	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> > +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> > +						 size_t length, uint64_t iova,
> > +						 int fd, int access);
> >   	uint64_t _ABI_placeholder3;
> >   	size_t   sz;			/* Must be immediately before struct ibv_context */
> >   	struct ibv_context context;	/* Must be last field in the struct */
> > @@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
> >   			  __builtin_constant_p(                                \
> >   				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
> >
> > +/**
> > + * ibv_reg_dmabuf_mr - Register a dambuf-based memory region  */
> > +struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +				 int fd, int access);
> > +
> >   enum ibv_rereg_mr_err_code {
> >   	/* Old MR is valid, invalid input */
> >   	IBV_REREG_MR_ERR_INPUT = -1,
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
@ 2020-11-24 18:00       ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 18:00 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Christian Koenig,
	Jason Gunthorpe, Doug Ledford, Vetter, Daniel

> -----Original Message-----
> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Tuesday, November 24, 2020 2:32 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Jason Gunthorpe <jgg@ziepe.ca>;
> Leon Romanovsky <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter,
> Daniel <daniel.vetter@intel.com>; Yishai Hadas <yishaih@nvidia.com>
> Subject: Re: [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region
> 
> On 11/23/2020 7:53 PM, Jianxin Xiong wrote:
> > Add new API function and new provider method for registering dma-buf
> > based memory region. Update the man page and bump the API version.
> >
> > Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> > ---
> >   kernel-headers/rdma/ib_user_ioctl_cmds.h | 14 ++++++++++++
> >   libibverbs/cmd_mr.c                      | 38 ++++++++++++++++++++++++++++++++
> >   libibverbs/driver.h                      |  7 ++++++
> >   libibverbs/dummy_ops.c                   | 11 +++++++++
> >   libibverbs/libibverbs.map.in             |  6 +++++
> >   libibverbs/man/ibv_reg_mr.3              | 21 ++++++++++++++++--
> >   libibverbs/verbs.c                       | 19 ++++++++++++++++
> >   libibverbs/verbs.h                       | 10 +++++++++
> >   8 files changed, 124 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > index 7968a18..dafc7eb 100644
> > --- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > +++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
> > + * Copyright (c) 2020, Intel Corporation. All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -251,6 +252,7 @@ enum uverbs_methods_mr {
> >   	UVERBS_METHOD_MR_DESTROY,
> >   	UVERBS_METHOD_ADVISE_MR,
> >   	UVERBS_METHOD_QUERY_MR,
> > +	UVERBS_METHOD_REG_DMABUF_MR,
> >   };
> >
> >   enum uverbs_attrs_mr_destroy_ids {
> > @@ -272,6 +274,18 @@ enum uverbs_attrs_query_mr_cmd_attr_ids {
> >   	UVERBS_ATTR_QUERY_MR_RESP_IOVA,
> >   };
> >
> > +enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids {
> > +	UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
> > +	UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
> > +	UVERBS_ATTR_REG_DMABUF_MR_OFFSET,
> > +	UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
> > +	UVERBS_ATTR_REG_DMABUF_MR_IOVA,
> > +	UVERBS_ATTR_REG_DMABUF_MR_FD,
> > +	UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> > +	UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
> > +	UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
> > +};
> > +
> >   enum uverbs_attrs_create_counters_cmd_attr_ids {
> >   	UVERBS_ATTR_CREATE_COUNTERS_HANDLE,
> >   };
> > diff --git a/libibverbs/cmd_mr.c b/libibverbs/cmd_mr.c index
> > 42dbe42..91ce2ef 100644
> > --- a/libibverbs/cmd_mr.c
> > +++ b/libibverbs/cmd_mr.c
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2018 Mellanox Technologies, Ltd.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -116,3 +117,40 @@ int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr,
> >   	return 0;
> >   }
> >
> > +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +			  uint64_t iova, int fd, int access,
> > +			  struct verbs_mr *vmr)
> > +{
> > +	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_MR,
> > +			       UVERBS_METHOD_REG_DMABUF_MR,
> > +			       9);
> > +	struct ib_uverbs_attr *handle;
> > +	uint32_t lkey, rkey;
> > +	int ret;
> > +
> > +	handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
> > +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey);
> > +	fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey);
> > +
> > +	fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length);
> > +	fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova);
> > +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd);
> > +	fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
> > +access);
> > +
> > +	ret = execute_ioctl(pd->context, cmdb);
> > +	if (ret)
> > +		return errno;
> > +
> > +	vmr->ibv_mr.handle =
> > +		read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle);
> > +	vmr->ibv_mr.context = pd->context;
> > +	vmr->ibv_mr.lkey    = lkey;
> > +	vmr->ibv_mr.rkey    = rkey;
> > +	vmr->ibv_mr.pd	    = pd;
> > +	vmr->ibv_mr.addr    = (void *)offset;
> > +	vmr->ibv_mr.length  = length;
> > +	vmr->mr_type        = IBV_MR_TYPE_MR;
> > +	return 0;
> > +}
> > diff --git a/libibverbs/driver.h b/libibverbs/driver.h index
> > ab80f4b..d6a9d0a 100644
> > --- a/libibverbs/driver.h
> > +++ b/libibverbs/driver.h
> > @@ -2,6 +2,7 @@
> >    * Copyright (c) 2004, 2005 Topspin Communications.  All rights reserved.
> >    * Copyright (c) 2005, 2006 Cisco Systems, Inc.  All rights reserved.
> >    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation. All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -373,6 +374,9 @@ struct verbs_context_ops {
> >   	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
> >   				    uint64_t dm_offset, size_t length,
> >   				    unsigned int access);
> > +	struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> > +					size_t length, uint64_t iova,
> > +					int fd, int access);
> >   	struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length,
> >   				 uint64_t hca_va, int access);
> >   	int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only); @@
> > -498,6 +502,9 @@ int ibv_cmd_advise_mr(struct ibv_pd *pd,
> >   		      uint32_t flags,
> >   		      struct ibv_sge *sg_list,
> >   		      uint32_t num_sge);
> > +int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +			  uint64_t iova, int fd, int access,
> > +			  struct verbs_mr *vmr);
> >   int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type,
> >   		     struct ibv_mw *mw, struct ibv_alloc_mw *cmd,
> >   		     size_t cmd_size,
> > diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c index
> > e5af9e4..dca96d2 100644
> > --- a/libibverbs/dummy_ops.c
> > +++ b/libibverbs/dummy_ops.c
> > @@ -1,5 +1,6 @@
> >   /*
> >    * Copyright (c) 2017 Mellanox Technologies, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -452,6 +453,14 @@ static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
> >   	return NULL;
> >   }
> >
> > +static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset,
> > +				    size_t length, uint64_t iova,
> > +				    int fd, int access)
> > +{
> > +	errno = ENOSYS;
> > +	return NULL;
> > +}
> > +
> >   static int req_notify_cq(struct ibv_cq *cq, int solicited_only)
> >   {
> >   	return EOPNOTSUPP;
> > @@ -560,6 +569,7 @@ const struct verbs_context_ops verbs_dummy_ops = {
> >   	query_srq,
> >   	read_counters,
> >   	reg_dm_mr,
> > +	reg_dmabuf_mr,
> >   	reg_mr,
> >   	req_notify_cq,
> >   	rereg_mr,
> > @@ -689,6 +699,7 @@ void verbs_set_ops(struct verbs_context *vctx,
> >   	SET_PRIV_OP_IC(vctx, set_ece);
> >   	SET_PRIV_OP_IC(vctx, unimport_mr);
> >   	SET_PRIV_OP_IC(vctx, unimport_pd);
> > +	SET_OP(vctx, reg_dmabuf_mr);
> >
> >   #undef SET_OP
> >   #undef SET_OP2
> > diff --git a/libibverbs/libibverbs.map.in
> > b/libibverbs/libibverbs.map.in index b5ccaca..f67e1ef 100644
> > --- a/libibverbs/libibverbs.map.in
> > +++ b/libibverbs/libibverbs.map.in
> > @@ -148,6 +148,11 @@ IBVERBS_1.11 {
> >   		_ibv_query_gid_table;
> >   } IBVERBS_1.10;
> >
> > +IBVERBS_1.12 {
> > +	global:
> > +		ibv_reg_dmabuf_mr;
> > +} IBVERBS_1.11;
> > +
> >   /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
> >      version. See the top level CMakeLists.txt for this setting. */
> >
> > @@ -211,6 +216,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
> >   		ibv_cmd_query_srq;
> >   		ibv_cmd_read_counters;
> >   		ibv_cmd_reg_dm_mr;
> > +		ibv_cmd_reg_dmabuf_mr;
> >   		ibv_cmd_reg_mr;
> >   		ibv_cmd_req_notify_cq;
> >   		ibv_cmd_rereg_mr;
> > diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3
> > index 2bfc955..4975c79 100644
> > --- a/libibverbs/man/ibv_reg_mr.3
> > +++ b/libibverbs/man/ibv_reg_mr.3
> > @@ -3,7 +3,7 @@
> >   .\"
> >   .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
> >   .SH "NAME"
> > -ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a
> > memory region (MR)
> > +ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \-
> > +register or deregister a memory region (MR)
> >   .SH "SYNOPSIS"
> >   .nf
> >   .B #include <infiniband/verbs.h>
> > @@ -15,6 +15,9 @@ ibv_reg_mr, ibv_reg_mr_iova, ibv_dereg_mr \- register or deregister a memory reg
> >   .BI "                               size_t " "length" ", uint64_t " "hca_va" ,
> >   .BI "                               int " "access" );
> >   .sp
> > +.BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" ,
> > +.BI "                                 size_t " "length" ", int " "access" );
> > +.sp
> 
> 
> This misses the 'fd' parameter.

Thanks. This has been fixed in the new version I am working on (and the github PR).

> 
> >   .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" );
> >   .fi
> >   .SH "DESCRIPTION"
> > @@ -71,11 +74,25 @@ a lkey or rkey. The offset in the memory region is computed as 'addr +
> >   (iova - hca_va)'. Specifying 0 for hca_va has the same effect as
> >   IBV_ACCESS_ZERO_BASED.
> >   .PP
> > +.B ibv_reg_dmabuf_mr()
> > +registers a dma-buf based memory region (MR) associated with the
> > +protection domain .I pd\fR.
> > +The MR starts at
> > +.I offset
> > +of the dma-buf and its size is
> > +.I length\fR.
> > +The dma-buf is identified by the file descriptor .I fd\fR.
> > +The argument
> > +.I access
> > +describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are
> supported:
> > +.B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC,
> IBV_ACCESS_RELAXED_ORDERING.
> > +.PP
> >   .B ibv_dereg_mr()
> >   deregisters the MR
> >   .I mr\fR.
> >   .SH "RETURN VALUE"
> > -.B ibv_reg_mr() / ibv_reg_mr_iova()
> > +.B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr()
> >   returns a pointer to the registered MR, or NULL if the request fails.
> >   The local key (\fBL_Key\fR) field
> >   .B lkey
> > diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c index
> > 2b0ede8..84ddac7 100644
> > --- a/libibverbs/verbs.c
> > +++ b/libibverbs/verbs.c
> > @@ -1,6 +1,7 @@
> >   /*
> >    * Copyright (c) 2005 Topspin Communications.  All rights reserved.
> >    * Copyright (c) 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corperation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -367,6 +368,24 @@ void ibv_unimport_mr(struct ibv_mr *mr)
> >   	get_ops(mr->context)->unimport_mr(mr);
> >   }
> >
> > +LATEST_SYMVER_FUNC(ibv_reg_dmabuf_mr, 1_12, "IBVERBS_1.12",
> > +		   struct ibv_mr *,
> > +		   struct ibv_pd *pd, uint64_t offset,
> > +		   size_t length, int fd, int access) {
> 
> 
>   ibv_dereg_mr() -> ibv_dofork_range() doesn't seem to be applicable in this case, right ? if offset / addr won't be 0 it may be activated on
> the address range.

I missed that one. Should add condition to ibv_dereg_mr() to not call ibv_dofork_range()
on dmabuf mr.  

> 
> Do we rely on applications to *not* turn on fork support in libibverbs ?
> (e.g. call ibv_fork_init()), what if some other registrations may need being fork safe ?
> 

If the backing storage is device memory that doesn't have a mapping in the virtual
address space, it's not affected by fork. If the backing storage is shared/system memory
that has user accessible virtual address (e.g. iGPU), the exporter is responsible for handling
fork() -- and in that case, regular memory registration can be used instead, unless for the
purpose of testing dma-buf functionality.  

> > +	struct ibv_mr *mr;
> > +
> > +	mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, offset,
> > +						 fd, access);
> > +	if (mr) {
> > +		mr->context = pd->context;
> > +		mr->pd      = pd;
> > +		mr->addr    = (void *)offset;
> > +		mr->length  = length;
> > +	}
> > +	return mr;
> > +}
> > +
> >   LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1",
> >   		   int,
> >   		   struct ibv_mr *mr, int flags,
> > diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h index
> > ee57e05..3961b1e 100644
> > --- a/libibverbs/verbs.h
> > +++ b/libibverbs/verbs.h
> > @@ -3,6 +3,7 @@
> >    * Copyright (c) 2004, 2011-2012 Intel Corporation.  All rights reserved.
> >    * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc.  All rights reserved.
> >    * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
> > + * Copyright (c) 2020 Intel Corporation.  All rights reserved.
> >    *
> >    * This software is available to you under a choice of one of two
> >    * licenses.  You may choose to be licensed under the terms of the
> > GNU @@ -2133,6 +2134,9 @@ struct verbs_context {
> >   	struct ibv_xrcd *	(*open_xrcd)(struct ibv_context *context,
> >   					     struct ibv_xrcd_init_attr *xrcd_init_attr);
> >   	int			(*close_xrcd)(struct ibv_xrcd *xrcd);
> > +	struct ibv_mr *		(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset,
> > +						 size_t length, uint64_t iova,
> > +						 int fd, int access);
> >   	uint64_t _ABI_placeholder3;
> >   	size_t   sz;			/* Must be immediately before struct ibv_context */
> >   	struct ibv_context context;	/* Must be last field in the struct */
> > @@ -2535,6 +2539,12 @@ __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova,
> >   			  __builtin_constant_p(                                \
> >   				  ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0))
> >
> > +/**
> > + * ibv_reg_dmabuf_mr - Register a dambuf-based memory region  */
> > +struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length,
> > +				 int fd, int access);
> > +
> >   enum ibv_rereg_mr_err_code {
> >   	/* Old MR is valid, invalid input */
> >   	IBV_REREG_MR_ERR_INPUT = -1,
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-24 15:16       ` Daniel Vetter
@ 2020-11-24 18:45         ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 18:45 UTC (permalink / raw)
  To: Daniel Vetter, Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Tuesday, November 24, 2020 7:17 AM
> To: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> >
> > > +cdef class DmaBuf:
> > > +    def __init__(self, size, unit=0):
> > > +        """
> > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > +        DRI device interface (/dev/dri/card*). Usually this
> > > +requires the
> 
> Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> backwards compat galore, don't use.
> 
> Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> compositors and stuff.
> 
> Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

/dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
mode setting commands (including dumb_buf). The original intention here is to
have something to support the new tests added, not for general compute. 

> 
> > > +        effective user id being root or being a member of the 'video' group.
> > > +        :param size: The size (in number of bytes) of the buffer.
> > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > +        :return: The newly created DmaBuf object on success.
> > > +        """
> > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > +
> > > +        args = bytearray(32)
> > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > + args)
> 
> Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> stack isn't fully running yet, aka boot splash.
> 
> And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.

One alternative is to use the GEM_CREATE method which can be done via the renderD*
device, but the command is vendor specific, so the logic is a little bit more complex. 

> 
> > > +
> > > +        args = bytearray(12)
> > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > +        a, b, self.fd = unpack('=iii', args)
> > > +
> > > +        args = bytearray(16)
> > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > +        a, b, self.map_offset = unpack('=iiq', args);
> >
> > Wow, OK
> >
> > Is it worth using ctypes here instead? Can you at least add a comment
> > before each pack specifying the 'struct XXX' this is following?
> >
> > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> >
> > Christian, I would be very happy to hear from you that this entire
> > work is good for AMD as well
> 
> I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> (which isn't the same as gbm at all). So not so cool.

Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
That would be much simpler than going with gbm and less dependency in setting up
the testing evrionment.

> 
> The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> 
> Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> 
> Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> 
> It's pretty glorious :-/
> 
> Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).
> 
> Cheers, Daniel
> 
> > Edward should look through this, but I'm glad to see something like
> > this
> >
> > Thanks,
> > Jason
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-24 18:45         ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 18:45 UTC (permalink / raw)
  To: Daniel Vetter, Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Tuesday, November 24, 2020 7:17 AM
> To: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> >
> > > +cdef class DmaBuf:
> > > +    def __init__(self, size, unit=0):
> > > +        """
> > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > +        DRI device interface (/dev/dri/card*). Usually this
> > > +requires the
> 
> Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> backwards compat galore, don't use.
> 
> Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> compositors and stuff.
> 
> Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

/dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
mode setting commands (including dumb_buf). The original intention here is to
have something to support the new tests added, not for general compute. 

> 
> > > +        effective user id being root or being a member of the 'video' group.
> > > +        :param size: The size (in number of bytes) of the buffer.
> > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > +        :return: The newly created DmaBuf object on success.
> > > +        """
> > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > +
> > > +        args = bytearray(32)
> > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > + args)
> 
> Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> stack isn't fully running yet, aka boot splash.
> 
> And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.

One alternative is to use the GEM_CREATE method which can be done via the renderD*
device, but the command is vendor specific, so the logic is a little bit more complex. 

> 
> > > +
> > > +        args = bytearray(12)
> > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > +        a, b, self.fd = unpack('=iii', args)
> > > +
> > > +        args = bytearray(16)
> > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > +        a, b, self.map_offset = unpack('=iiq', args);
> >
> > Wow, OK
> >
> > Is it worth using ctypes here instead? Can you at least add a comment
> > before each pack specifying the 'struct XXX' this is following?
> >
> > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> >
> > Christian, I would be very happy to hear from you that this entire
> > work is good for AMD as well
> 
> I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> (which isn't the same as gbm at all). So not so cool.

Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
That would be much simpler than going with gbm and less dependency in setting up
the testing evrionment.

> 
> The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> 
> Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> 
> Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> 
> It's pretty glorious :-/
> 
> Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).
> 
> Cheers, Daniel
> 
> > Edward should look through this, but I'm glad to see something like
> > this
> >
> > Thanks,
> > Jason
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
  2020-11-23 17:53   ` Jianxin Xiong
@ 2020-11-24 20:26     ` John Hubbard
  -1 siblings, 0 replies; 48+ messages in thread
From: John Hubbard @ 2020-11-24 20:26 UTC (permalink / raw)
  To: Jianxin Xiong, linux-rdma, dri-devel
  Cc: Doug Ledford, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal,
	Christian Koenig, Daniel Vetter

Just some silly nits I stumbled across while trying to understand the tests.

On 11/23/20 9:53 AM, Jianxin Xiong wrote:
> The filter defintion is wrong and causes get_access_flags() always

              definition

> returning empty list. As the result the MR tests using this function
> are effectively skipped (but report success).
> 
> Also fix a typo in the comments.

Was there another typo somewhere? All I see is an *added* typo...

> 
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> ---
>   tests/utils.py | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/utils.py b/tests/utils.py
> index 0ad7110..eee44b4 100644
> --- a/tests/utils.py
> +++ b/tests/utils.py
> @@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
>       :param element: A list of access flags to check
>       :return: True if this list is legal, else False
>       """
> -    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
> -        if e.IBV_ACCESS_LOCAL_WRITE:
> +    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
> +        if not e.IBV_ACCESS_LOCAL_WRITE in element:
>               return False
>       return True
>   
> @@ -69,7 +69,7 @@ def get_access_flags(ctx):
>       added as well.
>       After verifying that the flags selection is legal, it is appended to an
>       array, assuming it wasn't previously appended.
> -    :param ctx: Device Context to check capabilities
> +    :param ctx: Device Coyyntext to check capabilities

I liked the old spelling. "Coyyntext" just doesn't sound as good. :)

>       :param num: Size of initial collection
>       :return: A random legal value for MR flags
>       """
> 

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
@ 2020-11-24 20:26     ` John Hubbard
  0 siblings, 0 replies; 48+ messages in thread
From: John Hubbard @ 2020-11-24 20:26 UTC (permalink / raw)
  To: Jianxin Xiong, linux-rdma, dri-devel
  Cc: Leon Romanovsky, Christian Koenig, Jason Gunthorpe, Doug Ledford,
	Daniel Vetter

Just some silly nits I stumbled across while trying to understand the tests.

On 11/23/20 9:53 AM, Jianxin Xiong wrote:
> The filter defintion is wrong and causes get_access_flags() always

              definition

> returning empty list. As the result the MR tests using this function
> are effectively skipped (but report success).
> 
> Also fix a typo in the comments.

Was there another typo somewhere? All I see is an *added* typo...

> 
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> ---
>   tests/utils.py | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/utils.py b/tests/utils.py
> index 0ad7110..eee44b4 100644
> --- a/tests/utils.py
> +++ b/tests/utils.py
> @@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
>       :param element: A list of access flags to check
>       :return: True if this list is legal, else False
>       """
> -    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
> -        if e.IBV_ACCESS_LOCAL_WRITE:
> +    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
> +        if not e.IBV_ACCESS_LOCAL_WRITE in element:
>               return False
>       return True
>   
> @@ -69,7 +69,7 @@ def get_access_flags(ctx):
>       added as well.
>       After verifying that the flags selection is legal, it is appended to an
>       array, assuming it wasn't previously appended.
> -    :param ctx: Device Context to check capabilities
> +    :param ctx: Device Coyyntext to check capabilities

I liked the old spelling. "Coyyntext" just doesn't sound as good. :)

>       :param num: Size of initial collection
>       :return: A random legal value for MR flags
>       """
> 

thanks,
-- 
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
  2020-11-24 20:26     ` John Hubbard
@ 2020-11-24 20:43       ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 20:43 UTC (permalink / raw)
  To: John Hubbard, linux-rdma, dri-devel
  Cc: Doug Ledford, Jason Gunthorpe, Leon Romanovsky, Sumit Semwal,
	Christian Koenig, Vetter, Daniel

> -----Original Message-----
> From: John Hubbard <jhubbard@nvidia.com>
> Sent: Tuesday, November 24, 2020 12:27 PM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>; linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org
> Cc: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>; Sumit Semwal
> <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
> 
> Just some silly nits I stumbled across while trying to understand the tests.
> 
> On 11/23/20 9:53 AM, Jianxin Xiong wrote:
> > The filter defintion is wrong and causes get_access_flags() always
> 
>               definition

Thanks.

> 
> > returning empty list. As the result the MR tests using this function
> > are effectively skipped (but report success).
> >
> > Also fix a typo in the comments.
> 
> Was there another typo somewhere? All I see is an *added* typo...
> 
> >
> > Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> > ---
> >   tests/utils.py | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/utils.py b/tests/utils.py index 0ad7110..eee44b4
> > 100644
> > --- a/tests/utils.py
> > +++ b/tests/utils.py
> > @@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
> >       :param element: A list of access flags to check
> >       :return: True if this list is legal, else False
> >       """
> > -    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
> > -        if e.IBV_ACCESS_LOCAL_WRITE:
> > +    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
> > +        if not e.IBV_ACCESS_LOCAL_WRITE in element:
> >               return False
> >       return True
> >
> > @@ -69,7 +69,7 @@ def get_access_flags(ctx):
> >       added as well.
> >       After verifying that the flags selection is legal, it is appended to an
> >       array, assuming it wasn't previously appended.
> > -    :param ctx: Device Context to check capabilities
> > +    :param ctx: Device Coyyntext to check capabilities
> 
> I liked the old spelling. "Coyyntext" just doesn't sound as good. :)

Hmm, I don't know what happened 😊 I was seeing the other way around.

> 
> >       :param num: Size of initial collection
> >       :return: A random legal value for MR flags
> >       """
> >
> 
> thanks,
> --
> John Hubbard
> NVIDIA

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
@ 2020-11-24 20:43       ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-24 20:43 UTC (permalink / raw)
  To: John Hubbard, linux-rdma, dri-devel
  Cc: Leon Romanovsky, Christian Koenig, Jason Gunthorpe, Doug Ledford,
	Vetter, Daniel

> -----Original Message-----
> From: John Hubbard <jhubbard@nvidia.com>
> Sent: Tuesday, November 24, 2020 12:27 PM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>; linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org
> Cc: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>; Sumit Semwal
> <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags()
> 
> Just some silly nits I stumbled across while trying to understand the tests.
> 
> On 11/23/20 9:53 AM, Jianxin Xiong wrote:
> > The filter defintion is wrong and causes get_access_flags() always
> 
>               definition

Thanks.

> 
> > returning empty list. As the result the MR tests using this function
> > are effectively skipped (but report success).
> >
> > Also fix a typo in the comments.
> 
> Was there another typo somewhere? All I see is an *added* typo...
> 
> >
> > Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> > ---
> >   tests/utils.py | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/utils.py b/tests/utils.py index 0ad7110..eee44b4
> > 100644
> > --- a/tests/utils.py
> > +++ b/tests/utils.py
> > @@ -55,8 +55,8 @@ def filter_illegal_access_flags(element):
> >       :param element: A list of access flags to check
> >       :return: True if this list is legal, else False
> >       """
> > -    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE:
> > -        if e.IBV_ACCESS_LOCAL_WRITE:
> > +    if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element:
> > +        if not e.IBV_ACCESS_LOCAL_WRITE in element:
> >               return False
> >       return True
> >
> > @@ -69,7 +69,7 @@ def get_access_flags(ctx):
> >       added as well.
> >       After verifying that the flags selection is legal, it is appended to an
> >       array, assuming it wasn't previously appended.
> > -    :param ctx: Device Context to check capabilities
> > +    :param ctx: Device Coyyntext to check capabilities
> 
> I liked the old spelling. "Coyyntext" just doesn't sound as good. :)

Hmm, I don't know what happened 😊 I was seeing the other way around.

> 
> >       :param num: Size of initial collection
> >       :return: A random legal value for MR flags
> >       """
> >
> 
> thanks,
> --
> John Hubbard
> NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-24 18:45         ` Xiong, Jianxin
@ 2020-11-25 10:50           ` Daniel Vetter
  -1 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2020-11-25 10:50 UTC (permalink / raw)
  To: Xiong, Jianxin
  Cc: Daniel Vetter, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	dri-devel, Doug Ledford, Vetter, Daniel, Christian Koenig

On Tue, Nov 24, 2020 at 06:45:06PM +0000, Xiong, Jianxin wrote:
> > -----Original Message-----
> > From: Daniel Vetter <daniel@ffwll.ch>
> > Sent: Tuesday, November 24, 2020 7:17 AM
> > To: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> > >
> > > > +cdef class DmaBuf:
> > > > +    def __init__(self, size, unit=0):
> > > > +        """
> > > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > > +        DRI device interface (/dev/dri/card*). Usually this
> > > > +requires the
> > 
> > Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> > backwards compat galore, don't use.
> > 
> > Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> > compositors and stuff.
> > 
> > Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)
> 
> /dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
> mode setting commands (including dumb_buf). The original intention here is to
> have something to support the new tests added, not for general compute. 

Not having dumb_buf available is a feature. So even more reasons to use
that.

Also note that amdgpu has killed card* access pretty much, it's for
modesetting only.

> > > > +        effective user id being root or being a member of the 'video' group.
> > > > +        :param size: The size (in number of bytes) of the buffer.
> > > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > > +        :return: The newly created DmaBuf object on success.
> > > > +        """
> > > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > > +
> > > > +        args = bytearray(32)
> > > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > > + args)
> > 
> > Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> > disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> > stack isn't fully running yet, aka boot splash.
> > 
> > And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.
> 
> One alternative is to use the GEM_CREATE method which can be done via the renderD*
> device, but the command is vendor specific, so the logic is a little bit more complex. 

Yup. I guess the most minimal thing is to have a per-vendor (you can ask
drm for the driver name to match the right one) callback here to allocate
buffers correctly. Might be less churn than trying to pull in vulkan or
something like that.

It's at least what we're doing in igt for testing drm drivers (although
most of the generic igt tests for display, so dumb_buffer fallback is
available).

DRM_IOCTL_VERSION is the thing you'd need here, struct drm_version.name
has the field for figuring out which driver it is.

Also drivers without render node support won't ever be in the same system
as an rdma card and actually useful (because well they're either very old,
or display-only). So not an issue I think.

> > > > +
> > > > +        args = bytearray(12)
> > > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > > +        a, b, self.fd = unpack('=iii', args)
> > > > +
> > > > +        args = bytearray(16)
> > > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > > +        a, b, self.map_offset = unpack('=iiq', args);
> > >
> > > Wow, OK
> > >
> > > Is it worth using ctypes here instead? Can you at least add a comment
> > > before each pack specifying the 'struct XXX' this is following?
> > >
> > > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> > >
> > > Christian, I would be very happy to hear from you that this entire
> > > work is good for AMD as well
> > 
> > I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> > That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> > (which isn't the same as gbm at all). So not so cool.
> 
> Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
> That would be much simpler than going with gbm and less dependency in setting up
> the testing evrionment.

Yeah imo makes sense. It's a bunch more code for you to make it work on
i915 and amd, but it's not terrible. And avoids the dependencies, and also
avoids the abuse of card* and dumb buffers. Plus not really more complex,
you just need a table or something to match from the drm driver name to
the driver-specific buffer create function. Everything else stays the
same.

Also this opens up the door to force-test stuff like p2p in the future,
since at least on i915 you'll be able to ensure that a buffer is in vram
only.

Would be good if we also have a trick for amdgpu to make sure the buffer
stays in vram. I think there's some flags you can pass to the amdgpu
buffer create function. So maybe you want 2 testcases here, one allocates
the buffer in system memory, the other in vram for testing p2p
functionality. That kind of stuff isn't possible with dumb buffers.
-Daniel




> > 
> > The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> > So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> > 
> > Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> > 
> > Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> > stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> > 
> > It's pretty glorious :-/
> > 
> > Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> > 
> > Cheers, Daniel
> > 
> > > Edward should look through this, but I'm glad to see something like
> > > this
> > >
> > > Thanks,
> > > Jason
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-25 10:50           ` Daniel Vetter
  0 siblings, 0 replies; 48+ messages in thread
From: Daniel Vetter @ 2020-11-25 10:50 UTC (permalink / raw)
  To: Xiong, Jianxin
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Jason Gunthorpe,
	Doug Ledford, Vetter, Daniel, Christian Koenig

On Tue, Nov 24, 2020 at 06:45:06PM +0000, Xiong, Jianxin wrote:
> > -----Original Message-----
> > From: Daniel Vetter <daniel@ffwll.ch>
> > Sent: Tuesday, November 24, 2020 7:17 AM
> > To: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> > >
> > > > +cdef class DmaBuf:
> > > > +    def __init__(self, size, unit=0):
> > > > +        """
> > > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > > +        DRI device interface (/dev/dri/card*). Usually this
> > > > +requires the
> > 
> > Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> > backwards compat galore, don't use.
> > 
> > Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> > compositors and stuff.
> > 
> > Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)
> 
> /dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
> mode setting commands (including dumb_buf). The original intention here is to
> have something to support the new tests added, not for general compute. 

Not having dumb_buf available is a feature. So even more reasons to use
that.

Also note that amdgpu has killed card* access pretty much, it's for
modesetting only.

> > > > +        effective user id being root or being a member of the 'video' group.
> > > > +        :param size: The size (in number of bytes) of the buffer.
> > > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > > +        :return: The newly created DmaBuf object on success.
> > > > +        """
> > > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > > +
> > > > +        args = bytearray(32)
> > > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > > + args)
> > 
> > Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> > disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> > stack isn't fully running yet, aka boot splash.
> > 
> > And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.
> 
> One alternative is to use the GEM_CREATE method which can be done via the renderD*
> device, but the command is vendor specific, so the logic is a little bit more complex. 

Yup. I guess the most minimal thing is to have a per-vendor (you can ask
drm for the driver name to match the right one) callback here to allocate
buffers correctly. Might be less churn than trying to pull in vulkan or
something like that.

It's at least what we're doing in igt for testing drm drivers (although
most of the generic igt tests for display, so dumb_buffer fallback is
available).

DRM_IOCTL_VERSION is the thing you'd need here, struct drm_version.name
has the field for figuring out which driver it is.

Also drivers without render node support won't ever be in the same system
as an rdma card and actually useful (because well they're either very old,
or display-only). So not an issue I think.

> > > > +
> > > > +        args = bytearray(12)
> > > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > > +        a, b, self.fd = unpack('=iii', args)
> > > > +
> > > > +        args = bytearray(16)
> > > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > > +        a, b, self.map_offset = unpack('=iiq', args);
> > >
> > > Wow, OK
> > >
> > > Is it worth using ctypes here instead? Can you at least add a comment
> > > before each pack specifying the 'struct XXX' this is following?
> > >
> > > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> > >
> > > Christian, I would be very happy to hear from you that this entire
> > > work is good for AMD as well
> > 
> > I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> > That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> > (which isn't the same as gbm at all). So not so cool.
> 
> Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
> That would be much simpler than going with gbm and less dependency in setting up
> the testing evrionment.

Yeah imo makes sense. It's a bunch more code for you to make it work on
i915 and amd, but it's not terrible. And avoids the dependencies, and also
avoids the abuse of card* and dumb buffers. Plus not really more complex,
you just need a table or something to match from the drm driver name to
the driver-specific buffer create function. Everything else stays the
same.

Also this opens up the door to force-test stuff like p2p in the future,
since at least on i915 you'll be able to ensure that a buffer is in vram
only.

Would be good if we also have a trick for amdgpu to make sure the buffer
stays in vram. I think there's some flags you can pass to the amdgpu
buffer create function. So maybe you want 2 testcases here, one allocates
the buffer in system memory, the other in vram for testing p2p
functionality. That kind of stuff isn't possible with dumb buffers.
-Daniel




> > 
> > The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> > So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> > 
> > Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> > 
> > Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> > stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> > 
> > It's pretty glorious :-/
> > 
> > Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> > 
> > Cheers, Daniel
> > 
> > > Edward should look through this, but I'm glad to see something like
> > > this
> > >
> > > Thanks,
> > > Jason
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-25 10:50           ` Daniel Vetter
@ 2020-11-25 12:14             ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-25 12:14 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Xiong, Jianxin, Leon Romanovsky, linux-rdma, dri-devel,
	Doug Ledford, Vetter, Daniel, Christian Koenig

On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:

> Yeah imo makes sense. It's a bunch more code for you to make it work on
> i915 and amd, but it's not terrible. And avoids the dependencies, and also
> avoids the abuse of card* and dumb buffers. Plus not really more complex,
> you just need a table or something to match from the drm driver name to
> the driver-specific buffer create function. Everything else stays the
> same.

If it is going to get more complicated please write it in C then. We
haven't done it yet, but you can link a C function through cython to
the python test script

If you struggle here I can probably work out the build system bits,
but it should not be too terrible

Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-25 12:14             ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-25 12:14 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig, Xiong, Jianxin

On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:

> Yeah imo makes sense. It's a bunch more code for you to make it work on
> i915 and amd, but it's not terrible. And avoids the dependencies, and also
> avoids the abuse of card* and dumb buffers. Plus not really more complex,
> you just need a table or something to match from the drm driver name to
> the driver-specific buffer create function. Everything else stays the
> same.

If it is going to get more complicated please write it in C then. We
haven't done it yet, but you can link a C function through cython to
the python test script

If you struggle here I can probably work out the build system bits,
but it should not be too terrible

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-25 12:14             ` Jason Gunthorpe
@ 2020-11-25 19:27               ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-25 19:27 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:15 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> 
> > Yeah imo makes sense. It's a bunch more code for you to make it work
> > on
> > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > also avoids the abuse of card* and dumb buffers. Plus not really more
> > complex, you just need a table or something to match from the drm
> > driver name to the driver-specific buffer create function. Everything
> > else stays the same.
> 
> If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> python test script
> 
> If you struggle here I can probably work out the build system bits, but it should not be too terrible

Thanks Daniel and Jason. I have started working in this direction. There should be no
technical obstacle here. 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-25 19:27               ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-25 19:27 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:15 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> 
> > Yeah imo makes sense. It's a bunch more code for you to make it work
> > on
> > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > also avoids the abuse of card* and dumb buffers. Plus not really more
> > complex, you just need a table or something to match from the drm
> > driver name to the driver-specific buffer create function. Everything
> > else stays the same.
> 
> If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> python test script
> 
> If you struggle here I can probably work out the build system bits, but it should not be too terrible

Thanks Daniel and Jason. I have started working in this direction. There should be no
technical obstacle here. 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-25 19:27               ` Xiong, Jianxin
@ 2020-11-26  0:00                 ` Jason Gunthorpe
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-26  0:00 UTC (permalink / raw)
  To: Xiong, Jianxin
  Cc: Daniel Vetter, Leon Romanovsky, linux-rdma, dri-devel,
	Doug Ledford, Vetter, Daniel, Christian Koenig

On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > From: Jason Gunthorpe <jgg@ziepe.ca>
> > Sent: Wednesday, November 25, 2020 4:15 AM
> > To: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > 
> > > Yeah imo makes sense. It's a bunch more code for you to make it work
> > > on
> > > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > > also avoids the abuse of card* and dumb buffers. Plus not really more
> > > complex, you just need a table or something to match from the drm
> > > driver name to the driver-specific buffer create function. Everything
> > > else stays the same.
> > 
> > If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> > python test script
> > 
> > If you struggle here I can probably work out the build system bits, but it should not be too terrible
> 
> Thanks Daniel and Jason. I have started working in this direction. There should be no
> technical obstacle here. 

Just to be clear I mean write some 'get dma buf fd' function in C, not
the whole test

Jason

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-26  0:00                 ` Jason Gunthorpe
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Gunthorpe @ 2020-11-26  0:00 UTC (permalink / raw)
  To: Xiong, Jianxin
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > From: Jason Gunthorpe <jgg@ziepe.ca>
> > Sent: Wednesday, November 25, 2020 4:15 AM
> > To: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > 
> > > Yeah imo makes sense. It's a bunch more code for you to make it work
> > > on
> > > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > > also avoids the abuse of card* and dumb buffers. Plus not really more
> > > complex, you just need a table or something to match from the drm
> > > driver name to the driver-specific buffer create function. Everything
> > > else stays the same.
> > 
> > If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> > python test script
> > 
> > If you struggle here I can probably work out the build system bits, but it should not be too terrible
> 
> Thanks Daniel and Jason. I have started working in this direction. There should be no
> technical obstacle here. 

Just to be clear I mean write some 'get dma buf fd' function in C, not
the whole test

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
  2020-11-26  0:00                 ` Jason Gunthorpe
@ 2020-11-26  0:43                   ` Xiong, Jianxin
  -1 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-26  0:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Daniel Vetter, Leon Romanovsky, linux-rdma, dri-devel,
	Doug Ledford, Vetter, Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:00 PM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org;
> Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > Sent: Wednesday, November 25, 2020 4:15 AM
> > > To: Daniel Vetter <daniel@ffwll.ch>
> > > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky
> > > <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>;
> > > Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > > <christian.koenig@amd.com>
> > > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR
> > > support
> > >
> > > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > >
> > > > Yeah imo makes sense. It's a bunch more code for you to make it
> > > > work on
> > > > i915 and amd, but it's not terrible. And avoids the dependencies,
> > > > and also avoids the abuse of card* and dumb buffers. Plus not
> > > > really more complex, you just need a table or something to match
> > > > from the drm driver name to the driver-specific buffer create
> > > > function. Everything else stays the same.
> > >
> > > If it is going to get more complicated please write it in C then. We
> > > haven't done it yet, but you can link a C function through cython to
> > > the python test script
> > >
> > > If you struggle here I can probably work out the build system bits,
> > > but it should not be too terrible
> >
> > Thanks Daniel and Jason. I have started working in this direction.
> > There should be no technical obstacle here.
> 
> Just to be clear I mean write some 'get dma buf fd' function in C, not the whole test
> 
Yes, that's my understanding.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
@ 2020-11-26  0:43                   ` Xiong, Jianxin
  0 siblings, 0 replies; 48+ messages in thread
From: Xiong, Jianxin @ 2020-11-26  0:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, dri-devel, Doug Ledford, Vetter,
	Daniel, Christian Koenig

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:00 PM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org;
> Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > Sent: Wednesday, November 25, 2020 4:15 AM
> > > To: Daniel Vetter <daniel@ffwll.ch>
> > > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky
> > > <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>;
> > > Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > > <christian.koenig@amd.com>
> > > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR
> > > support
> > >
> > > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > >
> > > > Yeah imo makes sense. It's a bunch more code for you to make it
> > > > work on
> > > > i915 and amd, but it's not terrible. And avoids the dependencies,
> > > > and also avoids the abuse of card* and dumb buffers. Plus not
> > > > really more complex, you just need a table or something to match
> > > > from the drm driver name to the driver-specific buffer create
> > > > function. Everything else stays the same.
> > >
> > > If it is going to get more complicated please write it in C then. We
> > > haven't done it yet, but you can link a C function through cython to
> > > the python test script
> > >
> > > If you struggle here I can probably work out the build system bits,
> > > but it should not be too terrible
> >
> > Thanks Daniel and Jason. I have started working in this direction.
> > There should be no technical obstacle here.
> 
> Just to be clear I mean write some 'get dma buf fd' function in C, not the whole test
> 
Yes, that's my understanding.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2020-11-26  8:24 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-23 17:52 [PATCH rdma-core 0/5] Add user space dma-buf support Jianxin Xiong
2020-11-23 17:52 ` Jianxin Xiong
2020-11-23 17:53 ` [PATCH rdma-core 1/5] verbs: Support dma-buf based memory region Jianxin Xiong
2020-11-23 17:53   ` Jianxin Xiong
2020-11-23 18:01   ` Jason Gunthorpe
2020-11-23 18:01     ` Jason Gunthorpe
2020-11-24 10:31   ` Yishai Hadas
2020-11-24 10:31     ` Yishai Hadas
2020-11-24 18:00     ` Xiong, Jianxin
2020-11-24 18:00       ` Xiong, Jianxin
2020-11-23 17:53 ` [PATCH rdma-core 2/5] mlx5: " Jianxin Xiong
2020-11-23 17:53   ` Jianxin Xiong
2020-11-23 18:01   ` Jason Gunthorpe
2020-11-23 18:01     ` Jason Gunthorpe
2020-11-23 19:40     ` Xiong, Jianxin
2020-11-23 19:40       ` Xiong, Jianxin
2020-11-23 17:53 ` [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support Jianxin Xiong
2020-11-23 17:53   ` Jianxin Xiong
2020-11-23 18:05   ` Jason Gunthorpe
2020-11-23 18:05     ` Jason Gunthorpe
2020-11-23 19:48     ` Xiong, Jianxin
2020-11-23 19:48       ` Xiong, Jianxin
2020-11-24 15:16     ` Daniel Vetter
2020-11-24 15:16       ` Daniel Vetter
2020-11-24 15:36       ` Jason Gunthorpe
2020-11-24 15:36         ` Jason Gunthorpe
2020-11-24 16:21         ` Xiong, Jianxin
2020-11-24 16:21           ` Xiong, Jianxin
2020-11-24 18:45       ` Xiong, Jianxin
2020-11-24 18:45         ` Xiong, Jianxin
2020-11-25 10:50         ` Daniel Vetter
2020-11-25 10:50           ` Daniel Vetter
2020-11-25 12:14           ` Jason Gunthorpe
2020-11-25 12:14             ` Jason Gunthorpe
2020-11-25 19:27             ` Xiong, Jianxin
2020-11-25 19:27               ` Xiong, Jianxin
2020-11-26  0:00               ` Jason Gunthorpe
2020-11-26  0:00                 ` Jason Gunthorpe
2020-11-26  0:43                 ` Xiong, Jianxin
2020-11-26  0:43                   ` Xiong, Jianxin
2020-11-23 17:53 ` [PATCH rdma-core 4/5] tests: Add tests for dma-buf based memory regions Jianxin Xiong
2020-11-23 17:53   ` Jianxin Xiong
2020-11-23 17:53 ` [PATCH rdma-core 5/5] tests: Bug fix for get_access_flags() Jianxin Xiong
2020-11-23 17:53   ` Jianxin Xiong
2020-11-24 20:26   ` John Hubbard
2020-11-24 20:26     ` John Hubbard
2020-11-24 20:43     ` Xiong, Jianxin
2020-11-24 20:43       ` Xiong, Jianxin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.