linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support
@ 2021-09-08  6:16 Shunsuke Mie
  2021-09-08  6:16 ` [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device Shunsuke Mie
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  6:16 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Shunsuke Mie, Christian König, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, dhobsong, taki, etom

This patch series add a support for rxe driver.

A dma-buf based memory registering has beed introduced to use the memory
region that lack of associated page structures (e.g. device memory and CMA
managed memory) [1]. However, to use the dma-buf based memory, each rdma
device drivers require add some implementation. The rxe driver has not
support yet.

[1] https://www.spinics.net/lists/linux-rdma/msg98592.html

To enable to use the memories in rxe rdma device, add some changes and
implementation in this patch series.

This series consists of three patches. The first patch changes the IB core
to support for rdma drivers that have not real dma device. The second
patch extracts a memory mapping process of rxe as a common function to use
a dma-buf support. The third patch adds the dma-buf support to rxe driver.

Related user space RDMA library changes are provided as a separate
patch.

Shunsuke Mie (3):
  RDMA/umem: Change for rdma devices has not dma device
  RDMA/rxe: Extract a mapping process into a function
  RDMA/rxe: Support dma-buf as memory region

 drivers/infiniband/core/umem_dmabuf.c |   2 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |   3 +
 drivers/infiniband/sw/rxe/rxe_mr.c    | 186 +++++++++++++++++++++-----
 drivers/infiniband/sw/rxe/rxe_verbs.c |  36 +++++
 4 files changed, 193 insertions(+), 34 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  6:16 [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Shunsuke Mie
@ 2021-09-08  6:16 ` Shunsuke Mie
  2021-09-08  6:26   ` Christoph Hellwig
  2021-09-08  6:16 ` [RFC PATCH 2/3] RDMA/rxe: Extract a mapping process into a function Shunsuke Mie
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  6:16 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Shunsuke Mie, Christian König, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, dhobsong, taki, etom

To share memory space using dma-buf, a API of the dma-buf requires dma
device, but devices such as rxe do not have a dma device. For those case,
change to specify a device of struct ib instead of the dma device.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/infiniband/core/umem_dmabuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index c6e875619fac..58d6ac9cb51a 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -146,7 +146,7 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device,
 
 	umem_dmabuf->attach = dma_buf_dynamic_attach(
 					dmabuf,
-					device->dma_device,
+					device->dma_device ? device->dma_device : &device->dev,
 					ops,
 					umem_dmabuf);
 	if (IS_ERR(umem_dmabuf->attach)) {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 2/3] RDMA/rxe: Extract a mapping process into a function
  2021-09-08  6:16 [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Shunsuke Mie
  2021-09-08  6:16 ` [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device Shunsuke Mie
@ 2021-09-08  6:16 ` Shunsuke Mie
  2021-09-08  6:16 ` [RFC PATCH 3/3] RDMA/rxe: Support dma-buf as memory region Shunsuke Mie
  2021-09-09  5:45 ` [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Zhu Yanjun
  3 siblings, 0 replies; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  6:16 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Shunsuke Mie, Christian König, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, dhobsong, taki, etom

Functionization of the process to generate the memory map. The function
generates maps from scatterlists to a list of page aligned addresses.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 89 ++++++++++++++++++------------
 1 file changed, 54 insertions(+), 35 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index be4bcb420fab..8b08705ed62a 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -103,15 +103,59 @@ void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr)
 	mr->type = RXE_MR_TYPE_DMA;
 }
 
+/* generate a rxe memory map data structure from ib_umem */
+static int rxe_mr_gen_map(struct rxe_mr *mr, struct ib_umem *umem)
+{
+	int err;
+	int num_buf;
+	struct rxe_map **map;
+	struct rxe_phys_buf *buf = NULL;
+	struct sg_page_iter sg_iter;
+	void *vaddr;
+
+	num_buf = 0;
+	map = mr->map;
+	if (mr->length > 0) {
+		buf = map[0]->buf;
+
+		for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
+			if (num_buf >= RXE_BUF_PER_MAP) {
+				map++;
+				buf = map[0]->buf;
+				num_buf = 0;
+			}
+
+			vaddr = page_address(sg_page_iter_page(&sg_iter));
+			if (!vaddr) {
+				pr_warn("%s: Unable to get virtual address", __func__);
+				err = -ENOMEM;
+				goto err1;
+			}
+
+			buf->addr = (uintptr_t)vaddr;
+			buf->size = PAGE_SIZE;
+			num_buf++;
+			buf++;
+		}
+	}
+
+	mr->umem = umem;
+	mr->offset = ib_umem_offset(umem);
+	mr->state = RXE_MR_STATE_VALID;
+	mr->type = RXE_MR_TYPE_MR;
+
+	return 0;
+
+err1:
+	ib_umem_release(umem);
+	return err;
+}
+
 int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 		     int access, struct rxe_mr *mr)
 {
-	struct rxe_map		**map;
-	struct rxe_phys_buf	*buf = NULL;
 	struct ib_umem		*umem;
-	struct sg_page_iter	sg_iter;
-	int			num_buf;
-	void			*vaddr;
+	int num_buf;
 	int err;
 	int i;
 
@@ -138,43 +182,18 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 	mr->page_shift = PAGE_SHIFT;
 	mr->page_mask = PAGE_SIZE - 1;
 
-	num_buf			= 0;
-	map = mr->map;
-	if (length > 0) {
-		buf = map[0]->buf;
-
-		for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
-			if (num_buf >= RXE_BUF_PER_MAP) {
-				map++;
-				buf = map[0]->buf;
-				num_buf = 0;
-			}
-
-			vaddr = page_address(sg_page_iter_page(&sg_iter));
-			if (!vaddr) {
-				pr_warn("%s: Unable to get virtual address\n",
-						__func__);
-				err = -ENOMEM;
-				goto err_cleanup_map;
-			}
-
-			buf->addr = (uintptr_t)vaddr;
-			buf->size = PAGE_SIZE;
-			num_buf++;
-			buf++;
+	mr->length = length;
 
-		}
+	err = rxe_mr_gen_map(mr, umem);
+	if (err) {
+		pr_warn("%s: Failed to map pages", __func__);
+		goto err_cleanup_map;
 	}
 
 	mr->ibmr.pd = &pd->ibpd;
-	mr->umem = umem;
 	mr->access = access;
-	mr->length = length;
 	mr->iova = iova;
 	mr->va = start;
-	mr->offset = ib_umem_offset(umem);
-	mr->state = RXE_MR_STATE_VALID;
-	mr->type = RXE_MR_TYPE_MR;
 
 	return 0;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 3/3] RDMA/rxe: Support dma-buf as memory region
  2021-09-08  6:16 [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Shunsuke Mie
  2021-09-08  6:16 ` [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device Shunsuke Mie
  2021-09-08  6:16 ` [RFC PATCH 2/3] RDMA/rxe: Extract a mapping process into a function Shunsuke Mie
@ 2021-09-08  6:16 ` Shunsuke Mie
  2021-09-09  5:45 ` [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Zhu Yanjun
  3 siblings, 0 replies; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  6:16 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Shunsuke Mie, Christian König, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, dhobsong, taki, etom

Implement a ib device operation ‘reg_user_mr_dmabuf’. Import dma-buf
using the IB core API and map the memory area linked the dma-buf.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/infiniband/sw/rxe/rxe_loc.h   |   3 +
 drivers/infiniband/sw/rxe/rxe_mr.c    | 101 ++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_verbs.c |  36 +++++++++
 3 files changed, 140 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 1ddb20855dee..206d9d5f8bbf 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -75,6 +75,9 @@ u8 rxe_get_next_key(u32 last_key);
 void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr);
 int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 		     int access, struct rxe_mr *mr);
+int rxe_mr_dmabuf_init_user(struct rxe_pd *pd, int fd, u64 start, u64 length,
+			    u64 iova, int access, struct rxe_mr *mr);
+
 int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr);
 int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 		enum rxe_mr_copy_dir dir, u32 *crcp);
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 8b08705ed62a..846f52aad0de 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -4,6 +4,8 @@
  * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
  */
 
+#include <linux/dma-buf.h>
+
 #include "rxe.h"
 #include "rxe_loc.h"
 
@@ -207,6 +209,105 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 	return err;
 }
 
+static int rxe_map_dmabuf_mr(struct rxe_mr *mr)
+{
+	struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
+	struct ib_umem *umem = mr->umem;
+	int err;
+
+	err = ib_umem_dmabuf_map_pages(umem_dmabuf);
+	if (err)
+		goto err1;
+
+	err = rxe_mr_gen_map(mr, umem);
+	if (err)
+		goto err2;
+
+	return ib_umem_num_pages(umem);
+
+err2:
+	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
+err1:
+	return err;
+}
+
+/* A function called from the dma-buf exporter when the mapped pages
+ * become invalid.
+ */
+static void rxe_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
+{
+	int err;
+	struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
+	struct rxe_mr *mr = umem_dmabuf->private;
+
+	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
+
+	/* all of memory region is immediately mapped again */
+	err = rxe_map_dmabuf_mr(mr);
+	if (err)
+		pr_err("%s: failed to map the dma-buf region", __func__);
+}
+
+static struct dma_buf_attach_ops rxe_ib_dmabuf_attach_ops = {
+	.move_notify = rxe_ib_dmabuf_invalidate_cb,
+};
+
+/* initialize a umem and map all the areas of dma-buf. */
+int rxe_mr_dmabuf_init_user(struct rxe_pd *pd, int fd, u64 start, u64 length,
+			    u64 iova, int access, struct rxe_mr *mr)
+{
+	struct ib_umem_dmabuf *umem_dmabuf;
+	int num_buf;
+	int err;
+
+	umem_dmabuf = ib_umem_dmabuf_get(pd->ibpd.device, start, length, fd,
+					 access, &rxe_ib_dmabuf_attach_ops);
+	if (IS_ERR(umem_dmabuf)) {
+		err = PTR_ERR(umem_dmabuf);
+		pr_err("%s: failed to get umem_dmabuf (%d)", __func__, err);
+		goto err1;
+	}
+
+	umem_dmabuf->private = mr;
+
+	mr->umem = &umem_dmabuf->umem;
+	mr->umem->iova = iova;
+	num_buf = ib_umem_num_pages(mr->umem);
+
+	rxe_mr_init(access, mr);
+
+	err = rxe_mr_alloc(mr, num_buf);
+	if (err)
+		goto err1;
+
+	mr->page_shift = PAGE_SHIFT;
+	mr->page_mask = PAGE_SIZE - 1;
+
+	mr->ibmr.pd = &pd->ibpd;
+	mr->access = access;
+	mr->length = length;
+	mr->iova = iova;
+	mr->va = start;
+	mr->offset = ib_umem_offset(mr->umem);
+	mr->state = RXE_MR_STATE_VALID;
+	mr->type = RXE_MR_TYPE_MR;
+
+	err = rxe_map_dmabuf_mr(mr);
+	if (err) {
+		pr_err("%s: failed to map the dma-buf region", __func__);
+		goto err2;
+	}
+
+	return 0;
+
+err2:
+	for (i = 0; i < mr->num_map; i++)
+		kfree(mr->map[i]);
+	kfree(mr->map);
+err1:
+	return err;
+}
+
 int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr)
 {
 	int err;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index c223959ac174..4a38c20730b3 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -959,6 +959,39 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd,
 	return ERR_PTR(err);
 }
 
+static struct ib_mr *rxe_reg_user_mr_dmabuf(struct ib_pd *ibpd, u64 start,
+					    u64 length, u64 iova, int fd,
+					    int access, struct ib_udata *udata)
+{
+	int err;
+	struct rxe_dev *rxe = to_rdev(ibpd->device);
+	struct rxe_pd *pd = to_rpd(ibpd);
+	struct rxe_mr *mr;
+
+	mr = rxe_alloc(&rxe->mr_pool);
+	if (!mr) {
+		err = -ENOMEM;
+		goto err1;
+	}
+
+	rxe_add_index(mr);
+
+	rxe_add_ref(pd);
+
+	err = rxe_mr_dmabuf_init_user(pd, fd, start, length, iova, access, mr);
+	if (err)
+		goto err2;
+
+	return &mr->ibmr;
+
+err2:
+	rxe_drop_ref(pd);
+	rxe_drop_index(mr);
+	rxe_drop_ref(mr);
+err1:
+	return ERR_PTR(err);
+}
+
 static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type,
 				  u32 max_num_sg)
 {
@@ -1139,6 +1172,7 @@ static const struct ib_device_ops rxe_dev_ops = {
 	.query_qp = rxe_query_qp,
 	.query_srq = rxe_query_srq,
 	.reg_user_mr = rxe_reg_user_mr,
+	.reg_user_mr_dmabuf = rxe_reg_user_mr_dmabuf,
 	.req_notify_cq = rxe_req_notify_cq,
 	.resize_cq = rxe_resize_cq,
 
@@ -1181,6 +1215,8 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
 	}
 	rxe->tfm = tfm;
 
+	dma_coerce_mask_and_coherent(&dev->dev, DMA_BIT_MASK(64));
+
 	err = ib_register_device(dev, ibdev_name, NULL);
 	if (err)
 		pr_warn("%s failed with error %d\n", __func__, err);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  6:16 ` [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device Shunsuke Mie
@ 2021-09-08  6:26   ` Christoph Hellwig
  2021-09-08  7:01     ` Shunsuke Mie
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-09-08  6:26 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Zhu Yanjun, Christian K??nig, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, dhobsong, taki, etom

On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> To share memory space using dma-buf, a API of the dma-buf requires dma
> device, but devices such as rxe do not have a dma device. For those case,
> change to specify a device of struct ib instead of the dma device.

So if dma-buf doesn't actually need a device to dma map why do we ever
pass the dma_device here?  Something does not add up.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  6:26   ` Christoph Hellwig
@ 2021-09-08  7:01     ` Shunsuke Mie
  2021-09-08  7:19       ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  7:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Zhu Yanjun, Christian K??nig, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, Damian Hobson-Garcia, Takanari Hayama,
	Tomohito Esaki

Thank you for your comment.
>
> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > To share memory space using dma-buf, a API of the dma-buf requires dma
> > device, but devices such as rxe do not have a dma device. For those case,
> > change to specify a device of struct ib instead of the dma device.
>
> So if dma-buf doesn't actually need a device to dma map why do we ever
> pass the dma_device here?  Something does not add up.
As described in the dma-buf api guide [1], the dma_device is used by dma-buf
exporter to know the device buffer constraints of importer.
[1] https://lwn.net/Articles/489703/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  7:01     ` Shunsuke Mie
@ 2021-09-08  7:19       ` Christoph Hellwig
  2021-09-08  8:41         ` Shunsuke Mie
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-09-08  7:19 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Christoph Hellwig, Zhu Yanjun, Christian K??nig, Alex Deucher,
	Daniel Vetter, Doug Ledford, Jason Gunthorpe, Jianxin Xiong,
	Leon Romanovsky, linux-kernel, linux-rdma, Damian Hobson-Garcia,
	Takanari Hayama, Tomohito Esaki

On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> Thank you for your comment.
> >
> > On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > To share memory space using dma-buf, a API of the dma-buf requires dma
> > > device, but devices such as rxe do not have a dma device. For those case,
> > > change to specify a device of struct ib instead of the dma device.
> >
> > So if dma-buf doesn't actually need a device to dma map why do we ever
> > pass the dma_device here?  Something does not add up.
> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> exporter to know the device buffer constraints of importer.
> [1] https://lwn.net/Articles/489703/

Which means for rxe you'd also have to pass the one for the underlying
net device.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  7:19       ` Christoph Hellwig
@ 2021-09-08  8:41         ` Shunsuke Mie
  2021-09-08 11:18           ` Jason Gunthorpe
  0 siblings, 1 reply; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-08  8:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Zhu Yanjun, Christian K??nig, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, Damian Hobson-Garcia, Takanari Hayama,
	Tomohito Esaki

2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
>
> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > Thank you for your comment.
> > >
> > > On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > device, but devices such as rxe do not have a dma device. For those case,
> > > > change to specify a device of struct ib instead of the dma device.
> > >
> > > So if dma-buf doesn't actually need a device to dma map why do we ever
> > > pass the dma_device here?  Something does not add up.
> > As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > exporter to know the device buffer constraints of importer.
> > [1] https://lwn.net/Articles/489703/
>
> Which means for rxe you'd also have to pass the one for the underlying
> net device.
I thought of that way too. In that case, the memory region is constrained by the
net device, but rxe driver copies data using CPU. To avoid the constraints, I
decided to use the ib device.

Thanks,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08  8:41         ` Shunsuke Mie
@ 2021-09-08 11:18           ` Jason Gunthorpe
  2021-09-08 13:33             ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2021-09-08 11:18 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Christoph Hellwig, Zhu Yanjun, Christian K??nig, Alex Deucher,
	Daniel Vetter, Doug Ledford, Jianxin Xiong, Leon Romanovsky,
	linux-kernel, linux-rdma, Damian Hobson-Garcia, Takanari Hayama,
	Tomohito Esaki

On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> >
> > On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > Thank you for your comment.
> > > >
> > > > On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > device, but devices such as rxe do not have a dma device. For those case,
> > > > > change to specify a device of struct ib instead of the dma device.
> > > >
> > > > So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > pass the dma_device here?  Something does not add up.
> > > As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > exporter to know the device buffer constraints of importer.
> > > [1] https://lwn.net/Articles/489703/
> >
> > Which means for rxe you'd also have to pass the one for the underlying
> > net device.
> I thought of that way too. In that case, the memory region is constrained by the
> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> decided to use the ib device.

Well, that is the whole problem.

We can't mix the dmabuf stuff people are doing that doesn't fill in
the CPU pages in the SGL with RXE - it is simply impossible as things
currently are for RXE to acess this non-struct page memory.

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08 11:18           ` Jason Gunthorpe
@ 2021-09-08 13:33             ` Christian König
  2021-09-08 19:22               ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-09-08 13:33 UTC (permalink / raw)
  To: Jason Gunthorpe, Shunsuke Mie
  Cc: Christoph Hellwig, Zhu Yanjun, Alex Deucher, Daniel Vetter,
	Doug Ledford, Jianxin Xiong, Leon Romanovsky, linux-kernel,
	linux-rdma, Damian Hobson-Garcia, Takanari Hayama,
	Tomohito Esaki

Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
>> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
>>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
>>>> Thank you for your comment.
>>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
>>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
>>>>>> device, but devices such as rxe do not have a dma device. For those case,
>>>>>> change to specify a device of struct ib instead of the dma device.
>>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
>>>>> pass the dma_device here?  Something does not add up.
>>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
>>>> exporter to know the device buffer constraints of importer.
>>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
>>> Which means for rxe you'd also have to pass the one for the underlying
>>> net device.
>> I thought of that way too. In that case, the memory region is constrained by the
>> net device, but rxe driver copies data using CPU. To avoid the constraints, I
>> decided to use the ib device.
> Well, that is the whole problem.
>
> We can't mix the dmabuf stuff people are doing that doesn't fill in
> the CPU pages in the SGL with RXE - it is simply impossible as things
> currently are for RXE to acess this non-struct page memory.

Yeah, agree that doesn't make much sense.

When you want to access the data with the CPU then why do you want to 
use DMA-buf in the first place?

Please keep in mind that there is work ongoing to replace the sg table 
with an DMA address array and so make the underlying struct page 
inaccessible for importers.

Regards,
Christian.

>
> Jason


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08 13:33             ` Christian König
@ 2021-09-08 19:22               ` Daniel Vetter
  2021-09-08 23:33                 ` Jason Gunthorpe
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-09-08 19:22 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Gunthorpe, Shunsuke Mie, Christoph Hellwig, Zhu Yanjun,
	Alex Deucher, Doug Ledford, Jianxin Xiong, Leon Romanovsky,
	Linux Kernel Mailing List, linux-rdma, Damian Hobson-Garcia,
	Takanari Hayama, Tomohito Esaki

On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> >>>> Thank you for your comment.
> >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> >>>>>> change to specify a device of struct ib instead of the dma device.
> >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> >>>>> pass the dma_device here?  Something does not add up.
> >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> >>>> exporter to know the device buffer constraints of importer.
> >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> >>> Which means for rxe you'd also have to pass the one for the underlying
> >>> net device.
> >> I thought of that way too. In that case, the memory region is constrained by the
> >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> >> decided to use the ib device.
> > Well, that is the whole problem.
> >
> > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > the CPU pages in the SGL with RXE - it is simply impossible as things
> > currently are for RXE to acess this non-struct page memory.
>
> Yeah, agree that doesn't make much sense.
>
> When you want to access the data with the CPU then why do you want to
> use DMA-buf in the first place?
>
> Please keep in mind that there is work ongoing to replace the sg table
> with an DMA address array and so make the underlying struct page
> inaccessible for importers.

Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
for cpu access. Which intentionally does not require any device. No
idea why there's a dma_buf_attach involved. Now not all exporters
support this, but that's fixable, and you must call
dma_buf_begin/end_cpu_access for cache management if the allocation
isn't cpu coherent. But it's all there, no need to apply hacks of
allowing a wrong device or other fun things.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08 19:22               ` Daniel Vetter
@ 2021-09-08 23:33                 ` Jason Gunthorpe
  2021-09-09  9:26                   ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2021-09-08 23:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Shunsuke Mie, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > >>>> Thank you for your comment.
> > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > >>>>>> change to specify a device of struct ib instead of the dma device.
> > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > >>>>> pass the dma_device here?  Something does not add up.
> > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > >>>> exporter to know the device buffer constraints of importer.
> > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > >>> Which means for rxe you'd also have to pass the one for the underlying
> > >>> net device.
> > >> I thought of that way too. In that case, the memory region is constrained by the
> > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > >> decided to use the ib device.
> > > Well, that is the whole problem.
> > >
> > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > currently are for RXE to acess this non-struct page memory.
> >
> > Yeah, agree that doesn't make much sense.
> >
> > When you want to access the data with the CPU then why do you want to
> > use DMA-buf in the first place?
> >
> > Please keep in mind that there is work ongoing to replace the sg table
> > with an DMA address array and so make the underlying struct page
> > inaccessible for importers.
> 
> Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> for cpu access. Which intentionally does not require any device. No
> idea why there's a dma_buf_attach involved. Now not all exporters
> support this, but that's fixable, and you must call
> dma_buf_begin/end_cpu_access for cache management if the allocation
> isn't cpu coherent. But it's all there, no need to apply hacks of
> allowing a wrong device or other fun things.

Can rxe leave the vmap in place potentially forever?

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support
  2021-09-08  6:16 [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Shunsuke Mie
                   ` (2 preceding siblings ...)
  2021-09-08  6:16 ` [RFC PATCH 3/3] RDMA/rxe: Support dma-buf as memory region Shunsuke Mie
@ 2021-09-09  5:45 ` Zhu Yanjun
  2021-09-10  2:00   ` Shunsuke Mie
  3 siblings, 1 reply; 20+ messages in thread
From: Zhu Yanjun @ 2021-09-09  5:45 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Christian König, Alex Deucher, Daniel Vetter, Doug Ledford,
	Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky, LKML,
	RDMA mailing list, dhobsong, taki, etom

On Wed, Sep 8, 2021 at 2:16 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> This patch series add a support for rxe driver.

After applying the patches, please run rdma-core tests with the patched kernel.
Then fix all the problems in rdma-core.

Thanks
Zhu Yanjun

>
> A dma-buf based memory registering has beed introduced to use the memory
> region that lack of associated page structures (e.g. device memory and CMA
> managed memory) [1]. However, to use the dma-buf based memory, each rdma
> device drivers require add some implementation. The rxe driver has not
> support yet.
>
> [1] https://www.spinics.net/lists/linux-rdma/msg98592.html
>
> To enable to use the memories in rxe rdma device, add some changes and
> implementation in this patch series.
>
> This series consists of three patches. The first patch changes the IB core
> to support for rdma drivers that have not real dma device. The second
> patch extracts a memory mapping process of rxe as a common function to use
> a dma-buf support. The third patch adds the dma-buf support to rxe driver.
>
> Related user space RDMA library changes are provided as a separate
> patch.
>
> Shunsuke Mie (3):
>   RDMA/umem: Change for rdma devices has not dma device
>   RDMA/rxe: Extract a mapping process into a function
>   RDMA/rxe: Support dma-buf as memory region
>
>  drivers/infiniband/core/umem_dmabuf.c |   2 +-
>  drivers/infiniband/sw/rxe/rxe_loc.h   |   3 +
>  drivers/infiniband/sw/rxe/rxe_mr.c    | 186 +++++++++++++++++++++-----
>  drivers/infiniband/sw/rxe/rxe_verbs.c |  36 +++++
>  4 files changed, 193 insertions(+), 34 deletions(-)
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-08 23:33                 ` Jason Gunthorpe
@ 2021-09-09  9:26                   ` Daniel Vetter
  2021-09-10  1:46                     ` Shunsuke Mie
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-09-09  9:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christian König, Shunsuke Mie, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > >>>> Thank you for your comment.
> > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > >>>>> pass the dma_device here?  Something does not add up.
> > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > >>>> exporter to know the device buffer constraints of importer.
> > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > >>> net device.
> > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > >> decided to use the ib device.
> > > > Well, that is the whole problem.
> > > >
> > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > currently are for RXE to acess this non-struct page memory.
> > >
> > > Yeah, agree that doesn't make much sense.
> > >
> > > When you want to access the data with the CPU then why do you want to
> > > use DMA-buf in the first place?
> > >
> > > Please keep in mind that there is work ongoing to replace the sg table
> > > with an DMA address array and so make the underlying struct page
> > > inaccessible for importers.
> >
> > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > for cpu access. Which intentionally does not require any device. No
> > idea why there's a dma_buf_attach involved. Now not all exporters
> > support this, but that's fixable, and you must call
> > dma_buf_begin/end_cpu_access for cache management if the allocation
> > isn't cpu coherent. But it's all there, no need to apply hacks of
> > allowing a wrong device or other fun things.
>
> Can rxe leave the vmap in place potentially forever?

Yeah, it's like perma-pinning the buffer into system memory for
non-p2p dma-buf sharing. We just squint and pretend that can't be
abused too badly :-) On 32bit you'll run out of vmap space rather
quickly, but that's not something anyone cares about here either. We
have a bunch of more sw modesetting drivers in drm which use
dma_buf_vmap() like this, so it's all fine.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-09  9:26                   ` Daniel Vetter
@ 2021-09-10  1:46                     ` Shunsuke Mie
  2021-09-13 19:22                       ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-10  1:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jason Gunthorpe, Christian König, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@ffwll.ch>:
>
> On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > >>>> Thank you for your comment.
> > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > >>>>> pass the dma_device here?  Something does not add up.
> > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > >>>> exporter to know the device buffer constraints of importer.
> > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > >>> net device.
> > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > >> decided to use the ib device.
> > > > > Well, that is the whole problem.
> > > > >
> > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > currently are for RXE to acess this non-struct page memory.
> > > >
> > > > Yeah, agree that doesn't make much sense.
> > > >
> > > > When you want to access the data with the CPU then why do you want to
> > > > use DMA-buf in the first place?
> > > >
> > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > with an DMA address array and so make the underlying struct page
> > > > inaccessible for importers.
> > >
> > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > for cpu access. Which intentionally does not require any device. No
> > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > support this, but that's fixable, and you must call
> > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > allowing a wrong device or other fun things.
> >
> > Can rxe leave the vmap in place potentially forever?
>
> Yeah, it's like perma-pinning the buffer into system memory for
> non-p2p dma-buf sharing. We just squint and pretend that can't be
> abused too badly :-) On 32bit you'll run out of vmap space rather
> quickly, but that's not something anyone cares about here either. We
> have a bunch of more sw modesetting drivers in drm which use
> dma_buf_vmap() like this, so it's all fine.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Thanks for your comments.

In the first place, the CMA region cannot be used for RDMA because the
region has no struct page. In addition, some GPU drivers use CMA and share
the region as dma-buf. As a result, RDMA cannot transfer for the region. To
solve this problem, rxe dma-buf support is better I thought.

I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
instead of the dma_buf_dynamic_attach().

Regards,
Shunsuke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support
  2021-09-09  5:45 ` [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Zhu Yanjun
@ 2021-09-10  2:00   ` Shunsuke Mie
  0 siblings, 0 replies; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-10  2:00 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Christian König, Alex Deucher, Daniel Vetter, Doug Ledford,
	Jason Gunthorpe, Jianxin Xiong, Leon Romanovsky, LKML,
	RDMA mailing list, Damian Hobson-Garcia, taki, etom

2021年9月9日(木) 14:45 Zhu Yanjun <zyjzyj2000@gmail.com>:
> After applying the patches, please run rdma-core tests with the patched kernel.
> Then fix all the problems in rdma-core.

I understand. I'd like to do the tests and fix it before posting the next
patches

Regards,
Shunsuke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-10  1:46                     ` Shunsuke Mie
@ 2021-09-13 19:22                       ` Daniel Vetter
  2021-09-14  7:11                         ` Shunsuke Mie
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-09-13 19:22 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Gunthorpe, Christian König, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

On Fri, Sep 10, 2021 at 3:46 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> 2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@ffwll.ch>:
> >
> > On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > > >>>> Thank you for your comment.
> > > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > > >>>>> pass the dma_device here?  Something does not add up.
> > > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > > >>>> exporter to know the device buffer constraints of importer.
> > > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > > >>> net device.
> > > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > > >> decided to use the ib device.
> > > > > > Well, that is the whole problem.
> > > > > >
> > > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > > currently are for RXE to acess this non-struct page memory.
> > > > >
> > > > > Yeah, agree that doesn't make much sense.
> > > > >
> > > > > When you want to access the data with the CPU then why do you want to
> > > > > use DMA-buf in the first place?
> > > > >
> > > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > > with an DMA address array and so make the underlying struct page
> > > > > inaccessible for importers.
> > > >
> > > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > > for cpu access. Which intentionally does not require any device. No
> > > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > > support this, but that's fixable, and you must call
> > > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > > allowing a wrong device or other fun things.
> > >
> > > Can rxe leave the vmap in place potentially forever?
> >
> > Yeah, it's like perma-pinning the buffer into system memory for
> > non-p2p dma-buf sharing. We just squint and pretend that can't be
> > abused too badly :-) On 32bit you'll run out of vmap space rather
> > quickly, but that's not something anyone cares about here either. We
> > have a bunch of more sw modesetting drivers in drm which use
> > dma_buf_vmap() like this, so it's all fine.
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
> Thanks for your comments.
>
> In the first place, the CMA region cannot be used for RDMA because the
> region has no struct page. In addition, some GPU drivers use CMA and share
> the region as dma-buf. As a result, RDMA cannot transfer for the region. To
> solve this problem, rxe dma-buf support is better I thought.
>
> I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
> instead of the dma_buf_dynamic_attach().

btw for next version please cc dri-devel. get_maintainers.pl should
pick it up for these patches.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-13 19:22                       ` Daniel Vetter
@ 2021-09-14  7:11                         ` Shunsuke Mie
  2021-09-14  9:38                           ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-14  7:11 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jason Gunthorpe, Christian König, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

2021年9月14日(火) 4:23 Daniel Vetter <daniel.vetter@ffwll.ch>:
>
> On Fri, Sep 10, 2021 at 3:46 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > 2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@ffwll.ch>:
> > >
> > > On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > > > >>>> Thank you for your comment.
> > > > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > > > >>>>> pass the dma_device here?  Something does not add up.
> > > > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > > > >>>> exporter to know the device buffer constraints of importer.
> > > > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > > > >>> net device.
> > > > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > > > >> decided to use the ib device.
> > > > > > > Well, that is the whole problem.
> > > > > > >
> > > > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > > > currently are for RXE to acess this non-struct page memory.
> > > > > >
> > > > > > Yeah, agree that doesn't make much sense.
> > > > > >
> > > > > > When you want to access the data with the CPU then why do you want to
> > > > > > use DMA-buf in the first place?
> > > > > >
> > > > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > > > with an DMA address array and so make the underlying struct page
> > > > > > inaccessible for importers.
> > > > >
> > > > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > > > for cpu access. Which intentionally does not require any device. No
> > > > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > > > support this, but that's fixable, and you must call
> > > > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > > > allowing a wrong device or other fun things.
> > > >
> > > > Can rxe leave the vmap in place potentially forever?
> > >
> > > Yeah, it's like perma-pinning the buffer into system memory for
> > > non-p2p dma-buf sharing. We just squint and pretend that can't be
> > > abused too badly :-) On 32bit you'll run out of vmap space rather
> > > quickly, but that's not something anyone cares about here either. We
> > > have a bunch of more sw modesetting drivers in drm which use
> > > dma_buf_vmap() like this, so it's all fine.
> > > -Daniel
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> > Thanks for your comments.
> >
> > In the first place, the CMA region cannot be used for RDMA because the
> > region has no struct page. In addition, some GPU drivers use CMA and share
> > the region as dma-buf. As a result, RDMA cannot transfer for the region. To
> > solve this problem, rxe dma-buf support is better I thought.
> >
> > I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
> > instead of the dma_buf_dynamic_attach().
>
> btw for next version please cc dri-devel. get_maintainers.pl should
> pick it up for these patches.
A CC list of these patches is generated by get_maintainers.pl but it
didn't pick up the dri-devel. Should I add the dri-devel to the cc
manually?

Regards,
Shunsuke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-14  7:11                         ` Shunsuke Mie
@ 2021-09-14  9:38                           ` Daniel Vetter
  2021-09-14 10:13                             ` Shunsuke Mie
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-09-14  9:38 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Gunthorpe, Christian König, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

On Tue, Sep 14, 2021 at 9:11 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> 2021年9月14日(火) 4:23 Daniel Vetter <daniel.vetter@ffwll.ch>:
> >
> > On Fri, Sep 10, 2021 at 3:46 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > >
> > > 2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@ffwll.ch>:
> > > >
> > > > On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > > > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > > > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > > > > >>>> Thank you for your comment.
> > > > > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > > > > >>>>> pass the dma_device here?  Something does not add up.
> > > > > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > > > > >>>> exporter to know the device buffer constraints of importer.
> > > > > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > > > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > > > > >>> net device.
> > > > > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > > > > >> decided to use the ib device.
> > > > > > > > Well, that is the whole problem.
> > > > > > > >
> > > > > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > > > > currently are for RXE to acess this non-struct page memory.
> > > > > > >
> > > > > > > Yeah, agree that doesn't make much sense.
> > > > > > >
> > > > > > > When you want to access the data with the CPU then why do you want to
> > > > > > > use DMA-buf in the first place?
> > > > > > >
> > > > > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > > > > with an DMA address array and so make the underlying struct page
> > > > > > > inaccessible for importers.
> > > > > >
> > > > > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > > > > for cpu access. Which intentionally does not require any device. No
> > > > > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > > > > support this, but that's fixable, and you must call
> > > > > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > > > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > > > > allowing a wrong device or other fun things.
> > > > >
> > > > > Can rxe leave the vmap in place potentially forever?
> > > >
> > > > Yeah, it's like perma-pinning the buffer into system memory for
> > > > non-p2p dma-buf sharing. We just squint and pretend that can't be
> > > > abused too badly :-) On 32bit you'll run out of vmap space rather
> > > > quickly, but that's not something anyone cares about here either. We
> > > > have a bunch of more sw modesetting drivers in drm which use
> > > > dma_buf_vmap() like this, so it's all fine.
> > > > -Daniel
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > > Thanks for your comments.
> > >
> > > In the first place, the CMA region cannot be used for RDMA because the
> > > region has no struct page. In addition, some GPU drivers use CMA and share
> > > the region as dma-buf. As a result, RDMA cannot transfer for the region. To
> > > solve this problem, rxe dma-buf support is better I thought.
> > >
> > > I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
> > > instead of the dma_buf_dynamic_attach().
> >
> > btw for next version please cc dri-devel. get_maintainers.pl should
> > pick it up for these patches.
> A CC list of these patches is generated by get_maintainers.pl but it
> didn't pick up the dri-devel. Should I add the dri-devel to the cc
> manually?

Hm yes, on rechecking the regex doesn't match since you're not
touching any dma-buf code directly. Or not directly enough for
get_maintainers.pl to pick it up.

DMA BUFFER SHARING FRAMEWORK
M:    Sumit Semwal <sumit.semwal@linaro.org>
M:    Christian König <christian.koenig@amd.com>
L:    linux-media@vger.kernel.org
L:    dri-devel@lists.freedesktop.org
L:    linaro-mm-sig@lists.linaro.org (moderated for non-subscribers)
S:    Maintained
T:    git git://anongit.freedesktop.org/drm/drm-misc
F:    Documentation/driver-api/dma-buf.rst
F:    drivers/dma-buf/
F:    include/linux/*fence.h
F:    include/linux/dma-buf*
F:    include/linux/dma-resv.h
K:    \bdma_(?:buf|fence|resv)\b

Above is the MAINTAINERS entry that's always good to cc for anything
related to dma_buf/fence/resv and any of these related things.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device
  2021-09-14  9:38                           ` Daniel Vetter
@ 2021-09-14 10:13                             ` Shunsuke Mie
  0 siblings, 0 replies; 20+ messages in thread
From: Shunsuke Mie @ 2021-09-14 10:13 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jason Gunthorpe, Christian König, Christoph Hellwig,
	Zhu Yanjun, Alex Deucher, Doug Ledford, Jianxin Xiong,
	Leon Romanovsky, Linux Kernel Mailing List, linux-rdma,
	Damian Hobson-Garcia, Takanari Hayama, Tomohito Esaki

2021年9月14日(火) 18:38 Daniel Vetter <daniel.vetter@ffwll.ch>:
>
> On Tue, Sep 14, 2021 at 9:11 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > 2021年9月14日(火) 4:23 Daniel Vetter <daniel.vetter@ffwll.ch>:
> > >
> > > On Fri, Sep 10, 2021 at 3:46 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > > >
> > > > 2021年9月9日(木) 18:26 Daniel Vetter <daniel.vetter@ffwll.ch>:
> > > > >
> > > > > On Thu, Sep 9, 2021 at 1:33 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > > > On Wed, Sep 08, 2021 at 09:22:37PM +0200, Daniel Vetter wrote:
> > > > > > > On Wed, Sep 8, 2021 at 3:33 PM Christian König <christian.koenig@amd.com> wrote:
> > > > > > > > Am 08.09.21 um 13:18 schrieb Jason Gunthorpe:
> > > > > > > > > On Wed, Sep 08, 2021 at 05:41:39PM +0900, Shunsuke Mie wrote:
> > > > > > > > >> 2021年9月8日(水) 16:20 Christoph Hellwig <hch@infradead.org>:
> > > > > > > > >>> On Wed, Sep 08, 2021 at 04:01:14PM +0900, Shunsuke Mie wrote:
> > > > > > > > >>>> Thank you for your comment.
> > > > > > > > >>>>> On Wed, Sep 08, 2021 at 03:16:09PM +0900, Shunsuke Mie wrote:
> > > > > > > > >>>>>> To share memory space using dma-buf, a API of the dma-buf requires dma
> > > > > > > > >>>>>> device, but devices such as rxe do not have a dma device. For those case,
> > > > > > > > >>>>>> change to specify a device of struct ib instead of the dma device.
> > > > > > > > >>>>> So if dma-buf doesn't actually need a device to dma map why do we ever
> > > > > > > > >>>>> pass the dma_device here?  Something does not add up.
> > > > > > > > >>>> As described in the dma-buf api guide [1], the dma_device is used by dma-buf
> > > > > > > > >>>> exporter to know the device buffer constraints of importer.
> > > > > > > > >>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.net%2FArticles%2F489703%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4d18470a94df4ed24c8108d972ba5591%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637666967356417448%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ARwQyo%2BCjMohaNbyREofToHIj2bndL5L0HaU9cOrYq4%3D&amp;reserved=0
> > > > > > > > >>> Which means for rxe you'd also have to pass the one for the underlying
> > > > > > > > >>> net device.
> > > > > > > > >> I thought of that way too. In that case, the memory region is constrained by the
> > > > > > > > >> net device, but rxe driver copies data using CPU. To avoid the constraints, I
> > > > > > > > >> decided to use the ib device.
> > > > > > > > > Well, that is the whole problem.
> > > > > > > > >
> > > > > > > > > We can't mix the dmabuf stuff people are doing that doesn't fill in
> > > > > > > > > the CPU pages in the SGL with RXE - it is simply impossible as things
> > > > > > > > > currently are for RXE to acess this non-struct page memory.
> > > > > > > >
> > > > > > > > Yeah, agree that doesn't make much sense.
> > > > > > > >
> > > > > > > > When you want to access the data with the CPU then why do you want to
> > > > > > > > use DMA-buf in the first place?
> > > > > > > >
> > > > > > > > Please keep in mind that there is work ongoing to replace the sg table
> > > > > > > > with an DMA address array and so make the underlying struct page
> > > > > > > > inaccessible for importers.
> > > > > > >
> > > > > > > Also if you do have a dma-buf, you can just dma_buf_vmap() the buffer
> > > > > > > for cpu access. Which intentionally does not require any device. No
> > > > > > > idea why there's a dma_buf_attach involved. Now not all exporters
> > > > > > > support this, but that's fixable, and you must call
> > > > > > > dma_buf_begin/end_cpu_access for cache management if the allocation
> > > > > > > isn't cpu coherent. But it's all there, no need to apply hacks of
> > > > > > > allowing a wrong device or other fun things.
> > > > > >
> > > > > > Can rxe leave the vmap in place potentially forever?
> > > > >
> > > > > Yeah, it's like perma-pinning the buffer into system memory for
> > > > > non-p2p dma-buf sharing. We just squint and pretend that can't be
> > > > > abused too badly :-) On 32bit you'll run out of vmap space rather
> > > > > quickly, but that's not something anyone cares about here either. We
> > > > > have a bunch of more sw modesetting drivers in drm which use
> > > > > dma_buf_vmap() like this, so it's all fine.
> > > > > -Daniel
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > Thanks for your comments.
> > > >
> > > > In the first place, the CMA region cannot be used for RDMA because the
> > > > region has no struct page. In addition, some GPU drivers use CMA and share
> > > > the region as dma-buf. As a result, RDMA cannot transfer for the region. To
> > > > solve this problem, rxe dma-buf support is better I thought.
> > > >
> > > > I'll consider and redesign the rxe dma-buf support using the dma_buf_vmap()
> > > > instead of the dma_buf_dynamic_attach().
> > >
> > > btw for next version please cc dri-devel. get_maintainers.pl should
> > > pick it up for these patches.
> > A CC list of these patches is generated by get_maintainers.pl but it
> > didn't pick up the dri-devel. Should I add the dri-devel to the cc
> > manually?
>
> Hm yes, on rechecking the regex doesn't match since you're not
> touching any dma-buf code directly. Or not directly enough for
> get_maintainers.pl to pick it up.
>
> DMA BUFFER SHARING FRAMEWORK
> M:    Sumit Semwal <sumit.semwal@linaro.org>
> M:    Christian König <christian.koenig@amd.com>
> L:    linux-media@vger.kernel.org
> L:    dri-devel@lists.freedesktop.org
> L:    linaro-mm-sig@lists.linaro.org (moderated for non-subscribers)
> S:    Maintained
> T:    git git://anongit.freedesktop.org/drm/drm-misc
> F:    Documentation/driver-api/dma-buf.rst
> F:    drivers/dma-buf/
> F:    include/linux/*fence.h
> F:    include/linux/dma-buf*
> F:    include/linux/dma-resv.h
> K:    \bdma_(?:buf|fence|resv)\b
>
> Above is the MAINTAINERS entry that's always good to cc for anything
> related to dma_buf/fence/resv and any of these related things.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Yes, the dma-buf was not directly included in my changes. However, this is
related to dma-buf. So I'll add the dma-buf related ML and members
to cc using
`./scripts/get_maintainer.pl -f drivers/infiniband/core/umem_dmabuf.c`.
I think it is enough to list the email addresses.

Thank you for letting me know that.

Regards,
Shunsuke,

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-09-14 10:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08  6:16 [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Shunsuke Mie
2021-09-08  6:16 ` [RFC PATCH 1/3] RDMA/umem: Change for rdma devices has not dma device Shunsuke Mie
2021-09-08  6:26   ` Christoph Hellwig
2021-09-08  7:01     ` Shunsuke Mie
2021-09-08  7:19       ` Christoph Hellwig
2021-09-08  8:41         ` Shunsuke Mie
2021-09-08 11:18           ` Jason Gunthorpe
2021-09-08 13:33             ` Christian König
2021-09-08 19:22               ` Daniel Vetter
2021-09-08 23:33                 ` Jason Gunthorpe
2021-09-09  9:26                   ` Daniel Vetter
2021-09-10  1:46                     ` Shunsuke Mie
2021-09-13 19:22                       ` Daniel Vetter
2021-09-14  7:11                         ` Shunsuke Mie
2021-09-14  9:38                           ` Daniel Vetter
2021-09-14 10:13                             ` Shunsuke Mie
2021-09-08  6:16 ` [RFC PATCH 2/3] RDMA/rxe: Extract a mapping process into a function Shunsuke Mie
2021-09-08  6:16 ` [RFC PATCH 3/3] RDMA/rxe: Support dma-buf as memory region Shunsuke Mie
2021-09-09  5:45 ` [RFC PATCH 0/3] RDMA/rxe: Add dma-buf support Zhu Yanjun
2021-09-10  2:00   ` Shunsuke Mie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).