All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jianxin Xiong <jianxin.xiong@intel.com>
Cc: linux-rdma@vger.kernel.org, Doug Ledford <dledford@redhat.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Leon Romanovsky <leon@kernel.org>
Subject: Re: [RFC PATCH 1/3] RDMA/umem: Support importing dma-buf as user memory region
Date: Thu, 16 Apr 2020 15:01:51 -0300	[thread overview]
Message-ID: <20200416180151.GU5100@ziepe.ca> (raw)
In-Reply-To: <1587056973-101760-2-git-send-email-jianxin.xiong@intel.com>

On Thu, Apr 16, 2020 at 10:09:31AM -0700, Jianxin Xiong wrote:
> Dma-buf, a standard cross-driver buffer sharing mechanism, is chosen to
> be the basis of a non-proprietary approach for supporting RDMA to/from
> buffers allocated from device local memory (e.g. GPU VRAM).
> 
> Dma-buf is supported by mainstream GPU drivers. By using ioctl calls
> via the devices under /dev/dri/, user space applications can allocate
> and export GPU buffers as dma-buf objects with associated file
> descriptors.
> 
> In order to use the exported GPU buffers for RDMA operations, the RDMA
> driver needs to be able to import dma-buf objects. This happens at the
> time of memory registration. A GPU buffer is registered as a special
> type of user space memory region with the dma-buf file descriptor as
> an extra parameter. The uverbs API needs to be extended to allow the
> extra parameter be passed from user space to kernel.
> 
> Implements the common code for pinning and mapping dma-buf pages and
> adds config option for RDMA driver dma-buf support. The common code
> is utilized by the new uverbs commands introduced by follow-up patches.
> 
> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
> Reviewed-by: Sean Hefty <sean.hefty@intel.com>
> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
>  drivers/infiniband/Kconfig            |  10 ++++
>  drivers/infiniband/core/Makefile      |   1 +
>  drivers/infiniband/core/umem.c        |   3 +
>  drivers/infiniband/core/umem_dmabuf.c | 100 ++++++++++++++++++++++++++++++++++
>  include/rdma/ib_umem.h                |   2 +
>  include/rdma/ib_umem_dmabuf.h         |  50 +++++++++++++++++
>  6 files changed, 166 insertions(+)
>  create mode 100644 drivers/infiniband/core/umem_dmabuf.c
>  create mode 100644 include/rdma/ib_umem_dmabuf.h
> 
> diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
> index ade8638..1dcfc59 100644
> +++ b/drivers/infiniband/Kconfig
> @@ -63,6 +63,16 @@ config INFINIBAND_ON_DEMAND_PAGING
>  	  memory regions without pinning their pages, fetching the
>  	  pages on demand instead.
>  
> +config INFINIBAND_DMABUF
> +	bool "InfiniBand dma-buf support"
> +	depends on INFINIBAND_USER_MEM

select ..some kind of DMABUF symbol...

> +	default n
> +	help
> +	  Support for dma-buf based user memory.
> +	  This allows userspace processes register memory regions
> +	  backed by device memory exported as dma-buf, and thus
> +	  enables RDMA operations using device memory.

See remarks on the cover letter

> +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device,
> +				   unsigned long addr, size_t size,
> +				   int dmabuf_fd, int access)
> +{
> +	struct ib_umem_dmabuf *umem_dmabuf;
> +	struct sg_table *sgt;
> +	enum dma_data_direction dir;
> +	long ret;
> +
> +	if (((addr + size) < addr) ||
> +	    PAGE_ALIGN(addr + size) < (addr + size))
> +		return ERR_PTR(-EINVAL);

This math validating the user parameters can overflow in various bad
ways

> +	if (!can_do_mlock())
> +		return ERR_PTR(-EPERM);

Why?

> +	umem_dmabuf->umem.ibdev = device;
> +	umem_dmabuf->umem.length = size;
> +	umem_dmabuf->umem.address = addr;
> +	umem_dmabuf->umem.writable = ib_access_writable(access);
> +	umem_dmabuf->umem.is_dmabuf = 1;
> +	umem_dmabuf->umem.owning_mm = current->mm;

Why does this need to store owning_mm?

> +	umem_dmabuf->fd = dmabuf_fd;

Doesn't need to store fd

> +void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf)
> +{
> +	enum dma_data_direction dir;
> +
> +	dir = umem_dmabuf->umem.writable ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
> +
> +	/*
> +	 * Only use the original sgt returned from dma_buf_map_attachment(),
> +	 * otherwise the scatterlist may be freed twice due to the map caching
> +	 * mechanism.
> +	 */
> +	dma_buf_unmap_attachment(umem_dmabuf->attach, umem_dmabuf->sgt, dir);
> +	dma_buf_detach(umem_dmabuf->dmabuf, umem_dmabuf->attach);
> +	dma_buf_put(umem_dmabuf->dmabuf);
> +	mmdrop(umem_dmabuf->umem.owning_mm);
> +	kfree(umem_dmabuf);
> +}
> +EXPORT_SYMBOL(ib_umem_dmabuf_release);

Why is this an EXPORT_SYMBOL?

> diff --git a/include/rdma/ib_umem_dmabuf.h b/include/rdma/ib_umem_dmabuf.h
> new file mode 100644
> index 0000000..e82b205
> +++ b/include/rdma/ib_umem_dmabuf.h

This should not be a public header

> @@ -0,0 +1,50 @@
> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> +/*
> + * Copyright (c) 2020 Intel Corporation. All rights reserved.
> + */
> +
> +#ifndef IB_UMEM_DMABUF_H
> +#define IB_UMEM_DMABUF_H
> +
> +#include <linux/dma-buf.h>
> +#include <rdma/ib_umem.h>
> +#include <rdma/ib_verbs.h>
> +
> +struct ib_umem_dmabuf {
> +	struct ib_umem	umem;
> +	int		fd;
> +	struct dma_buf	*dmabuf;
> +	struct dma_buf_attachment *attach;
> +	struct sg_table *sgt;

Probably the ib_umem should be changed to hold a struct sg_table. Not
clear to me why dma_buf wants to allocate this as a pointer..

Also this can be in umem_dmabuf.c

> +static inline struct ib_umem_dmabuf *to_ib_umem_dmabuf(struct ib_umem *umem)
> +{
> +	return container_of(umem, struct ib_umem_dmabuf, umem);
> +}

Put in ummem_dmabuf.c

> +
> +#ifdef CONFIG_INFINIBAND_DMABUF
> +
> +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device,
> +				   unsigned long addr, size_t size,
> +				   int dmabuf_fd, int access);
> +
> +void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf);
> +
> +#else /* CONFIG_INFINIBAND_DMABUF */
> +
> +static inline struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device,
> +						 unsigned long addr,
> +						 size_t size, int dmabuf_fd,
> +						 int access)
> +{
> +	return ERR_PTR(-EINVAL);
> +}

This should be in the existing ib_umem.h

> +
> +static inline void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf)
> +{
> +}

In uverbs_priv.h

Jason

  reply	other threads:[~2020-04-16 18:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16 17:09 [RFC PATCH 0/3] RDMA: Add dma-buf support Jianxin Xiong
2020-04-16 17:09 ` [RFC PATCH 1/3] RDMA/umem: Support importing dma-buf as user memory region Jianxin Xiong
2020-04-16 18:01   ` Jason Gunthorpe [this message]
2020-04-16 19:41     ` Xiong, Jianxin
2020-04-17 12:31       ` Jason Gunthorpe
2020-04-17 15:49         ` Xiong, Jianxin
2020-04-16 18:02   ` Leon Romanovsky
2020-04-16 18:04     ` Jason Gunthorpe
2020-04-16 18:30       ` Leon Romanovsky
2020-04-16 21:08     ` Xiong, Jianxin
2020-04-16 17:09 ` [RFC PATCH 2/3] RDMA/uverbs: Add uverbs commands for fd-based MR registration Jianxin Xiong
2020-04-16 17:47   ` Jason Gunthorpe
2020-04-16 18:32     ` Xiong, Jianxin
2020-04-17 12:35       ` Jason Gunthorpe
2020-04-16 17:09 ` [RFC PATCH 3/3] RDMA/mlx5: Support new uverbs commands for registering fd-based MR Jianxin Xiong
2020-04-16 17:54 ` [RFC PATCH 0/3] RDMA: Add dma-buf support Jason Gunthorpe
2020-04-16 19:08   ` Xiong, Jianxin
2020-04-16 19:34     ` Jason Gunthorpe
2020-04-16 21:02       ` Xiong, Jianxin
2020-04-17 12:35         ` Jason Gunthorpe
2020-04-17 16:49           ` Xiong, Jianxin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200416180151.GU5100@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=dledford@redhat.com \
    --cc=jianxin.xiong@intel.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.