Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jianxin Xiong <jianxin.xiong@intel.com>
Cc: linux-rdma@vger.kernel.org, Doug Ledford <dledford@redhat.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Leon Romanovsky <leon@kernel.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [RFC PATCH v2 0/3] RDMA: add dma-buf support
Date: Mon, 29 Jun 2020 15:51:52 -0300
Message-ID: <20200629185152.GD25301@ziepe.ca> (raw)
In-Reply-To: <1593451903-30959-1-git-send-email-jianxin.xiong@intel.com>

On Mon, Jun 29, 2020 at 10:31:40AM -0700, Jianxin Xiong wrote:

> ZONE_DEVICE is a new zone for device memory in the memory management
> subsystem. It allows pages from device memory being described with
> specialized page structures. As the result, calls like get_user_pages()
> can succeed, but what can be done with these page structures may be

get_user_pages() does not succeed with ZONE_DEVICE_PAGEs

> Heterogeneous Memory Management (HMM) utilizes mmu_interval_notifier
> and ZONE_DEVICE to support shared virtual address space and page
> migration between system memory and device memory. HMM doesn't support
> pinning device memory because pages located on device must be able to
> migrate to system memory when accessed by CPU. Peer-to-peer access
> is possible if the peer can handle page fault. For RDMA, that means
> the NIC must support on-demand paging.

peer-peer access is currently not possible with hmm_range_fault().

> This patch series adds dma-buf importer role to the RDMA driver in
> attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
> chosen for a few reasons: first, the API is relatively simple and allows
> a lot of flexibility in implementing the buffer manipulation ops.
> Second, it doesn't require page structure. Third, dma-buf is already
> supported in many GPU drivers. However, we are aware that existing GPU
> drivers don't allow pinning device memory via the dma-buf interface.

So.. this patch doesn't really do anything new? We could just make a
MR against the DMA buf mmap and get to the same place?

> Pinning and mapping a dma-buf would cause the backing storage to migrate
> to system RAM. This is due to the lack of knowledge about whether the
> importer can perform peer-to-peer access and the lack of resource limit
> control measure for GPU. For the first part, the latest dma-buf driver
> has a peer-to-peer flag for the importer, but the flag is currently tied
> to dynamic mapping support, which requires on-demand paging support from
> the NIC to work.

ODP for DMA buf?

> There are a few possible ways to address these issues, such as
> decoupling peer-to-peer flag from dynamic mapping, allowing more
> leeway for individual drivers to make the pinning decision and
> adding GPU resource limit control via cgroup. We would like to get
> comments on this patch series with the assumption that device memory
> pinning via dma-buf is supported by some GPU drivers, and at the
> same time welcome open discussions on how to address the
> aforementioned issues as well as GPU-NIC peer-to-peer access
> solutions in general.

These seem like DMA buf problems, not RDMA problems, why are you
asking these questions with a RDMA patch set? The usual DMA buf people
are not even Cc'd here.

> This is the second version of the patch series. Here are the changes
> from the previous version:
> * Instead of adding new device method for dma-buf specific registration,
> existing method is extended to accept an extra parameter.

I think the comment was the extra parameter should have been a umem or
maybe a new umem_description struct, not blindly adding a fd as a
parameter and a wack of EOPNOTSUPPS

> This series is organized as follows. The first patch adds the common
> code for importing dma-buf from a file descriptor and pinning and
> mapping the dma-buf pages. Patch 2 extends the reg_user_mr() method
> of the ib_device structure to accept dma-buf file descriptor as an extra
> parameter. Vendor drivers are updated with the change. Patch 3 adds a
> new uverbs command for registering dma-buf based memory region.

The ioctl stuff seems OK, but this doesn't seem to bring any new
functionality?

Jason

  parent reply index

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-29 17:31 Jianxin Xiong
2020-06-29 17:31 ` [RFC PATCH v2 1/3] RDMA/umem: Support importing dma-buf as user memory region Jianxin Xiong
2020-06-30 19:04   ` Xiong, Jianxin
2020-06-29 17:31 ` [RFC PATCH v2 2/3] RDMA/core: Expand the driver method 'reg_user_mr' to support dma-buf Jianxin Xiong
2020-06-30 19:04   ` Xiong, Jianxin
2020-06-29 17:31 ` [RFC PATCH v2 3/3] RDMA/uverbs: Add uverbs command for dma-buf based MR registration Jianxin Xiong
2020-06-30 19:05   ` Xiong, Jianxin
2020-06-29 18:51 ` Jason Gunthorpe [this message]
2020-06-30 17:21   ` [RFC PATCH v2 0/3] RDMA: add dma-buf support Xiong, Jianxin
2020-06-30 17:34     ` Jason Gunthorpe
2020-06-30 18:46       ` Xiong, Jianxin
2020-06-30 19:17         ` Jason Gunthorpe
2020-06-30 20:08           ` Xiong, Jianxin
2020-07-02 12:27             ` Jason Gunthorpe
2020-07-01  9:03         ` Christian König
2020-07-01 12:07           ` Daniel Vetter
2020-07-01 12:14             ` Daniel Vetter
2020-07-01 12:39           ` Jason Gunthorpe
2020-07-01 12:55             ` Christian König
2020-07-01 15:42               ` Daniel Vetter
2020-07-01 17:15                 ` Jason Gunthorpe
2020-07-02 13:10                   ` Daniel Vetter
2020-07-02 13:29                     ` Jason Gunthorpe
2020-07-02 14:50                       ` Christian König
2020-07-02 18:15                         ` Daniel Vetter
2020-07-03 12:03                           ` Jason Gunthorpe
2020-07-03 12:52                             ` Daniel Vetter
2020-07-03 13:14                               ` Jason Gunthorpe
2020-07-03 13:21                                 ` Christian König
2020-07-07 21:58                                   ` Xiong, Jianxin
2020-07-08  9:38                                     ` Christian König
2020-07-08  9:49                                       ` Daniel Vetter
2020-07-08 14:20                                         ` Christian König
2020-07-08 14:33                                           ` Alex Deucher
2020-06-30 18:56 ` Xiong, Jianxin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200629185152.GD25301@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=daniel.vetter@intel.com \
    --cc=dledford@redhat.com \
    --cc=jianxin.xiong@intel.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git