From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1162074AbdAEXXP (ORCPT ); Thu, 5 Jan 2017 18:23:15 -0500 Received: from quartz.orcorp.ca ([184.70.90.242]:50240 "EHLO quartz.orcorp.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1034556AbdAEXXF (ORCPT ); Thu, 5 Jan 2017 18:23:05 -0500 Date: Thu, 5 Jan 2017 15:42:15 -0700 From: Jason Gunthorpe To: Jerome Glisse Cc: Jerome Glisse , "Deucher, Alexander" , "'linux-kernel@vger.kernel.org'" , "'linux-rdma@vger.kernel.org'" , "'linux-nvdimm@lists.01.org'" , "'Linux-media@vger.kernel.org'" , "'dri-devel@lists.freedesktop.org'" , "'linux-pci@vger.kernel.org'" , "Kuehling, Felix" , "Sagalovitch, Serguei" , "Blinzer, Paul" , "Koenig, Christian" , "Suthikulpanit, Suravee" , "Sander, Ben" , hch@infradead.org, david1.zhou@amd.com, qiang.yu@amd.com Subject: Re: Enabling peer to peer device transactions for PCIe devices Message-ID: <20170105224215.GA3855@obsidianresearch.com> References: <20170105183927.GA5324@gmail.com> <20170105190113.GA12587@obsidianresearch.com> <20170105195424.GB2166@redhat.com> <20170105200719.GB31047@obsidianresearch.com> <20170105201935.GC2166@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170105201935.GC2166@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Broken-Reverse-DNS: no host name found for IP address 10.0.0.156 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 05, 2017 at 03:19:36PM -0500, Jerome Glisse wrote: > > Always having a VMA changes the discussion - the question is how to > > create a VMA that reprensents IO device memory, and how do DMA > > consumers extract the correct information from that VMA to pass to the > > kernel DMA API so it can setup peer-peer DMA. > > Well my point is that it can't be. In HMM case inside a single VMA > you [..] > In the GPUDirect case the idea is that you have a specific device vma > that you map for peer to peer. [..] I still don't understand what you driving at - you've said in both cases a user VMA exists. >>From my perspective in RDMA, all I want is a core kernel flow to convert a '__user *' into a scatter list of DMA addresses, that works no matter what is backing that VMA, be it HMM, a 'hidden' GPU object, or struct page memory. A '__user *' pointer is the only way to setup a RDMA MR, and I see no reason to have another API at this time. The details of how to translate to a scatter list are a MM subject, and the MM folks need to get I just don't care if that routine works at a page level, or a whole VMA level, or some combination of both, that is up to the MM team to figure out :) > a page level. Expectation here is that the GPU userspace expose a special > API to allow RDMA to directly happen on GPU object allocated through > GPU specific API (ie it is not regular memory and it is not accessible > by CPU). So, how do you identify these GPU objects? How do you expect RDMA convert them to scatter lists? How will ODP work? > > We have MMU notifiers to handle this today in RDMA. Async RDMA MR > > Invalidate like you see in the above out of tree patches is totally > > crazy and shouldn't be in mainline. Use ODP capable RDMA hardware. > > Well there is still a large base of hardware that do not have such > feature and some people would like to be able to keep using those. Hopefully someone will figure out how to do that without the crazy async MR invalidation. Jason