From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Gunthorpe Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Date: Tue, 18 Apr 2017 13:48:45 -0600 Message-ID: <20170418194845.GA22895@obsidianresearch.com> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Logan Gunthorpe Cc: Jens Axboe , "James E.J. Bottomley" , "Martin K. Petersen" , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Benjamin Herrenschmidt , Steve Wise , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Keith Busch , Jerome Glisse , Bjorn Helgaas , linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-nvdimm , Max Gurtovoy , linux-scsi , Christoph Hellwig List-Id: linux-nvdimm@lists.01.org On Tue, Apr 18, 2017 at 01:35:32PM -0600, Logan Gunthorpe wrote: > > Ultimately every dma_ops will need special code to support P2P with > > the special hardware that ops is controlling, so it makes some sense > > to start by pushing the check down there in the first place. This > > advice is partially motivated by how dma_map_sg is just a small > > wrapper around the function pointer call... > > Yes, I noticed this problem too and that makes sense. It just means > every dma_ops will probably need to be modified to either support p2p > pages or fail on them. Though, the only real difficulty there is that it > will be a lot of work. I think this is why progress on this keeps getting stuck - every solution is a lot of work. > > Where p2p_same_segment_map_page checks if the two devices are on the > > 'same switch' and if so returns the address translated to match the > > bus address programmed into the BAR or fails. We knows this case is > > required to work by the PCI spec, so it makes sense to use it as the > > first canned helper. > > I've also suggested that this check should probably be done (or perhaps > duplicated) before we even get to the map stage. Since the mechanics of the check is essentially unique to every dma-ops I would not hoist it out of the map function without a really good reason. > In the case of nvme-fabrics we'd probably want to let the user know > when they try to configure it or at least fall back to allocating > regular memory instead. You could try to do a dummy mapping / create a MR early on to detect this. FWIW, I wonder if from a RDMA perspective we have another problem.. Should we allow P2P memory to be used with the local DMA lkey? There are potential designs around virtualization that would not allow that. Should we mandate that P2P memory be in its own MR? Jason From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757829AbdDRTt2 (ORCPT ); Tue, 18 Apr 2017 15:49:28 -0400 Received: from quartz.orcorp.ca ([184.70.90.242]:48340 "EHLO quartz.orcorp.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757592AbdDRTt0 (ORCPT ); Tue, 18 Apr 2017 15:49:26 -0400 Date: Tue, 18 Apr 2017 13:48:45 -0600 From: Jason Gunthorpe To: Logan Gunthorpe Cc: Benjamin Herrenschmidt , Dan Williams , Bjorn Helgaas , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" , Jerome Glisse Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Message-ID: <20170418194845.GA22895@obsidianresearch.com> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Broken-Reverse-DNS: no host name found for IP address 10.0.0.156 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 18, 2017 at 01:35:32PM -0600, Logan Gunthorpe wrote: > > Ultimately every dma_ops will need special code to support P2P with > > the special hardware that ops is controlling, so it makes some sense > > to start by pushing the check down there in the first place. This > > advice is partially motivated by how dma_map_sg is just a small > > wrapper around the function pointer call... > > Yes, I noticed this problem too and that makes sense. It just means > every dma_ops will probably need to be modified to either support p2p > pages or fail on them. Though, the only real difficulty there is that it > will be a lot of work. I think this is why progress on this keeps getting stuck - every solution is a lot of work. > > Where p2p_same_segment_map_page checks if the two devices are on the > > 'same switch' and if so returns the address translated to match the > > bus address programmed into the BAR or fails. We knows this case is > > required to work by the PCI spec, so it makes sense to use it as the > > first canned helper. > > I've also suggested that this check should probably be done (or perhaps > duplicated) before we even get to the map stage. Since the mechanics of the check is essentially unique to every dma-ops I would not hoist it out of the map function without a really good reason. > In the case of nvme-fabrics we'd probably want to let the user know > when they try to configure it or at least fall back to allocating > regular memory instead. You could try to do a dummy mapping / create a MR early on to detect this. FWIW, I wonder if from a RDMA perspective we have another problem.. Should we allow P2P memory to be used with the local DMA lkey? There are potential designs around virtualization that would not allow that. Should we mandate that P2P memory be in its own MR? Jason From mboxrd@z Thu Jan 1 00:00:00 1970 From: jgunthorpe@obsidianresearch.com (Jason Gunthorpe) Date: Tue, 18 Apr 2017 13:48:45 -0600 Subject: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory In-Reply-To: References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> Message-ID: <20170418194845.GA22895@obsidianresearch.com> On Tue, Apr 18, 2017@01:35:32PM -0600, Logan Gunthorpe wrote: > > Ultimately every dma_ops will need special code to support P2P with > > the special hardware that ops is controlling, so it makes some sense > > to start by pushing the check down there in the first place. This > > advice is partially motivated by how dma_map_sg is just a small > > wrapper around the function pointer call... > > Yes, I noticed this problem too and that makes sense. It just means > every dma_ops will probably need to be modified to either support p2p > pages or fail on them. Though, the only real difficulty there is that it > will be a lot of work. I think this is why progress on this keeps getting stuck - every solution is a lot of work. > > Where p2p_same_segment_map_page checks if the two devices are on the > > 'same switch' and if so returns the address translated to match the > > bus address programmed into the BAR or fails. We knows this case is > > required to work by the PCI spec, so it makes sense to use it as the > > first canned helper. > > I've also suggested that this check should probably be done (or perhaps > duplicated) before we even get to the map stage. Since the mechanics of the check is essentially unique to every dma-ops I would not hoist it out of the map function without a really good reason. > In the case of nvme-fabrics we'd probably want to let the user know > when they try to configure it or at least fall back to allocating > regular memory instead. You could try to do a dummy mapping / create a MR early on to detect this. FWIW, I wonder if from a RDMA perspective we have another problem.. Should we allow P2P memory to be used with the local DMA lkey? There are potential designs around virtualization that would not allow that. Should we mandate that P2P memory be in its own MR? Jason