From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory Date: Tue, 18 Apr 2017 16:02:40 -0700 Message-ID: References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> <20170418212258.GA26838@obsidianresearch.com> <96198489-1af5-abcf-f23f-9a7e41aa17f7@deltatee.com> <5e68102d-e165-6ef3-8678-9bdb4f78382b@deltatee.com> <462e318b-bcb8-7031-5b25-2c245086e077@deltatee.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <462e318b-bcb8-7031-5b25-2c245086e077-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Logan Gunthorpe Cc: Jens Axboe , Keith Busch , "James E.J. Bottomley" , "Martin K. Petersen" , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Benjamin Herrenschmidt , Steve Wise , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Jason Gunthorpe , Jerome Glisse , Bjorn Helgaas , linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-nvdimm , Max Gurtovoy , linux-scsi , Christoph Hellwig List-Id: linux-nvdimm@lists.01.org On Tue, Apr 18, 2017 at 3:56 PM, Logan Gunthorpe wrote: > > > On 18/04/17 04:50 PM, Dan Williams wrote: >> On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe wrote: >>> >>> >>> On 18/04/17 04:28 PM, Dan Williams wrote: >>>> Unlike the pci bus address offset case which I think is fundamental to >>>> support since shipping archs do this today, I think it is ok to say >>>> p2p is restricted to a single sgl that gets to talk to host memory or >>>> a single device. That said, what's wrong with a p2p aware map_sg >>>> implementation calling up to the host memory map_sg implementation on >>>> a per sgl basis? >>> >>> I think Ben said they need mixed sgls and that is where this gets messy. >>> I think I'd prefer this too given trying to enforce all sgs in a list to >>> be one type or another could be quite difficult given the state of the >>> scatterlist code. >>> >>>>> Also, what happens if p2p pages end up getting passed to a device that >>>>> doesn't have the injected dma_ops? >>>> >>>> This goes back to limiting p2p to a single pci host bridge. If the p2p >>>> capability is coordinated with the bridge rather than between the >>>> individual devices then we have a central point to catch this case. >>> >>> Not really relevant. If these pages get to userspace (as people seem >>> keen on doing) or a less than careful kernel driver they could easily >>> get into the dma_map calls of devices that aren't even pci related (via >>> an O_DIRECT operation on an incorrect file or something). The common >>> code must reject these and can't rely on an injected dma op. >> >> No, we can't do that at get_user_pages() time, it will always need to >> be up to the device driver to fail dma that it can't perform. > > I'm not sure I follow -- are you agreeing with me? The dma_map_* needs > to fail for any dma it cannot perform. Which means either all dma_ops > providers need to be p2p aware or this logic has to be in dma_map_* > itself. My point being: you can't rely on an injected dma_op for some > devices to handle the fail case globally. Ah, I see what you're saying now. Yes, we do need something that guarantees any dma mapping implementation that gets a struct page that it does now know how to translate properly fails the request. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758027AbdDRXCu (ORCPT ); Tue, 18 Apr 2017 19:02:50 -0400 Received: from mail-oi0-f53.google.com ([209.85.218.53]:33182 "EHLO mail-oi0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758009AbdDRXCm (ORCPT ); Tue, 18 Apr 2017 19:02:42 -0400 MIME-Version: 1.0 In-Reply-To: <462e318b-bcb8-7031-5b25-2c245086e077@deltatee.com> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> <20170418212258.GA26838@obsidianresearch.com> <96198489-1af5-abcf-f23f-9a7e41aa17f7@deltatee.com> <5e68102d-e165-6ef3-8678-9bdb4f78382b@deltatee.com> <462e318b-bcb8-7031-5b25-2c245086e077@deltatee.com> From: Dan Williams Date: Tue, 18 Apr 2017 16:02:40 -0700 Message-ID: Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory To: Logan Gunthorpe Cc: Jason Gunthorpe , Benjamin Herrenschmidt , Bjorn Helgaas , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" , Jerome Glisse Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 18, 2017 at 3:56 PM, Logan Gunthorpe wrote: > > > On 18/04/17 04:50 PM, Dan Williams wrote: >> On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe wrote: >>> >>> >>> On 18/04/17 04:28 PM, Dan Williams wrote: >>>> Unlike the pci bus address offset case which I think is fundamental to >>>> support since shipping archs do this today, I think it is ok to say >>>> p2p is restricted to a single sgl that gets to talk to host memory or >>>> a single device. That said, what's wrong with a p2p aware map_sg >>>> implementation calling up to the host memory map_sg implementation on >>>> a per sgl basis? >>> >>> I think Ben said they need mixed sgls and that is where this gets messy. >>> I think I'd prefer this too given trying to enforce all sgs in a list to >>> be one type or another could be quite difficult given the state of the >>> scatterlist code. >>> >>>>> Also, what happens if p2p pages end up getting passed to a device that >>>>> doesn't have the injected dma_ops? >>>> >>>> This goes back to limiting p2p to a single pci host bridge. If the p2p >>>> capability is coordinated with the bridge rather than between the >>>> individual devices then we have a central point to catch this case. >>> >>> Not really relevant. If these pages get to userspace (as people seem >>> keen on doing) or a less than careful kernel driver they could easily >>> get into the dma_map calls of devices that aren't even pci related (via >>> an O_DIRECT operation on an incorrect file or something). The common >>> code must reject these and can't rely on an injected dma op. >> >> No, we can't do that at get_user_pages() time, it will always need to >> be up to the device driver to fail dma that it can't perform. > > I'm not sure I follow -- are you agreeing with me? The dma_map_* needs > to fail for any dma it cannot perform. Which means either all dma_ops > providers need to be p2p aware or this logic has to be in dma_map_* > itself. My point being: you can't rely on an injected dma_op for some > devices to handle the fail case globally. Ah, I see what you're saying now. Yes, we do need something that guarantees any dma mapping implementation that gets a struct page that it does now know how to translate properly fails the request. From mboxrd@z Thu Jan 1 00:00:00 1970 From: dan.j.williams@intel.com (Dan Williams) Date: Tue, 18 Apr 2017 16:02:40 -0700 Subject: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory In-Reply-To: <462e318b-bcb8-7031-5b25-2c245086e077@deltatee.com> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> <20170418212258.GA26838@obsidianresearch.com> <96198489-1af5-abcf-f23f-9a7e41aa17f7@deltatee.com> <5e68102d-e165-6ef3-8678-9bdb4f78382b@deltatee.com> <462e318b-bcb8-7031-5b25-2c245086e077@deltatee.com> Message-ID: On Tue, Apr 18, 2017@3:56 PM, Logan Gunthorpe wrote: > > > On 18/04/17 04:50 PM, Dan Williams wrote: >> On Tue, Apr 18, 2017@3:48 PM, Logan Gunthorpe wrote: >>> >>> >>> On 18/04/17 04:28 PM, Dan Williams wrote: >>>> Unlike the pci bus address offset case which I think is fundamental to >>>> support since shipping archs do this today, I think it is ok to say >>>> p2p is restricted to a single sgl that gets to talk to host memory or >>>> a single device. That said, what's wrong with a p2p aware map_sg >>>> implementation calling up to the host memory map_sg implementation on >>>> a per sgl basis? >>> >>> I think Ben said they need mixed sgls and that is where this gets messy. >>> I think I'd prefer this too given trying to enforce all sgs in a list to >>> be one type or another could be quite difficult given the state of the >>> scatterlist code. >>> >>>>> Also, what happens if p2p pages end up getting passed to a device that >>>>> doesn't have the injected dma_ops? >>>> >>>> This goes back to limiting p2p to a single pci host bridge. If the p2p >>>> capability is coordinated with the bridge rather than between the >>>> individual devices then we have a central point to catch this case. >>> >>> Not really relevant. If these pages get to userspace (as people seem >>> keen on doing) or a less than careful kernel driver they could easily >>> get into the dma_map calls of devices that aren't even pci related (via >>> an O_DIRECT operation on an incorrect file or something). The common >>> code must reject these and can't rely on an injected dma op. >> >> No, we can't do that at get_user_pages() time, it will always need to >> be up to the device driver to fail dma that it can't perform. > > I'm not sure I follow -- are you agreeing with me? The dma_map_* needs > to fail for any dma it cannot perform. Which means either all dma_ops > providers need to be p2p aware or this logic has to be in dma_map_* > itself. My point being: you can't rely on an injected dma_op for some > devices to handle the fail case globally. Ah, I see what you're saying now. Yes, we do need something that guarantees any dma mapping implementation that gets a struct page that it does now know how to translate properly fails the request.