From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751858AbeCXDtw (ORCPT ); Fri, 23 Mar 2018 23:49:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:48012 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751409AbeCXDtu (ORCPT ); Fri, 23 Mar 2018 23:49:50 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7C8532172B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Fri, 23 Mar 2018 22:49:47 -0500 From: Bjorn Helgaas To: Logan Gunthorpe Cc: Stephen Bates , Sinan Kaya , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "linux-block@vger.kernel.org" , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , Dan Williams , =?iso-8859-1?B?Suly9G1l?= Glisse , Benjamin Herrenschmidt , Alex Williamson Subject: Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory Message-ID: <20180324034947.GE210003@bhelgaas-glaptop.roam.corp.google.com> References: <3ea80992-a0fc-08f2-d93d-ae0ec4e3f4ce@codeaurora.org> <4eb6850c-df1b-fd44-3ee0-d43a50270b53@deltatee.com> <757fca36-dee4-e070-669e-f2788bd78e41@codeaurora.org> <4f761f55-4e9a-dccb-d12f-c59d2cd689db@deltatee.com> <20180313230850.GA45763@bhelgaas-glaptop.roam.corp.google.com> <20180323215046.GC210003@bhelgaas-glaptop.roam.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 23, 2018 at 03:59:14PM -0600, Logan Gunthorpe wrote: > On 23/03/18 03:50 PM, Bjorn Helgaas wrote: > > Popping way up the stack, my original point was that I'm trying to > > remove restrictions on what devices can participate in > > peer-to-peer DMA. I think it's fairly clear that in conventional > > PCI, any devices in the same PCI hierarchy, i.e., below the same > > host-to-PCI bridge, should be able to DMA to each other. > > Yup, we are working on this. > > > The routing behavior of PCIe is supposed to be compatible with > > conventional PCI, and I would argue that this effectively requires > > multi-function PCIe devices to have the internal routing required > > to avoid the route-to-self issue. > > That would be very nice but many devices do not support the internal > route. We've had to work around this in the past and as I mentioned > earlier that NVMe devices have a flag indicating support. However, > if a device wants to be involved in P2P it must support it and we > can exclude devices that don't support it by simply not enabling > their drivers. Do you think these devices that don't support internal DMA between functions are within spec, or should we handle them as exceptions, e.g., via quirks? If NVMe defines a flag indicating peer-to-peer support, that would suggest to me that these devices are within spec. I looked up the CMBSZ register you mentioned (NVMe 1.3a, sec 3.1.12). You must be referring to the WDS, RDS, LISTS, CQS, and SQS bits. If WDS is set, the controller supports having Write-related data and metadata in the Controller Memory Buffer. That would mean the driver could put certain queues in controller memory instead of in host memory. The controller could then read the queue from its own internal memory rather than issuing a PCIe transaction to read it from host memory. That makes sense to me, but I don't see the connection to peer-to-peer. There's no multi-function device in this picture, so it's not about internal DMA between functions. WDS, etc., tell us about capabilities of the controller. If WDS is set, the CPU (or a peer PCIe device) can write things to controller memory. If it is clear, neither the CPU nor a peer device can put things there. So it doesn't seem to tell us anything about peer-to-peer specifically. It looks like information needed by the NVMe driver, but not by the PCI core. Bjorn