From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752137AbeCNQSH (ORCPT ); Wed, 14 Mar 2018 12:18:07 -0400 Received: from ale.deltatee.com ([207.54.116.67]:40728 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751755AbeCNQSE (ORCPT ); Wed, 14 Mar 2018 12:18:04 -0400 To: Bjorn Helgaas Cc: Stephen Bates , Sinan Kaya , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "linux-block@vger.kernel.org" , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , Dan Williams , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Benjamin Herrenschmidt , Alex Williamson References: <3e738f95-d73c-4182-2fa1-8664aafb1ab7@deltatee.com> <703aa92c-0c1c-4852-5887-6f6e6ccde0fb@codeaurora.org> <3ea80992-a0fc-08f2-d93d-ae0ec4e3f4ce@codeaurora.org> <4eb6850c-df1b-fd44-3ee0-d43a50270b53@deltatee.com> <757fca36-dee4-e070-669e-f2788bd78e41@codeaurora.org> <4f761f55-4e9a-dccb-d12f-c59d2cd689db@deltatee.com> <20180313230850.GA45763@bhelgaas-glaptop.roam.corp.google.com> <8de5d3dd-a78f-02d5-0eea-4365364143b6@deltatee.com> <20180314025639.GA50067@bhelgaas-glaptop.roam.corp.google.com> From: Logan Gunthorpe Message-ID: <112493af-ccd0-455b-6600-b50764f7ab7e@deltatee.com> Date: Wed, 14 Mar 2018 10:17:34 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180314025639.GA50067@bhelgaas-glaptop.roam.corp.google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.162 X-SA-Exim-Rcpt-To: alex.williamson@redhat.com, benh@kernel.crashing.org, jglisse@redhat.com, dan.j.williams@intel.com, maxg@mellanox.com, jgg@mellanox.com, bhelgaas@google.com, sagi@grimberg.me, keith.busch@intel.com, axboe@kernel.dk, hch@lst.de, linux-block@vger.kernel.org, linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, okaya@codeaurora.org, sbates@raithlin.com, helgaas@kernel.org X-SA-Exim-Mail-From: logang@deltatee.com Subject: Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/03/18 08:56 PM, Bjorn Helgaas wrote: > I assume you want to exclude Root Ports because of multi-function > devices and the "route to self" error. I was hoping for a reference > to that so I could learn more about it. I haven't been able to find where in the spec it forbids route to self. But I was told this by developers who work know switches. Hopefully Stephen can find the reference. But it's a bit of a moot point. Devices can DMA to themselves if they are designed to do so. For example, some NVMe cards can read and write their own CMB for certain types of DMA request. There is a register in the spec (CMBSZ) which specifies which types of requests are supported. (See 3.1.12 in NVMe 1.3a). > I agree that peers need to have a common upstream bridge. I think > you're saying peers need to have *two* common upstream bridges. If I > understand correctly, requiring two common bridges is a way to ensure > that peers directly below Root Ports don't try to DMA to each other. No, I don't get where you think we need to have two common upstream bridges. I'm not sure when such a case would ever happen. But you seem to understand based on what you wrote below. > So I guess the first order of business is to nail down whether peers > below a Root Port are prohibited from DMAing to each other. My > assumption, based on 6.12.1.2 and the fact that I haven't yet found > a prohibition, is that they can. If you have a multifunction device designed to DMA to itself below a root port, it can. But determining this is on a device by device basis, just as determining whether a root complex can do peer to peer is on a per device basis. So I'd say we don't want to allow it by default and let someone who has such a device figure out what's necessary if and when one comes along. > You already have upstream_bridges_match(), which takes two pci_devs. > I think it should walk up the PCI hierarchy from the first device, > checking whether the bridge at each level is also a parent of the > second device. Yes, this is what I meant when I said walking the entire tree. I've been kicking the can down the road on implementing this as getting ref counting right and testing it is going to be quite tricky. The single switch approach we implemented now is just a simplification which works for a single switch. But I guess we can look at implementing it this way for v4. Logan