nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Bjorn Helgaas <helgaas@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Alex Williamson <alex.williamson@redhat.com>
Cc: "Jens Axboe" <axboe@kernel.dk>,
	"Keith Busch" <keith.busch@intel.com>,
	linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Jason Gunthorpe" <jgg@mellanox.com>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Max Gurtovoy" <maxg@mellanox.com>,
	"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches
Date: Tue, 8 May 2018 09:17:51 +0200	[thread overview]
Message-ID: <e58e8d27-42a5-ca69-3b68-c409a2922cd1@amd.com> (raw)
In-Reply-To: <20180507231306.GG161390@bhelgaas-glaptop.roam.corp.google.com>

Hi Bjorn,

Am 08.05.2018 um 01:13 schrieb Bjorn Helgaas:
> [+to Alex]
>
> Alex,
>
> Are you happy with this strategy of turning off ACS based on
> CONFIG_PCI_P2PDMA?  We only check this at enumeration-time and
> I don't know if there are other places we would care?

thanks for pointing this out, I totally missed this hack.

AMD APUs mandatory need the ACS flag set for the GPU integrated in the 
CPU when IOMMU is enabled or otherwise you will break SVM.

Similar problems arise when you do this for dedicated GPU, but we 
haven't upstreamed the support for this yet.

So that is a clear NAK from my side for the approach.

And what exactly is the problem here? I'm currently testing P2P with 
GPUs in different IOMMU domains and at least with AMD IOMMUs that works 
perfectly fine.

Regards,
Christian.

>
> On Mon, Apr 23, 2018 at 05:30:36PM -0600, Logan Gunthorpe wrote:
>> For peer-to-peer transactions to work the downstream ports in each
>> switch must not have the ACS flags set. At this time there is no way
>> to dynamically change the flags and update the corresponding IOMMU
>> groups so this is done at enumeration time before the groups are
>> assigned.
>>
>> This effectively means that if CONFIG_PCI_P2PDMA is selected then
>> all devices behind any PCIe switch heirarchy will be in the same IOMMU
>> group. Which implies that individual devices behind any switch
>> heirarchy will not be able to be assigned to separate VMs because
>> there is no isolation between them. Additionally, any malicious PCIe
>> devices will be able to DMA to memory exposed by other EPs in the same
>> domain as TLPs will not be checked by the IOMMU.
>>
>> Given that the intended use case of P2P Memory is for users with
>> custom hardware designed for purpose, we do not expect distributors
>> to ever need to enable this option. Users that want to use P2P
>> must have compiled a custom kernel with this configuration option
>> and understand the implications regarding ACS. They will either
>> not require ACS or will have design the system in such a way that
>> devices that require isolation will be separate from those using P2P
>> transactions.
>>
>> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
>> ---
>>   drivers/pci/Kconfig        |  9 +++++++++
>>   drivers/pci/p2pdma.c       | 45 ++++++++++++++++++++++++++++++---------------
>>   drivers/pci/pci.c          |  6 ++++++
>>   include/linux/pci-p2pdma.h |  5 +++++
>>   4 files changed, 50 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>> index b2396c22b53e..b6db41d4b708 100644
>> --- a/drivers/pci/Kconfig
>> +++ b/drivers/pci/Kconfig
>> @@ -139,6 +139,15 @@ config PCI_P2PDMA
>>   	  transations must be between devices behind the same root port.
>>   	  (Typically behind a network of PCIe switches).
>>   
>> +	  Enabling this option will also disable ACS on all ports behind
>> +	  any PCIe switch. This effectively puts all devices behind any
>> +	  switch heirarchy into the same IOMMU group. Which implies that
> s/heirarchy/hierarchy/ (also above in changelog)
>
>> +	  individual devices behind any switch will not be able to be
>> +	  assigned to separate VMs because there is no isolation between
>> +	  them. Additionally, any malicious PCIe devices will be able to
>> +	  DMA to memory exposed by other EPs in the same domain as TLPs
>> +	  will not be checked by the IOMMU.
>> +
>>   	  If unsure, say N.
>>   
>>   config PCI_LABEL
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index ed9dce8552a2..e9f43b43acac 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -240,27 +240,42 @@ static struct pci_dev *find_parent_pci_dev(struct device *dev)
>>   }
>>   
>>   /*
>> - * If a device is behind a switch, we try to find the upstream bridge
>> - * port of the switch. This requires two calls to pci_upstream_bridge():
>> - * one for the upstream port on the switch, one on the upstream port
>> - * for the next level in the hierarchy. Because of this, devices connected
>> - * to the root port will be rejected.
>> + * pci_p2pdma_disable_acs - disable ACS flags for all PCI bridges
>> + * @pdev: device to disable ACS flags for
>> + *
>> + * The ACS flags for P2P Request Redirect and P2P Completion Redirect need
>> + * to be disabled on any PCI bridge in order for the TLPS to not be forwarded
>> + * up to the RC which is not what we want for P2P.
> s/PCI bridge/PCIe switch/ (ACS doesn't apply to conventional PCI)
>
>> + *
>> + * This function is called when the devices are first enumerated and
>> + * will result in all devices behind any bridge to be in the same IOMMU
>> + * group. At this time, there is no way to "hotplug" IOMMU groups so we rely
>> + * on this largish hammer. If you need the devices to be in separate groups
>> + * don't enable CONFIG_PCI_P2PDMA.
>> + *
>> + * Returns 1 if the ACS bits for this device was cleared, otherwise 0.
>>    */
>> -static struct pci_dev *get_upstream_bridge_port(struct pci_dev *pdev)
>> +int pci_p2pdma_disable_acs(struct pci_dev *pdev)
>>   {
>> -	struct pci_dev *up1, *up2;
>> +	int pos;
>> +	u16 ctrl;
>>   
>> -	if (!pdev)
>> -		return NULL;
>> +	if (!pci_is_bridge(pdev))
>> +		return 0;
>>   
>> -	up1 = pci_dev_get(pci_upstream_bridge(pdev));
>> -	if (!up1)
>> -		return NULL;
>> +	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS);
>> +	if (!pos)
>> +		return 0;
>> +
>> +	pci_info(pdev, "disabling ACS flags for peer-to-peer DMA\n");
>> +
>> +	pci_read_config_word(pdev, pos + PCI_ACS_CTRL, &ctrl);
>> +
>> +	ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR);
>>   
>> -	up2 = pci_dev_get(pci_upstream_bridge(up1));
>> -	pci_dev_put(up1);
>> +	pci_write_config_word(pdev, pos + PCI_ACS_CTRL, ctrl);
>>   
>> -	return up2;
>> +	return 1;
>>   }
>>   
>>   /*
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index e597655a5643..7e2f5724ba22 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -16,6 +16,7 @@
>>   #include <linux/of.h>
>>   #include <linux/of_pci.h>
>>   #include <linux/pci.h>
>> +#include <linux/pci-p2pdma.h>
>>   #include <linux/pm.h>
>>   #include <linux/slab.h>
>>   #include <linux/module.h>
>> @@ -2835,6 +2836,11 @@ static void pci_std_enable_acs(struct pci_dev *dev)
>>    */
>>   void pci_enable_acs(struct pci_dev *dev)
>>   {
>> +#ifdef CONFIG_PCI_P2PDMA
>> +	if (pci_p2pdma_disable_acs(dev))
>> +		return;
>> +#endif
>> +
>>   	if (!pci_acs_enable)
>>   		return;
>>   
>> diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h
>> index 0cde88341eeb..fcb3437a2f3c 100644
>> --- a/include/linux/pci-p2pdma.h
>> +++ b/include/linux/pci-p2pdma.h
>> @@ -18,6 +18,7 @@ struct block_device;
>>   struct scatterlist;
>>   
>>   #ifdef CONFIG_PCI_P2PDMA
>> +int pci_p2pdma_disable_acs(struct pci_dev *pdev);
>>   int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
>>   		u64 offset);
>>   int pci_p2pdma_add_client(struct list_head *head, struct device *dev);
>> @@ -40,6 +41,10 @@ int pci_p2pdma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
>>   void pci_p2pdma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>>   			 enum dma_data_direction dir);
>>   #else /* CONFIG_PCI_P2PDMA */
>> +static inline int pci_p2pdma_disable_acs(struct pci_dev *pdev)
>> +{
>> +	return 0;
>> +}
>>   static inline int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar,
>>   		size_t size, u64 offset)
>>   {
>> -- 
>> 2.11.0
>>

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2018-05-08  7:18 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 23:30 [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory Logan Gunthorpe
2018-05-07 23:00   ` Bjorn Helgaas
2018-05-07 23:09     ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-05-07 23:02   ` Bjorn Helgaas
2018-04-23 23:30 ` [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches Logan Gunthorpe
2018-04-24  3:33   ` Randy Dunlap
2018-05-07 23:13   ` Bjorn Helgaas
2018-05-08  7:17     ` Christian König [this message]
2018-05-08 14:25       ` Stephen  Bates
2018-05-08 16:37         ` Christian König
2018-05-08 16:27       ` Logan Gunthorpe
2018-05-08 16:50         ` Christian König
2018-05-08 19:13           ` Logan Gunthorpe
2018-05-08 19:34             ` Alex Williamson
2018-05-08 19:45               ` Logan Gunthorpe
2018-05-08 20:13                 ` Alex Williamson
2018-05-08 20:19                   ` Logan Gunthorpe
2018-05-08 20:43                     ` Alex Williamson
2018-05-08 20:49                       ` Logan Gunthorpe
2018-05-08 21:26                         ` Alex Williamson
2018-05-08 21:42                           ` Stephen  Bates
2018-05-08 22:03                             ` Alex Williamson
2018-05-08 22:10                               ` Logan Gunthorpe
2018-05-08 22:25                                 ` Stephen  Bates
2018-05-08 23:11                                   ` Alex Williamson
2018-05-08 23:31                                     ` Logan Gunthorpe
2018-05-09  0:17                                       ` Alex Williamson
2018-05-08 22:32                                 ` Alex Williamson
2018-05-08 23:00                                   ` Dan Williams
2018-05-08 23:15                                     ` Logan Gunthorpe
2018-05-09 12:38                                       ` Stephen  Bates
2018-05-08 22:21                               ` Don Dutile
2018-05-09 12:44                                 ` Stephen  Bates
2018-05-09 15:58                                   ` Don Dutile
2018-05-08 20:50                     ` Jerome Glisse
2018-05-08 21:35                       ` Stephen  Bates
2018-05-09 13:12                       ` Stephen  Bates
2018-05-09 13:40                         ` Christian König
2018-05-09 15:41                           ` Stephen  Bates
2018-05-09 16:07                             ` Jerome Glisse
2018-05-09 16:30                               ` Stephen  Bates
2018-05-09 17:49                                 ` Jerome Glisse
2018-05-10 14:20                                   ` Stephen  Bates
2018-05-10 14:29                                     ` Christian König
2018-05-10 14:59                                       ` Jerome Glisse
2018-05-10 18:44                                         ` Stephen  Bates
2018-05-09 16:45                           ` Logan Gunthorpe
2018-05-10 12:52                             ` Christian König
2018-05-10 14:16                               ` Stephen  Bates
2018-05-10 14:41                                 ` Jerome Glisse
2018-05-10 18:41                                   ` Stephen  Bates
2018-05-10 18:59                                     ` Logan Gunthorpe
2018-05-10 19:10                                     ` Alex Williamson
2018-05-10 19:24                                       ` Jerome Glisse
2018-05-10 16:32                                 ` Logan Gunthorpe
2018-05-10 17:11                                   ` Stephen  Bates
2018-05-10 17:15                                     ` Logan Gunthorpe
2018-05-11  8:52                                       ` Christian König
2018-05-11 15:48                                         ` Logan Gunthorpe
2018-05-11 21:50                                           ` Stephen  Bates
2018-05-11 22:24                                             ` Stephen  Bates
2018-05-11 22:55                                               ` Logan Gunthorpe
2018-05-08 14:31   ` Dan Williams
2018-05-08 14:44     ` Stephen  Bates
2018-05-08 21:04       ` Don Dutile
2018-05-08 21:27         ` Stephen  Bates
2018-05-08 23:06           ` Don Dutile
2018-05-09  0:01             ` Alex Williamson
2018-05-09 12:35               ` Stephen  Bates
2018-05-09 14:44                 ` Alex Williamson
2018-05-09 15:52                   ` Don Dutile
2018-05-09 15:47               ` Don Dutile
2018-05-09 15:53           ` Don Dutile
2018-04-23 23:30 ` [PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation Logan Gunthorpe
2018-05-07 23:20   ` Bjorn Helgaas
2018-05-22 21:24   ` Randy Dunlap
2018-05-22 21:28     ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 14/14] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
2018-05-02 11:51 ` [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Christian König
2018-05-02 15:56   ` Logan Gunthorpe
2018-05-03  9:05     ` Christian König
2018-05-03 15:59       ` Logan Gunthorpe
2018-05-03 17:29         ` Christian König
2018-05-03 18:43           ` Logan Gunthorpe
2018-05-04 14:27             ` Christian König
2018-05-04 15:52               ` Logan Gunthorpe
2018-05-07 23:23 ` Bjorn Helgaas
2018-05-07 23:34   ` Logan Gunthorpe
2018-05-08 16:57   ` Alex Williamson
2018-05-08 19:14     ` Logan Gunthorpe
2018-05-08 21:25     ` Don Dutile
2018-05-08 21:40       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e58e8d27-42a5-ca69-3b68-c409a2922cd1@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=maxg@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).