From: Bjorn Helgaas <helgaas@kernel.org>
To: Logan Gunthorpe <logang@deltatee.com>,
Alex Williamson <alex.williamson@redhat.com>
Cc: "Jens Axboe" <axboe@kernel.dk>,
"Keith Busch" <keith.busch@intel.com>,
linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
"Jérôme Glisse" <jglisse@redhat.com>,
"Jason Gunthorpe" <jgg@mellanox.com>,
"Christian König" <christian.koenig@amd.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Max Gurtovoy" <maxg@mellanox.com>,
"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches
Date: Mon, 7 May 2018 18:13:06 -0500 [thread overview]
Message-ID: <20180507231306.GG161390@bhelgaas-glaptop.roam.corp.google.com> (raw)
In-Reply-To: <20180423233046.21476-5-logang@deltatee.com>
[+to Alex]
Alex,
Are you happy with this strategy of turning off ACS based on
CONFIG_PCI_P2PDMA? We only check this at enumeration-time and
I don't know if there are other places we would care?
On Mon, Apr 23, 2018 at 05:30:36PM -0600, Logan Gunthorpe wrote:
> For peer-to-peer transactions to work the downstream ports in each
> switch must not have the ACS flags set. At this time there is no way
> to dynamically change the flags and update the corresponding IOMMU
> groups so this is done at enumeration time before the groups are
> assigned.
>
> This effectively means that if CONFIG_PCI_P2PDMA is selected then
> all devices behind any PCIe switch heirarchy will be in the same IOMMU
> group. Which implies that individual devices behind any switch
> heirarchy will not be able to be assigned to separate VMs because
> there is no isolation between them. Additionally, any malicious PCIe
> devices will be able to DMA to memory exposed by other EPs in the same
> domain as TLPs will not be checked by the IOMMU.
>
> Given that the intended use case of P2P Memory is for users with
> custom hardware designed for purpose, we do not expect distributors
> to ever need to enable this option. Users that want to use P2P
> must have compiled a custom kernel with this configuration option
> and understand the implications regarding ACS. They will either
> not require ACS or will have design the system in such a way that
> devices that require isolation will be separate from those using P2P
> transactions.
>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
> drivers/pci/Kconfig | 9 +++++++++
> drivers/pci/p2pdma.c | 45 ++++++++++++++++++++++++++++++---------------
> drivers/pci/pci.c | 6 ++++++
> include/linux/pci-p2pdma.h | 5 +++++
> 4 files changed, 50 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index b2396c22b53e..b6db41d4b708 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -139,6 +139,15 @@ config PCI_P2PDMA
> transations must be between devices behind the same root port.
> (Typically behind a network of PCIe switches).
>
> + Enabling this option will also disable ACS on all ports behind
> + any PCIe switch. This effectively puts all devices behind any
> + switch heirarchy into the same IOMMU group. Which implies that
s/heirarchy/hierarchy/ (also above in changelog)
> + individual devices behind any switch will not be able to be
> + assigned to separate VMs because there is no isolation between
> + them. Additionally, any malicious PCIe devices will be able to
> + DMA to memory exposed by other EPs in the same domain as TLPs
> + will not be checked by the IOMMU.
> +
> If unsure, say N.
>
> config PCI_LABEL
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index ed9dce8552a2..e9f43b43acac 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -240,27 +240,42 @@ static struct pci_dev *find_parent_pci_dev(struct device *dev)
> }
>
> /*
> - * If a device is behind a switch, we try to find the upstream bridge
> - * port of the switch. This requires two calls to pci_upstream_bridge():
> - * one for the upstream port on the switch, one on the upstream port
> - * for the next level in the hierarchy. Because of this, devices connected
> - * to the root port will be rejected.
> + * pci_p2pdma_disable_acs - disable ACS flags for all PCI bridges
> + * @pdev: device to disable ACS flags for
> + *
> + * The ACS flags for P2P Request Redirect and P2P Completion Redirect need
> + * to be disabled on any PCI bridge in order for the TLPS to not be forwarded
> + * up to the RC which is not what we want for P2P.
s/PCI bridge/PCIe switch/ (ACS doesn't apply to conventional PCI)
> + *
> + * This function is called when the devices are first enumerated and
> + * will result in all devices behind any bridge to be in the same IOMMU
> + * group. At this time, there is no way to "hotplug" IOMMU groups so we rely
> + * on this largish hammer. If you need the devices to be in separate groups
> + * don't enable CONFIG_PCI_P2PDMA.
> + *
> + * Returns 1 if the ACS bits for this device was cleared, otherwise 0.
> */
> -static struct pci_dev *get_upstream_bridge_port(struct pci_dev *pdev)
> +int pci_p2pdma_disable_acs(struct pci_dev *pdev)
> {
> - struct pci_dev *up1, *up2;
> + int pos;
> + u16 ctrl;
>
> - if (!pdev)
> - return NULL;
> + if (!pci_is_bridge(pdev))
> + return 0;
>
> - up1 = pci_dev_get(pci_upstream_bridge(pdev));
> - if (!up1)
> - return NULL;
> + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS);
> + if (!pos)
> + return 0;
> +
> + pci_info(pdev, "disabling ACS flags for peer-to-peer DMA\n");
> +
> + pci_read_config_word(pdev, pos + PCI_ACS_CTRL, &ctrl);
> +
> + ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR);
>
> - up2 = pci_dev_get(pci_upstream_bridge(up1));
> - pci_dev_put(up1);
> + pci_write_config_word(pdev, pos + PCI_ACS_CTRL, ctrl);
>
> - return up2;
> + return 1;
> }
>
> /*
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e597655a5643..7e2f5724ba22 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -16,6 +16,7 @@
> #include <linux/of.h>
> #include <linux/of_pci.h>
> #include <linux/pci.h>
> +#include <linux/pci-p2pdma.h>
> #include <linux/pm.h>
> #include <linux/slab.h>
> #include <linux/module.h>
> @@ -2835,6 +2836,11 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> */
> void pci_enable_acs(struct pci_dev *dev)
> {
> +#ifdef CONFIG_PCI_P2PDMA
> + if (pci_p2pdma_disable_acs(dev))
> + return;
> +#endif
> +
> if (!pci_acs_enable)
> return;
>
> diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h
> index 0cde88341eeb..fcb3437a2f3c 100644
> --- a/include/linux/pci-p2pdma.h
> +++ b/include/linux/pci-p2pdma.h
> @@ -18,6 +18,7 @@ struct block_device;
> struct scatterlist;
>
> #ifdef CONFIG_PCI_P2PDMA
> +int pci_p2pdma_disable_acs(struct pci_dev *pdev);
> int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
> u64 offset);
> int pci_p2pdma_add_client(struct list_head *head, struct device *dev);
> @@ -40,6 +41,10 @@ int pci_p2pdma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
> void pci_p2pdma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
> enum dma_data_direction dir);
> #else /* CONFIG_PCI_P2PDMA */
> +static inline int pci_p2pdma_disable_acs(struct pci_dev *pdev)
> +{
> + return 0;
> +}
> static inline int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar,
> size_t size, u64 offset)
> {
> --
> 2.11.0
>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2018-05-07 23:13 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-23 23:30 [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory Logan Gunthorpe
2018-05-07 23:00 ` Bjorn Helgaas
2018-05-07 23:09 ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-05-07 23:02 ` Bjorn Helgaas
2018-04-23 23:30 ` [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches Logan Gunthorpe
2018-04-24 3:33 ` Randy Dunlap
2018-05-07 23:13 ` Bjorn Helgaas [this message]
2018-05-08 7:17 ` Christian König
2018-05-08 14:25 ` Stephen Bates
2018-05-08 16:37 ` Christian König
2018-05-08 16:27 ` Logan Gunthorpe
2018-05-08 16:50 ` Christian König
2018-05-08 19:13 ` Logan Gunthorpe
2018-05-08 19:34 ` Alex Williamson
2018-05-08 19:45 ` Logan Gunthorpe
2018-05-08 20:13 ` Alex Williamson
2018-05-08 20:19 ` Logan Gunthorpe
2018-05-08 20:43 ` Alex Williamson
2018-05-08 20:49 ` Logan Gunthorpe
2018-05-08 21:26 ` Alex Williamson
2018-05-08 21:42 ` Stephen Bates
2018-05-08 22:03 ` Alex Williamson
2018-05-08 22:10 ` Logan Gunthorpe
2018-05-08 22:25 ` Stephen Bates
2018-05-08 23:11 ` Alex Williamson
2018-05-08 23:31 ` Logan Gunthorpe
2018-05-09 0:17 ` Alex Williamson
2018-05-08 22:32 ` Alex Williamson
2018-05-08 23:00 ` Dan Williams
2018-05-08 23:15 ` Logan Gunthorpe
2018-05-09 12:38 ` Stephen Bates
2018-05-08 22:21 ` Don Dutile
2018-05-09 12:44 ` Stephen Bates
2018-05-09 15:58 ` Don Dutile
2018-05-08 20:50 ` Jerome Glisse
2018-05-08 21:35 ` Stephen Bates
2018-05-09 13:12 ` Stephen Bates
2018-05-09 13:40 ` Christian König
2018-05-09 15:41 ` Stephen Bates
2018-05-09 16:07 ` Jerome Glisse
2018-05-09 16:30 ` Stephen Bates
2018-05-09 17:49 ` Jerome Glisse
2018-05-10 14:20 ` Stephen Bates
2018-05-10 14:29 ` Christian König
2018-05-10 14:59 ` Jerome Glisse
2018-05-10 18:44 ` Stephen Bates
2018-05-09 16:45 ` Logan Gunthorpe
2018-05-10 12:52 ` Christian König
2018-05-10 14:16 ` Stephen Bates
2018-05-10 14:41 ` Jerome Glisse
2018-05-10 18:41 ` Stephen Bates
2018-05-10 18:59 ` Logan Gunthorpe
2018-05-10 19:10 ` Alex Williamson
2018-05-10 19:24 ` Jerome Glisse
2018-05-10 16:32 ` Logan Gunthorpe
2018-05-10 17:11 ` Stephen Bates
2018-05-10 17:15 ` Logan Gunthorpe
2018-05-11 8:52 ` Christian König
2018-05-11 15:48 ` Logan Gunthorpe
2018-05-11 21:50 ` Stephen Bates
2018-05-11 22:24 ` Stephen Bates
2018-05-11 22:55 ` Logan Gunthorpe
2018-05-08 14:31 ` Dan Williams
2018-05-08 14:44 ` Stephen Bates
2018-05-08 21:04 ` Don Dutile
2018-05-08 21:27 ` Stephen Bates
2018-05-08 23:06 ` Don Dutile
2018-05-09 0:01 ` Alex Williamson
2018-05-09 12:35 ` Stephen Bates
2018-05-09 14:44 ` Alex Williamson
2018-05-09 15:52 ` Don Dutile
2018-05-09 15:47 ` Don Dutile
2018-05-09 15:53 ` Don Dutile
2018-04-23 23:30 ` [PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation Logan Gunthorpe
2018-05-07 23:20 ` Bjorn Helgaas
2018-05-22 21:24 ` Randy Dunlap
2018-05-22 21:28 ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 14/14] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
2018-05-02 11:51 ` [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Christian König
2018-05-02 15:56 ` Logan Gunthorpe
2018-05-03 9:05 ` Christian König
2018-05-03 15:59 ` Logan Gunthorpe
2018-05-03 17:29 ` Christian König
2018-05-03 18:43 ` Logan Gunthorpe
2018-05-04 14:27 ` Christian König
2018-05-04 15:52 ` Logan Gunthorpe
2018-05-07 23:23 ` Bjorn Helgaas
2018-05-07 23:34 ` Logan Gunthorpe
2018-05-08 16:57 ` Alex Williamson
2018-05-08 19:14 ` Logan Gunthorpe
2018-05-08 21:25 ` Don Dutile
2018-05-08 21:40 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180507231306.GG161390@bhelgaas-glaptop.roam.corp.google.com \
--to=helgaas@kernel.org \
--cc=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=christian.koenig@amd.com \
--cc=hch@lst.de \
--cc=jgg@mellanox.com \
--cc=jglisse@redhat.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=maxg@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).