* [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-05-28 18:40 Alex Williamson 2013-05-28 18:40 ` Alex Williamson ` (4 more replies) 0 siblings, 5 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 18:40 UTC (permalink / raw) To: iommu, dwmw2, joro; +Cc: stephen, linux-pci, ddutile This series tries to address: https://bugzilla.kernel.org/show_bug.cgi?id=44881 Where pci_find_upstream_pcie_bridge() gets lost trying to find the upstream PCIe-to-PCI bridge for a device because the bridge doesn't expose a PCIe capability. To do this, we add a iommu_pci_is_pcie_bridge function which includes a quirk to look to the next upstream device as a sanity check. We can then replace pci_find_upstream_pcie_bridge with a function that's a bit more generic and less tied to intel-iommu eccentricities. v2 uses the same logic as v1, but moves the search and match code to IOMMU-core since PCI-core doesn't want it. v1 has several reports from users that this solves the problem they have in the above bz. Thanks, Alex --- Alex Williamson (2): iommu: Quirked PCIe bridge test and search function intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge drivers/iommu/Kconfig | 5 ++ drivers/iommu/Makefile | 1 drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- drivers/iommu/intel_irq_remapping.c | 15 +++++-- drivers/iommu/pci.c | 69 +++++++++++++++++++++++++++++++ drivers/iommu/pci.h | 23 ++++++++++ 6 files changed, 157 insertions(+), 33 deletions(-) create mode 100644 drivers/iommu/pci.c ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-05-28 18:40 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 18:40 UTC (permalink / raw) To: iommu, dwmw2, joro; +Cc: stephen, linux-pci, ddutile Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec and do not include a PCIe capability. pci_is_pcie() is not useful on these devices, making it difficult to determine where we transition from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked bridge test for fear of confusion that a PCIe capability would be expected, so implement it in IOMMU code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/iommu/Kconfig | 3 ++ drivers/iommu/Makefile | 1 + drivers/iommu/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/pci.h | 23 ++++++++++++++++ 4 files changed, 96 insertions(+) create mode 100644 drivers/iommu/pci.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index c332fb9..a591794 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -2,6 +2,9 @@ config IOMMU_API bool +config IOMMU_PCI + bool + menuconfig IOMMU_SUPPORT bool "IOMMU Hardware Support" default y diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index ef0e520..f14d905 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -15,3 +15,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_SHMOBILE_IOMMU) += shmobile-iommu.o obj-$(CONFIG_SHMOBILE_IPMMU) += shmobile-ipmmu.o +obj-$(CONFIG_IOMMU_PCI) += pci.o diff --git a/drivers/iommu/pci.c b/drivers/iommu/pci.c new file mode 100644 index 0000000..c3cae8d --- /dev/null +++ b/drivers/iommu/pci.c @@ -0,0 +1,69 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson <alex.williamson@redhat.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <linux/pci.h> + +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) +{ + if (!pdev->subordinate) + return false; + + if (pci_is_pcie(pdev)) + return true; + +#ifdef CONFIG_PCI_QUIRKS + /* + * If we're not on the root bus, look one device upstream of the + * current device. If that device is PCIe and is not a PCIe-to-PCI + * bridge, then the current device is effectively PCIe as it must + * be the PCIe-to-PCI bridge. This handles several bridges that + * violate the PCIe spec by not exposing a PCIe capability: + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 + */ + if (!pci_is_root_bus(pdev->bus)) { + struct pci_dev *parent = pdev->bus->self; + + if (pci_is_pcie(parent) && + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) + return true; + } +#endif + return false; +} + +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last) +{ + *last = NULL; + + if (match(pdev)) + return pdev; + + *last = pdev; + + while (!pci_is_root_bus(pdev->bus)) { + *last = pdev; + pdev = pdev->bus->self; + + if (match(pdev)) + return pdev; + } + + return NULL; +} diff --git a/drivers/iommu/pci.h b/drivers/iommu/pci.h index 352d80a..cc6d58a 100644 --- a/drivers/iommu/pci.h +++ b/drivers/iommu/pci.h @@ -26,4 +26,27 @@ static inline void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) *from = to; } +/** + * iommu_pci_is_pcie_bridge - Match a PCIe bridge device + * @pdev: device to test + * + * Return true if the given device is a PCIe bridge, false otherwise. + */ +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev); + +/** + * iommu_pci_find_upstream - Generic upstream search function + * @pdev: starting PCI device to search + * @match: match function to call on each device (true = match) + * @last: last device examined prior to returned device + * + * Walk upstream from the given device, calling match() at each + * device, including self. Returns the first device matching match(). + * If the root bus is reached without finding a match, return NULL. + * last returns the N-1 step in the search path. + */ +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last); + #endif /* __IOMMU_PCI_H */ ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-05-28 18:40 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 18:40 UTC (permalink / raw) To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, joro-zLv9SwRftAIdnm+yROfE0A Cc: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, linux-pci-u79uwXL29TY76Z2rM5mHXA Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec and do not include a PCIe capability. pci_is_pcie() is not useful on these devices, making it difficult to determine where we transition from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked bridge test for fear of confusion that a PCIe capability would be expected, so implement it in IOMMU code. Signed-off-by: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- drivers/iommu/Kconfig | 3 ++ drivers/iommu/Makefile | 1 + drivers/iommu/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/pci.h | 23 ++++++++++++++++ 4 files changed, 96 insertions(+) create mode 100644 drivers/iommu/pci.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index c332fb9..a591794 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -2,6 +2,9 @@ config IOMMU_API bool +config IOMMU_PCI + bool + menuconfig IOMMU_SUPPORT bool "IOMMU Hardware Support" default y diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index ef0e520..f14d905 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -15,3 +15,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_SHMOBILE_IOMMU) += shmobile-iommu.o obj-$(CONFIG_SHMOBILE_IPMMU) += shmobile-ipmmu.o +obj-$(CONFIG_IOMMU_PCI) += pci.o diff --git a/drivers/iommu/pci.c b/drivers/iommu/pci.c new file mode 100644 index 0000000..c3cae8d --- /dev/null +++ b/drivers/iommu/pci.c @@ -0,0 +1,69 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <linux/pci.h> + +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) +{ + if (!pdev->subordinate) + return false; + + if (pci_is_pcie(pdev)) + return true; + +#ifdef CONFIG_PCI_QUIRKS + /* + * If we're not on the root bus, look one device upstream of the + * current device. If that device is PCIe and is not a PCIe-to-PCI + * bridge, then the current device is effectively PCIe as it must + * be the PCIe-to-PCI bridge. This handles several bridges that + * violate the PCIe spec by not exposing a PCIe capability: + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 + */ + if (!pci_is_root_bus(pdev->bus)) { + struct pci_dev *parent = pdev->bus->self; + + if (pci_is_pcie(parent) && + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) + return true; + } +#endif + return false; +} + +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last) +{ + *last = NULL; + + if (match(pdev)) + return pdev; + + *last = pdev; + + while (!pci_is_root_bus(pdev->bus)) { + *last = pdev; + pdev = pdev->bus->self; + + if (match(pdev)) + return pdev; + } + + return NULL; +} diff --git a/drivers/iommu/pci.h b/drivers/iommu/pci.h index 352d80a..cc6d58a 100644 --- a/drivers/iommu/pci.h +++ b/drivers/iommu/pci.h @@ -26,4 +26,27 @@ static inline void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) *from = to; } +/** + * iommu_pci_is_pcie_bridge - Match a PCIe bridge device + * @pdev: device to test + * + * Return true if the given device is a PCIe bridge, false otherwise. + */ +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev); + +/** + * iommu_pci_find_upstream - Generic upstream search function + * @pdev: starting PCI device to search + * @match: match function to call on each device (true = match) + * @last: last device examined prior to returned device + * + * Walk upstream from the given device, calling match() at each + * device, including self. Returns the first device matching match(). + * If the root bus is reached without finding a match, return NULL. + * last returns the N-1 step in the search path. + */ +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last); + #endif /* __IOMMU_PCI_H */ ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-05-28 18:40 ` Alex Williamson (?) @ 2013-05-28 19:38 ` Stephen Hemminger 2013-05-28 19:53 ` Alex Williamson -1 siblings, 1 reply; 34+ messages in thread From: Stephen Hemminger @ 2013-05-28 19:38 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, joro, linux-pci, ddutile On Tue, 28 May 2013 12:40:20 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > and do not include a PCIe capability. pci_is_pcie() is not useful on > these devices, making it difficult to determine where we transition > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > bridge test for fear of confusion that a PCIe capability would be > expected, so implement it in IOMMU code. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> This won't work for me because I run without IOMMU. The overhead of IOMMU setup per-packet significantly impacts network performance. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-05-28 19:38 ` Stephen Hemminger @ 2013-05-28 19:53 ` Alex Williamson 2013-05-28 19:56 ` Stephen Hemminger 0 siblings, 1 reply; 34+ messages in thread From: Alex Williamson @ 2013-05-28 19:53 UTC (permalink / raw) To: Stephen Hemminger; +Cc: iommu, dwmw2, joro, linux-pci, ddutile On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > On Tue, 28 May 2013 12:40:20 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > these devices, making it difficult to determine where we transition > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > bridge test for fear of confusion that a PCIe capability would be > > expected, so implement it in IOMMU code. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > This won't work for me because I run without IOMMU. > The overhead of IOMMU setup per-packet significantly impacts network performance. You were one of the reporters of this issue: http://www.spinics.net/lists/linux-pci/msg19579.html Does this not work for you or are you saying this is not relevant to you because you no longer attempt to enable the IOMMU? Thanks, Alex ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-05-28 19:53 ` Alex Williamson @ 2013-05-28 19:56 ` Stephen Hemminger 2013-05-28 20:15 ` Alex Williamson 0 siblings, 1 reply; 34+ messages in thread From: Stephen Hemminger @ 2013-05-28 19:56 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, joro, linux-pci, ddutile On Tue, 28 May 2013 13:53:10 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > On Tue, 28 May 2013 12:40:20 -0600 > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > these devices, making it difficult to determine where we transition > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > bridge test for fear of confusion that a PCIe capability would be > > > expected, so implement it in IOMMU code. > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > This won't work for me because I run without IOMMU. > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > You were one of the reporters of this issue: > > http://www.spinics.net/lists/linux-pci/msg19579.html > > Does this not work for you or are you saying this is not relevant to you > because you no longer attempt to enable the IOMMU? Thanks, > > Alex > For most usage, I gave up on using IOMMU/SR-IOV. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-05-28 19:56 ` Stephen Hemminger @ 2013-05-28 20:15 ` Alex Williamson 2013-05-28 20:28 ` Stephen Hemminger 0 siblings, 1 reply; 34+ messages in thread From: Alex Williamson @ 2013-05-28 20:15 UTC (permalink / raw) To: Stephen Hemminger; +Cc: iommu, dwmw2, joro, linux-pci, ddutile On Tue, 2013-05-28 at 12:56 -0700, Stephen Hemminger wrote: > On Tue, 28 May 2013 13:53:10 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > > On Tue, 28 May 2013 12:40:20 -0600 > > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > > these devices, making it difficult to determine where we transition > > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > > bridge test for fear of confusion that a PCIe capability would be > > > > expected, so implement it in IOMMU code. > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > > This won't work for me because I run without IOMMU. > > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > > > You were one of the reporters of this issue: > > > > http://www.spinics.net/lists/linux-pci/msg19579.html > > > > Does this not work for you or are you saying this is not relevant to you > > because you no longer attempt to enable the IOMMU? Thanks, > > > > For most usage, I gave up on using IOMMU/SR-IOV. Ok, let's call that a "don't care" rather than "won't work" for the purposes of these patches. I'll take you off the CC if there's another version. Thanks, Alex ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-05-28 20:15 ` Alex Williamson @ 2013-05-28 20:28 ` Stephen Hemminger 0 siblings, 0 replies; 34+ messages in thread From: Stephen Hemminger @ 2013-05-28 20:28 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, joro, linux-pci, ddutile On Tue, 28 May 2013 14:15:24 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-05-28 at 12:56 -0700, Stephen Hemminger wrote: > > On Tue, 28 May 2013 13:53:10 -0600 > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > > > On Tue, 28 May 2013 12:40:20 -0600 > > > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > > > these devices, making it difficult to determine where we transition > > > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > > > bridge test for fear of confusion that a PCIe capability would be > > > > > expected, so implement it in IOMMU code. > > > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > > > > This won't work for me because I run without IOMMU. > > > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > > > > > You were one of the reporters of this issue: > > > > > > http://www.spinics.net/lists/linux-pci/msg19579.html > > > > > > Does this not work for you or are you saying this is not relevant to you > > > because you no longer attempt to enable the IOMMU? Thanks, > > > > > > > For most usage, I gave up on using IOMMU/SR-IOV. > > Ok, let's call that a "don't care" rather than "won't work" for the > purposes of these patches. I'll take you off the CC if there's another > version. Thanks, > > Alex > Yes, it is a don't care. but leave me on cc, because I may still need it. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-20 13:59 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-06-20 13:59 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, stephen, linux-pci, ddutile On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) > +{ > + if (!pdev->subordinate) > + return false; > + > + if (pci_is_pcie(pdev)) > + return true; > + > +#ifdef CONFIG_PCI_QUIRKS > + /* > + * If we're not on the root bus, look one device upstream of the > + * current device. If that device is PCIe and is not a PCIe-to-PCI > + * bridge, then the current device is effectively PCIe as it must > + * be the PCIe-to-PCI bridge. This handles several bridges that > + * violate the PCIe spec by not exposing a PCIe capability: > + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 > + */ > + if (!pci_is_root_bus(pdev->bus)) { > + struct pci_dev *parent = pdev->bus->self; > + > + if (pci_is_pcie(parent) && > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > + return true; > + } Hmm, that looks a bit dangerous. Do we have a list of PCI vendor/device-ids of these broken bridges to match against instead of some open-coded heuristics? That would probably also help to bring this into the PCI code. Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-20 13:59 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-06-20 13:59 UTC (permalink / raw) To: Alex Williamson Cc: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, linux-pci-u79uwXL29TY76Z2rM5mHXA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) > +{ > + if (!pdev->subordinate) > + return false; > + > + if (pci_is_pcie(pdev)) > + return true; > + > +#ifdef CONFIG_PCI_QUIRKS > + /* > + * If we're not on the root bus, look one device upstream of the > + * current device. If that device is PCIe and is not a PCIe-to-PCI > + * bridge, then the current device is effectively PCIe as it must > + * be the PCIe-to-PCI bridge. This handles several bridges that > + * violate the PCIe spec by not exposing a PCIe capability: > + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 > + */ > + if (!pci_is_root_bus(pdev->bus)) { > + struct pci_dev *parent = pdev->bus->self; > + > + if (pci_is_pcie(parent) && > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > + return true; > + } Hmm, that looks a bit dangerous. Do we have a list of PCI vendor/device-ids of these broken bridges to match against instead of some open-coded heuristics? That would probably also help to bring this into the PCI code. Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-06-20 13:59 ` Joerg Roedel (?) @ 2013-06-20 15:44 ` Alex Williamson 2013-06-20 16:15 ` Joerg Roedel -1 siblings, 1 reply; 34+ messages in thread From: Alex Williamson @ 2013-06-20 15:44 UTC (permalink / raw) To: Joerg Roedel; +Cc: iommu, dwmw2, stephen, linux-pci, ddutile, Bjorn Helgaas On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > > +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) > > +{ > > + if (!pdev->subordinate) > > + return false; > > + > > + if (pci_is_pcie(pdev)) > > + return true; > > + > > +#ifdef CONFIG_PCI_QUIRKS > > + /* > > + * If we're not on the root bus, look one device upstream of the > > + * current device. If that device is PCIe and is not a PCIe-to-PCI > > + * bridge, then the current device is effectively PCIe as it must > > + * be the PCIe-to-PCI bridge. This handles several bridges that > > + * violate the PCIe spec by not exposing a PCIe capability: > > + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > + */ > > + if (!pci_is_root_bus(pdev->bus)) { > > + struct pci_dev *parent = pdev->bus->self; > > + > > + if (pci_is_pcie(parent) && > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > > + return true; > > + } > > Hmm, that looks a bit dangerous. How so? The algorithm seems pretty simple and logical. > Do we have a list of PCI > vendor/device-ids of these broken bridges to match against instead of > some open-coded heuristics? > > That would probably also help to bring this into the PCI code. Actually, I believe Bjorn rejected the idea of a fixed list because this problem is detectable. He also doesn't want me messing with quirks to pci_is_pcie() in PCI because he wants a 1:1 relation between that and having a PCIe capability. So, I'm stuck and this is where it's ended up. Thanks, Alex ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-20 16:15 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-06-20 16:15 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, stephen, linux-pci, ddutile, Bjorn Helgaas On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > > > + if (!pci_is_root_bus(pdev->bus)) { > > > + struct pci_dev *parent = pdev->bus->self; > > > + > > > + if (pci_is_pcie(parent) && > > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > > > + return true; > > > + } > > > > Hmm, that looks a bit dangerous. > > How so? The algorithm seems pretty simple and logical. It is simple, but it is still a heuristic that may fail at some point, no? > Actually, I believe Bjorn rejected the idea of a fixed list because this > problem is detectable. He also doesn't want me messing with quirks to > pci_is_pcie() in PCI because he wants a 1:1 relation between that and > having a PCIe capability. So, I'm stuck and this is where it's ended > up. Thanks, I think implementing such a list is much safer. Bjorn, why didn't you like that idea? Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-20 16:15 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-06-20 16:15 UTC (permalink / raw) To: Alex Williamson Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Bjorn Helgaas, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > > > + if (!pci_is_root_bus(pdev->bus)) { > > > + struct pci_dev *parent = pdev->bus->self; > > > + > > > + if (pci_is_pcie(parent) && > > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > > > + return true; > > > + } > > > > Hmm, that looks a bit dangerous. > > How so? The algorithm seems pretty simple and logical. It is simple, but it is still a heuristic that may fail at some point, no? > Actually, I believe Bjorn rejected the idea of a fixed list because this > problem is detectable. He also doesn't want me messing with quirks to > pci_is_pcie() in PCI because he wants a 1:1 relation between that and > having a PCIe capability. So, I'm stuck and this is where it's ended > up. Thanks, I think implementing such a list is much safer. Bjorn, why didn't you like that idea? Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-06-20 16:15 ` Joerg Roedel (?) @ 2013-06-26 4:20 ` Bjorn Helgaas 2013-06-26 18:45 ` Alex Williamson -1 siblings, 1 reply; 34+ messages in thread From: Bjorn Helgaas @ 2013-06-26 4:20 UTC (permalink / raw) To: Joerg Roedel Cc: Alex Williamson, open list:INTEL IOMMU (VT-d), David Woodhouse, Stephen Hemminger, linux-pci, Don Dutile On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: >> > > + if (!pci_is_root_bus(pdev->bus)) { >> > > + struct pci_dev *parent = pdev->bus->self; >> > > + >> > > + if (pci_is_pcie(parent) && >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) >> > > + return true; >> > > + } >> > >> > Hmm, that looks a bit dangerous. >> >> How so? The algorithm seems pretty simple and logical. > > It is simple, but it is still a heuristic that may fail at some point, > no? > >> Actually, I believe Bjorn rejected the idea of a fixed list because this >> problem is detectable. He also doesn't want me messing with quirks to >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and >> having a PCIe capability. So, I'm stuck and this is where it's ended >> up. Thanks, > > I think implementing such a list is much safer. > > Bjorn, why didn't you like that idea? Sorry, I can't remember, and I haven't been able to find the discussion where I said that. I think the current patches are all in drivers/iommu, and if a list makes sense there, it's fine with me. Bjorn ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-26 18:45 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-06-26 18:45 UTC (permalink / raw) To: Bjorn Helgaas Cc: Joerg Roedel, open list:INTEL IOMMU (VT-d), David Woodhouse, Stephen Hemminger, linux-pci, Don Dutile On Tue, 2013-06-25 at 22:20 -0600, Bjorn Helgaas wrote: > On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: > > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > >> > > + if (!pci_is_root_bus(pdev->bus)) { > >> > > + struct pci_dev *parent = pdev->bus->self; > >> > > + > >> > > + if (pci_is_pcie(parent) && > >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > >> > > + return true; > >> > > + } > >> > > >> > Hmm, that looks a bit dangerous. > >> > >> How so? The algorithm seems pretty simple and logical. > > > > It is simple, but it is still a heuristic that may fail at some point, > > no? > > > >> Actually, I believe Bjorn rejected the idea of a fixed list because this > >> problem is detectable. He also doesn't want me messing with quirks to > >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and > >> having a PCIe capability. So, I'm stuck and this is where it's ended > >> up. Thanks, > > > > I think implementing such a list is much safer. > > > > Bjorn, why didn't you like that idea? > > Sorry, I can't remember, and I haven't been able to find the > discussion where I said that. I think the current patches are all in > drivers/iommu, and if a list makes sense there, it's fine with me. Here's the comment I remember https://bugzilla.kernel.org/show_bug.cgi?id=44881#c7 Comment #7 From Bjorn Helgaas 2012-08-23 15:58:39 [snip] I doubt the upstream device is at fault. More likely the downstream device is really a PCIe device (a PCIe-to-PCI bridge) but just fails to report a PCIe capability. I think this situation is likely too common to deal with via quirks, so we'll have to figure out a way to just make this work. Thanks, Alex ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function @ 2013-06-26 18:45 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-06-26 18:45 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Stephen Hemminger, open list:INTEL IOMMU (VT-d), David Woodhouse On Tue, 2013-06-25 at 22:20 -0600, Bjorn Helgaas wrote: > On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote: > > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > >> > > + if (!pci_is_root_bus(pdev->bus)) { > >> > > + struct pci_dev *parent = pdev->bus->self; > >> > > + > >> > > + if (pci_is_pcie(parent) && > >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > >> > > + return true; > >> > > + } > >> > > >> > Hmm, that looks a bit dangerous. > >> > >> How so? The algorithm seems pretty simple and logical. > > > > It is simple, but it is still a heuristic that may fail at some point, > > no? > > > >> Actually, I believe Bjorn rejected the idea of a fixed list because this > >> problem is detectable. He also doesn't want me messing with quirks to > >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and > >> having a PCIe capability. So, I'm stuck and this is where it's ended > >> up. Thanks, > > > > I think implementing such a list is much safer. > > > > Bjorn, why didn't you like that idea? > > Sorry, I can't remember, and I haven't been able to find the > discussion where I said that. I think the current patches are all in > drivers/iommu, and if a list makes sense there, it's fine with me. Here's the comment I remember https://bugzilla.kernel.org/show_bug.cgi?id=44881#c7 Comment #7 From Bjorn Helgaas 2012-08-23 15:58:39 [snip] I doubt the upstream device is at fault. More likely the downstream device is really a PCIe device (a PCIe-to-PCI bridge) but just fails to report a PCIe capability. I think this situation is likely too common to deal with via quirks, so we'll have to figure out a way to just make this work. Thanks, Alex ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function 2013-06-26 18:45 ` Alex Williamson (?) @ 2013-06-26 19:11 ` Bjorn Helgaas -1 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-06-26 19:11 UTC (permalink / raw) To: Alex Williamson Cc: Joerg Roedel, open list:INTEL IOMMU (VT-d), David Woodhouse, Stephen Hemminger, linux-pci, Don Dutile On Wed, Jun 26, 2013 at 12:45 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-06-25 at 22:20 -0600, Bjorn Helgaas wrote: >> On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: >> > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: >> >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: >> >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: >> >> > > + if (!pci_is_root_bus(pdev->bus)) { >> >> > > + struct pci_dev *parent = pdev->bus->self; >> >> > > + >> >> > > + if (pci_is_pcie(parent) && >> >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) >> >> > > + return true; >> >> > > + } >> >> > >> >> > Hmm, that looks a bit dangerous. >> >> >> >> How so? The algorithm seems pretty simple and logical. >> > >> > It is simple, but it is still a heuristic that may fail at some point, >> > no? >> > >> >> Actually, I believe Bjorn rejected the idea of a fixed list because this >> >> problem is detectable. He also doesn't want me messing with quirks to >> >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and >> >> having a PCIe capability. So, I'm stuck and this is where it's ended >> >> up. Thanks, >> > >> > I think implementing such a list is much safer. >> > >> > Bjorn, why didn't you like that idea? >> >> Sorry, I can't remember, and I haven't been able to find the >> discussion where I said that. I think the current patches are all in >> drivers/iommu, and if a list makes sense there, it's fine with me. > > Here's the comment I remember > > https://bugzilla.kernel.org/show_bug.cgi?id=44881#c7 > > Comment #7 From Bjorn Helgaas 2012-08-23 15:58:39 > [snip] > I doubt the upstream device is at fault. More likely the > downstream device is really a PCIe device (a PCIe-to-PCI bridge) > but just fails to report a PCIe capability. I think this > situation is likely too common to deal with via quirks, so we'll > have to figure out a way to just make this work. OK, I remember that now. So the question is whether you want a list or a set of quirks that may be an ongoing maintenance burden, or whether you want an algorithm that may be risky but possibly less maintenance. I preferred the latter. I think a failure in the algorithm will most likely result in a device that just doesn't work (because we derived a DMA source ID that doesn't match what the IOMMU sees), so at least the impact is relatively minor, and no worse than a missing entry in the list of exception devices. Bjorn ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 2/2] intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge 2013-05-28 18:40 [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges Alex Williamson 2013-05-28 18:40 ` Alex Williamson @ 2013-05-28 18:40 ` Alex Williamson 2013-05-28 22:09 ` Bjorn Helgaas ` (2 subsequent siblings) 4 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 18:40 UTC (permalink / raw) To: iommu, dwmw2, joro; +Cc: stephen, linux-pci, ddutile pci_find_upstream_pcie_bridge() attempts to return the PCIe-to-PCI bridge upstream from the provided device. It currently returns NULL if 1) the provided device is PCIe (ie. it does not have such an upstream bridge), 2) the device is on the root bus (a subtle case of its claim to return the parent device when not connected to a PCIe bridge), or 3) when it gets lost walking the bus and finds itself on a PCIe device that is not a PCIe-to-PCI bridge. This latter case not only generates a WARN, but treats the provided device the same as if it were PCIe. We can replace 1) and 2) with explicit tests in the IOMMU code as this is the code that knows the visibility of devices at the root bus. That leaves the bus walk case 3), where the NULL return is actually an error. We hit this error because some PCIe-to-PCI bridge vendors ignore the PCIe spec requirement that all PCIe devices must include a PCIe capability, which is all that pci_is_pcie() actually tests. When PCI quirks are enabled, iommu_pci_is_pcie_bridge() adds one more test by looking at the next upstream device to test whether it is PCIe with a type code of a PCIe-to-PCI/PCI-X bridge. If it does not have this type code, then by deduction, the current bridge must be a PCIe-to-PCI bridge and thus be a PCIe bridge. Therefore, pci_find_upstream_pcie_bridge() is replaced by an explicit call to (pci_is_pcie() || pci_is_root_bus()) plus iommu_pci_find_upstream() with the iommu_pci_is_pcie_bridge() match function. This either returns the first PCIe bridge, which must be a PCIe-to-PCI bridge or NULL if not found. In the NULL case, we know the original device is not on the root bus (already tested for that), so we can safely walk from the provided device to the last device and fall into the legacy PCI bridge code. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=44881 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/iommu/Kconfig | 2 + drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- drivers/iommu/intel_irq_remapping.c | 15 +++++-- 3 files changed, 61 insertions(+), 33 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index a591794..71ec2d5 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -85,6 +85,7 @@ config INTEL_IOMMU depends on PCI_MSI && ACPI && (X86 || IA64_GENERIC) select IOMMU_API select DMAR_TABLE + select IOMMU_PCI help DMA remapping (DMAR) devices support enables independent address translations for Direct Memory Access (DMA) from devices. @@ -125,6 +126,7 @@ config IRQ_REMAP bool "Support for Interrupt Remapping" depends on X86_64 && X86_IO_APIC && PCI_MSI && ACPI select DMAR_TABLE + select IOMMU_PCI ---help--- Supports Interrupt remapping for IO-APIC and MSI devices. To use x2apic mode in the CPU's which support x2APIC enhancements or diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b4f0e28..3748187 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1689,9 +1689,13 @@ domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev, return ret; /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); - if (!tmp) + if (pci_is_pcie(pdev) || pci_is_root_bus(pdev->bus)) return 0; + + tmp = iommu_pci_find_upstream(pdev, iommu_pci_is_pcie_bridge, &parent); + if (!tmp) + tmp = parent; + /* Secondary interface's bus number and devfn 0 */ parent = pdev->bus->self; while (parent != tmp) { @@ -1703,7 +1707,7 @@ domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev, return ret; parent = parent->bus->self; } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ + if (iommu_pci_is_pcie_bridge(tmp)) /* this is a PCIe-to-PCI bridge */ return domain_context_mapping_one(domain, pci_domain_nr(tmp->subordinate), tmp->subordinate->number, 0, @@ -1731,9 +1735,13 @@ static int domain_context_mapped(struct pci_dev *pdev) if (!ret) return ret; /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); + if (pci_is_pcie(pdev) || pci_is_root_bus(pdev->bus)) + return 0; + + tmp = iommu_pci_find_upstream(pdev, iommu_pci_is_pcie_bridge, &parent); if (!tmp) - return ret; + tmp = parent; + /* Secondary interface's bus number and devfn 0 */ parent = pdev->bus->self; while (parent != tmp) { @@ -1743,7 +1751,7 @@ static int domain_context_mapped(struct pci_dev *pdev) return ret; parent = parent->bus->self; } - if (pci_is_pcie(tmp)) + if (iommu_pci_is_pcie_bridge(tmp)) return device_context_mapped(iommu, tmp->subordinate->number, 0); else @@ -1974,7 +1982,7 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) struct intel_iommu *iommu; struct dmar_drhd_unit *drhd; struct device_domain_info *info, *tmp; - struct pci_dev *dev_tmp; + struct pci_dev *dev_tmp = NULL; unsigned long flags; int bus = 0, devfn = 0; int segment; @@ -1986,9 +1994,16 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) segment = pci_domain_nr(pdev->bus); - dev_tmp = pci_find_upstream_pcie_bridge(pdev); - if (dev_tmp) { - if (pci_is_pcie(dev_tmp)) { + if (!pci_is_pcie(pdev) && !pci_is_root_bus(pdev->bus)) { + struct pci_dev *last; + + dev_tmp = iommu_pci_find_upstream(pdev, + iommu_pci_is_pcie_bridge, + &last); + if (!dev_tmp) + dev_tmp = last; + + if (iommu_pci_is_pcie_bridge(dev_tmp)) { bus = dev_tmp->subordinate->number; devfn = 0; } else { @@ -3754,26 +3769,24 @@ static void iommu_detach_dependent_devices(struct intel_iommu *iommu, { struct pci_dev *tmp, *parent; - if (!iommu || !pdev) + if (!iommu || !pdev || pci_is_pcie(pdev) || pci_is_root_bus(pdev->bus)) return; /* dependent device detach */ - tmp = pci_find_upstream_pcie_bridge(pdev); + tmp = iommu_pci_find_upstream(pdev, iommu_pci_is_pcie_bridge, &parent); + if (!tmp) + tmp = parent; + /* Secondary interface's bus number and devfn 0 */ - if (tmp) { - parent = pdev->bus->self; - while (parent != tmp) { - iommu_detach_dev(iommu, parent->bus->number, - parent->devfn); - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ - iommu_detach_dev(iommu, - tmp->subordinate->number, 0); - else /* this is a legacy PCI bridge */ - iommu_detach_dev(iommu, tmp->bus->number, - tmp->devfn); + parent = pdev->bus->self; + while (parent != tmp) { + iommu_detach_dev(iommu, parent->bus->number, parent->devfn); + parent = parent->bus->self; } + if (iommu_pci_is_pcie_bridge(tmp)) /* this is a PCIe-to-PCI bridge */ + iommu_detach_dev(iommu, tmp->subordinate->number, 0); + else /* this is a legacy PCI bridge */ + iommu_detach_dev(iommu, tmp->bus->number, tmp->devfn); } static void domain_remove_one_dev_info(struct dmar_domain *domain, @@ -4158,7 +4171,7 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain, static int intel_iommu_add_device(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); - struct pci_dev *bridge, *dma_pdev = NULL; + struct pci_dev *dma_pdev = NULL; struct iommu_group *group; int ret; @@ -4166,9 +4179,15 @@ static int intel_iommu_add_device(struct device *dev) pdev->bus->number, pdev->devfn)) return -ENODEV; - bridge = pci_find_upstream_pcie_bridge(pdev); - if (bridge) { - if (pci_is_pcie(bridge)) + if (!pci_is_pcie(pdev) && !pci_is_root_bus(pdev->bus)) { + struct pci_dev *bridge, *last; + + bridge = iommu_pci_find_upstream(pdev,iommu_pci_is_pcie_bridge, + &last); + if (!bridge) + bridge = last; + + if (iommu_pci_is_pcie_bridge(bridge)) dma_pdev = pci_get_domain_bus_and_slot( pci_domain_nr(pdev->bus), bridge->subordinate->number, 0); diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c index 5b19b2d..2a6d27f 100644 --- a/drivers/iommu/intel_irq_remapping.c +++ b/drivers/iommu/intel_irq_remapping.c @@ -16,6 +16,7 @@ #include <asm/msidef.h> #include "irq_remapping.h" +#include "pci.h" struct ioapic_scope { struct intel_iommu *iommu; @@ -373,7 +374,6 @@ static int set_hpet_sid(struct irte *irte, u8 id) static int set_msi_sid(struct irte *irte, struct pci_dev *dev) { - struct pci_dev *bridge; if (!irte || !dev) return -1; @@ -385,9 +385,16 @@ static int set_msi_sid(struct irte *irte, struct pci_dev *dev) return 0; } - bridge = pci_find_upstream_pcie_bridge(dev); - if (bridge) { - if (pci_is_pcie(bridge))/* this is a PCIe-to-PCI/PCIX bridge */ + if (!pci_is_pcie(dev) && !pci_is_root_bus(dev->bus)) { + struct pci_dev *bridge, *last; + + bridge = iommu_pci_find_upstream(dev, iommu_pci_is_pcie_bridge, + &last); + if (!bridge) + bridge = last; + + /* PCIe-to-PCI/PCIX bridge */ + if (iommu_pci_is_pcie_bridge(bridge)) set_irte_sid(irte, SVT_VERIFY_BUS, SQ_ALL_16, (bridge->bus->number << 8) | dev->bus->number); else /* this is a legacy PCI bridge */ ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-05-28 22:09 ` Bjorn Helgaas 0 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-05-28 22:09 UTC (permalink / raw) To: Alex Williamson Cc: open list:INTEL IOMMU (VT-d), David Woodhouse, Joerg Roedel, Stephen Hemminger, linux-pci, Don Dutile On Tue, May 28, 2013 at 12:40 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > This series tries to address: > > https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > Where pci_find_upstream_pcie_bridge() gets lost trying to find the > upstream PCIe-to-PCI bridge for a device because the bridge doesn't > expose a PCIe capability. To do this, we add a iommu_pci_is_pcie_bridge > function which includes a quirk to look to the next upstream device > as a sanity check. We can then replace pci_find_upstream_pcie_bridge > with a function that's a bit more generic and less tied to intel-iommu > eccentricities. > > v2 uses the same logic as v1, but moves the search and match code to > IOMMU-core since PCI-core doesn't want it. v1 has several reports > from users that this solves the problem they have in the above bz. > Thanks, > > Alex > > --- > > Alex Williamson (2): > iommu: Quirked PCIe bridge test and search function > intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge > > > drivers/iommu/Kconfig | 5 ++ > drivers/iommu/Makefile | 1 > drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- > drivers/iommu/intel_irq_remapping.c | 15 +++++-- > drivers/iommu/pci.c | 69 +++++++++++++++++++++++++++++++ > drivers/iommu/pci.h | 23 ++++++++++ > 6 files changed, 157 insertions(+), 33 deletions(-) > create mode 100644 drivers/iommu/pci.c These both look OK to me, for whatever that's worth (since they don't touch drivers/pci any more :)). I assume you probably removed the only users of pci_find_upstream_pcie_bridge(), right? If so, and you want to remove that in this same series, you can add my Acked-by to that. Bjorn ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-05-28 22:09 ` Bjorn Helgaas 0 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-05-28 22:09 UTC (permalink / raw) To: Alex Williamson Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Stephen Hemminger, open list:INTEL IOMMU (VT-d), David Woodhouse On Tue, May 28, 2013 at 12:40 PM, Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > This series tries to address: > > https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > Where pci_find_upstream_pcie_bridge() gets lost trying to find the > upstream PCIe-to-PCI bridge for a device because the bridge doesn't > expose a PCIe capability. To do this, we add a iommu_pci_is_pcie_bridge > function which includes a quirk to look to the next upstream device > as a sanity check. We can then replace pci_find_upstream_pcie_bridge > with a function that's a bit more generic and less tied to intel-iommu > eccentricities. > > v2 uses the same logic as v1, but moves the search and match code to > IOMMU-core since PCI-core doesn't want it. v1 has several reports > from users that this solves the problem they have in the above bz. > Thanks, > > Alex > > --- > > Alex Williamson (2): > iommu: Quirked PCIe bridge test and search function > intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge > > > drivers/iommu/Kconfig | 5 ++ > drivers/iommu/Makefile | 1 > drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- > drivers/iommu/intel_irq_remapping.c | 15 +++++-- > drivers/iommu/pci.c | 69 +++++++++++++++++++++++++++++++ > drivers/iommu/pci.h | 23 ++++++++++ > 6 files changed, 157 insertions(+), 33 deletions(-) > create mode 100644 drivers/iommu/pci.c These both look OK to me, for whatever that's worth (since they don't touch drivers/pci any more :)). I assume you probably removed the only users of pci_find_upstream_pcie_bridge(), right? If so, and you want to remove that in this same series, you can add my Acked-by to that. Bjorn ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 3/2] pci: Remove unused pci_find_upstream_pcie_bridge() @ 2013-05-28 22:53 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 22:53 UTC (permalink / raw) To: bhelgaas, iommu, dwmw2, joro; +Cc: stephen, linux-pci, ddutile This no longer has any users. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> --- I figured I'd follow up with this after the series is accepted, but I won't turn down an pre-emptive ack from Bjorn. drivers/pci/search.c | 35 ----------------------------------- include/linux/pci.h | 11 ----------- 2 files changed, 46 deletions(-) diff --git a/drivers/pci/search.c b/drivers/pci/search.c index d0627fa..da2e82e 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -17,41 +17,6 @@ DECLARE_RWSEM(pci_bus_sem); EXPORT_SYMBOL_GPL(pci_bus_sem); -/* - * find the upstream PCIe-to-PCI bridge of a PCI device - * if the device is PCIE, return NULL - * if the device isn't connected to a PCIe bridge (that is its parent is a - * legacy PCI bridge and the bridge is directly connected to bus 0), return its - * parent - */ -struct pci_dev * -pci_find_upstream_pcie_bridge(struct pci_dev *pdev) -{ - struct pci_dev *tmp = NULL; - - if (pci_is_pcie(pdev)) - return NULL; - while (1) { - if (pci_is_root_bus(pdev->bus)) - break; - pdev = pdev->bus->self; - /* a p2p bridge */ - if (!pci_is_pcie(pdev)) { - tmp = pdev; - continue; - } - /* PCI device should connect to a PCIe bridge */ - if (pci_pcie_type(pdev) != PCI_EXP_TYPE_PCI_BRIDGE) { - /* Busted hardware? */ - WARN_ON_ONCE(1); - return NULL; - } - return pdev; - } - - return tmp; -} - static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr) { struct pci_bus* child; diff --git a/include/linux/pci.h b/include/linux/pci.h index 710067f..890d426 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1851,15 +1851,4 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) } #endif -/** - * pci_find_upstream_pcie_bridge - find upstream PCIe-to-PCI bridge of a device - * @pdev: the PCI device - * - * if the device is PCIE, return NULL - * if the device isn't connected to a PCIe bridge (that is its parent is a - * legacy PCI bridge and the bridge is directly connected to bus 0), return its - * parent - */ -struct pci_dev *pci_find_upstream_pcie_bridge(struct pci_dev *pdev); - #endif /* LINUX_PCI_H */ ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 3/2] pci: Remove unused pci_find_upstream_pcie_bridge() @ 2013-05-28 22:53 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-05-28 22:53 UTC (permalink / raw) To: bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, joro-zLv9SwRftAIdnm+yROfE0A Cc: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, linux-pci-u79uwXL29TY76Z2rM5mHXA This no longer has any users. Signed-off-by: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Acked-by: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- I figured I'd follow up with this after the series is accepted, but I won't turn down an pre-emptive ack from Bjorn. drivers/pci/search.c | 35 ----------------------------------- include/linux/pci.h | 11 ----------- 2 files changed, 46 deletions(-) diff --git a/drivers/pci/search.c b/drivers/pci/search.c index d0627fa..da2e82e 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -17,41 +17,6 @@ DECLARE_RWSEM(pci_bus_sem); EXPORT_SYMBOL_GPL(pci_bus_sem); -/* - * find the upstream PCIe-to-PCI bridge of a PCI device - * if the device is PCIE, return NULL - * if the device isn't connected to a PCIe bridge (that is its parent is a - * legacy PCI bridge and the bridge is directly connected to bus 0), return its - * parent - */ -struct pci_dev * -pci_find_upstream_pcie_bridge(struct pci_dev *pdev) -{ - struct pci_dev *tmp = NULL; - - if (pci_is_pcie(pdev)) - return NULL; - while (1) { - if (pci_is_root_bus(pdev->bus)) - break; - pdev = pdev->bus->self; - /* a p2p bridge */ - if (!pci_is_pcie(pdev)) { - tmp = pdev; - continue; - } - /* PCI device should connect to a PCIe bridge */ - if (pci_pcie_type(pdev) != PCI_EXP_TYPE_PCI_BRIDGE) { - /* Busted hardware? */ - WARN_ON_ONCE(1); - return NULL; - } - return pdev; - } - - return tmp; -} - static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr) { struct pci_bus* child; diff --git a/include/linux/pci.h b/include/linux/pci.h index 710067f..890d426 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1851,15 +1851,4 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev) } #endif -/** - * pci_find_upstream_pcie_bridge - find upstream PCIe-to-PCI bridge of a device - * @pdev: the PCI device - * - * if the device is PCIE, return NULL - * if the device isn't connected to a PCIe bridge (that is its parent is a - * legacy PCI bridge and the bridge is directly connected to bus 0), return its - * parent - */ -struct pci_dev *pci_find_upstream_pcie_bridge(struct pci_dev *pdev); - #endif /* LINUX_PCI_H */ ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 17:07 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-08 17:07 UTC (permalink / raw) To: iommu; +Cc: dwmw2, joro, stephen, linux-pci, ddutile Joerg, Where do we stand on this series? You had a concern that the heuristic used in patch 1/ could be dangerous. The suggestion for detecting the issue was actually from Bjorn who replied with his rationale. Do you want to go in the direction of a fixed whitelist or do you agree that even if the heuristic breaks it provides better behavior than what we have now? Thanks, Alex On Tue, 2013-05-28 at 12:40 -0600, Alex Williamson wrote: > This series tries to address: > > https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > Where pci_find_upstream_pcie_bridge() gets lost trying to find the > upstream PCIe-to-PCI bridge for a device because the bridge doesn't > expose a PCIe capability. To do this, we add a iommu_pci_is_pcie_bridge > function which includes a quirk to look to the next upstream device > as a sanity check. We can then replace pci_find_upstream_pcie_bridge > with a function that's a bit more generic and less tied to intel-iommu > eccentricities. > > v2 uses the same logic as v1, but moves the search and match code to > IOMMU-core since PCI-core doesn't want it. v1 has several reports > from users that this solves the problem they have in the above bz. > Thanks, > > Alex > > --- > > Alex Williamson (2): > iommu: Quirked PCIe bridge test and search function > intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge > > > drivers/iommu/Kconfig | 5 ++ > drivers/iommu/Makefile | 1 > drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- > drivers/iommu/intel_irq_remapping.c | 15 +++++-- > drivers/iommu/pci.c | 69 +++++++++++++++++++++++++++++++ > drivers/iommu/pci.h | 23 ++++++++++ > 6 files changed, 157 insertions(+), 33 deletions(-) > create mode 100644 drivers/iommu/pci.c ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 17:07 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-08 17:07 UTC (permalink / raw) To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Cc: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, linux-pci-u79uwXL29TY76Z2rM5mHXA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ Joerg, Where do we stand on this series? You had a concern that the heuristic used in patch 1/ could be dangerous. The suggestion for detecting the issue was actually from Bjorn who replied with his rationale. Do you want to go in the direction of a fixed whitelist or do you agree that even if the heuristic breaks it provides better behavior than what we have now? Thanks, Alex On Tue, 2013-05-28 at 12:40 -0600, Alex Williamson wrote: > This series tries to address: > > https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > Where pci_find_upstream_pcie_bridge() gets lost trying to find the > upstream PCIe-to-PCI bridge for a device because the bridge doesn't > expose a PCIe capability. To do this, we add a iommu_pci_is_pcie_bridge > function which includes a quirk to look to the next upstream device > as a sanity check. We can then replace pci_find_upstream_pcie_bridge > with a function that's a bit more generic and less tied to intel-iommu > eccentricities. > > v2 uses the same logic as v1, but moves the search and match code to > IOMMU-core since PCI-core doesn't want it. v1 has several reports > from users that this solves the problem they have in the above bz. > Thanks, > > Alex > > --- > > Alex Williamson (2): > iommu: Quirked PCIe bridge test and search function > intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge > > > drivers/iommu/Kconfig | 5 ++ > drivers/iommu/Makefile | 1 > drivers/iommu/intel-iommu.c | 77 ++++++++++++++++++++++------------- > drivers/iommu/intel_irq_remapping.c | 15 +++++-- > drivers/iommu/pci.c | 69 +++++++++++++++++++++++++++++++ > drivers/iommu/pci.h | 23 ++++++++++ > 6 files changed, 157 insertions(+), 33 deletions(-) > create mode 100644 drivers/iommu/pci.c ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 19:34 ` Bjorn Helgaas 0 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-07-08 19:34 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, joro, stephen, linux-pci, ddutile On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > Joerg, > > Where do we stand on this series? You had a concern that the heuristic > used in patch 1/ could be dangerous. The suggestion for detecting the > issue was actually from Bjorn who replied with his rationale. Do you > want to go in the direction of a fixed whitelist or do you agree that > even if the heuristic breaks it provides better behavior than what we > have now? Thanks, I'm trying to take a step back and look at the overall design, not these specific patches. IOMMUs translate addresses based on their source, i.e., a PCIe requester ID. This is made more complicated by the fact that some bridges "take ownership" (change the requester ID as they forward transactions upstream), as well as the fact that conventional PCI has no requester ID at all. And some broken devices apparently generate DMA requests using the requester ID of another device. We currently deal with this using the pci_find_upstream_pcie_bridge() and pci_get_dma_source() interfaces, but I think there's too much assembly required by their users. pci_find_upstream_pcie_bridge() callers normally loop through all the bridges between the "upstream PCIe bridge" and the device, checking for bridges that might take ownership. They probably also ought to use pci_get_dma_source() to account for the broken devices, but most callers don't. Most of this is PCI-specific stuff that should be of interest to all IOMMU drivers, and the overall structure of calls and looping should be the same for all of them, so it would be nice to factor it out somehow. The attached patch is guaranteed not to even compile; it's just to make this idea more concrete. The basic idea is that since the IOMMU driver wants to perform some action for each possible requester ID the IOMMU might see, PCI could provide an iterator ("pci_for_each_requester_id()") to do that. Bjorn commit afad51492c6672b96c2b0735600d5695e30f7180 Author: Bjorn Helgaas <bhelgaas@google.com> Date: Wed Jul 3 16:04:26 2013 -0600 pci-add-for-each-requester-id diff --git a/drivers/pci/search.c b/drivers/pci/search.c index d0627fa..380eb03 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -17,6 +17,89 @@ DECLARE_RWSEM(pci_bus_sem); EXPORT_SYMBOL_GPL(pci_bus_sem); +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) + +static inline bool pci_is_pcix(struct pci_dev *dev) +{ + return !!pci_pcix_cap(dev); /* XXX not implemented */ +} + +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) +{ + /* + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge + * Spec v1.0, sec 2.3. + */ + if (pci_is_pcie(bridge) && + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) + return true; + + /* + * A PCI-X to PCI-X bridge need not take ownership because there + * are requester IDs on the secondary PCI-X bus. However, if a PCI + * device is added on the secondary bus, the bridge must revert to + * being a PCI-X to PCI bridge, and it then *would* take ownership. + * Assuming a PCI-X to PCI-X bridge takes ownership means we can + * tolerate a future hot-add without having to change existing + * IOMMU mappings. + */ + if (pci_is_pcix(bridge)) + return true; + + return false; +} + +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, + struct pci_dev *dev) +{ + struct pci_dev *bridge; + struct pci_bus *child_bus; + u8 secondary, subordinate, busn = dev->bus->number; + + if (dev->bus == bus) + return dev; + + /* + * There may be several devices on "bus". Find the one that is a + * bridge leading to "dev". + */ + list_for_each_entry(bridge, &bus->devices, bus_list) { + child_bus = bridge->subordinate; + if (child_bus) { + secondary = child_bus->busn_res.start; + subordinate = child_bus->busn_res.end; + if (secondary <= busn && busn <= subordinate) + return bridge; + } + } + return NULL; +} + +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, + int (*fn)(struct pci_dev *, void *), + void *data) +{ + int ret; + + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ + + while (bridge != dev) { + if (pci_bridge_may_take_ownership(bridge)) { + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); + if (ret) + return ret; + } + bridge = pci_bridge_to_dev(bridge->subordinate, dev); + } + + if (pci_is_pcie(dev) || pci_is_pcix(dev)) + return fn(dev, PCI_REQUESTER_ID(dev), data); + + /* Conventional PCI has no requester ID */ + return 0; +} + /* * find the upstream PCIe-to-PCI bridge of a PCI device * if the device is PCIE, return NULL diff --git a/include/linux/pci.h b/include/linux/pci.h index 3a24e4f..cbcc82f 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, } struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); int pci_dev_present(const struct pci_device_id *ids); +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, + int (*fn)(struct pci_dev *, void *), void *data); int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 *val); commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e Author: Bjorn Helgaas <bhelgaas@google.com> Date: Wed Jul 3 12:02:15 2013 -0600 intel-iommu-upstream-pcie diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b91564d..bdb6e6a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, return 0; } +struct domain_context_info { + struct dmar_domain *domain; + struct pci_dev *dev; + int translation; + struct intel_iommu *iomumu; + bool mapped; +}; + +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + int segment = pci_domain_nr(data->dev); + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + + return domain_context_mapping_one(data->domain, segment, + bus, devfn, data->translation); +} + static int domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev, int translation) { - int ret; - struct pci_dev *tmp, *parent; + struct domain_context_info data; + struct pci_dev *bridge; - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), - pdev->bus->number, pdev->devfn, - translation); - if (ret) - return ret; + data.domain = domain; + data.dev = pdev; + data.translation = translation; + bridge = xx; + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); +} - /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); - if (!tmp) - return 0; - /* Secondary interface's bus number and devfn 0 */ - parent = pdev->bus->self; - while (parent != tmp) { - ret = domain_context_mapping_one(domain, - pci_domain_nr(parent->bus), - parent->bus->number, - parent->devfn, translation); - if (ret) - return ret; - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ - return domain_context_mapping_one(domain, - pci_domain_nr(tmp->subordinate), - tmp->subordinate->number, 0, - translation); - else /* this is a legacy PCI bridge */ - return domain_context_mapping_one(domain, - pci_domain_nr(tmp->bus), - tmp->bus->number, - tmp->devfn, - translation); +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + bool ret; + + ret = device_context_mapped(dev_info->iommu, bus, devfn); + if (!ret) + dev_info->mapped = false; + + return 0; } static bool domain_context_mapped(struct pci_dev *pdev) { - bool ret; - struct pci_dev *tmp, *parent; + struct domain_context_info data; struct intel_iommu *iommu; + struct pci_dev *bridge; iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, pdev->devfn); if (!iommu) return false; - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); - if (!ret) - return false; - /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); - if (!tmp) - return true; - /* Secondary interface's bus number and devfn 0 */ - parent = pdev->bus->self; - while (parent != tmp) { - ret = device_context_mapped(iommu, parent->bus->number, - parent->devfn); - if (!ret) - return false; - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) - return device_context_mapped(iommu, tmp->subordinate->number, - 0); - else - return device_context_mapped(iommu, tmp->bus->number, - tmp->devfn); + data.iommu = iommu; + data.mapped = true; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); + return data.mapped; } /* Returns a number of VTD pages, but aligned to MM page size */ @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) return NULL; } +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + int segment = pci_domain_nr(data->dev); + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + unsigned long flags; + struct device_domain_info *info; + + spin_lock_irqsave(&device_domain_lock, flags); + list_for_each_entry(info, &device_domain_list, global) { + if (info->segment == segment && + info->bus == bus && info->devfn == devfn) { + data->domain = info->domain; + break; + } + } + spin_unlock_irqrestore(&device_domain_lock, flags); + return 0; +} + /* domain is initialized */ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) { @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) struct intel_iommu *iommu; struct dmar_drhd_unit *drhd; struct device_domain_info *info, *tmp; - struct pci_dev *dev_tmp; unsigned long flags; int bus = 0, devfn = 0; int segment; int ret; + struct domain_context_info data; domain = find_domain(pdev); if (domain) @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) segment = pci_domain_nr(pdev->bus); - dev_tmp = pci_find_upstream_pcie_bridge(pdev); - if (dev_tmp) { - if (pci_is_pcie(dev_tmp)) { - bus = dev_tmp->subordinate->number; - devfn = 0; - } else { - bus = dev_tmp->bus->number; - devfn = dev_tmp->devfn; - } - spin_lock_irqsave(&device_domain_lock, flags); - list_for_each_entry(info, &device_domain_list, global) { - if (info->segment == segment && - info->bus == bus && info->devfn == devfn) { - found = info->domain; - break; - } - } - spin_unlock_irqrestore(&device_domain_lock, flags); - /* pcie-pci bridge already has a domain, uses it */ - if (found) { - domain = found; - goto found_domain; - } + data.domain = NULL; + data.dev = pdev; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); + if (data.domain) { + domain = data.domain; + goto found_domain; } domain = alloc_domain(); @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) return 0; } +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + struct intel_iommu *iommu = data->iommu; + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + + iommu_detach_dev(iommu, bus, devfn); + return 0; +} + static void iommu_detach_dependent_devices(struct intel_iommu *iommu, struct pci_dev *pdev) { - struct pci_dev *tmp, *parent; + struct domain_context_info data; if (!iommu || !pdev) return; /* dependent device detach */ - tmp = pci_find_upstream_pcie_bridge(pdev); - /* Secondary interface's bus number and devfn 0 */ - if (tmp) { - parent = pdev->bus->self; - while (parent != tmp) { - iommu_detach_dev(iommu, parent->bus->number, - parent->devfn); - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ - iommu_detach_dev(iommu, - tmp->subordinate->number, 0); - else /* this is a legacy PCI bridge */ - iommu_detach_dev(iommu, tmp->bus->number, - tmp->devfn); - } + data.iommu = iommu; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); } static void domain_remove_one_dev_info(struct dmar_domain *domain, ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 19:34 ` Bjorn Helgaas 0 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-07-08 19:34 UTC (permalink / raw) To: Alex Williamson Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > Joerg, > > Where do we stand on this series? You had a concern that the heuristic > used in patch 1/ could be dangerous. The suggestion for detecting the > issue was actually from Bjorn who replied with his rationale. Do you > want to go in the direction of a fixed whitelist or do you agree that > even if the heuristic breaks it provides better behavior than what we > have now? Thanks, I'm trying to take a step back and look at the overall design, not these specific patches. IOMMUs translate addresses based on their source, i.e., a PCIe requester ID. This is made more complicated by the fact that some bridges "take ownership" (change the requester ID as they forward transactions upstream), as well as the fact that conventional PCI has no requester ID at all. And some broken devices apparently generate DMA requests using the requester ID of another device. We currently deal with this using the pci_find_upstream_pcie_bridge() and pci_get_dma_source() interfaces, but I think there's too much assembly required by their users. pci_find_upstream_pcie_bridge() callers normally loop through all the bridges between the "upstream PCIe bridge" and the device, checking for bridges that might take ownership. They probably also ought to use pci_get_dma_source() to account for the broken devices, but most callers don't. Most of this is PCI-specific stuff that should be of interest to all IOMMU drivers, and the overall structure of calls and looping should be the same for all of them, so it would be nice to factor it out somehow. The attached patch is guaranteed not to even compile; it's just to make this idea more concrete. The basic idea is that since the IOMMU driver wants to perform some action for each possible requester ID the IOMMU might see, PCI could provide an iterator ("pci_for_each_requester_id()") to do that. Bjorn commit afad51492c6672b96c2b0735600d5695e30f7180 Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Date: Wed Jul 3 16:04:26 2013 -0600 pci-add-for-each-requester-id diff --git a/drivers/pci/search.c b/drivers/pci/search.c index d0627fa..380eb03 100644 --- a/drivers/pci/search.c +++ b/drivers/pci/search.c @@ -17,6 +17,89 @@ DECLARE_RWSEM(pci_bus_sem); EXPORT_SYMBOL_GPL(pci_bus_sem); +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) + +static inline bool pci_is_pcix(struct pci_dev *dev) +{ + return !!pci_pcix_cap(dev); /* XXX not implemented */ +} + +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) +{ + /* + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge + * Spec v1.0, sec 2.3. + */ + if (pci_is_pcie(bridge) && + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) + return true; + + /* + * A PCI-X to PCI-X bridge need not take ownership because there + * are requester IDs on the secondary PCI-X bus. However, if a PCI + * device is added on the secondary bus, the bridge must revert to + * being a PCI-X to PCI bridge, and it then *would* take ownership. + * Assuming a PCI-X to PCI-X bridge takes ownership means we can + * tolerate a future hot-add without having to change existing + * IOMMU mappings. + */ + if (pci_is_pcix(bridge)) + return true; + + return false; +} + +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, + struct pci_dev *dev) +{ + struct pci_dev *bridge; + struct pci_bus *child_bus; + u8 secondary, subordinate, busn = dev->bus->number; + + if (dev->bus == bus) + return dev; + + /* + * There may be several devices on "bus". Find the one that is a + * bridge leading to "dev". + */ + list_for_each_entry(bridge, &bus->devices, bus_list) { + child_bus = bridge->subordinate; + if (child_bus) { + secondary = child_bus->busn_res.start; + subordinate = child_bus->busn_res.end; + if (secondary <= busn && busn <= subordinate) + return bridge; + } + } + return NULL; +} + +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, + int (*fn)(struct pci_dev *, void *), + void *data) +{ + int ret; + + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ + + while (bridge != dev) { + if (pci_bridge_may_take_ownership(bridge)) { + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); + if (ret) + return ret; + } + bridge = pci_bridge_to_dev(bridge->subordinate, dev); + } + + if (pci_is_pcie(dev) || pci_is_pcix(dev)) + return fn(dev, PCI_REQUESTER_ID(dev), data); + + /* Conventional PCI has no requester ID */ + return 0; +} + /* * find the upstream PCIe-to-PCI bridge of a PCI device * if the device is PCIE, return NULL diff --git a/include/linux/pci.h b/include/linux/pci.h index 3a24e4f..cbcc82f 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, } struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); int pci_dev_present(const struct pci_device_id *ids); +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, + int (*fn)(struct pci_dev *, void *), void *data); int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 *val); commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Date: Wed Jul 3 12:02:15 2013 -0600 intel-iommu-upstream-pcie diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b91564d..bdb6e6a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, return 0; } +struct domain_context_info { + struct dmar_domain *domain; + struct pci_dev *dev; + int translation; + struct intel_iommu *iomumu; + bool mapped; +}; + +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + int segment = pci_domain_nr(data->dev); + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + + return domain_context_mapping_one(data->domain, segment, + bus, devfn, data->translation); +} + static int domain_context_mapping(struct dmar_domain *domain, struct pci_dev *pdev, int translation) { - int ret; - struct pci_dev *tmp, *parent; + struct domain_context_info data; + struct pci_dev *bridge; - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), - pdev->bus->number, pdev->devfn, - translation); - if (ret) - return ret; + data.domain = domain; + data.dev = pdev; + data.translation = translation; + bridge = xx; + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); +} - /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); - if (!tmp) - return 0; - /* Secondary interface's bus number and devfn 0 */ - parent = pdev->bus->self; - while (parent != tmp) { - ret = domain_context_mapping_one(domain, - pci_domain_nr(parent->bus), - parent->bus->number, - parent->devfn, translation); - if (ret) - return ret; - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ - return domain_context_mapping_one(domain, - pci_domain_nr(tmp->subordinate), - tmp->subordinate->number, 0, - translation); - else /* this is a legacy PCI bridge */ - return domain_context_mapping_one(domain, - pci_domain_nr(tmp->bus), - tmp->bus->number, - tmp->devfn, - translation); +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + bool ret; + + ret = device_context_mapped(dev_info->iommu, bus, devfn); + if (!ret) + dev_info->mapped = false; + + return 0; } static bool domain_context_mapped(struct pci_dev *pdev) { - bool ret; - struct pci_dev *tmp, *parent; + struct domain_context_info data; struct intel_iommu *iommu; + struct pci_dev *bridge; iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, pdev->devfn); if (!iommu) return false; - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); - if (!ret) - return false; - /* dependent device mapping */ - tmp = pci_find_upstream_pcie_bridge(pdev); - if (!tmp) - return true; - /* Secondary interface's bus number and devfn 0 */ - parent = pdev->bus->self; - while (parent != tmp) { - ret = device_context_mapped(iommu, parent->bus->number, - parent->devfn); - if (!ret) - return false; - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) - return device_context_mapped(iommu, tmp->subordinate->number, - 0); - else - return device_context_mapped(iommu, tmp->bus->number, - tmp->devfn); + data.iommu = iommu; + data.mapped = true; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); + return data.mapped; } /* Returns a number of VTD pages, but aligned to MM page size */ @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) return NULL; } +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + int segment = pci_domain_nr(data->dev); + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + unsigned long flags; + struct device_domain_info *info; + + spin_lock_irqsave(&device_domain_lock, flags); + list_for_each_entry(info, &device_domain_list, global) { + if (info->segment == segment && + info->bus == bus && info->devfn == devfn) { + data->domain = info->domain; + break; + } + } + spin_unlock_irqrestore(&device_domain_lock, flags); + return 0; +} + /* domain is initialized */ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) { @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) struct intel_iommu *iommu; struct dmar_drhd_unit *drhd; struct device_domain_info *info, *tmp; - struct pci_dev *dev_tmp; unsigned long flags; int bus = 0, devfn = 0; int segment; int ret; + struct domain_context_info data; domain = find_domain(pdev); if (domain) @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) segment = pci_domain_nr(pdev->bus); - dev_tmp = pci_find_upstream_pcie_bridge(pdev); - if (dev_tmp) { - if (pci_is_pcie(dev_tmp)) { - bus = dev_tmp->subordinate->number; - devfn = 0; - } else { - bus = dev_tmp->bus->number; - devfn = dev_tmp->devfn; - } - spin_lock_irqsave(&device_domain_lock, flags); - list_for_each_entry(info, &device_domain_list, global) { - if (info->segment == segment && - info->bus == bus && info->devfn == devfn) { - found = info->domain; - break; - } - } - spin_unlock_irqrestore(&device_domain_lock, flags); - /* pcie-pci bridge already has a domain, uses it */ - if (found) { - domain = found; - goto found_domain; - } + data.domain = NULL; + data.dev = pdev; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); + if (data.domain) { + domain = data.domain; + goto found_domain; } domain = alloc_domain(); @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) return 0; } +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) +{ + struct domain_context_info *dev_info = data; + struct intel_iommu *iommu = data->iommu; + u8 bus = data->requester_id >> 8; + u8 devfn = data->requester_id & 0xff; + + iommu_detach_dev(iommu, bus, devfn); + return 0; +} + static void iommu_detach_dependent_devices(struct intel_iommu *iommu, struct pci_dev *pdev) { - struct pci_dev *tmp, *parent; + struct domain_context_info data; if (!iommu || !pdev) return; /* dependent device detach */ - tmp = pci_find_upstream_pcie_bridge(pdev); - /* Secondary interface's bus number and devfn 0 */ - if (tmp) { - parent = pdev->bus->self; - while (parent != tmp) { - iommu_detach_dev(iommu, parent->bus->number, - parent->devfn); - parent = parent->bus->self; - } - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ - iommu_detach_dev(iommu, - tmp->subordinate->number, 0); - else /* this is a legacy PCI bridge */ - iommu_detach_dev(iommu, tmp->bus->number, - tmp->devfn); - } + data.iommu = iommu; + bridge = xx; + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); } static void domain_remove_one_dev_info(struct dmar_domain *domain, ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 20:49 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-08 20:49 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: iommu, dwmw2, joro, stephen, linux-pci, ddutile On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > > Joerg, > > > > Where do we stand on this series? You had a concern that the heuristic > > used in patch 1/ could be dangerous. The suggestion for detecting the > > issue was actually from Bjorn who replied with his rationale. Do you > > want to go in the direction of a fixed whitelist or do you agree that > > even if the heuristic breaks it provides better behavior than what we > > have now? Thanks, > > I'm trying to take a step back and look at the overall design, not > these specific patches. > > IOMMUs translate addresses based on their source, i.e., a PCIe > requester ID. This is made more complicated by the fact that some > bridges "take ownership" (change the requester ID as they forward > transactions upstream), as well as the fact that conventional PCI has > no requester ID at all. And some broken devices apparently generate > DMA requests using the requester ID of another device. > > We currently deal with this using the pci_find_upstream_pcie_bridge() > and pci_get_dma_source() interfaces, but I think there's too much > assembly required by their users. pci_find_upstream_pcie_bridge() > callers normally loop through all the bridges between the "upstream > PCIe bridge" and the device, checking for bridges that might take > ownership. They probably also ought to use pci_get_dma_source() to > account for the broken devices, but most callers don't. > > Most of this is PCI-specific stuff that should be of interest to all > IOMMU drivers, and the overall structure of calls and looping should > be the same for all of them, so it would be nice to factor it out > somehow. > > The attached patch is guaranteed not to even compile; it's just to > make this idea more concrete. The basic idea is that since the IOMMU > driver wants to perform some action for each possible requester ID the > IOMMU might see, PCI could provide an iterator > ("pci_for_each_requester_id()") to do that. > > Bjorn > > > commit afad51492c6672b96c2b0735600d5695e30f7180 > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Wed Jul 3 16:04:26 2013 -0600 > > pci-add-for-each-requester-id > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c > index d0627fa..380eb03 100644 > --- a/drivers/pci/search.c > +++ b/drivers/pci/search.c > @@ -17,6 +17,89 @@ > DECLARE_RWSEM(pci_bus_sem); > EXPORT_SYMBOL_GPL(pci_bus_sem); > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) > + > +static inline bool pci_is_pcix(struct pci_dev *dev) > +{ > + return !!pci_pcix_cap(dev); /* XXX not implemented */ > +} > + > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) > +{ > + /* > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge > + * Spec v1.0, sec 2.3. > + */ > + if (pci_is_pcie(bridge) && > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) > + return true; Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe capability are exactly what causes the current bug. > + > + /* > + * A PCI-X to PCI-X bridge need not take ownership because there > + * are requester IDs on the secondary PCI-X bus. However, if a PCI > + * device is added on the secondary bus, the bridge must revert to > + * being a PCI-X to PCI bridge, and it then *would* take ownership. > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can > + * tolerate a future hot-add without having to change existing > + * IOMMU mappings. > + */ > + if (pci_is_pcix(bridge)) > + return true; > + > + return false; > +} > + > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, > + struct pci_dev *dev) > +{ > + struct pci_dev *bridge; > + struct pci_bus *child_bus; > + u8 secondary, subordinate, busn = dev->bus->number; > + > + if (dev->bus == bus) > + return dev; > + > + /* > + * There may be several devices on "bus". Find the one that is a > + * bridge leading to "dev". > + */ > + list_for_each_entry(bridge, &bus->devices, bus_list) { > + child_bus = bridge->subordinate; > + if (child_bus) { > + secondary = child_bus->busn_res.start; > + subordinate = child_bus->busn_res.end; > + if (secondary <= busn && busn <= subordinate) > + return bridge; > + } > + } > + return NULL; > +} > + > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > + int (*fn)(struct pci_dev *, void *), u16 requester_id > + void *data) > +{ > + int ret; > + > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ It's unfortunate that our dma quirk still seems to only get called for the end device where we could have bridges that need it too. > + > + while (bridge != dev) { I'm having a hard time understanding this function since bridge is never defined in any of the below sample code. Would bridge be the root port above a device? And what about devices connected to the RC? Seems like we'd need to traverse up the bus first, so there's a bit of missing magic. > + if (pci_bridge_may_take_ownership(bridge)) { > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); > + if (ret) > + return ret; > + } > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); > + } > + > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) > + return fn(dev, PCI_REQUESTER_ID(dev), data); > + > + /* Conventional PCI has no requester ID */ > + return 0; > +} > + > /* > * find the upstream PCIe-to-PCI bridge of a PCI device > * if the device is PCIE, return NULL > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 3a24e4f..cbcc82f 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, > } > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); > int pci_dev_present(const struct pci_device_id *ids); > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > + int (*fn)(struct pci_dev *, void *), void *data); > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, > int where, u8 *val); > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Wed Jul 3 12:02:15 2013 -0600 > > intel-iommu-upstream-pcie I assume this one would is also going to apply dma quirks to dma_ops, which I'm all for. > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index b91564d..bdb6e6a 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, > return 0; > } > > +struct domain_context_info { > + struct dmar_domain *domain; > + struct pci_dev *dev; > + int translation; > + struct intel_iommu *iomumu; s/iomumu/iommu/ > + bool mapped; > +}; > + > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + int segment = pci_domain_nr(data->dev); dev_info->dev > + u8 bus = data->requester_id >> 8; s/data->// > + u8 devfn = data->requester_id & 0xff; s/data->// > + > + return domain_context_mapping_one(data->domain, segment, > + bus, devfn, data->translation); > +} > + > static int domain_context_mapping(struct dmar_domain *domain, > struct pci_dev *pdev, int translation) > { > - int ret; > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > + struct pci_dev *bridge; > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), > - pdev->bus->number, pdev->devfn, > - translation); > - if (ret) > - return ret; > + data.domain = domain; > + data.dev = pdev; > + data.translation = translation; > + bridge = xx; Darn, I was hoping you were going to reveal where bridge comes from. > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); > +} > > - /* dependent device mapping */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - if (!tmp) > - return 0; > - /* Secondary interface's bus number and devfn 0 */ > - parent = pdev->bus->self; > - while (parent != tmp) { > - ret = domain_context_mapping_one(domain, > - pci_domain_nr(parent->bus), > - parent->bus->number, > - parent->devfn, translation); > - if (ret) > - return ret; > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > - return domain_context_mapping_one(domain, > - pci_domain_nr(tmp->subordinate), > - tmp->subordinate->number, 0, > - translation); > - else /* this is a legacy PCI bridge */ > - return domain_context_mapping_one(domain, > - pci_domain_nr(tmp->bus), > - tmp->bus->number, > - tmp->devfn, > - translation); > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + bool ret; > + > + ret = device_context_mapped(dev_info->iommu, bus, devfn); > + if (!ret) > + dev_info->mapped = false; > + > + return 0; > } > > static bool domain_context_mapped(struct pci_dev *pdev) > { > - bool ret; > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > struct intel_iommu *iommu; > + struct pci_dev *bridge; > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, > pdev->devfn); > if (!iommu) > return false; > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); > - if (!ret) > - return false; > - /* dependent device mapping */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - if (!tmp) > - return true; > - /* Secondary interface's bus number and devfn 0 */ > - parent = pdev->bus->self; > - while (parent != tmp) { > - ret = device_context_mapped(iommu, parent->bus->number, > - parent->devfn); > - if (!ret) > - return false; > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) > - return device_context_mapped(iommu, tmp->subordinate->number, > - 0); > - else > - return device_context_mapped(iommu, tmp->bus->number, > - tmp->devfn); > + data.iommu = iommu; > + data.mapped = true; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); > + return data.mapped; > } > > /* Returns a number of VTD pages, but aligned to MM page size */ > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) > return NULL; > } > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + int segment = pci_domain_nr(data->dev); > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + unsigned long flags; > + struct device_domain_info *info; > + > + spin_lock_irqsave(&device_domain_lock, flags); > + list_for_each_entry(info, &device_domain_list, global) { > + if (info->segment == segment && > + info->bus == bus && info->devfn == devfn) { > + data->domain = info->domain; > + break; > + } > + } > + spin_unlock_irqrestore(&device_domain_lock, flags); > + return 0; > +} > + > /* domain is initialized */ > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > { > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > struct intel_iommu *iommu; > struct dmar_drhd_unit *drhd; > struct device_domain_info *info, *tmp; > - struct pci_dev *dev_tmp; > unsigned long flags; > int bus = 0, devfn = 0; > int segment; > int ret; > + struct domain_context_info data; > > domain = find_domain(pdev); > if (domain) > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > segment = pci_domain_nr(pdev->bus); > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); > - if (dev_tmp) { > - if (pci_is_pcie(dev_tmp)) { > - bus = dev_tmp->subordinate->number; > - devfn = 0; > - } else { > - bus = dev_tmp->bus->number; > - devfn = dev_tmp->devfn; > - } > - spin_lock_irqsave(&device_domain_lock, flags); > - list_for_each_entry(info, &device_domain_list, global) { > - if (info->segment == segment && > - info->bus == bus && info->devfn == devfn) { > - found = info->domain; > - break; > - } > - } > - spin_unlock_irqrestore(&device_domain_lock, flags); > - /* pcie-pci bridge already has a domain, uses it */ > - if (found) { > - domain = found; > - goto found_domain; > - } > + data.domain = NULL; > + data.dev = pdev; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); > + if (data.domain) { > + domain = data.domain; > + goto found_domain; > } > > domain = alloc_domain(); > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) > return 0; > } > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + struct intel_iommu *iommu = data->iommu; > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + > + iommu_detach_dev(iommu, bus, devfn); > + return 0; > +} > + > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, > struct pci_dev *pdev) > { > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > > if (!iommu || !pdev) > return; > > /* dependent device detach */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - /* Secondary interface's bus number and devfn 0 */ > - if (tmp) { > - parent = pdev->bus->self; > - while (parent != tmp) { > - iommu_detach_dev(iommu, parent->bus->number, > - parent->devfn); > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > - iommu_detach_dev(iommu, > - tmp->subordinate->number, 0); > - else /* this is a legacy PCI bridge */ > - iommu_detach_dev(iommu, tmp->bus->number, > - tmp->devfn); > - } > + data.iommu = iommu; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); > } > > static void domain_remove_one_dev_info(struct dmar_domain *domain, > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-08 20:49 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-08 20:49 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > > Joerg, > > > > Where do we stand on this series? You had a concern that the heuristic > > used in patch 1/ could be dangerous. The suggestion for detecting the > > issue was actually from Bjorn who replied with his rationale. Do you > > want to go in the direction of a fixed whitelist or do you agree that > > even if the heuristic breaks it provides better behavior than what we > > have now? Thanks, > > I'm trying to take a step back and look at the overall design, not > these specific patches. > > IOMMUs translate addresses based on their source, i.e., a PCIe > requester ID. This is made more complicated by the fact that some > bridges "take ownership" (change the requester ID as they forward > transactions upstream), as well as the fact that conventional PCI has > no requester ID at all. And some broken devices apparently generate > DMA requests using the requester ID of another device. > > We currently deal with this using the pci_find_upstream_pcie_bridge() > and pci_get_dma_source() interfaces, but I think there's too much > assembly required by their users. pci_find_upstream_pcie_bridge() > callers normally loop through all the bridges between the "upstream > PCIe bridge" and the device, checking for bridges that might take > ownership. They probably also ought to use pci_get_dma_source() to > account for the broken devices, but most callers don't. > > Most of this is PCI-specific stuff that should be of interest to all > IOMMU drivers, and the overall structure of calls and looping should > be the same for all of them, so it would be nice to factor it out > somehow. > > The attached patch is guaranteed not to even compile; it's just to > make this idea more concrete. The basic idea is that since the IOMMU > driver wants to perform some action for each possible requester ID the > IOMMU might see, PCI could provide an iterator > ("pci_for_each_requester_id()") to do that. > > Bjorn > > > commit afad51492c6672b96c2b0735600d5695e30f7180 > Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Date: Wed Jul 3 16:04:26 2013 -0600 > > pci-add-for-each-requester-id > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c > index d0627fa..380eb03 100644 > --- a/drivers/pci/search.c > +++ b/drivers/pci/search.c > @@ -17,6 +17,89 @@ > DECLARE_RWSEM(pci_bus_sem); > EXPORT_SYMBOL_GPL(pci_bus_sem); > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) > + > +static inline bool pci_is_pcix(struct pci_dev *dev) > +{ > + return !!pci_pcix_cap(dev); /* XXX not implemented */ > +} > + > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) > +{ > + /* > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge > + * Spec v1.0, sec 2.3. > + */ > + if (pci_is_pcie(bridge) && > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) > + return true; Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe capability are exactly what causes the current bug. > + > + /* > + * A PCI-X to PCI-X bridge need not take ownership because there > + * are requester IDs on the secondary PCI-X bus. However, if a PCI > + * device is added on the secondary bus, the bridge must revert to > + * being a PCI-X to PCI bridge, and it then *would* take ownership. > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can > + * tolerate a future hot-add without having to change existing > + * IOMMU mappings. > + */ > + if (pci_is_pcix(bridge)) > + return true; > + > + return false; > +} > + > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, > + struct pci_dev *dev) > +{ > + struct pci_dev *bridge; > + struct pci_bus *child_bus; > + u8 secondary, subordinate, busn = dev->bus->number; > + > + if (dev->bus == bus) > + return dev; > + > + /* > + * There may be several devices on "bus". Find the one that is a > + * bridge leading to "dev". > + */ > + list_for_each_entry(bridge, &bus->devices, bus_list) { > + child_bus = bridge->subordinate; > + if (child_bus) { > + secondary = child_bus->busn_res.start; > + subordinate = child_bus->busn_res.end; > + if (secondary <= busn && busn <= subordinate) > + return bridge; > + } > + } > + return NULL; > +} > + > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > + int (*fn)(struct pci_dev *, void *), u16 requester_id > + void *data) > +{ > + int ret; > + > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ It's unfortunate that our dma quirk still seems to only get called for the end device where we could have bridges that need it too. > + > + while (bridge != dev) { I'm having a hard time understanding this function since bridge is never defined in any of the below sample code. Would bridge be the root port above a device? And what about devices connected to the RC? Seems like we'd need to traverse up the bus first, so there's a bit of missing magic. > + if (pci_bridge_may_take_ownership(bridge)) { > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); > + if (ret) > + return ret; > + } > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); > + } > + > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) > + return fn(dev, PCI_REQUESTER_ID(dev), data); > + > + /* Conventional PCI has no requester ID */ > + return 0; > +} > + > /* > * find the upstream PCIe-to-PCI bridge of a PCI device > * if the device is PCIE, return NULL > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 3a24e4f..cbcc82f 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, > } > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); > int pci_dev_present(const struct pci_device_id *ids); > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > + int (*fn)(struct pci_dev *, void *), void *data); > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, > int where, u8 *val); > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e > Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Date: Wed Jul 3 12:02:15 2013 -0600 > > intel-iommu-upstream-pcie I assume this one would is also going to apply dma quirks to dma_ops, which I'm all for. > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index b91564d..bdb6e6a 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, > return 0; > } > > +struct domain_context_info { > + struct dmar_domain *domain; > + struct pci_dev *dev; > + int translation; > + struct intel_iommu *iomumu; s/iomumu/iommu/ > + bool mapped; > +}; > + > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + int segment = pci_domain_nr(data->dev); dev_info->dev > + u8 bus = data->requester_id >> 8; s/data->// > + u8 devfn = data->requester_id & 0xff; s/data->// > + > + return domain_context_mapping_one(data->domain, segment, > + bus, devfn, data->translation); > +} > + > static int domain_context_mapping(struct dmar_domain *domain, > struct pci_dev *pdev, int translation) > { > - int ret; > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > + struct pci_dev *bridge; > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), > - pdev->bus->number, pdev->devfn, > - translation); > - if (ret) > - return ret; > + data.domain = domain; > + data.dev = pdev; > + data.translation = translation; > + bridge = xx; Darn, I was hoping you were going to reveal where bridge comes from. > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); > +} > > - /* dependent device mapping */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - if (!tmp) > - return 0; > - /* Secondary interface's bus number and devfn 0 */ > - parent = pdev->bus->self; > - while (parent != tmp) { > - ret = domain_context_mapping_one(domain, > - pci_domain_nr(parent->bus), > - parent->bus->number, > - parent->devfn, translation); > - if (ret) > - return ret; > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > - return domain_context_mapping_one(domain, > - pci_domain_nr(tmp->subordinate), > - tmp->subordinate->number, 0, > - translation); > - else /* this is a legacy PCI bridge */ > - return domain_context_mapping_one(domain, > - pci_domain_nr(tmp->bus), > - tmp->bus->number, > - tmp->devfn, > - translation); > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + bool ret; > + > + ret = device_context_mapped(dev_info->iommu, bus, devfn); > + if (!ret) > + dev_info->mapped = false; > + > + return 0; > } > > static bool domain_context_mapped(struct pci_dev *pdev) > { > - bool ret; > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > struct intel_iommu *iommu; > + struct pci_dev *bridge; > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, > pdev->devfn); > if (!iommu) > return false; > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); > - if (!ret) > - return false; > - /* dependent device mapping */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - if (!tmp) > - return true; > - /* Secondary interface's bus number and devfn 0 */ > - parent = pdev->bus->self; > - while (parent != tmp) { > - ret = device_context_mapped(iommu, parent->bus->number, > - parent->devfn); > - if (!ret) > - return false; > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) > - return device_context_mapped(iommu, tmp->subordinate->number, > - 0); > - else > - return device_context_mapped(iommu, tmp->bus->number, > - tmp->devfn); > + data.iommu = iommu; > + data.mapped = true; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); > + return data.mapped; > } > > /* Returns a number of VTD pages, but aligned to MM page size */ > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) > return NULL; > } > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + int segment = pci_domain_nr(data->dev); > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + unsigned long flags; > + struct device_domain_info *info; > + > + spin_lock_irqsave(&device_domain_lock, flags); > + list_for_each_entry(info, &device_domain_list, global) { > + if (info->segment == segment && > + info->bus == bus && info->devfn == devfn) { > + data->domain = info->domain; > + break; > + } > + } > + spin_unlock_irqrestore(&device_domain_lock, flags); > + return 0; > +} > + > /* domain is initialized */ > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > { > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > struct intel_iommu *iommu; > struct dmar_drhd_unit *drhd; > struct device_domain_info *info, *tmp; > - struct pci_dev *dev_tmp; > unsigned long flags; > int bus = 0, devfn = 0; > int segment; > int ret; > + struct domain_context_info data; > > domain = find_domain(pdev); > if (domain) > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > segment = pci_domain_nr(pdev->bus); > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); > - if (dev_tmp) { > - if (pci_is_pcie(dev_tmp)) { > - bus = dev_tmp->subordinate->number; > - devfn = 0; > - } else { > - bus = dev_tmp->bus->number; > - devfn = dev_tmp->devfn; > - } > - spin_lock_irqsave(&device_domain_lock, flags); > - list_for_each_entry(info, &device_domain_list, global) { > - if (info->segment == segment && > - info->bus == bus && info->devfn == devfn) { > - found = info->domain; > - break; > - } > - } > - spin_unlock_irqrestore(&device_domain_lock, flags); > - /* pcie-pci bridge already has a domain, uses it */ > - if (found) { > - domain = found; > - goto found_domain; > - } > + data.domain = NULL; > + data.dev = pdev; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); > + if (data.domain) { > + domain = data.domain; > + goto found_domain; > } > > domain = alloc_domain(); > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) > return 0; > } > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) > +{ > + struct domain_context_info *dev_info = data; > + struct intel_iommu *iommu = data->iommu; > + u8 bus = data->requester_id >> 8; > + u8 devfn = data->requester_id & 0xff; > + > + iommu_detach_dev(iommu, bus, devfn); > + return 0; > +} > + > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, > struct pci_dev *pdev) > { > - struct pci_dev *tmp, *parent; > + struct domain_context_info data; > > if (!iommu || !pdev) > return; > > /* dependent device detach */ > - tmp = pci_find_upstream_pcie_bridge(pdev); > - /* Secondary interface's bus number and devfn 0 */ > - if (tmp) { > - parent = pdev->bus->self; > - while (parent != tmp) { > - iommu_detach_dev(iommu, parent->bus->number, > - parent->devfn); > - parent = parent->bus->self; > - } > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > - iommu_detach_dev(iommu, > - tmp->subordinate->number, 0); > - else /* this is a legacy PCI bridge */ > - iommu_detach_dev(iommu, tmp->bus->number, > - tmp->devfn); > - } > + data.iommu = iommu; > + bridge = xx; > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); > } > > static void domain_remove_one_dev_info(struct dmar_domain *domain, > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges 2013-07-08 20:49 ` Alex Williamson (?) @ 2013-07-08 21:51 ` Bjorn Helgaas 2013-07-09 18:27 ` Alex Williamson -1 siblings, 1 reply; 34+ messages in thread From: Bjorn Helgaas @ 2013-07-08 21:51 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, joro, stephen, linux-pci, ddutile On Mon, Jul 08, 2013 at 02:49:16PM -0600, Alex Williamson wrote: > On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: > > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > > > Joerg, > > > > > > Where do we stand on this series? You had a concern that the heuristic > > > used in patch 1/ could be dangerous. The suggestion for detecting the > > > issue was actually from Bjorn who replied with his rationale. Do you > > > want to go in the direction of a fixed whitelist or do you agree that > > > even if the heuristic breaks it provides better behavior than what we > > > have now? Thanks, > > > > I'm trying to take a step back and look at the overall design, not > > these specific patches. > > > > IOMMUs translate addresses based on their source, i.e., a PCIe > > requester ID. This is made more complicated by the fact that some > > bridges "take ownership" (change the requester ID as they forward > > transactions upstream), as well as the fact that conventional PCI has > > no requester ID at all. And some broken devices apparently generate > > DMA requests using the requester ID of another device. > > > > We currently deal with this using the pci_find_upstream_pcie_bridge() > > and pci_get_dma_source() interfaces, but I think there's too much > > assembly required by their users. pci_find_upstream_pcie_bridge() > > callers normally loop through all the bridges between the "upstream > > PCIe bridge" and the device, checking for bridges that might take > > ownership. They probably also ought to use pci_get_dma_source() to > > account for the broken devices, but most callers don't. > > > > Most of this is PCI-specific stuff that should be of interest to all > > IOMMU drivers, and the overall structure of calls and looping should > > be the same for all of them, so it would be nice to factor it out > > somehow. > > > > The attached patch is guaranteed not to even compile; it's just to > > make this idea more concrete. The basic idea is that since the IOMMU > > driver wants to perform some action for each possible requester ID the > > IOMMU might see, PCI could provide an iterator > > ("pci_for_each_requester_id()") to do that. > > > > Bjorn > > > > > > commit afad51492c6672b96c2b0735600d5695e30f7180 > > Author: Bjorn Helgaas <bhelgaas@google.com> > > Date: Wed Jul 3 16:04:26 2013 -0600 > > > > pci-add-for-each-requester-id > > > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c > > index d0627fa..380eb03 100644 > > --- a/drivers/pci/search.c > > +++ b/drivers/pci/search.c > > @@ -17,6 +17,89 @@ > > DECLARE_RWSEM(pci_bus_sem); > > EXPORT_SYMBOL_GPL(pci_bus_sem); > > > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) > > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) > > + > > +static inline bool pci_is_pcix(struct pci_dev *dev) > > +{ > > + return !!pci_pcix_cap(dev); /* XXX not implemented */ > > +} > > + > > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) > > +{ > > + /* > > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge > > + * Spec v1.0, sec 2.3. > > + */ > > + if (pci_is_pcie(bridge) && > > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) > > + return true; > > Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe > capability are exactly what causes the current bug. Likely so. We might be able to identify that as "secondary bus is not PCIe" or something. > > + > > + /* > > + * A PCI-X to PCI-X bridge need not take ownership because there > > + * are requester IDs on the secondary PCI-X bus. However, if a PCI > > + * device is added on the secondary bus, the bridge must revert to > > + * being a PCI-X to PCI bridge, and it then *would* take ownership. > > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can > > + * tolerate a future hot-add without having to change existing > > + * IOMMU mappings. > > + */ > > + if (pci_is_pcix(bridge)) > > + return true; > > + > > + return false; > > +} > > + > > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, > > + struct pci_dev *dev) > > +{ > > + struct pci_dev *bridge; > > + struct pci_bus *child_bus; > > + u8 secondary, subordinate, busn = dev->bus->number; > > + > > + if (dev->bus == bus) > > + return dev; > > + > > + /* > > + * There may be several devices on "bus". Find the one that is a > > + * bridge leading to "dev". > > + */ > > + list_for_each_entry(bridge, &bus->devices, bus_list) { > > + child_bus = bridge->subordinate; > > + if (child_bus) { > > + secondary = child_bus->busn_res.start; > > + subordinate = child_bus->busn_res.end; > > + if (secondary <= busn && busn <= subordinate) > > + return bridge; > > + } > > + } > > + return NULL; > > +} > > + > > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > + int (*fn)(struct pci_dev *, void *), > > u16 requester_id > > > + void *data) > > +{ > > + int ret; > > + > > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ > > It's unfortunate that our dma quirk still seems to only get called for > the end device where we could have bridges that need it too. Sure, we can do that if necessary. I'm not trying to present this as any kind of finished solution. The point is to figure out if we can make an interface that's a little more abstract so we can pull the PCI topology issues out of the IOMMU drivers, and I just thought something that looked like code would be more concrete and easier to discuss. > > + > > + while (bridge != dev) { > > > I'm having a hard time understanding this function since bridge is never > defined in any of the below sample code. Would bridge be the root port > above a device? And what about devices connected to the RC? Seems like > we'd need to traverse up the bus first, so there's a bit of missing > magic. I was assuming that an IOMMU could be either at the root of a PCI hierarchy or buried farther down. If that's true, the IOMMU driver might need to supply the top of the PCI hierarchy for which it translates. That's what I'm trying to get at with the "bridge" argument. If the IOMMU is buried, I assumed there was no point in iterating over bridges upstream of the IOMMU because it wouldn't see requester IDs of those upstream bridges. PCI wouldn't have any way to figure out how far up to go, hence the argument. > > + if (pci_bridge_may_take_ownership(bridge)) { > > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); > > + if (ret) > > + return ret; > > + } > > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); > > + } > > + > > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) > > + return fn(dev, PCI_REQUESTER_ID(dev), data); > > + > > + /* Conventional PCI has no requester ID */ > > + return 0; > > +} > > + > > /* > > * find the upstream PCIe-to-PCI bridge of a PCI device > > * if the device is PCIE, return NULL > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > index 3a24e4f..cbcc82f 100644 > > --- a/include/linux/pci.h > > +++ b/include/linux/pci.h > > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, > > } > > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); > > int pci_dev_present(const struct pci_device_id *ids); > > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > + int (*fn)(struct pci_dev *, void *), void *data); > > > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, > > int where, u8 *val); > > > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e > > Author: Bjorn Helgaas <bhelgaas@google.com> > > Date: Wed Jul 3 12:02:15 2013 -0600 > > > > intel-iommu-upstream-pcie > > I assume this one would is also going to apply dma quirks to dma_ops, > which I'm all for. > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > index b91564d..bdb6e6a 100644 > > --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, > > return 0; > > } > > > > +struct domain_context_info { > > + struct dmar_domain *domain; > > + struct pci_dev *dev; > > + int translation; > > + struct intel_iommu *iomumu; > > s/iomumu/iommu/ > > > + bool mapped; > > +}; > > + > > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) > > +{ > > + struct domain_context_info *dev_info = data; > > + int segment = pci_domain_nr(data->dev); > > dev_info->dev > > > + u8 bus = data->requester_id >> 8; > > s/data->// > > > + u8 devfn = data->requester_id & 0xff; > > s/data->// I told you it wouldn't even compile :) I'm only trying to present a possibility, not all the nitty-gritty details. > > + > > + return domain_context_mapping_one(data->domain, segment, > > + bus, devfn, data->translation); > > +} > > + > > static int domain_context_mapping(struct dmar_domain *domain, > > struct pci_dev *pdev, int translation) > > { > > - int ret; > > - struct pci_dev *tmp, *parent; > > + struct domain_context_info data; > > + struct pci_dev *bridge; > > > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), > > - pdev->bus->number, pdev->devfn, > > - translation); > > - if (ret) > > - return ret; > > + data.domain = domain; > > + data.dev = pdev; > > + data.translation = translation; > > + bridge = xx; > > Darn, I was hoping you were going to reveal where bridge comes from. > > > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); > > +} > > > > - /* dependent device mapping */ > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > - if (!tmp) > > - return 0; > > - /* Secondary interface's bus number and devfn 0 */ > > - parent = pdev->bus->self; > > - while (parent != tmp) { > > - ret = domain_context_mapping_one(domain, > > - pci_domain_nr(parent->bus), > > - parent->bus->number, > > - parent->devfn, translation); > > - if (ret) > > - return ret; > > - parent = parent->bus->self; > > - } > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > - return domain_context_mapping_one(domain, > > - pci_domain_nr(tmp->subordinate), > > - tmp->subordinate->number, 0, > > - translation); > > - else /* this is a legacy PCI bridge */ > > - return domain_context_mapping_one(domain, > > - pci_domain_nr(tmp->bus), > > - tmp->bus->number, > > - tmp->devfn, > > - translation); > > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) > > +{ > > + struct domain_context_info *dev_info = data; > > + u8 bus = data->requester_id >> 8; > > + u8 devfn = data->requester_id & 0xff; > > + bool ret; > > + > > + ret = device_context_mapped(dev_info->iommu, bus, devfn); > > + if (!ret) > > + dev_info->mapped = false; > > + > > + return 0; > > } > > > > static bool domain_context_mapped(struct pci_dev *pdev) > > { > > - bool ret; > > - struct pci_dev *tmp, *parent; > > + struct domain_context_info data; > > struct intel_iommu *iommu; > > + struct pci_dev *bridge; > > > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, > > pdev->devfn); > > if (!iommu) > > return false; > > > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); > > - if (!ret) > > - return false; > > - /* dependent device mapping */ > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > - if (!tmp) > > - return true; > > - /* Secondary interface's bus number and devfn 0 */ > > - parent = pdev->bus->self; > > - while (parent != tmp) { > > - ret = device_context_mapped(iommu, parent->bus->number, > > - parent->devfn); > > - if (!ret) > > - return false; > > - parent = parent->bus->self; > > - } > > - if (pci_is_pcie(tmp)) > > - return device_context_mapped(iommu, tmp->subordinate->number, > > - 0); > > - else > > - return device_context_mapped(iommu, tmp->bus->number, > > - tmp->devfn); > > + data.iommu = iommu; > > + data.mapped = true; > > + bridge = xx; > > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); > > + return data.mapped; > > } > > > > /* Returns a number of VTD pages, but aligned to MM page size */ > > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) > > return NULL; > > } > > > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) > > +{ > > + struct domain_context_info *dev_info = data; > > + int segment = pci_domain_nr(data->dev); > > + u8 bus = data->requester_id >> 8; > > + u8 devfn = data->requester_id & 0xff; > > + unsigned long flags; > > + struct device_domain_info *info; > > + > > + spin_lock_irqsave(&device_domain_lock, flags); > > + list_for_each_entry(info, &device_domain_list, global) { > > + if (info->segment == segment && > > + info->bus == bus && info->devfn == devfn) { > > + data->domain = info->domain; > > + break; > > + } > > + } > > + spin_unlock_irqrestore(&device_domain_lock, flags); > > + return 0; > > +} > > + > > /* domain is initialized */ > > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > { > > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > struct intel_iommu *iommu; > > struct dmar_drhd_unit *drhd; > > struct device_domain_info *info, *tmp; > > - struct pci_dev *dev_tmp; > > unsigned long flags; > > int bus = 0, devfn = 0; > > int segment; > > int ret; > > + struct domain_context_info data; > > > > domain = find_domain(pdev); > > if (domain) > > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > > segment = pci_domain_nr(pdev->bus); > > > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); > > - if (dev_tmp) { > > - if (pci_is_pcie(dev_tmp)) { > > - bus = dev_tmp->subordinate->number; > > - devfn = 0; > > - } else { > > - bus = dev_tmp->bus->number; > > - devfn = dev_tmp->devfn; > > - } > > - spin_lock_irqsave(&device_domain_lock, flags); > > - list_for_each_entry(info, &device_domain_list, global) { > > - if (info->segment == segment && > > - info->bus == bus && info->devfn == devfn) { > > - found = info->domain; > > - break; > > - } > > - } > > - spin_unlock_irqrestore(&device_domain_lock, flags); > > - /* pcie-pci bridge already has a domain, uses it */ > > - if (found) { > > - domain = found; > > - goto found_domain; > > - } > > + data.domain = NULL; > > + data.dev = pdev; > > + bridge = xx; > > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); > > + if (data.domain) { > > + domain = data.domain; > > + goto found_domain; > > } > > > > domain = alloc_domain(); > > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) > > return 0; > > } > > > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) > > +{ > > + struct domain_context_info *dev_info = data; > > + struct intel_iommu *iommu = data->iommu; > > + u8 bus = data->requester_id >> 8; > > + u8 devfn = data->requester_id & 0xff; > > + > > + iommu_detach_dev(iommu, bus, devfn); > > + return 0; > > +} > > + > > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, > > struct pci_dev *pdev) > > { > > - struct pci_dev *tmp, *parent; > > + struct domain_context_info data; > > > > if (!iommu || !pdev) > > return; > > > > /* dependent device detach */ > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > - /* Secondary interface's bus number and devfn 0 */ > > - if (tmp) { > > - parent = pdev->bus->self; > > - while (parent != tmp) { > > - iommu_detach_dev(iommu, parent->bus->number, > > - parent->devfn); > > - parent = parent->bus->self; > > - } > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > - iommu_detach_dev(iommu, > > - tmp->subordinate->number, 0); > > - else /* this is a legacy PCI bridge */ > > - iommu_detach_dev(iommu, tmp->bus->number, > > - tmp->devfn); > > - } > > + data.iommu = iommu; > > + bridge = xx; > > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); > > } > > > > static void domain_remove_one_dev_info(struct dmar_domain *domain, > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-09 18:27 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-09 18:27 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: iommu, dwmw2, joro, stephen, linux-pci, ddutile On Mon, 2013-07-08 at 15:51 -0600, Bjorn Helgaas wrote: > On Mon, Jul 08, 2013 at 02:49:16PM -0600, Alex Williamson wrote: > > On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: > > > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > > > > Joerg, > > > > > > > > Where do we stand on this series? You had a concern that the heuristic > > > > used in patch 1/ could be dangerous. The suggestion for detecting the > > > > issue was actually from Bjorn who replied with his rationale. Do you > > > > want to go in the direction of a fixed whitelist or do you agree that > > > > even if the heuristic breaks it provides better behavior than what we > > > > have now? Thanks, > > > > > > I'm trying to take a step back and look at the overall design, not > > > these specific patches. > > > > > > IOMMUs translate addresses based on their source, i.e., a PCIe > > > requester ID. This is made more complicated by the fact that some > > > bridges "take ownership" (change the requester ID as they forward > > > transactions upstream), as well as the fact that conventional PCI has > > > no requester ID at all. And some broken devices apparently generate > > > DMA requests using the requester ID of another device. > > > > > > We currently deal with this using the pci_find_upstream_pcie_bridge() > > > and pci_get_dma_source() interfaces, but I think there's too much > > > assembly required by their users. pci_find_upstream_pcie_bridge() > > > callers normally loop through all the bridges between the "upstream > > > PCIe bridge" and the device, checking for bridges that might take > > > ownership. They probably also ought to use pci_get_dma_source() to > > > account for the broken devices, but most callers don't. > > > > > > Most of this is PCI-specific stuff that should be of interest to all > > > IOMMU drivers, and the overall structure of calls and looping should > > > be the same for all of them, so it would be nice to factor it out > > > somehow. > > > > > > The attached patch is guaranteed not to even compile; it's just to > > > make this idea more concrete. The basic idea is that since the IOMMU > > > driver wants to perform some action for each possible requester ID the > > > IOMMU might see, PCI could provide an iterator > > > ("pci_for_each_requester_id()") to do that. > > > > > > Bjorn > > > > > > > > > commit afad51492c6672b96c2b0735600d5695e30f7180 > > > Author: Bjorn Helgaas <bhelgaas@google.com> > > > Date: Wed Jul 3 16:04:26 2013 -0600 > > > > > > pci-add-for-each-requester-id > > > > > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c > > > index d0627fa..380eb03 100644 > > > --- a/drivers/pci/search.c > > > +++ b/drivers/pci/search.c > > > @@ -17,6 +17,89 @@ > > > DECLARE_RWSEM(pci_bus_sem); > > > EXPORT_SYMBOL_GPL(pci_bus_sem); > > > > > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) > > > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) > > > + > > > +static inline bool pci_is_pcix(struct pci_dev *dev) > > > +{ > > > + return !!pci_pcix_cap(dev); /* XXX not implemented */ > > > +} > > > + > > > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) > > > +{ > > > + /* > > > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge > > > + * Spec v1.0, sec 2.3. > > > + */ > > > + if (pci_is_pcie(bridge) && > > > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) > > > + return true; > > > > Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe > > capability are exactly what causes the current bug. > > Likely so. We might be able to identify that as "secondary bus is not > PCIe" or something. > > > > + > > > + /* > > > + * A PCI-X to PCI-X bridge need not take ownership because there > > > + * are requester IDs on the secondary PCI-X bus. However, if a PCI > > > + * device is added on the secondary bus, the bridge must revert to > > > + * being a PCI-X to PCI bridge, and it then *would* take ownership. > > > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can > > > + * tolerate a future hot-add without having to change existing > > > + * IOMMU mappings. > > > + */ > > > + if (pci_is_pcix(bridge)) > > > + return true; > > > + > > > + return false; > > > +} > > > + > > > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, > > > + struct pci_dev *dev) > > > +{ > > > + struct pci_dev *bridge; > > > + struct pci_bus *child_bus; > > > + u8 secondary, subordinate, busn = dev->bus->number; > > > + > > > + if (dev->bus == bus) > > > + return dev; > > > + > > > + /* > > > + * There may be several devices on "bus". Find the one that is a > > > + * bridge leading to "dev". > > > + */ > > > + list_for_each_entry(bridge, &bus->devices, bus_list) { > > > + child_bus = bridge->subordinate; > > > + if (child_bus) { > > > + secondary = child_bus->busn_res.start; > > > + subordinate = child_bus->busn_res.end; > > > + if (secondary <= busn && busn <= subordinate) > > > + return bridge; > > > + } > > > + } > > > + return NULL; > > > +} > > > + > > > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > > + int (*fn)(struct pci_dev *, void *), > > > > u16 requester_id > > > > > + void *data) > > > +{ > > > + int ret; > > > + > > > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ > > > > It's unfortunate that our dma quirk still seems to only get called for > > the end device where we could have bridges that need it too. > > Sure, we can do that if necessary. > > I'm not trying to present this as any kind of finished solution. The > point is to figure out if we can make an interface that's a little > more abstract so we can pull the PCI topology issues out of the IOMMU > drivers, and I just thought something that looked like code would be > more concrete and easier to discuss. > > > > + > > > + while (bridge != dev) { > > > > > > I'm having a hard time understanding this function since bridge is never > > defined in any of the below sample code. Would bridge be the root port > > above a device? And what about devices connected to the RC? Seems like > > we'd need to traverse up the bus first, so there's a bit of missing > > magic. > > I was assuming that an IOMMU could be either at the root of a PCI > hierarchy or buried farther down. If that's true, the IOMMU driver > might need to supply the top of the PCI hierarchy for which it > translates. That's what I'm trying to get at with the "bridge" > argument. > > If the IOMMU is buried, I assumed there was no point in iterating over > bridges upstream of the IOMMU because it wouldn't see requester IDs of > those upstream bridges. PCI wouldn't have any way to figure out how > far up to go, hence the argument. I don't think that getting the bridge is going to be quite so simple. VT-d can have multiple IOMMUs, each handling a specific device or hierarchy. One IOMMU per domain can be default IOMMU for any device not otherwise specified. The IOMMU is not a PCI enumerable device. AMD-Vi does have a PCI visible IOMMU, but it's an endpoint on the root complex, not the top of a hierarchy. I'm warming up to your approach, but I think I'd rather see something like: struct pci_dev *pci_get_requester_id(struct pci_dev *pdev) { do { /* Apply DMA quirks at every step */ pdev = pci_get_dma_source(pdev); /* Done for PCIe devices or broken bridges, PCI-X TBD */ if (pci_is_pcie(pdev) || /* broken pcie bridge test */) break; } while (!pci_is_root_bus(pdev->bus) && (pdev = pdev->bus->self)); return pdev; } /* Start with requester ID pdev and pdev->bus, this way we handle both peer * and parent requester IDs */ int pci_for_each_requester_id(struct pci_dev *pdev, struct pci_bus *bus, int (*fn)(struct pci_dev *, void *), void *data) { struct pci_dev *tmp; int ret; list_for_each_entry(tmp, &bus->devices, bus_list) { if (pci_get_requester_id(tmp) == pdev) { ret = fn(tmp, data); if (ret) ... if (tmp->subordinate) { ret = pci_for_each_requester_id(pdev, tmp->subordinate, fn); if (ret) ... } } } return ret; } Seems like that would be sufficient to leverage the intel-iommu changes you've drafted. Thanks, Alex > > > + if (pci_bridge_may_take_ownership(bridge)) { > > > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); > > > + if (ret) > > > + return ret; > > > + } > > > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); > > > + } > > > + > > > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) > > > + return fn(dev, PCI_REQUESTER_ID(dev), data); > > > + > > > + /* Conventional PCI has no requester ID */ > > > + return 0; > > > +} > > > + > > > /* > > > * find the upstream PCIe-to-PCI bridge of a PCI device > > > * if the device is PCIE, return NULL > > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > > index 3a24e4f..cbcc82f 100644 > > > --- a/include/linux/pci.h > > > +++ b/include/linux/pci.h > > > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, > > > } > > > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); > > > int pci_dev_present(const struct pci_device_id *ids); > > > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > > + int (*fn)(struct pci_dev *, void *), void *data); > > > > > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, > > > int where, u8 *val); > > > > > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e > > > Author: Bjorn Helgaas <bhelgaas@google.com> > > > Date: Wed Jul 3 12:02:15 2013 -0600 > > > > > > intel-iommu-upstream-pcie > > > > I assume this one would is also going to apply dma quirks to dma_ops, > > which I'm all for. > > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > > index b91564d..bdb6e6a 100644 > > > --- a/drivers/iommu/intel-iommu.c > > > +++ b/drivers/iommu/intel-iommu.c > > > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, > > > return 0; > > > } > > > > > > +struct domain_context_info { > > > + struct dmar_domain *domain; > > > + struct pci_dev *dev; > > > + int translation; > > > + struct intel_iommu *iomumu; > > > > s/iomumu/iommu/ > > > > > + bool mapped; > > > +}; > > > + > > > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + int segment = pci_domain_nr(data->dev); > > > > dev_info->dev > > > > > + u8 bus = data->requester_id >> 8; > > > > s/data->// > > > > > + u8 devfn = data->requester_id & 0xff; > > > > s/data->// > > I told you it wouldn't even compile :) I'm only trying to present a > possibility, not all the nitty-gritty details. > > > > + > > > + return domain_context_mapping_one(data->domain, segment, > > > + bus, devfn, data->translation); > > > +} > > > + > > > static int domain_context_mapping(struct dmar_domain *domain, > > > struct pci_dev *pdev, int translation) > > > { > > > - int ret; > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > + struct pci_dev *bridge; > > > > > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), > > > - pdev->bus->number, pdev->devfn, > > > - translation); > > > - if (ret) > > > - return ret; > > > + data.domain = domain; > > > + data.dev = pdev; > > > + data.translation = translation; > > > + bridge = xx; > > > > Darn, I was hoping you were going to reveal where bridge comes from. > > > > > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); > > > +} > > > > > > - /* dependent device mapping */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (!tmp) > > > - return 0; > > > - /* Secondary interface's bus number and devfn 0 */ > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - ret = domain_context_mapping_one(domain, > > > - pci_domain_nr(parent->bus), > > > - parent->bus->number, > > > - parent->devfn, translation); > > > - if (ret) > > > - return ret; > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > > - return domain_context_mapping_one(domain, > > > - pci_domain_nr(tmp->subordinate), > > > - tmp->subordinate->number, 0, > > > - translation); > > > - else /* this is a legacy PCI bridge */ > > > - return domain_context_mapping_one(domain, > > > - pci_domain_nr(tmp->bus), > > > - tmp->bus->number, > > > - tmp->devfn, > > > - translation); > > > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + bool ret; > > > + > > > + ret = device_context_mapped(dev_info->iommu, bus, devfn); > > > + if (!ret) > > > + dev_info->mapped = false; > > > + > > > + return 0; > > > } > > > > > > static bool domain_context_mapped(struct pci_dev *pdev) > > > { > > > - bool ret; > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > struct intel_iommu *iommu; > > > + struct pci_dev *bridge; > > > > > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, > > > pdev->devfn); > > > if (!iommu) > > > return false; > > > > > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); > > > - if (!ret) > > > - return false; > > > - /* dependent device mapping */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (!tmp) > > > - return true; > > > - /* Secondary interface's bus number and devfn 0 */ > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - ret = device_context_mapped(iommu, parent->bus->number, > > > - parent->devfn); > > > - if (!ret) > > > - return false; > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) > > > - return device_context_mapped(iommu, tmp->subordinate->number, > > > - 0); > > > - else > > > - return device_context_mapped(iommu, tmp->bus->number, > > > - tmp->devfn); > > > + data.iommu = iommu; > > > + data.mapped = true; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); > > > + return data.mapped; > > > } > > > > > > /* Returns a number of VTD pages, but aligned to MM page size */ > > > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) > > > return NULL; > > > } > > > > > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + int segment = pci_domain_nr(data->dev); > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + unsigned long flags; > > > + struct device_domain_info *info; > > > + > > > + spin_lock_irqsave(&device_domain_lock, flags); > > > + list_for_each_entry(info, &device_domain_list, global) { > > > + if (info->segment == segment && > > > + info->bus == bus && info->devfn == devfn) { > > > + data->domain = info->domain; > > > + break; > > > + } > > > + } > > > + spin_unlock_irqrestore(&device_domain_lock, flags); > > > + return 0; > > > +} > > > + > > > /* domain is initialized */ > > > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > { > > > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > struct intel_iommu *iommu; > > > struct dmar_drhd_unit *drhd; > > > struct device_domain_info *info, *tmp; > > > - struct pci_dev *dev_tmp; > > > unsigned long flags; > > > int bus = 0, devfn = 0; > > > int segment; > > > int ret; > > > + struct domain_context_info data; > > > > > > domain = find_domain(pdev); > > > if (domain) > > > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > > > > segment = pci_domain_nr(pdev->bus); > > > > > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (dev_tmp) { > > > - if (pci_is_pcie(dev_tmp)) { > > > - bus = dev_tmp->subordinate->number; > > > - devfn = 0; > > > - } else { > > > - bus = dev_tmp->bus->number; > > > - devfn = dev_tmp->devfn; > > > - } > > > - spin_lock_irqsave(&device_domain_lock, flags); > > > - list_for_each_entry(info, &device_domain_list, global) { > > > - if (info->segment == segment && > > > - info->bus == bus && info->devfn == devfn) { > > > - found = info->domain; > > > - break; > > > - } > > > - } > > > - spin_unlock_irqrestore(&device_domain_lock, flags); > > > - /* pcie-pci bridge already has a domain, uses it */ > > > - if (found) { > > > - domain = found; > > > - goto found_domain; > > > - } > > > + data.domain = NULL; > > > + data.dev = pdev; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); > > > + if (data.domain) { > > > + domain = data.domain; > > > + goto found_domain; > > > } > > > > > > domain = alloc_domain(); > > > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) > > > return 0; > > > } > > > > > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + struct intel_iommu *iommu = data->iommu; > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + > > > + iommu_detach_dev(iommu, bus, devfn); > > > + return 0; > > > +} > > > + > > > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, > > > struct pci_dev *pdev) > > > { > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > > > > if (!iommu || !pdev) > > > return; > > > > > > /* dependent device detach */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - /* Secondary interface's bus number and devfn 0 */ > > > - if (tmp) { > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - iommu_detach_dev(iommu, parent->bus->number, > > > - parent->devfn); > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > > - iommu_detach_dev(iommu, > > > - tmp->subordinate->number, 0); > > > - else /* this is a legacy PCI bridge */ > > > - iommu_detach_dev(iommu, tmp->bus->number, > > > - tmp->devfn); > > > - } > > > + data.iommu = iommu; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); > > > } > > > > > > static void domain_remove_one_dev_info(struct dmar_domain *domain, > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-09 18:27 ` Alex Williamson 0 siblings, 0 replies; 34+ messages in thread From: Alex Williamson @ 2013-07-09 18:27 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ On Mon, 2013-07-08 at 15:51 -0600, Bjorn Helgaas wrote: > On Mon, Jul 08, 2013 at 02:49:16PM -0600, Alex Williamson wrote: > > On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: > > > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > > > > Joerg, > > > > > > > > Where do we stand on this series? You had a concern that the heuristic > > > > used in patch 1/ could be dangerous. The suggestion for detecting the > > > > issue was actually from Bjorn who replied with his rationale. Do you > > > > want to go in the direction of a fixed whitelist or do you agree that > > > > even if the heuristic breaks it provides better behavior than what we > > > > have now? Thanks, > > > > > > I'm trying to take a step back and look at the overall design, not > > > these specific patches. > > > > > > IOMMUs translate addresses based on their source, i.e., a PCIe > > > requester ID. This is made more complicated by the fact that some > > > bridges "take ownership" (change the requester ID as they forward > > > transactions upstream), as well as the fact that conventional PCI has > > > no requester ID at all. And some broken devices apparently generate > > > DMA requests using the requester ID of another device. > > > > > > We currently deal with this using the pci_find_upstream_pcie_bridge() > > > and pci_get_dma_source() interfaces, but I think there's too much > > > assembly required by their users. pci_find_upstream_pcie_bridge() > > > callers normally loop through all the bridges between the "upstream > > > PCIe bridge" and the device, checking for bridges that might take > > > ownership. They probably also ought to use pci_get_dma_source() to > > > account for the broken devices, but most callers don't. > > > > > > Most of this is PCI-specific stuff that should be of interest to all > > > IOMMU drivers, and the overall structure of calls and looping should > > > be the same for all of them, so it would be nice to factor it out > > > somehow. > > > > > > The attached patch is guaranteed not to even compile; it's just to > > > make this idea more concrete. The basic idea is that since the IOMMU > > > driver wants to perform some action for each possible requester ID the > > > IOMMU might see, PCI could provide an iterator > > > ("pci_for_each_requester_id()") to do that. > > > > > > Bjorn > > > > > > > > > commit afad51492c6672b96c2b0735600d5695e30f7180 > > > Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > > Date: Wed Jul 3 16:04:26 2013 -0600 > > > > > > pci-add-for-each-requester-id > > > > > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c > > > index d0627fa..380eb03 100644 > > > --- a/drivers/pci/search.c > > > +++ b/drivers/pci/search.c > > > @@ -17,6 +17,89 @@ > > > DECLARE_RWSEM(pci_bus_sem); > > > EXPORT_SYMBOL_GPL(pci_bus_sem); > > > > > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) > > > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) > > > + > > > +static inline bool pci_is_pcix(struct pci_dev *dev) > > > +{ > > > + return !!pci_pcix_cap(dev); /* XXX not implemented */ > > > +} > > > + > > > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) > > > +{ > > > + /* > > > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge > > > + * Spec v1.0, sec 2.3. > > > + */ > > > + if (pci_is_pcie(bridge) && > > > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) > > > + return true; > > > > Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe > > capability are exactly what causes the current bug. > > Likely so. We might be able to identify that as "secondary bus is not > PCIe" or something. > > > > + > > > + /* > > > + * A PCI-X to PCI-X bridge need not take ownership because there > > > + * are requester IDs on the secondary PCI-X bus. However, if a PCI > > > + * device is added on the secondary bus, the bridge must revert to > > > + * being a PCI-X to PCI bridge, and it then *would* take ownership. > > > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can > > > + * tolerate a future hot-add without having to change existing > > > + * IOMMU mappings. > > > + */ > > > + if (pci_is_pcix(bridge)) > > > + return true; > > > + > > > + return false; > > > +} > > > + > > > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, > > > + struct pci_dev *dev) > > > +{ > > > + struct pci_dev *bridge; > > > + struct pci_bus *child_bus; > > > + u8 secondary, subordinate, busn = dev->bus->number; > > > + > > > + if (dev->bus == bus) > > > + return dev; > > > + > > > + /* > > > + * There may be several devices on "bus". Find the one that is a > > > + * bridge leading to "dev". > > > + */ > > > + list_for_each_entry(bridge, &bus->devices, bus_list) { > > > + child_bus = bridge->subordinate; > > > + if (child_bus) { > > > + secondary = child_bus->busn_res.start; > > > + subordinate = child_bus->busn_res.end; > > > + if (secondary <= busn && busn <= subordinate) > > > + return bridge; > > > + } > > > + } > > > + return NULL; > > > +} > > > + > > > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > > + int (*fn)(struct pci_dev *, void *), > > > > u16 requester_id > > > > > + void *data) > > > +{ > > > + int ret; > > > + > > > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ > > > > It's unfortunate that our dma quirk still seems to only get called for > > the end device where we could have bridges that need it too. > > Sure, we can do that if necessary. > > I'm not trying to present this as any kind of finished solution. The > point is to figure out if we can make an interface that's a little > more abstract so we can pull the PCI topology issues out of the IOMMU > drivers, and I just thought something that looked like code would be > more concrete and easier to discuss. > > > > + > > > + while (bridge != dev) { > > > > > > I'm having a hard time understanding this function since bridge is never > > defined in any of the below sample code. Would bridge be the root port > > above a device? And what about devices connected to the RC? Seems like > > we'd need to traverse up the bus first, so there's a bit of missing > > magic. > > I was assuming that an IOMMU could be either at the root of a PCI > hierarchy or buried farther down. If that's true, the IOMMU driver > might need to supply the top of the PCI hierarchy for which it > translates. That's what I'm trying to get at with the "bridge" > argument. > > If the IOMMU is buried, I assumed there was no point in iterating over > bridges upstream of the IOMMU because it wouldn't see requester IDs of > those upstream bridges. PCI wouldn't have any way to figure out how > far up to go, hence the argument. I don't think that getting the bridge is going to be quite so simple. VT-d can have multiple IOMMUs, each handling a specific device or hierarchy. One IOMMU per domain can be default IOMMU for any device not otherwise specified. The IOMMU is not a PCI enumerable device. AMD-Vi does have a PCI visible IOMMU, but it's an endpoint on the root complex, not the top of a hierarchy. I'm warming up to your approach, but I think I'd rather see something like: struct pci_dev *pci_get_requester_id(struct pci_dev *pdev) { do { /* Apply DMA quirks at every step */ pdev = pci_get_dma_source(pdev); /* Done for PCIe devices or broken bridges, PCI-X TBD */ if (pci_is_pcie(pdev) || /* broken pcie bridge test */) break; } while (!pci_is_root_bus(pdev->bus) && (pdev = pdev->bus->self)); return pdev; } /* Start with requester ID pdev and pdev->bus, this way we handle both peer * and parent requester IDs */ int pci_for_each_requester_id(struct pci_dev *pdev, struct pci_bus *bus, int (*fn)(struct pci_dev *, void *), void *data) { struct pci_dev *tmp; int ret; list_for_each_entry(tmp, &bus->devices, bus_list) { if (pci_get_requester_id(tmp) == pdev) { ret = fn(tmp, data); if (ret) ... if (tmp->subordinate) { ret = pci_for_each_requester_id(pdev, tmp->subordinate, fn); if (ret) ... } } } return ret; } Seems like that would be sufficient to leverage the intel-iommu changes you've drafted. Thanks, Alex > > > + if (pci_bridge_may_take_ownership(bridge)) { > > > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); > > > + if (ret) > > > + return ret; > > > + } > > > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); > > > + } > > > + > > > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) > > > + return fn(dev, PCI_REQUESTER_ID(dev), data); > > > + > > > + /* Conventional PCI has no requester ID */ > > > + return 0; > > > +} > > > + > > > /* > > > * find the upstream PCIe-to-PCI bridge of a PCI device > > > * if the device is PCIE, return NULL > > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > > index 3a24e4f..cbcc82f 100644 > > > --- a/include/linux/pci.h > > > +++ b/include/linux/pci.h > > > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, > > > } > > > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); > > > int pci_dev_present(const struct pci_device_id *ids); > > > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, > > > + int (*fn)(struct pci_dev *, void *), void *data); > > > > > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, > > > int where, u8 *val); > > > > > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e > > > Author: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > > Date: Wed Jul 3 12:02:15 2013 -0600 > > > > > > intel-iommu-upstream-pcie > > > > I assume this one would is also going to apply dma quirks to dma_ops, > > which I'm all for. > > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > > index b91564d..bdb6e6a 100644 > > > --- a/drivers/iommu/intel-iommu.c > > > +++ b/drivers/iommu/intel-iommu.c > > > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, > > > return 0; > > > } > > > > > > +struct domain_context_info { > > > + struct dmar_domain *domain; > > > + struct pci_dev *dev; > > > + int translation; > > > + struct intel_iommu *iomumu; > > > > s/iomumu/iommu/ > > > > > + bool mapped; > > > +}; > > > + > > > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + int segment = pci_domain_nr(data->dev); > > > > dev_info->dev > > > > > + u8 bus = data->requester_id >> 8; > > > > s/data->// > > > > > + u8 devfn = data->requester_id & 0xff; > > > > s/data->// > > I told you it wouldn't even compile :) I'm only trying to present a > possibility, not all the nitty-gritty details. > > > > + > > > + return domain_context_mapping_one(data->domain, segment, > > > + bus, devfn, data->translation); > > > +} > > > + > > > static int domain_context_mapping(struct dmar_domain *domain, > > > struct pci_dev *pdev, int translation) > > > { > > > - int ret; > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > + struct pci_dev *bridge; > > > > > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), > > > - pdev->bus->number, pdev->devfn, > > > - translation); > > > - if (ret) > > > - return ret; > > > + data.domain = domain; > > > + data.dev = pdev; > > > + data.translation = translation; > > > + bridge = xx; > > > > Darn, I was hoping you were going to reveal where bridge comes from. > > > > > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); > > > +} > > > > > > - /* dependent device mapping */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (!tmp) > > > - return 0; > > > - /* Secondary interface's bus number and devfn 0 */ > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - ret = domain_context_mapping_one(domain, > > > - pci_domain_nr(parent->bus), > > > - parent->bus->number, > > > - parent->devfn, translation); > > > - if (ret) > > > - return ret; > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > > - return domain_context_mapping_one(domain, > > > - pci_domain_nr(tmp->subordinate), > > > - tmp->subordinate->number, 0, > > > - translation); > > > - else /* this is a legacy PCI bridge */ > > > - return domain_context_mapping_one(domain, > > > - pci_domain_nr(tmp->bus), > > > - tmp->bus->number, > > > - tmp->devfn, > > > - translation); > > > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + bool ret; > > > + > > > + ret = device_context_mapped(dev_info->iommu, bus, devfn); > > > + if (!ret) > > > + dev_info->mapped = false; > > > + > > > + return 0; > > > } > > > > > > static bool domain_context_mapped(struct pci_dev *pdev) > > > { > > > - bool ret; > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > struct intel_iommu *iommu; > > > + struct pci_dev *bridge; > > > > > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, > > > pdev->devfn); > > > if (!iommu) > > > return false; > > > > > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); > > > - if (!ret) > > > - return false; > > > - /* dependent device mapping */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (!tmp) > > > - return true; > > > - /* Secondary interface's bus number and devfn 0 */ > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - ret = device_context_mapped(iommu, parent->bus->number, > > > - parent->devfn); > > > - if (!ret) > > > - return false; > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) > > > - return device_context_mapped(iommu, tmp->subordinate->number, > > > - 0); > > > - else > > > - return device_context_mapped(iommu, tmp->bus->number, > > > - tmp->devfn); > > > + data.iommu = iommu; > > > + data.mapped = true; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); > > > + return data.mapped; > > > } > > > > > > /* Returns a number of VTD pages, but aligned to MM page size */ > > > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) > > > return NULL; > > > } > > > > > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + int segment = pci_domain_nr(data->dev); > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + unsigned long flags; > > > + struct device_domain_info *info; > > > + > > > + spin_lock_irqsave(&device_domain_lock, flags); > > > + list_for_each_entry(info, &device_domain_list, global) { > > > + if (info->segment == segment && > > > + info->bus == bus && info->devfn == devfn) { > > > + data->domain = info->domain; > > > + break; > > > + } > > > + } > > > + spin_unlock_irqrestore(&device_domain_lock, flags); > > > + return 0; > > > +} > > > + > > > /* domain is initialized */ > > > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > { > > > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > struct intel_iommu *iommu; > > > struct dmar_drhd_unit *drhd; > > > struct device_domain_info *info, *tmp; > > > - struct pci_dev *dev_tmp; > > > unsigned long flags; > > > int bus = 0, devfn = 0; > > > int segment; > > > int ret; > > > + struct domain_context_info data; > > > > > > domain = find_domain(pdev); > > > if (domain) > > > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) > > > > > > segment = pci_domain_nr(pdev->bus); > > > > > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); > > > - if (dev_tmp) { > > > - if (pci_is_pcie(dev_tmp)) { > > > - bus = dev_tmp->subordinate->number; > > > - devfn = 0; > > > - } else { > > > - bus = dev_tmp->bus->number; > > > - devfn = dev_tmp->devfn; > > > - } > > > - spin_lock_irqsave(&device_domain_lock, flags); > > > - list_for_each_entry(info, &device_domain_list, global) { > > > - if (info->segment == segment && > > > - info->bus == bus && info->devfn == devfn) { > > > - found = info->domain; > > > - break; > > > - } > > > - } > > > - spin_unlock_irqrestore(&device_domain_lock, flags); > > > - /* pcie-pci bridge already has a domain, uses it */ > > > - if (found) { > > > - domain = found; > > > - goto found_domain; > > > - } > > > + data.domain = NULL; > > > + data.dev = pdev; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); > > > + if (data.domain) { > > > + domain = data.domain; > > > + goto found_domain; > > > } > > > > > > domain = alloc_domain(); > > > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) > > > return 0; > > > } > > > > > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) > > > +{ > > > + struct domain_context_info *dev_info = data; > > > + struct intel_iommu *iommu = data->iommu; > > > + u8 bus = data->requester_id >> 8; > > > + u8 devfn = data->requester_id & 0xff; > > > + > > > + iommu_detach_dev(iommu, bus, devfn); > > > + return 0; > > > +} > > > + > > > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, > > > struct pci_dev *pdev) > > > { > > > - struct pci_dev *tmp, *parent; > > > + struct domain_context_info data; > > > > > > if (!iommu || !pdev) > > > return; > > > > > > /* dependent device detach */ > > > - tmp = pci_find_upstream_pcie_bridge(pdev); > > > - /* Secondary interface's bus number and devfn 0 */ > > > - if (tmp) { > > > - parent = pdev->bus->self; > > > - while (parent != tmp) { > > > - iommu_detach_dev(iommu, parent->bus->number, > > > - parent->devfn); > > > - parent = parent->bus->self; > > > - } > > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ > > > - iommu_detach_dev(iommu, > > > - tmp->subordinate->number, 0); > > > - else /* this is a legacy PCI bridge */ > > > - iommu_detach_dev(iommu, tmp->bus->number, > > > - tmp->devfn); > > > - } > > > + data.iommu = iommu; > > > + bridge = xx; > > > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); > > > } > > > > > > static void domain_remove_one_dev_info(struct dmar_domain *domain, > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges 2013-07-09 18:27 ` Alex Williamson (?) @ 2013-07-09 20:10 ` Bjorn Helgaas -1 siblings, 0 replies; 34+ messages in thread From: Bjorn Helgaas @ 2013-07-09 20:10 UTC (permalink / raw) To: Alex Williamson Cc: open list:INTEL IOMMU (VT-d), David Woodhouse, Joerg Roedel, Stephen Hemminger, linux-pci, Don Dutile On Tue, Jul 9, 2013 at 12:27 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Mon, 2013-07-08 at 15:51 -0600, Bjorn Helgaas wrote: >> On Mon, Jul 08, 2013 at 02:49:16PM -0600, Alex Williamson wrote: >> > On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote: >> > > On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: >> > > > Joerg, >> > > > >> > > > Where do we stand on this series? You had a concern that the heuristic >> > > > used in patch 1/ could be dangerous. The suggestion for detecting the >> > > > issue was actually from Bjorn who replied with his rationale. Do you >> > > > want to go in the direction of a fixed whitelist or do you agree that >> > > > even if the heuristic breaks it provides better behavior than what we >> > > > have now? Thanks, >> > > >> > > I'm trying to take a step back and look at the overall design, not >> > > these specific patches. >> > > >> > > IOMMUs translate addresses based on their source, i.e., a PCIe >> > > requester ID. This is made more complicated by the fact that some >> > > bridges "take ownership" (change the requester ID as they forward >> > > transactions upstream), as well as the fact that conventional PCI has >> > > no requester ID at all. And some broken devices apparently generate >> > > DMA requests using the requester ID of another device. >> > > >> > > We currently deal with this using the pci_find_upstream_pcie_bridge() >> > > and pci_get_dma_source() interfaces, but I think there's too much >> > > assembly required by their users. pci_find_upstream_pcie_bridge() >> > > callers normally loop through all the bridges between the "upstream >> > > PCIe bridge" and the device, checking for bridges that might take >> > > ownership. They probably also ought to use pci_get_dma_source() to >> > > account for the broken devices, but most callers don't. >> > > >> > > Most of this is PCI-specific stuff that should be of interest to all >> > > IOMMU drivers, and the overall structure of calls and looping should >> > > be the same for all of them, so it would be nice to factor it out >> > > somehow. >> > > >> > > The attached patch is guaranteed not to even compile; it's just to >> > > make this idea more concrete. The basic idea is that since the IOMMU >> > > driver wants to perform some action for each possible requester ID the >> > > IOMMU might see, PCI could provide an iterator >> > > ("pci_for_each_requester_id()") to do that. >> > > >> > > Bjorn >> > > >> > > >> > > commit afad51492c6672b96c2b0735600d5695e30f7180 >> > > Author: Bjorn Helgaas <bhelgaas@google.com> >> > > Date: Wed Jul 3 16:04:26 2013 -0600 >> > > >> > > pci-add-for-each-requester-id >> > > >> > > diff --git a/drivers/pci/search.c b/drivers/pci/search.c >> > > index d0627fa..380eb03 100644 >> > > --- a/drivers/pci/search.c >> > > +++ b/drivers/pci/search.c >> > > @@ -17,6 +17,89 @@ >> > > DECLARE_RWSEM(pci_bus_sem); >> > > EXPORT_SYMBOL_GPL(pci_bus_sem); >> > > >> > > +#define PCI_REQUESTER_ID(dev) (((dev)->bus->number << 8) | (dev)->devfn) >> > > +#define PCI_BRIDGE_REQUESTER_ID(bridge) ((bridge)->subordinate->number << 8) >> > > + >> > > +static inline bool pci_is_pcix(struct pci_dev *dev) >> > > +{ >> > > + return !!pci_pcix_cap(dev); /* XXX not implemented */ >> > > +} >> > > + >> > > +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge) >> > > +{ >> > > + /* >> > > + * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge >> > > + * Spec v1.0, sec 2.3. >> > > + */ >> > > + if (pci_is_pcie(bridge) && >> > > + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) >> > > + return true; >> > >> > Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe >> > capability are exactly what causes the current bug. >> >> Likely so. We might be able to identify that as "secondary bus is not >> PCIe" or something. >> >> > > + >> > > + /* >> > > + * A PCI-X to PCI-X bridge need not take ownership because there >> > > + * are requester IDs on the secondary PCI-X bus. However, if a PCI >> > > + * device is added on the secondary bus, the bridge must revert to >> > > + * being a PCI-X to PCI bridge, and it then *would* take ownership. >> > > + * Assuming a PCI-X to PCI-X bridge takes ownership means we can >> > > + * tolerate a future hot-add without having to change existing >> > > + * IOMMU mappings. >> > > + */ >> > > + if (pci_is_pcix(bridge)) >> > > + return true; >> > > + >> > > + return false; >> > > +} >> > > + >> > > +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus, >> > > + struct pci_dev *dev) >> > > +{ >> > > + struct pci_dev *bridge; >> > > + struct pci_bus *child_bus; >> > > + u8 secondary, subordinate, busn = dev->bus->number; >> > > + >> > > + if (dev->bus == bus) >> > > + return dev; >> > > + >> > > + /* >> > > + * There may be several devices on "bus". Find the one that is a >> > > + * bridge leading to "dev". >> > > + */ >> > > + list_for_each_entry(bridge, &bus->devices, bus_list) { >> > > + child_bus = bridge->subordinate; >> > > + if (child_bus) { >> > > + secondary = child_bus->busn_res.start; >> > > + subordinate = child_bus->busn_res.end; >> > > + if (secondary <= busn && busn <= subordinate) >> > > + return bridge; >> > > + } >> > > + } >> > > + return NULL; >> > > +} >> > > + >> > > +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, >> > > + int (*fn)(struct pci_dev *, void *), >> > >> > u16 requester_id >> > >> > > + void *data) >> > > +{ >> > > + int ret; >> > > + >> > > + dev = pci_get_dma_source(dev); /* XXX ref count screwup */ >> > >> > It's unfortunate that our dma quirk still seems to only get called for >> > the end device where we could have bridges that need it too. >> >> Sure, we can do that if necessary. >> >> I'm not trying to present this as any kind of finished solution. The >> point is to figure out if we can make an interface that's a little >> more abstract so we can pull the PCI topology issues out of the IOMMU >> drivers, and I just thought something that looked like code would be >> more concrete and easier to discuss. >> >> > > + >> > > + while (bridge != dev) { >> > >> > >> > I'm having a hard time understanding this function since bridge is never >> > defined in any of the below sample code. Would bridge be the root port >> > above a device? And what about devices connected to the RC? Seems like >> > we'd need to traverse up the bus first, so there's a bit of missing >> > magic. >> >> I was assuming that an IOMMU could be either at the root of a PCI >> hierarchy or buried farther down. If that's true, the IOMMU driver >> might need to supply the top of the PCI hierarchy for which it >> translates. That's what I'm trying to get at with the "bridge" >> argument. >> >> If the IOMMU is buried, I assumed there was no point in iterating over >> bridges upstream of the IOMMU because it wouldn't see requester IDs of >> those upstream bridges. PCI wouldn't have any way to figure out how >> far up to go, hence the argument. > > I don't think that getting the bridge is going to be quite so simple. > VT-d can have multiple IOMMUs, each handling a specific device or > hierarchy. One IOMMU per domain can be default IOMMU for any device not > otherwise specified. The IOMMU is not a PCI enumerable device. AMD-Vi > does have a PCI visible IOMMU, but it's an endpoint on the root complex, > not the top of a hierarchy. If the only possible topology is that the IOMMU sees transactions coming out of a device on a root bus, then a bridge parameter is unnecessary, and we can just look at every bridge between the root bus and the device. I just didn't know whether that was the only possible topology. If an IOMMU can be placed such that it's like a PCIe switch that happens to translate DMA addresses before forwarding upstream, then we might want a bridge parameter that identifies the "downstream port" of the IOMMU or the upstream port of a switch below the IOMMU. Otherwise we would generate requester IDs unnecessarily for bridges above the IOMMU. > I'm warming up to your approach, but I think I'd rather see something > like: > > struct pci_dev *pci_get_requester_id(struct pci_dev *pdev) This returns a "pci_dev *", not a requester ID, so for bridges that can take ownership, this depends on there being a pci_dev on the secondary bus with dev/fn == 0. Maybe that's always true; I'm not sure. > { > do { > /* Apply DMA quirks at every step */ > pdev = pci_get_dma_source(pdev); > > /* Done for PCIe devices or broken bridges, PCI-X TBD */ > if (pci_is_pcie(pdev) || /* broken pcie bridge test */) > break; > > } while (!pci_is_root_bus(pdev->bus) && (pdev = pdev->bus->self)); > > return pdev; I'm not sure it's sufficient to stop when we find a PCIe bridge. For example, in this topology: 00:1c.0 PCIe root port 01:00.0 PCIe-to-PCI bridge 02:00.0 PCI-to-PCIe reverse bridge 03:00.0 PCIe switch upstream port 04:00.0 PCIe switch downstream port 05:00.0 PCIe endpoint the IOMMU will only see requester IDs of 02:00.0, even though if we search up from 05:00.0 we find a PCIe device at 04:00.0. Maybe this is too contrived, but reverse bridges are sometimes used on add-in cards so a new part can be used in older systems, e.g., in this case the add-in card would contain 02:00.0 and everything below it. > } > > > /* Start with requester ID pdev and pdev->bus, this way we handle both peer > * and parent requester IDs */ > int pci_for_each_requester_id(struct pci_dev *pdev, struct pci_bus *bus, > int (*fn)(struct pci_dev *, void *), void *data) > { > struct pci_dev *tmp; > int ret; > > list_for_each_entry(tmp, &bus->devices, bus_list) { > if (pci_get_requester_id(tmp) == pdev) { > ret = fn(tmp, data); What does the IOMMU driver supply for "bus"? Presumably it doesn't supply "pdev->bus" because you're traversing downward ("tmp->subordinate" below). > if (ret) > ... > > if (tmp->subordinate) { > ret = pci_for_each_requester_id(pdev, tmp->subordinate, fn); > if (ret) > ... > } > } > } > > return ret; > } > > Seems like that would be sufficient to leverage the intel-iommu changes > you've drafted. Thanks, > > Alex > >> > > + if (pci_bridge_may_take_ownership(bridge)) { >> > > + ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data); >> > > + if (ret) >> > > + return ret; >> > > + } >> > > + bridge = pci_bridge_to_dev(bridge->subordinate, dev); >> > > + } >> > > + >> > > + if (pci_is_pcie(dev) || pci_is_pcix(dev)) >> > > + return fn(dev, PCI_REQUESTER_ID(dev), data); >> > > + >> > > + /* Conventional PCI has no requester ID */ >> > > + return 0; >> > > +} >> > > + >> > > /* >> > > * find the upstream PCIe-to-PCI bridge of a PCI device >> > > * if the device is PCIE, return NULL >> > > diff --git a/include/linux/pci.h b/include/linux/pci.h >> > > index 3a24e4f..cbcc82f 100644 >> > > --- a/include/linux/pci.h >> > > +++ b/include/linux/pci.h >> > > @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, >> > > } >> > > struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from); >> > > int pci_dev_present(const struct pci_device_id *ids); >> > > +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev, >> > > + int (*fn)(struct pci_dev *, void *), void *data); >> > > >> > > int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn, >> > > int where, u8 *val); >> > > >> > > commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e >> > > Author: Bjorn Helgaas <bhelgaas@google.com> >> > > Date: Wed Jul 3 12:02:15 2013 -0600 >> > > >> > > intel-iommu-upstream-pcie >> > >> > I assume this one would is also going to apply dma quirks to dma_ops, >> > which I'm all for. >> > >> > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> > > index b91564d..bdb6e6a 100644 >> > > --- a/drivers/iommu/intel-iommu.c >> > > +++ b/drivers/iommu/intel-iommu.c >> > > @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment, >> > > return 0; >> > > } >> > > >> > > +struct domain_context_info { >> > > + struct dmar_domain *domain; >> > > + struct pci_dev *dev; >> > > + int translation; >> > > + struct intel_iommu *iomumu; >> > >> > s/iomumu/iommu/ >> > >> > > + bool mapped; >> > > +}; >> > > + >> > > +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data) >> > > +{ >> > > + struct domain_context_info *dev_info = data; >> > > + int segment = pci_domain_nr(data->dev); >> > >> > dev_info->dev >> > >> > > + u8 bus = data->requester_id >> 8; >> > >> > s/data->// >> > >> > > + u8 devfn = data->requester_id & 0xff; >> > >> > s/data->// >> >> I told you it wouldn't even compile :) I'm only trying to present a >> possibility, not all the nitty-gritty details. >> >> > > + >> > > + return domain_context_mapping_one(data->domain, segment, >> > > + bus, devfn, data->translation); >> > > +} >> > > + >> > > static int domain_context_mapping(struct dmar_domain *domain, >> > > struct pci_dev *pdev, int translation) >> > > { >> > > - int ret; >> > > - struct pci_dev *tmp, *parent; >> > > + struct domain_context_info data; >> > > + struct pci_dev *bridge; >> > > >> > > - ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus), >> > > - pdev->bus->number, pdev->devfn, >> > > - translation); >> > > - if (ret) >> > > - return ret; >> > > + data.domain = domain; >> > > + data.dev = pdev; >> > > + data.translation = translation; >> > > + bridge = xx; >> > >> > Darn, I was hoping you were going to reveal where bridge comes from. >> > >> > > + return pci_for_each_requester_id(bridge, pdev, context_mapping, &data); >> > > +} >> > > >> > > - /* dependent device mapping */ >> > > - tmp = pci_find_upstream_pcie_bridge(pdev); >> > > - if (!tmp) >> > > - return 0; >> > > - /* Secondary interface's bus number and devfn 0 */ >> > > - parent = pdev->bus->self; >> > > - while (parent != tmp) { >> > > - ret = domain_context_mapping_one(domain, >> > > - pci_domain_nr(parent->bus), >> > > - parent->bus->number, >> > > - parent->devfn, translation); >> > > - if (ret) >> > > - return ret; >> > > - parent = parent->bus->self; >> > > - } >> > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ >> > > - return domain_context_mapping_one(domain, >> > > - pci_domain_nr(tmp->subordinate), >> > > - tmp->subordinate->number, 0, >> > > - translation); >> > > - else /* this is a legacy PCI bridge */ >> > > - return domain_context_mapping_one(domain, >> > > - pci_domain_nr(tmp->bus), >> > > - tmp->bus->number, >> > > - tmp->devfn, >> > > - translation); >> > > +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data) >> > > +{ >> > > + struct domain_context_info *dev_info = data; >> > > + u8 bus = data->requester_id >> 8; >> > > + u8 devfn = data->requester_id & 0xff; >> > > + bool ret; >> > > + >> > > + ret = device_context_mapped(dev_info->iommu, bus, devfn); >> > > + if (!ret) >> > > + dev_info->mapped = false; >> > > + >> > > + return 0; >> > > } >> > > >> > > static bool domain_context_mapped(struct pci_dev *pdev) >> > > { >> > > - bool ret; >> > > - struct pci_dev *tmp, *parent; >> > > + struct domain_context_info data; >> > > struct intel_iommu *iommu; >> > > + struct pci_dev *bridge; >> > > >> > > iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number, >> > > pdev->devfn); >> > > if (!iommu) >> > > return false; >> > > >> > > - ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn); >> > > - if (!ret) >> > > - return false; >> > > - /* dependent device mapping */ >> > > - tmp = pci_find_upstream_pcie_bridge(pdev); >> > > - if (!tmp) >> > > - return true; >> > > - /* Secondary interface's bus number and devfn 0 */ >> > > - parent = pdev->bus->self; >> > > - while (parent != tmp) { >> > > - ret = device_context_mapped(iommu, parent->bus->number, >> > > - parent->devfn); >> > > - if (!ret) >> > > - return false; >> > > - parent = parent->bus->self; >> > > - } >> > > - if (pci_is_pcie(tmp)) >> > > - return device_context_mapped(iommu, tmp->subordinate->number, >> > > - 0); >> > > - else >> > > - return device_context_mapped(iommu, tmp->bus->number, >> > > - tmp->devfn); >> > > + data.iommu = iommu; >> > > + data.mapped = true; >> > > + bridge = xx; >> > > + pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data); >> > > + return data.mapped; >> > > } >> > > >> > > /* Returns a number of VTD pages, but aligned to MM page size */ >> > > @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev) >> > > return NULL; >> > > } >> > > >> > > +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data) >> > > +{ >> > > + struct domain_context_info *dev_info = data; >> > > + int segment = pci_domain_nr(data->dev); >> > > + u8 bus = data->requester_id >> 8; >> > > + u8 devfn = data->requester_id & 0xff; >> > > + unsigned long flags; >> > > + struct device_domain_info *info; >> > > + >> > > + spin_lock_irqsave(&device_domain_lock, flags); >> > > + list_for_each_entry(info, &device_domain_list, global) { >> > > + if (info->segment == segment && >> > > + info->bus == bus && info->devfn == devfn) { >> > > + data->domain = info->domain; >> > > + break; >> > > + } >> > > + } >> > > + spin_unlock_irqrestore(&device_domain_lock, flags); >> > > + return 0; >> > > +} >> > > + >> > > /* domain is initialized */ >> > > static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) >> > > { >> > > @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) >> > > struct intel_iommu *iommu; >> > > struct dmar_drhd_unit *drhd; >> > > struct device_domain_info *info, *tmp; >> > > - struct pci_dev *dev_tmp; >> > > unsigned long flags; >> > > int bus = 0, devfn = 0; >> > > int segment; >> > > int ret; >> > > + struct domain_context_info data; >> > > >> > > domain = find_domain(pdev); >> > > if (domain) >> > > @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) >> > > >> > > segment = pci_domain_nr(pdev->bus); >> > > >> > > - dev_tmp = pci_find_upstream_pcie_bridge(pdev); >> > > - if (dev_tmp) { >> > > - if (pci_is_pcie(dev_tmp)) { >> > > - bus = dev_tmp->subordinate->number; >> > > - devfn = 0; >> > > - } else { >> > > - bus = dev_tmp->bus->number; >> > > - devfn = dev_tmp->devfn; >> > > - } >> > > - spin_lock_irqsave(&device_domain_lock, flags); >> > > - list_for_each_entry(info, &device_domain_list, global) { >> > > - if (info->segment == segment && >> > > - info->bus == bus && info->devfn == devfn) { >> > > - found = info->domain; >> > > - break; >> > > - } >> > > - } >> > > - spin_unlock_irqrestore(&device_domain_lock, flags); >> > > - /* pcie-pci bridge already has a domain, uses it */ >> > > - if (found) { >> > > - domain = found; >> > > - goto found_domain; >> > > - } >> > > + data.domain = NULL; >> > > + data.dev = pdev; >> > > + bridge = xx; >> > > + pci_for_each_requester_id(bridge, pdev, find_domain2, &data); >> > > + if (data.domain) { >> > > + domain = data.domain; >> > > + goto found_domain; >> > > } >> > > >> > > domain = alloc_domain(); >> > > @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void) >> > > return 0; >> > > } >> > > >> > > +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data) >> > > +{ >> > > + struct domain_context_info *dev_info = data; >> > > + struct intel_iommu *iommu = data->iommu; >> > > + u8 bus = data->requester_id >> 8; >> > > + u8 devfn = data->requester_id & 0xff; >> > > + >> > > + iommu_detach_dev(iommu, bus, devfn); >> > > + return 0; >> > > +} >> > > + >> > > static void iommu_detach_dependent_devices(struct intel_iommu *iommu, >> > > struct pci_dev *pdev) >> > > { >> > > - struct pci_dev *tmp, *parent; >> > > + struct domain_context_info data; >> > > >> > > if (!iommu || !pdev) >> > > return; >> > > >> > > /* dependent device detach */ >> > > - tmp = pci_find_upstream_pcie_bridge(pdev); >> > > - /* Secondary interface's bus number and devfn 0 */ >> > > - if (tmp) { >> > > - parent = pdev->bus->self; >> > > - while (parent != tmp) { >> > > - iommu_detach_dev(iommu, parent->bus->number, >> > > - parent->devfn); >> > > - parent = parent->bus->self; >> > > - } >> > > - if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */ >> > > - iommu_detach_dev(iommu, >> > > - tmp->subordinate->number, 0); >> > > - else /* this is a legacy PCI bridge */ >> > > - iommu_detach_dev(iommu, tmp->bus->number, >> > > - tmp->devfn); >> > > - } >> > > + data.iommu = iommu; >> > > + bridge = xx; >> > > + pci_for_each_requester_id(bridge, pdev, detach_requester, &data); >> > > } >> > > >> > > static void domain_remove_one_dev_info(struct dmar_domain *domain, >> > > -- >> > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> > > the body of a message to majordomo@vger.kernel.org >> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> > >> > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-30 11:52 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-07-30 11:52 UTC (permalink / raw) To: Alex Williamson; +Cc: iommu, dwmw2, stephen, linux-pci, ddutile (sorry for the delay) On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > Where do we stand on this series? You had a concern that the heuristic > used in patch 1/ could be dangerous. The suggestion for detecting the > issue was actually from Bjorn who replied with his rationale. Do you > want to go in the direction of a fixed whitelist or do you agree that > even if the heuristic breaks it provides better behavior than what we > have now? Thanks, So if this workaround should still live in the IOMMU code (after your discussion with Bjorn) you can either convince me that the heuristic will never fail or you resend a version that uses a list of known to be broken bridges to match against. Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges @ 2013-07-30 11:52 ` Joerg Roedel 0 siblings, 0 replies; 34+ messages in thread From: Joerg Roedel @ 2013-07-30 11:52 UTC (permalink / raw) To: Alex Williamson Cc: stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ, linux-pci-u79uwXL29TY76Z2rM5mHXA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ (sorry for the delay) On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote: > Where do we stand on this series? You had a concern that the heuristic > used in patch 1/ could be dangerous. The suggestion for detecting the > issue was actually from Bjorn who replied with his rationale. Do you > want to go in the direction of a fixed whitelist or do you agree that > even if the heuristic breaks it provides better behavior than what we > have now? Thanks, So if this workaround should still live in the IOMMU code (after your discussion with Bjorn) you can either convince me that the heuristic will never fail or you resend a version that uses a list of known to be broken bridges to match against. Joerg ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2013-07-30 11:53 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-05-28 18:40 [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges Alex Williamson 2013-05-28 18:40 ` [PATCH v2 1/2] iommu: Quirked PCIe bridge test and search function Alex Williamson 2013-05-28 18:40 ` Alex Williamson 2013-05-28 19:38 ` Stephen Hemminger 2013-05-28 19:53 ` Alex Williamson 2013-05-28 19:56 ` Stephen Hemminger 2013-05-28 20:15 ` Alex Williamson 2013-05-28 20:28 ` Stephen Hemminger 2013-06-20 13:59 ` Joerg Roedel 2013-06-20 13:59 ` Joerg Roedel 2013-06-20 15:44 ` Alex Williamson 2013-06-20 16:15 ` Joerg Roedel 2013-06-20 16:15 ` Joerg Roedel 2013-06-26 4:20 ` Bjorn Helgaas 2013-06-26 18:45 ` Alex Williamson 2013-06-26 18:45 ` Alex Williamson 2013-06-26 19:11 ` Bjorn Helgaas 2013-05-28 18:40 ` [PATCH v2 2/2] intel-iommu: Convert to iommu_pci_find_upstream + iommu_pci_is_pcie_bridge Alex Williamson 2013-05-28 22:09 ` [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges Bjorn Helgaas 2013-05-28 22:09 ` Bjorn Helgaas 2013-05-28 22:53 ` [PATCH v2 3/2] pci: Remove unused pci_find_upstream_pcie_bridge() Alex Williamson 2013-05-28 22:53 ` Alex Williamson 2013-07-08 17:07 ` [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges Alex Williamson 2013-07-08 17:07 ` Alex Williamson 2013-07-08 19:34 ` Bjorn Helgaas 2013-07-08 19:34 ` Bjorn Helgaas 2013-07-08 20:49 ` Alex Williamson 2013-07-08 20:49 ` Alex Williamson 2013-07-08 21:51 ` Bjorn Helgaas 2013-07-09 18:27 ` Alex Williamson 2013-07-09 18:27 ` Alex Williamson 2013-07-09 20:10 ` Bjorn Helgaas 2013-07-30 11:52 ` Joerg Roedel 2013-07-30 11:52 ` Joerg Roedel
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.