linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems
@ 2014-05-10 15:02 Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 01/15] PCI: Add DMA alias iterator Alex Williamson
                   ` (16 more replies)
  0 siblings, 17 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

This version can be found in my git tree:

git://github.com/awilliam/linux-vfio.git dma-alias-v3

Please report any issues.

v3:
 - Found several instances where I had PCI_SLOT when I meant
   PCI_FUNC.  Thanks to Andrew for spotting this.  This should
   fix the problem he was having with Ricoh quirks.  We also
   pruned down the func0 quirks to only those that we know are
   needed.  We can always add them back later.

 - Found a case in intel-iommu of using dev_is_pci() where I
   really wanted !dev_is_pci().  Fixed.

v2:
 - Several new Marvell controllers added to quirks.  There's been
   a lot of success reported with this series in
   https://bugzilla.kernel.org/show_bug.cgi?id=42679

 - Add quirk for ASMedia and Tundra PCIe-to-PCI bridges that do
   not expose a PCIe capability.  These have been shown to use
   the standard PCIe-to-PCI bridge requester ID.

 - Fix copy/paste duplicate Ricoh quirk ID

 - Fixed AMD IOMMU for the "ghost" function case where the DMA
   alias is for an absent device.  The iommu rlookup table and
   data fields need to be initializes.

 - Fixed Intel interrupt remapping, I wasn't passing the target
   bus number, only the alias bus number.

These patches are split across PCI and IOMMU, but I've front-loaded
all of the PCI infrastructure so that the first 7 patches can be
applied to PCI-core, the IOMMU maintainers can pickup their patches,
then we can finish with dead code removal.  Bjorn might also be
willing to carry the IOMMU changes if the maintainers want to ack
them.

Original description:

This series attempts to fix a couple issues we've had outstanding in
the PCI/IOMMU code for a while.  The first issue is with devices that
use the wrong requester ID for DMA transactions.  We already have a
sort of half-baked attempt to fix this for several Ricoh devices, but
the fix only helps them be useful through IOMMU groups, not the
general DMA case.  There are also several Marvell devices which use
use a different wrong requester ID and don't even fit into the DMA
source idea.  This series creates a DMA alias iterator that will
step through each possible alias of a device, allowing IOMMUs to
insert mappings for both the device and its aliases.

Hand-in-hand with this is our broken pci_find_upstream_pcie_bridge()
function, which is known to blowup when it finds itself suddenly at
a PCIe device without crossing a PCIe-to-PCI bridge (as identified by
the PCIe capability).  It also likes to make the invalid assumption
that a PCIe device never has its requester ID masked by any usptream
bus.  We can fix this using the above new DMA alias iterator, since
that's effectively what this function was meant to do.

Finally, with all these helpers, it makes sense to consolidate code
for determining IOMMU groups.  The first step in finding the root
of a group is finding the final upstream DMA alias for the device,
then applying additional ACS rules and incorporating device specific
aliases.  As this is all common to PCI, create a single implementation
and remove piles of code from the individual IOMMU drivers.

This series allows devices like the Marvell 88SE9123 to finally work
on Linux with either AMD-Vi or VT-d enabled on the box.  I've
collected device IDs from various bugs to support as many SKUs of
these devices as possible, but I'm sure there are others that I've
missed.

This should also enable motherboards with an onboard ASmedia
ASM1083/1085 PCIe-to-PCI bridge to work with VT-d enabled.  I've
acquired an adapter board with this chip, but it actually exposes
a PCIe capability, unlike most of the onboard controllers.  Therefore
I expect this series will fix the WARN_ON currently hit during boot,
but there's a 50/50 chance whether the device behaves like a PCI
bridge or a PCIe bridge with regard to the requester ID that it uses
to take ownership of the transaction.  If it turns out to use the
PCIe bridge model, I expect we can quirk it using a dev_flags bit
to identify a PCI bridge that takes ownership as if it was a PCIe
bridge.

Please test and provide feedback.  I expect IOMMU group topology
should not change from this series, but if a case is found where it
does, please share.  Also, if there are additional quirks we need
to add, please either file new or add to the existing bugs.  Thanks,

Alex

---

Alex Williamson (15):
      PCI: Add DMA alias iterator
      PCI: quirk pci_for_each_dma_alias()
      PCI: quirk dma_func_alias for Ricoh devices
      PCI: quirk dma_func_alias for Marvell devices
      PCI: Quirk pci_for_each_dma_alias() for bridges
      PCI: Add quirks for ASMedia and Tundra bridges
      PCI: Consolidate isolation domain code
      iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups
      iommu/amd: Update to use PCI DMA aliases
      iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups
      iommu/intel: Update to use PCI DMA aliases
      iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups
      iommu: Remove pci.h
      PCI: Remove pci_find_upstream_pcie_bridge()
      PCI: Remove pci_get_dma_source()


 drivers/iommu/amd_iommu.c           |  195 +++++------------------
 drivers/iommu/amd_iommu_types.h     |    1 
 drivers/iommu/fsl_pamu_domain.c     |   67 --------
 drivers/iommu/intel-iommu.c         |  293 +++++++++++++----------------------
 drivers/iommu/intel_irq_remapping.c |   55 +++++--
 drivers/iommu/pci.h                 |   29 ---
 drivers/pci/quirks.c                |  112 ++++++++-----
 drivers/pci/search.c                |  240 ++++++++++++++++++++++++++---
 include/linux/pci.h                 |   23 +--
 9 files changed, 484 insertions(+), 531 deletions(-)
 delete mode 100644 drivers/iommu/pci.h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 01/15] PCI: Add DMA alias iterator
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 02/15] PCI: quirk pci_for_each_dma_alias() Alex Williamson
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

In a mixed PCI/PCI-X/PCI-e topology, bridges can take ownership of
transactions, replacing the original requester ID with their own.
Sometimes we just want to know the resulting device or resulting
alias, sometimes we want each step in the chain.  This iterator
allows either usage.  When an endpoint is connected via an unbroken
chain of PCIe switches and root ports, it has no alias and its
requester ID is visible to the root bus.  When PCI/X get in the way,
we pick up aliases for bridges.

The reason why we potentially care about each step in the path is
because of PCI-X.  PCI-X has the concept of a requester ID, but
bridges may or may not take ownership of various types of
transactions.  We therefore leave it to the consumer of this function
to prune out what they don't care about rather than attempt to flatten
the alias ourselves.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/search.c |   70 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h  |    4 +++
 2 files changed, 74 insertions(+)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 4a1b972..5601cdb 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -18,6 +18,76 @@ DECLARE_RWSEM(pci_bus_sem);
 EXPORT_SYMBOL_GPL(pci_bus_sem);
 
 /*
+ * pci_for_each_dma_alias - Iterate over DMA aliases for a device
+ * @pdev: starting downstream device
+ * @fn: function to call for each alias
+ * @data: opaque data to pass to @fn
+ *
+ * Starting @pdev, walk up the bus calling @fn for each possible alias
+ * of @pdev at the root bus.
+ */
+int pci_for_each_dma_alias(struct pci_dev *pdev,
+			   int (*fn)(struct pci_dev *pdev,
+				     u16 alias, void *data), void *data)
+{
+	struct pci_bus *bus;
+	int ret;
+
+	ret = fn(pdev, PCI_DEVID(pdev->bus->number, pdev->devfn), data);
+	if (ret)
+		return ret;
+
+	for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) {
+		struct pci_dev *tmp;
+
+		/* Skip virtual buses */
+		if (!bus->self)
+			continue;
+
+		tmp = bus->self;
+
+		/*
+		 * PCIe-to-PCI/X bridges alias transactions from downstream
+		 * devices using the subordinate bus number (PCI Express to
+		 * PCI/PCI-X Bridge Spec, rev 1.0, sec 2.3).  For all cases
+		 * where the upstream bus is PCI/X we alias to the bridge
+		 * (there are various conditions in the previous reference
+		 * where the bridge may take ownership of transactions, even
+		 * when the secondary interface is PCI-X).
+		 */
+		if (pci_is_pcie(tmp)) {
+			switch (pci_pcie_type(tmp)) {
+			case PCI_EXP_TYPE_ROOT_PORT:
+			case PCI_EXP_TYPE_UPSTREAM:
+			case PCI_EXP_TYPE_DOWNSTREAM:
+				continue;
+			case PCI_EXP_TYPE_PCI_BRIDGE:
+				ret = fn(tmp,
+					 PCI_DEVID(tmp->subordinate->number,
+						   PCI_DEVFN(0, 0)), data);
+				if (ret)
+					return ret;
+				continue;
+			case PCI_EXP_TYPE_PCIE_BRIDGE:
+				ret = fn(tmp,
+					 PCI_DEVID(tmp->bus->number,
+						   tmp->devfn), data);
+				if (ret)
+					return ret;
+				continue;
+			}
+		} else {
+			ret = fn(tmp, PCI_DEVID(tmp->bus->number, tmp->devfn),
+				 data);
+			if (ret)
+				return ret;
+		}
+	}
+
+	return ret;
+}
+
+/*
  * find the upstream PCIe-to-PCI bridge of a PCI device
  * if the device is PCIE, return NULL
  * if the device isn't connected to a PCIe bridge (that is its parent is a
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aab57b4..14b074b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1795,6 +1795,10 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
 }
 #endif
 
+int pci_for_each_dma_alias(struct pci_dev *pdev,
+			   int (*fn)(struct pci_dev *pdev,
+				     u16 alias, void *data), void *data);
+
 /**
  * pci_find_upstream_pcie_bridge - find upstream PCIe-to-PCI bridge of a device
  * @pdev: the PCI device


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 02/15] PCI: quirk pci_for_each_dma_alias()
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 01/15] PCI: Add DMA alias iterator Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 03/15] PCI: quirk dma_func_alias for Ricoh devices Alex Williamson
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

There are a few broken devices that use the requester ID of a different
function in the slot for their DMA.  To handle these, add a bitmap to
struct pci_dev (using an alignment gap) that quirks can populate.  As
we iterate over the device and bus DMA aliases, also iterate over any
bits in the map.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/search.c |   21 +++++++++++++++++++++
 include/linux/pci.h  |    2 ++
 2 files changed, 23 insertions(+)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 5601cdb..ad698b2 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -37,6 +37,27 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
 	if (ret)
 		return ret;
 
+	/*
+	 * dma_func_alias provides a bitmap of other function numbers on
+	 * this same PCI slot to use as DMA aliases.
+	 */
+	if (unlikely(pdev->dma_func_alias)) {
+		u8 map = pdev->dma_func_alias & ~(1 << PCI_FUNC(pdev->devfn));
+		int func;
+
+		for (func = 0; map && func < 8; func++, map >>= 1) {
+			if (!(map & 1))
+				continue;
+
+			ret = fn(pdev,
+				 PCI_DEVID(pdev->bus->number,
+					   PCI_DEVFN(PCI_SLOT(pdev->devfn),
+					   func)), data);
+			if (ret)
+				return ret;
+		}
+	}
+
 	for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) {
 		struct pci_dev *tmp;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 14b074b..b4c97d2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -268,6 +268,8 @@ struct pci_dev {
 	u8		rom_base_reg;	/* which config register controls the ROM */
 	u8		pin;		/* which interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* cached PCIe Capabilities Register */
+	u8		dma_func_alias;	/* bitmap of functions used as DMA
+					   aliases for this device */
 
 	struct pci_driver *driver;	/* which driver has allocated this device */
 	u64		dma_mask;	/* Mask of the bits of bus address this


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 03/15] PCI: quirk dma_func_alias for Ricoh devices
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 01/15] PCI: Add DMA alias iterator Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 02/15] PCI: quirk pci_for_each_dma_alias() Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 04/15] PCI: quirk dma_func_alias for Marvell devices Alex Williamson
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

The existing quirk for these devices doesn't really solve the problem,
re-implement it using the DMA alias iterator.  We'll come back later
and remove the existing quirk and dma_source interface.  Note that
device ID 0xe822 is typically function 0 and 0xe230 has been tested
to not need the quirk and are therefore removed versus the equivalent
dma_source quirk.  If the exist in other configurations we can re-add
them.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e729206..ea2cb9f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3333,6 +3333,20 @@ int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 	return -ENOTTY;
 }
 
+static void quirk_dma_func0_alias(struct pci_dev *dev)
+{
+	if (PCI_FUNC(dev->devfn) != 0)
+		dev->dma_func_alias |= (1 << 0);
+}
+
+/*
+ * https://bugzilla.redhat.com/show_bug.cgi?id=605888
+ *
+ * Some Ricoh devices use function 0 as the PCIe requester ID for DMA.
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe832, quirk_dma_func0_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias);
+
 static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
 {
 	if (!PCI_FUNC(dev->devfn))


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 04/15] PCI: quirk dma_func_alias for Marvell devices
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (2 preceding siblings ...)
  2014-05-10 15:02 ` [PATCH v3 03/15] PCI: quirk dma_func_alias for Ricoh devices Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 05/15] PCI: Quirk pci_for_each_dma_alias() for bridges Alex Williamson
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

Several Marvell devices and a JMicron device have a similar DMA
requester ID problem to Ricoh, except they use function 1 as the
PCIe requester ID.  Add a quirk for these to populate the DMA
function alias bitmap.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ea2cb9f..c6ae46f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3347,6 +3347,40 @@ static void quirk_dma_func0_alias(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe832, quirk_dma_func0_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_RICOH, 0xe476, quirk_dma_func0_alias);
 
+static void quirk_dma_func1_alias(struct pci_dev *dev)
+{
+	if (PCI_FUNC(dev->devfn) != 1)
+		dev->dma_func_alias |= (1 << 1);
+}
+
+/*
+ * Marvell 88SE9123 uses function 1 as the requester ID for DMA.  In some
+ * SKUs function 1 is present and is a legacy IDE controller, in other
+ * SKUs this function is not present, making this a ghost requester.
+ * https://bugzilla.kernel.org/show_bug.cgi?id=42679
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9123,
+			 quirk_dma_func1_alias);
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c14 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9130,
+			 quirk_dma_func1_alias);
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c47 + c57 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9172,
+			 quirk_dma_func1_alias);
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c59 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x917a,
+			 quirk_dma_func1_alias);
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c46 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x91a0,
+			 quirk_dma_func1_alias);
+/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
+			 quirk_dma_func1_alias);
+/* https://bugs.gentoo.org/show_bug.cgi?id=497630 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
+			 PCI_DEVICE_ID_JMICRON_JMB388_ESD,
+			 quirk_dma_func1_alias);
+
 static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
 {
 	if (!PCI_FUNC(dev->devfn))


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 05/15] PCI: Quirk pci_for_each_dma_alias() for bridges
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (3 preceding siblings ...)
  2014-05-10 15:02 ` [PATCH v3 04/15] PCI: quirk dma_func_alias for Marvell devices Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:02 ` [PATCH v3 06/15] PCI: Add quirks for ASMedia and Tundra bridges Alex Williamson
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

Several PCIe-to-PCI bridges fail to provide a PCIe capability, causing
us to handle them as conventional PCI devices.  In some cases, this
may be correct, in others it's not.  Add a dev_flag bit to identify
devices to be handled as standard PCIe-to-PCI bridges.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/search.c |   10 ++++++++--
 include/linux/pci.h  |    2 ++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index ad698b2..dab4561 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -98,8 +98,14 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
 				continue;
 			}
 		} else {
-			ret = fn(tmp, PCI_DEVID(tmp->bus->number, tmp->devfn),
-				 data);
+			if (tmp->dev_flags & PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS)
+				ret = fn(tmp,
+					 PCI_DEVID(tmp->subordinate->number,
+						   PCI_DEVFN(0, 0)), data);
+			else
+				ret = fn(tmp,
+					 PCI_DEVID(tmp->bus->number,
+						   tmp->devfn), data);
 			if (ret)
 				return ret;
 		}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b4c97d2..31d9a90 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -171,6 +171,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
 	/* Flag for quirk use to store if quirk-specific ACS is enabled */
 	PCI_DEV_FLAGS_ACS_ENABLED_QUIRK = (__force pci_dev_flags_t) 8,
+	/* DMA alias the device as if it was a PCIe-to-PCI bridge */
+	PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS = (__force pci_dev_flags_t) 16,
 };
 
 enum pci_irq_reroute_variant {


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 06/15] PCI: Add quirks for ASMedia and Tundra bridges
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (4 preceding siblings ...)
  2014-05-10 15:02 ` [PATCH v3 05/15] PCI: Quirk pci_for_each_dma_alias() for bridges Alex Williamson
@ 2014-05-10 15:02 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 07/15] PCI: Consolidate isolation domain code Alex Williamson
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:02 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

The quirk is intended to be extremely generic, but we only apply it
to known offending devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index c6ae46f..652b49e 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3354,6 +3354,29 @@ static void quirk_dma_func1_alias(struct pci_dev *dev)
 }
 
 /*
+ * A few PCIe-to-PCI bridges fail to expose a PCIe capability, resulting in
+ * using the wrong DMA alias for the device.  Some of these devices can be
+ * used as either forward or reverse bridges, so we need to test whether the
+ * device is operating in the correct mode.  We could probably apply this
+ * quirk to PCI_ANY_ID, but for now we'll just use known offenders.  The test
+ * is for a non-root, non-PCIe bridge where the upstream device is PCIe and
+ * is not a PCIe-to-PCI bridge, then @pdev is actually a PCIe-to-PCI bridge.
+ */
+static void quirk_use_pcie_bridge_dma_alias(struct pci_dev *pdev)
+{
+	if (!pci_is_root_bus(pdev->bus) &&
+	    pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE &&
+	    !pci_is_pcie(pdev) && pci_is_pcie(pdev->bus->self) &&
+	    pci_pcie_type(pdev->bus->self) != PCI_EXP_TYPE_PCI_BRIDGE)
+		pdev->dev_flags |= PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS;
+}
+/* ASM1083/1085, https://bugzilla.kernel.org/show_bug.cgi?id=44881#c46 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ASMEDIA, 0x1080,
+			 quirk_use_pcie_bridge_dma_alias);
+/* Tundra 8113, https://bugzilla.kernel.org/show_bug.cgi?id=44881#c43 */
+DECLARE_PCI_FIXUP_HEADER(0x10e3, 0x8113, quirk_use_pcie_bridge_dma_alias);
+
+/*
  * Marvell 88SE9123 uses function 1 as the requester ID for DMA.  In some
  * SKUs function 1 is present and is a legacy IDE controller, in other
  * SKUs this function is not present, making this a ghost requester.


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 07/15] PCI: Consolidate isolation domain code
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (5 preceding siblings ...)
  2014-05-10 15:02 ` [PATCH v3 06/15] PCI: Add quirks for ASMedia and Tundra bridges Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

Each of the IOMMU drivers supporting IOMMU groups has their own
implementation of an algorithm to find the base device for an IOMMU
group.  This N:1 function takes into account visibility of a PCI
device on the bus using DMA aliases, as well as the isolation of
devices using ACS.  Since these are all generic PCI properties,
provide this helper in PCI code to create a single, standard
implementation.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/search.c |  130 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h  |    1 
 2 files changed, 131 insertions(+)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index dab4561..aea2443 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -115,6 +115,136 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
 }
 
 /*
+ * last_dma_alias_dev - Isolation root helper to record the last DMA alias pdev
+ */
+static int last_dma_alias_dev(struct pci_dev *pdev, u16 alias, void *data)
+{
+	*(struct pci_dev **)data = pdev;
+	return 0;
+}
+
+/*
+ * To consider a device isolated, we require ACS to support Source Validation,
+ * Request Redirection, Completer Redirection, and Upstream Forwarding.  This
+ * effectively means that devices cannot spoof their requester ID, requests
+ * and commpletions cannot be redirected, and all transactions are forwarded
+ * upstream, even as it passes through a bridge where the target device is
+ * downstream.
+ */
+#define REQ_ACS_FLAGS   (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
+
+/*
+ * pci_find_dma_isolation_root - Given a PCI device, find the root device for
+ *				 an isolation domain.
+ * @pdev: target device
+ *
+ * This function takes DMA alias quirks, bus topology, and PCI ACS into
+ * account to find the root device for an isolation domain.  The root device
+ * is a consistent and representative device within the isolation domain.
+ * Passing any device within a given isolation domain results in the same
+ * returned root device, allowing this root device to be used for lookup
+ * for structures like IOMMU groups.  Note that the root device is not
+ * necessarily a bridge, in the case of a multifunction device without
+ * DMA isolation between functions, the root device is the lowest function
+ * without isolation.
+ */
+struct pci_dev *pci_find_dma_isolation_root(struct pci_dev *pdev)
+{
+	struct pci_bus *bus;
+
+	/*
+	 * Step 1: Find the upstream DMA alias device
+	 *
+	 * A device can be aliased by bridges or DMA alias quirks.  Being able
+	 * to differentiate devices is a minimum requirement for isolation.
+	 */
+	pci_for_each_dma_alias(pdev, &last_dma_alias_dev, &pdev);
+
+	/*
+	 * Step 2: Find the upstream ACS device
+	 *
+	 * Beyond differentiation, ACS prevents uncontrolled peer-to-peer
+	 * transactions.  Therefore the next step is to find the upstream
+	 * ACS device.
+	 */
+	for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) {
+		if (!bus->self)
+			continue;
+
+		if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
+			break;
+
+		pdev = bus->self;
+	}
+
+	/*
+	 * Step 3: Handle DMA function aliases
+	 *
+	 * PCI functions are sometimes broken and use the wrong requester
+	 * ID for DMA transactions.  The requester ID for this device may
+	 * actually be used by another function in this slot.  If such a
+	 * function exists, use it.
+	 */
+	if (pdev->multifunction) {
+		u8 func, func_mask = 1 << PCI_FUNC(pdev->devfn);
+
+		for (func = 0; func < 8; func++) {
+			struct pci_dev *tmp;
+
+			if (func == PCI_FUNC(pdev->devfn))
+				continue;
+
+			tmp = pci_get_slot(pdev->bus,
+					   PCI_DEVFN(PCI_SLOT(pdev->devfn),
+						     func));
+			if (!tmp)
+				continue;
+
+			pci_dev_put(tmp);
+			/*
+			 * If this device has a DMA alias to us, it becomes
+			 * the base device regardless of ACS.
+			 */
+			if (tmp->dma_func_alias & func_mask) {
+				pdev = tmp;
+				break;
+			}
+		}
+	}
+
+	/*
+	 * Step 4: Handle multifunction ACS
+	 *
+	 * If the resulting device is multifunction and does not itself
+	 * support ACS then we cannot assume isolation between functions.
+	 * The root device needs to be consistent, therefore we return the
+	 * lowest numbered function that also lacks ACS support.
+	 */
+	if (pdev->multifunction && !pci_acs_enabled(pdev, REQ_ACS_FLAGS)) {
+		u8 func;
+
+		for (func = 0; func < PCI_FUNC(pdev->devfn); func++) {
+			struct pci_dev *tmp;
+
+			tmp = pci_get_slot(pdev->bus,
+					   PCI_DEVFN(PCI_SLOT(pdev->devfn),
+						     func));
+			if (!tmp)
+				continue;
+
+			pci_dev_put(tmp);
+
+			if (!pci_acs_enabled(tmp, REQ_ACS_FLAGS)) {
+				pdev = tmp;
+				break;
+			}
+		}
+	}
+
+	return pdev;
+}
+
+/*
  * find the upstream PCIe-to-PCI bridge of a PCI device
  * if the device is PCIE, return NULL
  * if the device isn't connected to a PCIe bridge (that is its parent is a
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 31d9a90..5096031 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1802,6 +1802,7 @@ static inline struct eeh_dev *pci_dev_to_eeh_dev(struct pci_dev *pdev)
 int pci_for_each_dma_alias(struct pci_dev *pdev,
 			   int (*fn)(struct pci_dev *pdev,
 				     u16 alias, void *data), void *data);
+struct pci_dev *pci_find_dma_isolation_root(struct pci_dev *pdev);
 
 /**
  * pci_find_upstream_pcie_bridge - find upstream PCIe-to-PCI bridge of a device


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (6 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 07/15] PCI: Consolidate isolation domain code Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-14 10:34   ` Joerg Roedel
  2014-05-10 15:03 ` [PATCH v3 09/15] iommu/amd: Update to use PCI DMA aliases Alex Williamson
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, Joerg Roedel, acooks, linux-kernel, linux

The IVRS tables provides aliases, but not to the extent now provided
by PCI core with DMA alias support and pci_find_dma_isolation_root().
The expectation is that the kernel and IVRS will produce the same
result for topology based aliases while the kernel will also include
device specific DMA quirks.  We therefore drop the AMD-specific IOMMU
group discovery in favor of the common code.  This also allows us to
drop tracking of the IOMMU group on the alias dev_data structure.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/amd_iommu.c       |  158 ++-------------------------------------
 drivers/iommu/amd_iommu_types.h |    1 
 2 files changed, 10 insertions(+), 149 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c949520..3d58de4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -46,7 +46,6 @@
 #include "amd_iommu_proto.h"
 #include "amd_iommu_types.h"
 #include "irq_remapping.h"
-#include "pci.h"
 
 #define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
 
@@ -133,9 +132,6 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
 	list_del(&dev_data->dev_data_list);
 	spin_unlock_irqrestore(&dev_data_list_lock, flags);
 
-	if (dev_data->group)
-		iommu_group_put(dev_data->group);
-
 	kfree(dev_data);
 }
 
@@ -264,107 +260,10 @@ static bool check_device(struct device *dev)
 	return true;
 }
 
-static struct pci_bus *find_hosted_bus(struct pci_bus *bus)
-{
-	while (!bus->self) {
-		if (!pci_is_root_bus(bus))
-			bus = bus->parent;
-		else
-			return ERR_PTR(-ENODEV);
-	}
-
-	return bus;
-}
-
-#define REQ_ACS_FLAGS	(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
-
-static struct pci_dev *get_isolation_root(struct pci_dev *pdev)
-{
-	struct pci_dev *dma_pdev = pdev;
-
-	/* Account for quirked devices */
-	swap_pci_ref(&dma_pdev, pci_get_dma_source(dma_pdev));
-
-	/*
-	 * If it's a multifunction device that does not support our
-	 * required ACS flags, add to the same group as lowest numbered
-	 * function that also does not suport the required ACS flags.
-	 */
-	if (dma_pdev->multifunction &&
-	    !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS)) {
-		u8 i, slot = PCI_SLOT(dma_pdev->devfn);
-
-		for (i = 0; i < 8; i++) {
-			struct pci_dev *tmp;
-
-			tmp = pci_get_slot(dma_pdev->bus, PCI_DEVFN(slot, i));
-			if (!tmp)
-				continue;
-
-			if (!pci_acs_enabled(tmp, REQ_ACS_FLAGS)) {
-				swap_pci_ref(&dma_pdev, tmp);
-				break;
-			}
-			pci_dev_put(tmp);
-		}
-	}
-
-	/*
-	 * Devices on the root bus go through the iommu.  If that's not us,
-	 * find the next upstream device and test ACS up to the root bus.
-	 * Finding the next device may require skipping virtual buses.
-	 */
-	while (!pci_is_root_bus(dma_pdev->bus)) {
-		struct pci_bus *bus = find_hosted_bus(dma_pdev->bus);
-		if (IS_ERR(bus))
-			break;
-
-		if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
-			break;
-
-		swap_pci_ref(&dma_pdev, pci_dev_get(bus->self));
-	}
-
-	return dma_pdev;
-}
-
-static int use_pdev_iommu_group(struct pci_dev *pdev, struct device *dev)
-{
-	struct iommu_group *group = iommu_group_get(&pdev->dev);
-	int ret;
-
-	if (!group) {
-		group = iommu_group_alloc();
-		if (IS_ERR(group))
-			return PTR_ERR(group);
-
-		WARN_ON(&pdev->dev != dev);
-	}
-
-	ret = iommu_group_add_device(group, dev);
-	iommu_group_put(group);
-	return ret;
-}
-
-static int use_dev_data_iommu_group(struct iommu_dev_data *dev_data,
-				    struct device *dev)
-{
-	if (!dev_data->group) {
-		struct iommu_group *group = iommu_group_alloc();
-		if (IS_ERR(group))
-			return PTR_ERR(group);
-
-		dev_data->group = group;
-	}
-
-	return iommu_group_add_device(dev_data->group, dev);
-}
-
 static int init_iommu_group(struct device *dev)
 {
-	struct iommu_dev_data *dev_data;
 	struct iommu_group *group;
-	struct pci_dev *dma_pdev;
+	struct pci_dev *pdev;
 	int ret;
 
 	group = iommu_group_get(dev);
@@ -373,58 +272,21 @@ static int init_iommu_group(struct device *dev)
 		return 0;
 	}
 
-	dev_data = find_dev_data(get_device_id(dev));
-	if (!dev_data)
-		return -ENOMEM;
-
-	if (dev_data->alias_data) {
-		u16 alias;
-		struct pci_bus *bus;
+	pdev = pci_find_dma_isolation_root(to_pci_dev(dev));
 
-		if (dev_data->alias_data->group)
-			goto use_group;
-
-		/*
-		 * If the alias device exists, it's effectively just a first
-		 * level quirk for finding the DMA source.
-		 */
-		alias = amd_iommu_alias_table[dev_data->devid];
-		dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff);
-		if (dma_pdev) {
-			dma_pdev = get_isolation_root(dma_pdev);
-			goto use_pdev;
-		}
-
-		/*
-		 * If the alias is virtual, try to find a parent device
-		 * and test whether the IOMMU group is actualy rooted above
-		 * the alias.  Be careful to also test the parent device if
-		 * we think the alias is the root of the group.
-		 */
-		bus = pci_find_bus(0, alias >> 8);
-		if (!bus)
-			goto use_group;
+	group = iommu_group_get(&pdev->dev);
 
-		bus = find_hosted_bus(bus);
-		if (IS_ERR(bus) || !bus->self)
-			goto use_group;
+	if (!group) {
+		group = iommu_group_alloc();
+		if (IS_ERR(group))
+			return PTR_ERR(group);
+	}
 
-		dma_pdev = get_isolation_root(pci_dev_get(bus->self));
-		if (dma_pdev != bus->self || (dma_pdev->multifunction &&
-		    !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS)))
-			goto use_pdev;
+	ret = iommu_group_add_device(group, dev);
 
-		pci_dev_put(dma_pdev);
-		goto use_group;
-	}
+	iommu_group_put(group);
 
-	dma_pdev = get_isolation_root(pci_dev_get(to_pci_dev(dev)));
-use_pdev:
-	ret = use_pdev_iommu_group(dma_pdev, dev);
-	pci_dev_put(dma_pdev);
 	return ret;
-use_group:
-	return use_dev_data_iommu_group(dev_data->alias_data, dev);
 }
 
 static int iommu_init_device(struct device *dev)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f1a5abf..7277a20 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -432,7 +432,6 @@ struct iommu_dev_data {
 	struct iommu_dev_data *alias_data;/* The alias dev_data */
 	struct protection_domain *domain; /* Domain the device is bound to */
 	atomic_t bind;			  /* Domain attach reference count */
-	struct iommu_group *group;	  /* IOMMU group for virtual aliases */
 	u16 devid;			  /* PCI Device ID */
 	bool iommu_v2;			  /* Device can make use of IOMMUv2 */
 	bool passthrough;		  /* Default for device is pt_domain */


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 09/15] iommu/amd: Update to use PCI DMA aliases
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (7 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 10/15] iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, Joerg Roedel, acooks, linux-kernel, linux

AMD-Vi already has a concept of an alias provided via the IVRS table.
This alias only handles topology based aliases, such as PCIe-to-PCI
bridges.  When such an alias is present, we continue to use it.  When
a platform alias is not present, we can now add a check of the device
dma_func_alias to create our own.  Note that the current code can
only handle a single alias of a device, and we don't attempt to change
that here.  It would only become a factor for the requester ID seen by
the IOMMU if PCI-X were involved anway.

Since the alias is now potentially device specific rather than the
topology based alias provided by the platform, we need to clear it
when the device goes away.

With the DMA alias and isolation infrastructure now in PCI-core, we
could opt to ignore IVRS provided aliases.  We now do this for IOMMU
groups.  However, for this more common use case, the "don't fix what
isn't broken" mantra seems like the safer route.

This should allow AMD-Vi to work with devices like Marvell and Ricoh
with DMA function alias quirks.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/amd_iommu.c |   37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 3d58de4..97a2f0b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -294,6 +294,7 @@ static int iommu_init_device(struct device *dev)
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct iommu_dev_data *dev_data;
 	u16 alias;
+	u8 func_alias;
 	int ret;
 
 	if (dev->archdata.iommu)
@@ -304,6 +305,29 @@ static int iommu_init_device(struct device *dev)
 		return -ENOMEM;
 
 	alias = amd_iommu_alias_table[dev_data->devid];
+
+	/*
+	 * If there is no platform provided alias (topology-based) check for
+	 * a device quirk-based alias.  Note that a non-existent alias (ie.
+	 * ghost alias) will also need their rlookup and dev_table setup.
+	 * Copy these from the original device since they're both functions
+	 * in the same slot.
+	 */
+	func_alias = pdev->dma_func_alias & ~(1 << PCI_FUNC(pdev->devfn));
+	if (func_alias && alias == dev_data->devid) {
+		WARN_ON(hweight8(func_alias) > 1);
+		alias = PCI_DEVID(pdev->bus->number,
+				  PCI_DEVFN(PCI_SLOT(pdev->devfn),
+					    ffs(func_alias) - 1));
+		if (!amd_iommu_rlookup_table[alias]) {
+			amd_iommu_rlookup_table[alias] =
+				amd_iommu_rlookup_table[dev_data->devid];
+			memcpy(amd_iommu_dev_table[alias].data,
+			       amd_iommu_dev_table[dev_data->devid].data,
+			       sizeof(amd_iommu_dev_table[alias].data));
+		}
+	}
+
 	if (alias != dev_data->devid) {
 		struct iommu_dev_data *alias_data;
 
@@ -351,12 +375,19 @@ static void iommu_ignore_device(struct device *dev)
 
 static void iommu_uninit_device(struct device *dev)
 {
+	struct iommu_dev_data *dev_data = search_dev_data(get_device_id(dev));
+
+	if (!dev_data)
+		return;
+
 	iommu_group_remove_device(dev);
 
+	/* Unlink from alias, it may change if another device is re-plugged */
+	dev_data->alias_data = NULL;
+
 	/*
-	 * Nothing to do here - we keep dev_data around for unplugged devices
-	 * and reuse it when the device is re-plugged - not doing so would
-	 * introduce a ton of races.
+	 * We keep dev_data around for unplugged devices and reuse it when the
+	 * device is re-plugged - not doing so would introduce a ton of races.
 	 */
 }
 


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 10/15] iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (8 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 09/15] iommu/amd: Update to use PCI DMA aliases Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 11/15] iommu/intel: Update to use PCI DMA aliases Alex Williamson
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, David Woodhouse, linux, linux-kernel

Drop custom code that attempts to do the exact same thing and use
PCI provided isolation root support.  Existing IOMMU group laytout
should not change.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/iommu/intel-iommu.c |   71 +++++--------------------------------------
 1 file changed, 8 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f256ffc..f0e50f1 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -44,7 +44,6 @@
 #include <asm/iommu.h>
 
 #include "irq_remapping.h"
-#include "pci.h"
 
 #define ROOT_SIZE		VTD_PAGE_SIZE
 #define CONTEXT_SIZE		VTD_PAGE_SIZE
@@ -4359,12 +4358,9 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain,
 	return 0;
 }
 
-#define REQ_ACS_FLAGS	(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
-
 static int intel_iommu_add_device(struct device *dev)
 {
-	struct pci_dev *pdev = to_pci_dev(dev);
-	struct pci_dev *bridge, *dma_pdev = NULL;
+	struct pci_dev *pdev;
 	struct iommu_group *group;
 	int ret;
 	u8 bus, devfn;
@@ -4372,68 +4368,16 @@ static int intel_iommu_add_device(struct device *dev)
 	if (!device_to_iommu(dev, &bus, &devfn))
 		return -ENODEV;
 
-	bridge = pci_find_upstream_pcie_bridge(pdev);
-	if (bridge) {
-		if (pci_is_pcie(bridge))
-			dma_pdev = pci_get_domain_bus_and_slot(
-						pci_domain_nr(pdev->bus),
-						bridge->subordinate->number, 0);
-		if (!dma_pdev)
-			dma_pdev = pci_dev_get(bridge);
-	} else
-		dma_pdev = pci_dev_get(pdev);
-
-	/* Account for quirked devices */
-	swap_pci_ref(&dma_pdev, pci_get_dma_source(dma_pdev));
-
-	/*
-	 * If it's a multifunction device that does not support our
-	 * required ACS flags, add to the same group as lowest numbered
-	 * function that also does not suport the required ACS flags.
-	 */
-	if (dma_pdev->multifunction &&
-	    !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS)) {
-		u8 i, slot = PCI_SLOT(dma_pdev->devfn);
-
-		for (i = 0; i < 8; i++) {
-			struct pci_dev *tmp;
-
-			tmp = pci_get_slot(dma_pdev->bus, PCI_DEVFN(slot, i));
-			if (!tmp)
-				continue;
-
-			if (!pci_acs_enabled(tmp, REQ_ACS_FLAGS)) {
-				swap_pci_ref(&dma_pdev, tmp);
-				break;
-			}
-			pci_dev_put(tmp);
-		}
+	group = iommu_group_get(dev);
+	if (group) {
+		iommu_group_put(group);
+		return 0;
 	}
 
-	/*
-	 * Devices on the root bus go through the iommu.  If that's not us,
-	 * find the next upstream device and test ACS up to the root bus.
-	 * Finding the next device may require skipping virtual buses.
-	 */
-	while (!pci_is_root_bus(dma_pdev->bus)) {
-		struct pci_bus *bus = dma_pdev->bus;
+	pdev = pci_find_dma_isolation_root(to_pci_dev(dev));
 
-		while (!bus->self) {
-			if (!pci_is_root_bus(bus))
-				bus = bus->parent;
-			else
-				goto root_bus;
-		}
+	group = iommu_group_get(&pdev->dev);
 
-		if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
-			break;
-
-		swap_pci_ref(&dma_pdev, pci_dev_get(bus->self));
-	}
-
-root_bus:
-	group = iommu_group_get(&dma_pdev->dev);
-	pci_dev_put(dma_pdev);
 	if (!group) {
 		group = iommu_group_alloc();
 		if (IS_ERR(group))
@@ -4443,6 +4387,7 @@ root_bus:
 	ret = iommu_group_add_device(group, dev);
 
 	iommu_group_put(group);
+
 	return ret;
 }
 


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 11/15] iommu/intel: Update to use PCI DMA aliases
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (9 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 10/15] iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 12/15] iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, David Woodhouse, linux, linux-kernel

VT-d code currently makes use of pci_find_upstream_pcie_bridge() in
order to find the topology based alias of a device.  This function has
a few problems.  First, it doesn't check the entire alias path of the
device to the root bus, therefore if a PCIe device is masked upstream,
the wrong result is produced.  Also, it's known to get confused and
give up when it crosses a bridge from a conventional PCI bus to a PCIe
bus that lacks a PCIe capability.  The PCI-core provided DMA alias
support solves both of these problems and additionally adds support
for DMA function quirks allowing VT-d to work with devices like
Marvell and Ricoh with known broken requester IDs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/iommu/intel-iommu.c         |  222 ++++++++++++++++-------------------
 drivers/iommu/intel_irq_remapping.c |   55 ++++++---
 2 files changed, 139 insertions(+), 138 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f0e50f1..7dbb1cb 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1840,54 +1840,56 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
 	return 0;
 }
 
+struct domain_context_mapping_data {
+	struct dmar_domain *domain;
+	struct intel_iommu *iommu;
+	int translation;
+};
+
+static int domain_context_mapping_cb(struct pci_dev *pdev,
+				     u16 alias, void *opaque)
+{
+	struct domain_context_mapping_data *data = opaque;
+
+	return domain_context_mapping_one(data->domain, data->iommu,
+					  PCI_BUS_NUM(alias), alias & 0xff,
+					  data->translation);
+}
+
 static int
 domain_context_mapping(struct dmar_domain *domain, struct device *dev,
 		       int translation)
 {
-	int ret;
-	struct pci_dev *pdev, *tmp, *parent;
 	struct intel_iommu *iommu;
 	u8 bus, devfn;
+	struct domain_context_mapping_data data;
 
 	iommu = device_to_iommu(dev, &bus, &devfn);
 	if (!iommu)
 		return -ENODEV;
 
-	ret = domain_context_mapping_one(domain, iommu, bus, devfn,
-					 translation);
-	if (ret || !dev_is_pci(dev))
-		return ret;
-
-	/* dependent device mapping */
-	pdev = to_pci_dev(dev);
-	tmp = pci_find_upstream_pcie_bridge(pdev);
-	if (!tmp)
-		return 0;
-	/* Secondary interface's bus number and devfn 0 */
-	parent = pdev->bus->self;
-	while (parent != tmp) {
-		ret = domain_context_mapping_one(domain, iommu,
-						 parent->bus->number,
-						 parent->devfn, translation);
-		if (ret)
-			return ret;
-		parent = parent->bus->self;
-	}
-	if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */
-		return domain_context_mapping_one(domain, iommu,
-					tmp->subordinate->number, 0,
-					translation);
-	else /* this is a legacy PCI bridge */
-		return domain_context_mapping_one(domain, iommu,
-						  tmp->bus->number,
-						  tmp->devfn,
+	if (!dev_is_pci(dev))
+		return domain_context_mapping_one(domain, iommu, bus, devfn,
 						  translation);
+
+	data.domain = domain;
+	data.iommu = iommu;
+	data.translation = translation;
+
+	return pci_for_each_dma_alias(to_pci_dev(dev),
+				      &domain_context_mapping_cb, &data);
+}
+
+static int domain_context_mapped_cb(struct pci_dev *pdev,
+				    u16 alias, void *opaque)
+{
+	struct intel_iommu *iommu = opaque;
+
+	return !device_context_mapped(iommu, PCI_BUS_NUM(alias), alias & 0xff);
 }
 
 static int domain_context_mapped(struct device *dev)
 {
-	int ret;
-	struct pci_dev *pdev, *tmp, *parent;
 	struct intel_iommu *iommu;
 	u8 bus, devfn;
 
@@ -1895,30 +1897,11 @@ static int domain_context_mapped(struct device *dev)
 	if (!iommu)
 		return -ENODEV;
 
-	ret = device_context_mapped(iommu, bus, devfn);
-	if (!ret || !dev_is_pci(dev))
-		return ret;
+	if (!dev_is_pci(dev))
+		return device_context_mapped(iommu, bus, devfn);
 
-	/* dependent device mapping */
-	pdev = to_pci_dev(dev);
-	tmp = pci_find_upstream_pcie_bridge(pdev);
-	if (!tmp)
-		return ret;
-	/* Secondary interface's bus number and devfn 0 */
-	parent = pdev->bus->self;
-	while (parent != tmp) {
-		ret = device_context_mapped(iommu, parent->bus->number,
-					    parent->devfn);
-		if (!ret)
-			return ret;
-		parent = parent->bus->self;
-	}
-	if (pci_is_pcie(tmp))
-		return device_context_mapped(iommu, tmp->subordinate->number,
-					     0);
-	else
-		return device_context_mapped(iommu, tmp->bus->number,
-					     tmp->devfn);
+	return !pci_for_each_dma_alias(to_pci_dev(dev),
+				       domain_context_mapped_cb, iommu);
 }
 
 /* Returns a number of VTD pages, but aligned to MM page size */
@@ -2207,79 +2190,86 @@ static struct dmar_domain *dmar_insert_dev_info(struct intel_iommu *iommu,
 	return domain;
 }
 
+static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
+{
+	*(u16 *)opaque = alias;
+	return 0;
+}
+
 /* domain is initialized */
 static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
 {
-	struct dmar_domain *domain, *free = NULL;
-	struct intel_iommu *iommu = NULL;
+	struct dmar_domain *domain, *tmp;
+	struct intel_iommu *iommu;
 	struct device_domain_info *info;
-	struct pci_dev *dev_tmp = NULL;
+	u16 dma_alias;
 	unsigned long flags;
-	u8 bus, devfn, bridge_bus, bridge_devfn;
+	u8 bus, devfn;
 
 	domain = find_domain(dev);
 	if (domain)
 		return domain;
 
+	iommu = device_to_iommu(dev, &bus, &devfn);
+	if (!iommu)
+		return NULL;
+
 	if (dev_is_pci(dev)) {
 		struct pci_dev *pdev = to_pci_dev(dev);
-		u16 segment;
 
-		segment = pci_domain_nr(pdev->bus);
-		dev_tmp = pci_find_upstream_pcie_bridge(pdev);
-		if (dev_tmp) {
-			if (pci_is_pcie(dev_tmp)) {
-				bridge_bus = dev_tmp->subordinate->number;
-				bridge_devfn = 0;
-			} else {
-				bridge_bus = dev_tmp->bus->number;
-				bridge_devfn = dev_tmp->devfn;
-			}
-			spin_lock_irqsave(&device_domain_lock, flags);
-			info = dmar_search_domain_by_dev_info(segment,
-							      bridge_bus,
-							      bridge_devfn);
-			if (info) {
-				iommu = info->iommu;
-				domain = info->domain;
-			}
-			spin_unlock_irqrestore(&device_domain_lock, flags);
-			/* pcie-pci bridge already has a domain, uses it */
-			if (info)
-				goto found_domain;
+		pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
+
+		spin_lock_irqsave(&device_domain_lock, flags);
+		info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
+						      PCI_BUS_NUM(dma_alias),
+						      dma_alias & 0xff);
+		if (info) {
+			iommu = info->iommu;
+			domain = info->domain;
 		}
-	}
+		spin_unlock_irqrestore(&device_domain_lock, flags);
 
-	iommu = device_to_iommu(dev, &bus, &devfn);
-	if (!iommu)
-		goto error;
+		/* DMA alias already has a domain, uses it */
+		if (info)
+			goto found_domain;
+	}
 
 	/* Allocate and initialize new domain for the device */
 	domain = alloc_domain(false);
 	if (!domain)
-		goto error;
+		return NULL;
+
 	if (iommu_attach_domain(domain, iommu)) {
 		free_domain_mem(domain);
-		domain = NULL;
-		goto error;
+		return NULL;
 	}
-	free = domain;
-	if (domain_init(domain, gaw))
-		goto error;
 
-	/* register pcie-to-pci device */
-	if (dev_tmp) {
-		domain = dmar_insert_dev_info(iommu, bridge_bus, bridge_devfn,
-					      NULL, domain);
+	if (domain_init(domain, gaw)) {
+		domain_exit(domain);
+		return NULL;
+	}
+
+	/* register PCI DMA alias device */
+	if (dev_is_pci(dev)) {
+		tmp = dmar_insert_dev_info(iommu, PCI_BUS_NUM(dma_alias),
+					   dma_alias & 0xff, NULL, domain);
+
+		if (!tmp || tmp != domain) {
+			domain_exit(domain);
+			domain = tmp;
+		}
+
 		if (!domain)
-			goto error;
+			return NULL;
 	}
 
 found_domain:
-	domain = dmar_insert_dev_info(iommu, bus, devfn, dev, domain);
-error:
-	if (free != domain)
-		domain_exit(free);
+	tmp = dmar_insert_dev_info(iommu, bus, devfn, dev, domain);
+
+	if (!tmp || tmp != domain) {
+		domain_exit(domain);
+		domain = tmp;
+	}
 
 	return domain;
 }
@@ -4029,33 +4019,21 @@ out_free_dmar:
 	return ret;
 }
 
+static int iommu_detach_dev_cb(struct pci_dev *pdev, u16 alias, void *opaque)
+{
+	struct intel_iommu *iommu = opaque;
+
+	iommu_detach_dev(iommu, PCI_BUS_NUM(alias), alias & 0xff);
+	return 0;
+}
+
 static void iommu_detach_dependent_devices(struct intel_iommu *iommu,
 					   struct device *dev)
 {
-	struct pci_dev *tmp, *parent, *pdev;
-
 	if (!iommu || !dev || !dev_is_pci(dev))
 		return;
 
-	pdev = to_pci_dev(dev);
-
-	/* dependent device detach */
-	tmp = pci_find_upstream_pcie_bridge(pdev);
-	/* Secondary interface's bus number and devfn 0 */
-	if (tmp) {
-		parent = pdev->bus->self;
-		while (parent != tmp) {
-			iommu_detach_dev(iommu, parent->bus->number,
-					 parent->devfn);
-			parent = parent->bus->self;
-		}
-		if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */
-			iommu_detach_dev(iommu,
-				tmp->subordinate->number, 0);
-		else /* this is a legacy PCI bridge */
-			iommu_detach_dev(iommu, tmp->bus->number,
-					 tmp->devfn);
-	}
+	pci_for_each_dma_alias(to_pci_dev(dev), &iommu_detach_dev_cb, iommu);
 }
 
 static void domain_remove_one_dev_info(struct dmar_domain *domain,
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 9b17489..757e0b0 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -369,29 +369,52 @@ static int set_hpet_sid(struct irte *irte, u8 id)
 	return 0;
 }
 
+struct set_msi_sid_data {
+	struct pci_dev *pdev;
+	u16 alias;
+};
+
+static int set_msi_sid_cb(struct pci_dev *pdev, u16 alias, void *opaque)
+{
+	struct set_msi_sid_data *data = opaque;
+
+	data->pdev = pdev;
+	data->alias = alias;
+
+	return 0;
+}
+
 static int set_msi_sid(struct irte *irte, struct pci_dev *dev)
 {
-	struct pci_dev *bridge;
+	struct set_msi_sid_data data;
 
 	if (!irte || !dev)
 		return -1;
 
-	/* PCIe device or Root Complex integrated PCI device */
-	if (pci_is_pcie(dev) || !dev->bus->parent) {
-		set_irte_sid(irte, SVT_VERIFY_SID_SQ, SQ_ALL_16,
-			     (dev->bus->number << 8) | dev->devfn);
-		return 0;
-	}
+	pci_for_each_dma_alias(dev, set_msi_sid_cb, &data);
 
-	bridge = pci_find_upstream_pcie_bridge(dev);
-	if (bridge) {
-		if (pci_is_pcie(bridge))/* this is a PCIe-to-PCI/PCIX bridge */
-			set_irte_sid(irte, SVT_VERIFY_BUS, SQ_ALL_16,
-				(bridge->bus->number << 8) | dev->bus->number);
-		else /* this is a legacy PCI bridge */
-			set_irte_sid(irte, SVT_VERIFY_SID_SQ, SQ_ALL_16,
-				(bridge->bus->number << 8) | bridge->devfn);
-	}
+	/*
+	 * DMA alias provides us with a PCI device and alias.  The only case
+	 * where the it will return an alias on a different bus than the
+	 * device is the case of a PCIe-to-PCI bridge, where the alias is for
+	 * the subordinate bus.  In this case we can only verify the bus.
+	 *
+	 * If the alias device is on a different bus than our source device
+	 * then we have a topology based alias, use it.
+	 *
+	 * Otherwise, the alias is for a device DMA quirk and we cannot
+	 * assume that MSI uses the same requester ID.  Therefore use the
+	 * original device.
+	 */
+	if (PCI_BUS_NUM(data.alias) != data.pdev->bus->number)
+		set_irte_sid(irte, SVT_VERIFY_BUS, SQ_ALL_16,
+			     PCI_DEVID(PCI_BUS_NUM(data.alias),
+				       dev->bus->number));
+	else if (data.pdev->bus->number != dev->bus->number)
+		set_irte_sid(irte, SVT_VERIFY_SID_SQ, SQ_ALL_16, data.alias);
+	else
+		set_irte_sid(irte, SVT_VERIFY_SID_SQ, SQ_ALL_16,
+			     PCI_DEVID(dev->bus->number, dev->devfn));
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 12/15] iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (10 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 11/15] iommu/intel: Update to use PCI DMA aliases Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 13/15] iommu: Remove pci.h Alex Williamson
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, Varun Sethi, acooks, linux-kernel, linux

Drop custom code and use PCI provided isolation root support.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Varun Sethi <varun.sethi@freescale.com>
---
 drivers/iommu/fsl_pamu_domain.c |   67 ++-------------------------------------
 1 file changed, 3 insertions(+), 64 deletions(-)

diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 93072ba..83e3e7c 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -38,7 +38,6 @@
 #include <sysdev/fsl_pci.h>
 
 #include "fsl_pamu_domain.h"
-#include "pci.h"
 
 /*
  * Global spinlock that needs to be held while
@@ -892,8 +891,6 @@ static int fsl_pamu_get_domain_attr(struct iommu_domain *domain,
 	return ret;
 }
 
-#define REQ_ACS_FLAGS	(PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
-
 static struct iommu_group *get_device_iommu_group(struct device *dev)
 {
 	struct iommu_group *group;
@@ -950,74 +947,16 @@ static struct iommu_group *get_pci_device_group(struct pci_dev *pdev)
 	struct pci_controller *pci_ctl;
 	bool pci_endpt_partioning;
 	struct iommu_group *group = NULL;
-	struct pci_dev *bridge, *dma_pdev = NULL;
+	struct pci_dev *dma_pdev;
 
 	pci_ctl = pci_bus_to_host(pdev->bus);
 	pci_endpt_partioning = check_pci_ctl_endpt_part(pci_ctl);
 	/* We can partition PCIe devices so assign device group to the device */
 	if (pci_endpt_partioning) {
-		bridge = pci_find_upstream_pcie_bridge(pdev);
-		if (bridge) {
-			if (pci_is_pcie(bridge))
-				dma_pdev = pci_get_domain_bus_and_slot(
-						pci_domain_nr(pdev->bus),
-						bridge->subordinate->number, 0);
-			if (!dma_pdev)
-				dma_pdev = pci_dev_get(bridge);
-		} else
-			dma_pdev = pci_dev_get(pdev);
-
-		/* Account for quirked devices */
-		swap_pci_ref(&dma_pdev, pci_get_dma_source(dma_pdev));
-
-		/*
-		 * If it's a multifunction device that does not support our
-		 * required ACS flags, add to the same group as lowest numbered
-		 * function that also does not suport the required ACS flags.
-		 */
-		if (dma_pdev->multifunction &&
-		    !pci_acs_enabled(dma_pdev, REQ_ACS_FLAGS)) {
-			u8 i, slot = PCI_SLOT(dma_pdev->devfn);
+		dma_pdev = pci_find_dma_isolation_root(pdev);
 
-			for (i = 0; i < 8; i++) {
-				struct pci_dev *tmp;
-
-				tmp = pci_get_slot(dma_pdev->bus, PCI_DEVFN(slot, i));
-				if (!tmp)
-					continue;
-
-				if (!pci_acs_enabled(tmp, REQ_ACS_FLAGS)) {
-					swap_pci_ref(&dma_pdev, tmp);
-					break;
-				}
-				pci_dev_put(tmp);
-			}
-		}
-
-		/*
-		 * Devices on the root bus go through the iommu.  If that's not us,
-		 * find the next upstream device and test ACS up to the root bus.
-		 * Finding the next device may require skipping virtual buses.
-		 */
-		while (!pci_is_root_bus(dma_pdev->bus)) {
-			struct pci_bus *bus = dma_pdev->bus;
-
-			while (!bus->self) {
-				if (!pci_is_root_bus(bus))
-					bus = bus->parent;
-				else
-					goto root_bus;
-			}
-
-			if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS))
-				break;
-
-			swap_pci_ref(&dma_pdev, pci_dev_get(bus->self));
-		}
-
-root_bus:
 		group = get_device_iommu_group(&dma_pdev->dev);
-		pci_dev_put(dma_pdev);
+
 		/*
 		 * PCIe controller is not a paritionable entity
 		 * free the controller device iommu_group.


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 13/15] iommu: Remove pci.h
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (11 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 12/15] iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 14/15] PCI: Remove pci_find_upstream_pcie_bridge() Alex Williamson
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, Joerg Roedel, acooks, linux-kernel, linux

The single helper here no longer has any users.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/pci.h |   29 -----------------------------
 1 file changed, 29 deletions(-)
 delete mode 100644 drivers/iommu/pci.h

diff --git a/drivers/iommu/pci.h b/drivers/iommu/pci.h
deleted file mode 100644
index 352d80a..0000000
--- a/drivers/iommu/pci.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
- *
- * Copyright (C) 2013 Red Hat, Inc.
- * Copyright (C) 2013 Freescale Semiconductor, Inc.
- *
- */
-#ifndef __IOMMU_PCI_H
-#define __IOMMU_PCI_H
-
-/* Helper function for swapping pci device reference */
-static inline void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
-{
-	pci_dev_put(*from);
-	*from = to;
-}
-
-#endif  /* __IOMMU_PCI_H */


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 14/15] PCI: Remove pci_find_upstream_pcie_bridge()
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (12 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 13/15] iommu: Remove pci.h Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 15:03 ` [PATCH v3 15/15] PCI: Remove pci_get_dma_source() Alex Williamson
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

It's broken and has no users.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/search.c |   35 -----------------------------------
 include/linux/pci.h  |   11 -----------
 2 files changed, 46 deletions(-)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index aea2443..473c6e8 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -244,41 +244,6 @@ struct pci_dev *pci_find_dma_isolation_root(struct pci_dev *pdev)
 	return pdev;
 }
 
-/*
- * find the upstream PCIe-to-PCI bridge of a PCI device
- * if the device is PCIE, return NULL
- * if the device isn't connected to a PCIe bridge (that is its parent is a
- * legacy PCI bridge and the bridge is directly connected to bus 0), return its
- * parent
- */
-struct pci_dev *
-pci_find_upstream_pcie_bridge(struct pci_dev *pdev)
-{
-	struct pci_dev *tmp = NULL;
-
-	if (pci_is_pcie(pdev))
-		return NULL;
-	while (1) {
-		if (pci_is_root_bus(pdev->bus))
-			break;
-		pdev = pdev->bus->self;
-		/* a p2p bridge */
-		if (!pci_is_pcie(pdev)) {
-			tmp = pdev;
-			continue;
-		}
-		/* PCI device should connect to a PCIe bridge */
-		if (pci_pcie_type(pdev) != PCI_EXP_TYPE_PCI_BRIDGE) {
-			/* Busted hardware? */
-			WARN_ON_ONCE(1);
-			return NULL;
-		}
-		return pdev;
-	}
-
-	return tmp;
-}
-
 static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr)
 {
 	struct pci_bus *child;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5096031..64184d5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1804,15 +1804,4 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
 				     u16 alias, void *data), void *data);
 struct pci_dev *pci_find_dma_isolation_root(struct pci_dev *pdev);
 
-/**
- * pci_find_upstream_pcie_bridge - find upstream PCIe-to-PCI bridge of a device
- * @pdev: the PCI device
- *
- * if the device is PCIE, return NULL
- * if the device isn't connected to a PCIe bridge (that is its parent is a
- * legacy PCI bridge and the bridge is directly connected to bus 0), return its
- * parent
- */
-struct pci_dev *pci_find_upstream_pcie_bridge(struct pci_dev *pdev);
-
 #endif /* LINUX_PCI_H */


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 15/15] PCI: Remove pci_get_dma_source()
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (13 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 14/15] PCI: Remove pci_find_upstream_pcie_bridge() Alex Williamson
@ 2014-05-10 15:03 ` Alex Williamson
  2014-05-10 16:51 ` [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
  2014-05-13 22:35 ` eddy0596
  16 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-10 15:03 UTC (permalink / raw)
  To: linux-pci, iommu; +Cc: bhelgaas, acooks, linux-kernel, linux

It has no users; replaced by dma_func_alias.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   51 --------------------------------------------------
 include/linux/pci.h  |    5 -----
 2 files changed, 56 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 652b49e..df814a2 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3404,57 +3404,6 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
 			 PCI_DEVICE_ID_JMICRON_JMB388_ESD,
 			 quirk_dma_func1_alias);
 
-static struct pci_dev *pci_func_0_dma_source(struct pci_dev *dev)
-{
-	if (!PCI_FUNC(dev->devfn))
-		return pci_dev_get(dev);
-
-	return pci_get_slot(dev->bus, PCI_DEVFN(PCI_SLOT(dev->devfn), 0));
-}
-
-static const struct pci_dev_dma_source {
-	u16 vendor;
-	u16 device;
-	struct pci_dev *(*dma_source)(struct pci_dev *dev);
-} pci_dev_dma_source[] = {
-	/*
-	 * https://bugzilla.redhat.com/show_bug.cgi?id=605888
-	 *
-	 * Some Ricoh devices use the function 0 source ID for DMA on
-	 * other functions of a multifunction device.  The DMA devices
-	 * is therefore function 0, which will have implications of the
-	 * iommu grouping of these devices.
-	 */
-	{ PCI_VENDOR_ID_RICOH, 0xe822, pci_func_0_dma_source },
-	{ PCI_VENDOR_ID_RICOH, 0xe230, pci_func_0_dma_source },
-	{ PCI_VENDOR_ID_RICOH, 0xe832, pci_func_0_dma_source },
-	{ PCI_VENDOR_ID_RICOH, 0xe476, pci_func_0_dma_source },
-	{ 0 }
-};
-
-/*
- * IOMMUs with isolation capabilities need to be programmed with the
- * correct source ID of a device.  In most cases, the source ID matches
- * the device doing the DMA, but sometimes hardware is broken and will
- * tag the DMA as being sourced from a different device.  This function
- * allows that translation.  Note that the reference count of the
- * returned device is incremented on all paths.
- */
-struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
-{
-	const struct pci_dev_dma_source *i;
-
-	for (i = pci_dev_dma_source; i->dma_source; i++) {
-		if ((i->vendor == dev->vendor ||
-		     i->vendor == (u16)PCI_ANY_ID) &&
-		    (i->device == dev->device ||
-		     i->device == (u16)PCI_ANY_ID))
-			return i->dma_source(dev);
-	}
-
-	return pci_dev_get(dev);
-}
-
 /*
  * AMD has indicated that the devices below do not support peer-to-peer
  * in any system where they are found in the southbridge with an AMD
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 64184d5..c7183be 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1532,16 +1532,11 @@ enum pci_fixup_pass {
 
 #ifdef CONFIG_PCI_QUIRKS
 void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
-struct pci_dev *pci_get_dma_source(struct pci_dev *dev);
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
 void pci_dev_specific_enable_acs(struct pci_dev *dev);
 #else
 static inline void pci_fixup_device(enum pci_fixup_pass pass,
 				    struct pci_dev *dev) { }
-static inline struct pci_dev *pci_get_dma_source(struct pci_dev *dev)
-{
-	return pci_dev_get(dev);
-}
 static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
 					       u16 acs_flags)
 {


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (14 preceding siblings ...)
  2014-05-10 15:03 ` [PATCH v3 15/15] PCI: Remove pci_get_dma_source() Alex Williamson
@ 2014-05-10 16:51 ` George Spelvin
  2014-05-13 22:35 ` eddy0596
  16 siblings, 0 replies; 22+ messages in thread
From: George Spelvin @ 2014-05-10 16:51 UTC (permalink / raw)
  To: alex.williamson, iommu, linux-pci; +Cc: acooks, bhelgaas, linux-kernel, linux

Tested-by: George Spelvin <linux@horizon.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems
  2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
                   ` (15 preceding siblings ...)
  2014-05-10 16:51 ` [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
@ 2014-05-13 22:35 ` eddy0596
  2014-05-14 17:18   ` Alex Williamson
  16 siblings, 1 reply; 22+ messages in thread
From: eddy0596 @ 2014-05-13 22:35 UTC (permalink / raw)
  To: linux-kernel

Hello Alex,

Thanks for working on a fix on this long standing issue. I have applied the
amd portion of the IOMMU patches against the 3.14.3 kernel and found the
followings:
1) The computer would not boot up if it's from a cold start. The kernel log
shows that it hangs at the point the kernel attempt to attach the scsi disk
[sdk] that connects to the LSI-SAS2008 controller at pci 04:00.0. I can use
Ctrl+Alt+Del to reboot the computer. So, I guess the kernel didn't "hang"
and I don't see any oops either.
2) After a warm reboot with Ctrl+Alt+Del, the kernel will boot up fine. And,
the Marvell controller behaves properly (More stress test needed) and so as
the two LSI-SAS2008. A warm reboot after a hard reset at BIOS prompt will
also boot up fine.
3) Removing sdk and perform a cold reboot, the kernel stops after attaching
all the ST3000DM001 harddisks that connects to the LSI-SAS2008 at pci
01:00:0. The kernel stops at "ata12: SATA link down (SStatus 0 SControl
300)".
4) Removing sda and sdl that connects to the Marvell 88SE9172 at pci
09:00.0, the kernel stops after attaching the eight ST3000DM001 that
connects to the LSI-SAS2008 at pci 01:00:0.
5) Cold start with a kernel without the IOMMU patches starts up fine except
a number of kernel oops related to the Marvell controller complaining about
invalid PCI access from the AMD IOMMU.

Attached is the kernel boot log that's obtained with all HDDs attached and
successfully boot up after a warm reboot and some information on my setup.
Let me know if you need more information/log to help with debuging. 


Best Regards,

Edward Cheung

Motherboard: Gigabyte GA-990FXA-UD5 Revision 1.0. Note that the kernel is
using software IO TLB belief due to broken IVRS table. I am still trying to
find a fix for this. If there's any ivrs kernel boot parameter I should use,
please let me know.
Kernel: 3.14.3 with SCST patches applied.
Base Distro and kernel compile environment: Ubuntu 14.04LTS
Following IOMMU patches applied (Marked with *).
      (*)PCI: Add DMA alias iterator 
      (*)PCI: quirk pci_for_each_dma_alias() 
      (*)PCI: quirk dma_func_alias for Ricoh devices 
      (*)PCI: quirk dma_func_alias for Marvell devices 
      (*)PCI: Quirk pci_for_each_dma_alias() for bridges 
      PCI: Add quirks for ASMedia and Tundra bridges 
      (*)PCI: Consolidate isolation domain code 
      (*)iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups 
      (*)iommu/amd: Update to use PCI DMA aliases 
      iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups 
      iommu/intel: Update to use PCI DMA aliases 
      iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups 
      iommu: Remove pci.h 
      PCI: Remove pci_find_upstream_pcie_bridge() 
      PCI: Remove pci_get_dma_source() 

root@SAN-1:/mnt# lsscsi
[1:0:0:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdb
[1:0:1:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdc
[1:0:2:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdd
[1:0:3:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sde
[1:0:4:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdf
[1:0:5:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdg
[1:0:6:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdh
[1:0:7:0]    disk    ATA      ST3000DM001-1CH1 CC24  /dev/sdi
[9:0:0:0]    disk    ATA      WDC WD6400AAKS-0 01.0  /dev/sda
[10:0:0:0]   disk    ATA      WDC WD6400AAKS-0 01.0  /dev/sdl
[14:0:0:0]   disk    Kingston DataTraveler G2  1.00  /dev/sdj
[16:0:0:0]   disk    ATA      ST3750330NS      SN04  /dev/sdk

root@SAN-1:/mnt# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI
bridge (external gfx0 port B) (rev 02)
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory
Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI
bridge (PCI express gpp port B)
00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI
bridge (PCI express gpp port H)
00:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI
bridge (external gfx1 port A)
00:0b.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI
bridge (NB-SB link)
00:0c.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890S PCI
Express bridge for GPP2 port 1
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller
(rev 42)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0
LPC host controller (rev 40)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI
Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900
PCI to PCI bridge (PCIE port 0)
00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900
PCI to PCI bridge (PCIE port 1)
00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI
bridge (PCIE port 2)
00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI
bridge (PCIE port 3)
00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor
Function 5
01:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008
PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
02:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller
(rev 01)
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s
Controller (rev 11)
04:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008
PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
05:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI
Express HBA (rev 02)
05:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI
Express HBA (rev 02)
06:06.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Device 515a
07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
08:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller
(rev 01)
09:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s
Controller (rev 11)
0a:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA
Controller (rev 01)





--
View this message in context: http://linux-kernel.2935.n7.nabble.com/PATCH-v3-00-15-PCI-iommu-Fix-DMA-alias-problems-tp857369p859747.html
Sent from the Linux Kernel mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups
  2014-05-10 15:03 ` [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
@ 2014-05-14 10:34   ` Joerg Roedel
  2014-05-14 14:27     ` Alex Williamson
  0 siblings, 1 reply; 22+ messages in thread
From: Joerg Roedel @ 2014-05-14 10:34 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-pci, iommu, bhelgaas, acooks, linux-kernel, linux

On Sat, May 10, 2014 at 09:03:11AM -0600, Alex Williamson wrote:
> The expectation is that the kernel and IVRS will produce the same
> result for topology based aliases while the kernel will also include
> device specific DMA quirks.

Is that expectation really true? There are PCIe devices out there that
don't use their own device-id for requests but another one that isn't
even visible as a PCI device (so the kernel has no pci_dev structure for
it). The IVRS table contains such information, but I am not sure whether
the PCI core finds the right requestor-id for those devices.

I've seen this on PCIe cards that where the vendor just used an PCI-X
chip with a PCIe-to-PCI-X bridge on the card. The PCI-X device is
visible for the OS but uses the requestor-id of the invisible bridge.


	Joerg



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups
  2014-05-14 10:34   ` Joerg Roedel
@ 2014-05-14 14:27     ` Alex Williamson
  0 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-14 14:27 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: linux-pci, iommu, bhelgaas, acooks, linux-kernel, linux

On Wed, 2014-05-14 at 12:34 +0200, Joerg Roedel wrote:
> On Sat, May 10, 2014 at 09:03:11AM -0600, Alex Williamson wrote:
> > The expectation is that the kernel and IVRS will produce the same
> > result for topology based aliases while the kernel will also include
> > device specific DMA quirks.
> 
> Is that expectation really true? There are PCIe devices out there that
> don't use their own device-id for requests but another one that isn't
> even visible as a PCI device (so the kernel has no pci_dev structure for
> it). The IVRS table contains such information, but I am not sure whether
> the PCI core finds the right requestor-id for those devices.
> 
> I've seen this on PCIe cards that where the vendor just used an PCI-X
> chip with a PCIe-to-PCI-X bridge on the card. The PCI-X device is
> visible for the OS but uses the requestor-id of the invisible bridge.

If we rely on the IVRS, then the set of devices with quirked aliases is
fixed by the platform vendor.  More often than not, I think that proves
to be ineffective.  I personally have a 990FX box with a single function
Marvell SATA controller that uses function 1 as the requester ID.  It's
running the latest BIOS and the IVRS isn't helping.  With this series,
we have the ability to add quirks to the kernel to fix those kinds of
issues for both AMD-Vi, VT-d, and any other IOMMU.

Here I've chosen that we get more value from using shared code so we
don't process IOMMU groups in a unique way for AMD-Vi.  Patch 09/15 uses
the PCI core alias when the IVRS doesn't provide one, perhaps a
compromise would be to also do the reverse and update dma_func_alias on
the device when the IVRS knows of a device quirk that the kernel
doesn't.  Then we could still use the common IOMMU group code and print
something to dmesg so that we can update the kernel quirks.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems
  2014-05-13 22:35 ` eddy0596
@ 2014-05-14 17:18   ` Alex Williamson
  0 siblings, 0 replies; 22+ messages in thread
From: Alex Williamson @ 2014-05-14 17:18 UTC (permalink / raw)
  To: eddy0596; +Cc: linux-kernel, iommu, Bjorn Helgaas, linux-pci

[+cc original lists]

Hi Edward,

On Tue, 2014-05-13 at 15:35 -0700, eddy0596 wrote:
> Hello Alex,
> 
> Thanks for working on a fix on this long standing issue. I have applied the
> amd portion of the IOMMU patches against the 3.14.3 kernel and found the
> followings:
> 1) The computer would not boot up if it's from a cold start. The kernel log
> shows that it hangs at the point the kernel attempt to attach the scsi disk
> [sdk] that connects to the LSI-SAS2008 controller at pci 04:00.0. I can use
> Ctrl+Alt+Del to reboot the computer. So, I guess the kernel didn't "hang"
> and I don't see any oops either.
> 2) After a warm reboot with Ctrl+Alt+Del, the kernel will boot up fine. And,
> the Marvell controller behaves properly (More stress test needed) and so as
> the two LSI-SAS2008. A warm reboot after a hard reset at BIOS prompt will
> also boot up fine.

Both of these indicate that the hand-off state of the system is
different between a warm an cold reset.  Can you capture the boot
messages (serial console or netconsole) of each case and add the
pci=earlydump option so we can compare the PCI state?

> 3) Removing sdk and perform a cold reboot, the kernel stops after attaching
> all the ST3000DM001 harddisks that connects to the LSI-SAS2008 at pci
> 01:00:0. The kernel stops at "ata12: SATA link down (SStatus 0 SControl
> 300)".
> 4) Removing sda and sdl that connects to the Marvell 88SE9172 at pci
> 09:00.0, the kernel stops after attaching the eight ST3000DM001 that
> connects to the LSI-SAS2008 at pci 01:00:0.

So it's not an issue with those specific disks.  Is it possible to
remove or disable the controller in the BIOS to further isolate?

> 5) Cold start with a kernel without the IOMMU patches starts up fine except
> a number of kernel oops related to the Marvell controller complaining about
> invalid PCI access from the AMD IOMMU.

Is this kernel built from the same source tree as below without the
indicated IOMMU patches applied?

> Attached is the kernel boot log that's obtained with all HDDs attached and
> successfully boot up after a warm reboot and some information on my setup.
> Let me know if you need more information/log to help with debuging. 

The mailing list doesn't like attachments, but it was included in the
re-send to me where it was inline.  An unsuccessful boot log is probably
the most interesting, preferably with the pci=earlydump option (and
continue to use the amd_iommu_dump option as well).  Also, what happens
with amd_iommu=off?  If we're not getting any IOMMU faults, it seems
like the patches are doing their job and I'm at a bit of a loss to
understand how it would fail only on a cold boot.

It might also be useful to test the branch provided in case there's an
issue with backporting the patches to 3.14.

> Best Regards,
> 
> Edward Cheung
> 
> Motherboard: Gigabyte GA-990FXA-UD5 Revision 1.0. Note that the kernel is
> using software IO TLB belief due to broken IVRS table. I am still trying to
> find a fix for this.

What brings you to the conclusion that the IVRS table is broken?  IIRC,
AMD-Vi initializes the swiotlb to support pasthrough devices that can
only do 32bit DMA... or something like that.  So I don't think it's
unusual to see it initialized alongside AMD IOMMU.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems
       [not found] <400175834.3289.65.camel@ul30vt.home>
@ 2014-05-16  6:28 ` colin
  0 siblings, 0 replies; 22+ messages in thread
From: colin @ 2014-05-16  6:28 UTC (permalink / raw)
  To: alex.williamson, iommu, linux-pci; +Cc: acooks, bhelgaas, linux-kernel, linux

Alex Williamson wrote:
> Wow, I didn't think that kind of broken was possible.  Maybe instead of
> a bitmap of function aliases we could have a single devfn alias for a
> device.  That means we'd only be able to support a single alias for a
> device, but since I don't think we've seen devices that use more than a
> single alias, maybe that's ok.

In my (never finished) patch set for the same problem, the first thing
I did was

diff --git a/include/linux/pci.h b/include/linux/pci.h
index a13d6825..7788870a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -251,12 +251,13 @@ struct pci_dev {
 	struct proc_dir_entry *procent;	/* device entry in /proc/bus/pci */
 	struct pci_slot	*slot;		/* Physical slot this device is in */
 
-	unsigned int	devfn;		/* encoded device & function index */
 	unsigned short	vendor;
 	unsigned short	device;
 	unsigned short	subsystem_vendor;
 	unsigned short	subsystem_device;
 	unsigned int	class;		/* 3 bytes: (base,sub,prog-if) */
+	u8		devfn;		/* encoded device & function index */
+	u8		devfn_quirk;	/* zero is non-quirky */
 	u8		revision;	/* PCI revision, low byte of class word */
 	u8		hdr_type;	/* PCI header type (`multi' flag masked out) */
 	u8		pcie_cap;	/* PCIe capability offset */

I encoded "devfn_quirk" as a delta to devfn, so that zero would mean
"no quirk", and no existing intialization would need changing.

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-05-16  6:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-10 15:02 [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems Alex Williamson
2014-05-10 15:02 ` [PATCH v3 01/15] PCI: Add DMA alias iterator Alex Williamson
2014-05-10 15:02 ` [PATCH v3 02/15] PCI: quirk pci_for_each_dma_alias() Alex Williamson
2014-05-10 15:02 ` [PATCH v3 03/15] PCI: quirk dma_func_alias for Ricoh devices Alex Williamson
2014-05-10 15:02 ` [PATCH v3 04/15] PCI: quirk dma_func_alias for Marvell devices Alex Williamson
2014-05-10 15:02 ` [PATCH v3 05/15] PCI: Quirk pci_for_each_dma_alias() for bridges Alex Williamson
2014-05-10 15:02 ` [PATCH v3 06/15] PCI: Add quirks for ASMedia and Tundra bridges Alex Williamson
2014-05-10 15:03 ` [PATCH v3 07/15] PCI: Consolidate isolation domain code Alex Williamson
2014-05-10 15:03 ` [PATCH v3 08/15] iommu/amd: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
2014-05-14 10:34   ` Joerg Roedel
2014-05-14 14:27     ` Alex Williamson
2014-05-10 15:03 ` [PATCH v3 09/15] iommu/amd: Update to use PCI DMA aliases Alex Williamson
2014-05-10 15:03 ` [PATCH v3 10/15] iommu/intel: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
2014-05-10 15:03 ` [PATCH v3 11/15] iommu/intel: Update to use PCI DMA aliases Alex Williamson
2014-05-10 15:03 ` [PATCH v3 12/15] iommu/fsl: Use pci_find_dma_isolation_root() for IOMMU groups Alex Williamson
2014-05-10 15:03 ` [PATCH v3 13/15] iommu: Remove pci.h Alex Williamson
2014-05-10 15:03 ` [PATCH v3 14/15] PCI: Remove pci_find_upstream_pcie_bridge() Alex Williamson
2014-05-10 15:03 ` [PATCH v3 15/15] PCI: Remove pci_get_dma_source() Alex Williamson
2014-05-10 16:51 ` [PATCH v3 00/15] PCI/iommu: Fix DMA alias problems George Spelvin
2014-05-13 22:35 ` eddy0596
2014-05-14 17:18   ` Alex Williamson
     [not found] <400175834.3289.65.camel@ul30vt.home>
2014-05-16  6:28 ` colin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).