linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features
@ 2014-01-20 22:01 Alex Williamson
  2014-01-20 22:01 ` [PATCH 1/2] pci: Add device specific PCI ACS enable Alex Williamson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alex Williamson @ 2014-01-20 22:01 UTC (permalink / raw)
  To: bhelgaas, linux-pci; +Cc: linux-kernel

As described in 2/2 many Intel root ports lack PCIe ACS capabilities
which results in excessively large IOMMU groups.  Many of these root
ports do provide isolation capabilities, we just need to use device
specific mechanisms to enable and verify.  Long term, I hope we can
round out this list (particularly to include X79 root ports) and
more importantly, encourage proper PCIe ACS support in future
products.  I'm really hoping we can get this in during the 3.14 cycle.
Thanks,

Alex

---

Alex Williamson (2):
      pci: Add device specific PCI ACS enable
      pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports


 drivers/pci/pci.c    |   26 +++++-
 drivers/pci/quirks.c |  201 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h  |    2 
 3 files changed, 223 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] pci: Add device specific PCI ACS enable
  2014-01-20 22:01 [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Alex Williamson
@ 2014-01-20 22:01 ` Alex Williamson
  2014-01-20 22:01 ` [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports Alex Williamson
  2014-01-31 18:49 ` [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Bjorn Helgaas
  2 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2014-01-20 22:01 UTC (permalink / raw)
  To: bhelgaas, linux-pci; +Cc: linux-kernel

Some devices support PCI ACS-like features, but don't report it using
the standard PCIe capabilities.  We already provide hooks for device
specific testing of ACS, but not for device specific enabling of ACS.
This provides that setup hook.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/pci.c    |   26 ++++++++++++++++++++------
 drivers/pci/quirks.c |   25 +++++++++++++++++++++++++
 include/linux/pci.h  |    2 ++
 3 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 07369f3..a57ff7f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2357,21 +2357,18 @@ void pci_request_acs(void)
 }
 
 /**
- * pci_enable_acs - enable ACS if hardware support it
+ * pci_std_enable_acs - enable ACS on devices using standard ACS capabilites
  * @dev: the PCI device
  */
-void pci_enable_acs(struct pci_dev *dev)
+static int pci_std_enable_acs(struct pci_dev *dev)
 {
 	int pos;
 	u16 cap;
 	u16 ctrl;
 
-	if (!pci_acs_enable)
-		return;
-
 	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
 	if (!pos)
-		return;
+		return -ENODEV;
 
 	pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
 	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
@@ -2389,6 +2386,23 @@ void pci_enable_acs(struct pci_dev *dev)
 	ctrl |= (cap & PCI_ACS_UF);
 
 	pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
+
+	return 0;
+}
+
+/**
+ * pci_enable_acs - enable ACS if hardware support it
+ * @dev: the PCI device
+ */
+void pci_enable_acs(struct pci_dev *dev)
+{
+	if (!pci_acs_enable)
+		return;
+
+	if (!pci_std_enable_acs(dev))
+		return;
+
+	pci_dev_specific_enable_acs(dev);
 }
 
 static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 3a02717..4be1cd5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3461,3 +3461,28 @@ int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags)
 
 	return -ENOTTY;
 }
+
+static const struct pci_dev_enable_acs {
+	u16 vendor;
+	u16 device;
+	int (*enable_acs)(struct pci_dev *dev);
+} pci_dev_enable_acs[] = {
+	{ 0 }
+};
+
+void pci_dev_specific_enable_acs(struct pci_dev *dev)
+{
+	const struct pci_dev_enable_acs *i;
+	int ret;
+
+	for (i = pci_dev_enable_acs; i->enable_acs; i++) {
+		if ((i->vendor == dev->vendor ||
+		     i->vendor == (u16)PCI_ANY_ID) &&
+		    (i->device == dev->device ||
+		     i->device == (u16)PCI_ANY_ID)) {
+			ret = i->enable_acs(dev);
+			if (ret >= 0)
+				return;
+		}
+	}
+}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a13d682..7905fd8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1633,6 +1633,7 @@ enum pci_fixup_pass {
 void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
 struct pci_dev *pci_get_dma_source(struct pci_dev *dev);
 int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
+void pci_dev_specific_enable_acs(struct pci_dev *dev);
 #else
 static inline void pci_fixup_device(enum pci_fixup_pass pass,
 				    struct pci_dev *dev) {}
@@ -1645,6 +1646,7 @@ static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
 {
 	return -ENOTTY;
 }
+static inline void pci_dev_specific_enable_acs(struct pci_dev *dev) {}
 #endif
 
 void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen);


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports
  2014-01-20 22:01 [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Alex Williamson
  2014-01-20 22:01 ` [PATCH 1/2] pci: Add device specific PCI ACS enable Alex Williamson
@ 2014-01-20 22:01 ` Alex Williamson
  2014-01-31 19:26   ` Bjorn Helgaas
  2014-01-31 18:49 ` [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Bjorn Helgaas
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2014-01-20 22:01 UTC (permalink / raw)
  To: bhelgaas, linux-pci; +Cc: linux-kernel

Many of the currently available Intel PCH-based root ports do not
provide PCIe ACS capabilities.  Without this, we must assume that
peer-to-peer traffic between multifunction root ports and between
devices behind root ports is possible.  This lack of isolation is
exposed by grouping the devices together in the same IOMMU group.
If we want to expose these devices to userspace, vfio uses IOMMU
groups as the unit of ownership, thus making it very difficult to
assign individual devices to separate users.

The good news is that the chipset does provide ACS-like isolation
capabilities, but we do need to verify and enable those capabilities
if the BIOS has not done so.  This patch implements the device
specific enabling and testing of equivalent ACS function for these
devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |  176 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4be1cd5..3439a02 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -24,6 +24,7 @@
 #include <linux/ioport.h>
 #include <linux/sched.h>
 #include <linux/ktime.h>
+#include <linux/atomic.h>
 #include <asm/dma.h>	/* isa_dma_bridge_buggy */
 #include "pci.h"
 
@@ -3423,6 +3424,74 @@ static int pci_quirk_amd_sb_acs(struct pci_dev *dev, u16 acs_flags)
 #endif
 }
 
+/*
+ * Many Intel PCH root ports do provide ACS-like features to disable peer
+ * transactions and validate bus numbers in requests, but do not provide an
+ * actual PCIe ACS capability.  This is the list of device IDs known to fall
+ * into that category as provided by Intel in Red Hat bugzilla 1037684.
+ */
+static const u16 pci_quirk_intel_pch_acs_ids[] = {
+	/* Ibexpeak PCH */
+	0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49,
+	0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51,
+	/* Cougarpoint PCH */
+	0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17,
+	0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f,
+	/* Pantherpoint PCH */
+	0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17,
+	0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f,
+	/* Lynxpoint-H PCH */
+	0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17,
+	0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f,
+	/* Lynxpoint-LP PCH */
+	0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17,
+	0x9c18, 0x9c19, 0x9c1a, 0x9c1b,
+	/* Wildcat PCH */
+	0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97,
+	0x9c98, 0x9c99, 0x9c9a, 0x9c9b,
+};
+
+/*
+ * We can have multiple PCHs in a system and it is possible that we can
+ * fail to enable ACS support on some of them.  To keep it simple we
+ * handle them as all or nothing with the following values:
+ *  0: ACS is not enabled
+ * -1: ACS enable failed
+ * >0: ACS enabled
+ */
+static atomic_t intel_pch_acs_quirk_status = ATOMIC_INIT(0);
+
+#define INTEL_PCH_ACS_FLAGS (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV)
+
+static bool pci_quirk_intel_pch_acs_match(struct pci_dev *dev)
+{
+	int i;
+
+	/* Filter out a few obvious non-matches first */
+	if (!pci_is_pcie(dev) ||
+	    pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT || dev->bus->number)
+		return false;
+
+	for (i = 0; i < ARRAY_SIZE(pci_quirk_intel_pch_acs_ids); i++)
+		if (pci_quirk_intel_pch_acs_ids[i] == dev->device)
+			return true;
+
+	return false;
+}
+
+static int pci_quirk_intel_pch_acs(struct pci_dev *dev, u16 acs_flags)
+{
+	u16 flags = 0;
+
+	if (!pci_quirk_intel_pch_acs_match(dev))
+		return -ENOTTY;
+
+	if (atomic_read(&intel_pch_acs_quirk_status) > 0)
+		flags = INTEL_PCH_ACS_FLAGS;
+
+	return acs_flags & ~flags ? 0 : 1;
+}
+
 static const struct pci_dev_acs_enabled {
 	u16 vendor;
 	u16 device;
@@ -3434,6 +3503,7 @@ static const struct pci_dev_acs_enabled {
 	{ PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs },
 	{ PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs },
 	{ PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs },
+	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs },
 	{ 0 }
 };
 
@@ -3462,11 +3532,117 @@ int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags)
 	return -ENOTTY;
 }
 
+/* Config space offset of Root Complex Base Address register */
+#define INTEL_LPC_RCBA_REG 0xf0
+/* 31:14 RCBA address */
+#define INTEL_LPC_RCBA_MASK 0xffffc000
+/* RCBA Enable */
+#define INTEL_LPC_RCBA_ENABLE (1 << 0)
+
+/* Backbone Scratch Pad Register */
+#define INTEL_BSPR_REG 0x1104
+/* Backbone Peer Non-Posted Disable */
+#define INTEL_BSPR_REG_BPNPD (1 << 8)
+/* Backbone Peer Posted Disable */
+#define INTEL_BSPR_REG_BPPD  (1 << 9)
+
+/* Upstream Peer Decode Configuration Register */
+#define INTEL_UPDCR_REG 0x1114
+/* 5:0 Peer Decode Enable bits */
+#define INTEL_UPDCR_REG_MASK 0x3f
+
+static int pci_quirk_enable_intel_lpc_acs(struct pci_dev *dev)
+{
+	u32 rcba, bspr, updcr;
+	void __iomem *rcba_mem;
+
+	/*
+	 * Read the RCBA register from the LPC (D31:F0).  PCH root ports
+	 * are D28:F* and therefore get probed before LPC, thus we can't
+	 * use pci_get_slot/pci_read_config_dword here.
+	 */
+	pci_bus_read_config_dword(dev->bus, PCI_DEVFN(31, 0),
+				  INTEL_LPC_RCBA_REG, &rcba);
+	if (!(rcba & INTEL_LPC_RCBA_ENABLE))
+		return -EINVAL;
+
+	rcba_mem = ioremap_nocache(rcba & INTEL_LPC_RCBA_MASK,
+				   PAGE_ALIGN(INTEL_UPDCR_REG));
+	if (!rcba_mem)
+		return -ENOMEM;
+
+	/*
+	 * The BSPR can disallow peer cycles, but it's set by soft strap and
+	 * therefore read-only.  If both posted and non-posted peer cycles are
+	 * disallowed, we're ok.  If either are allowed, then we need to use
+	 * the UPDCR to disable peer decodes for each port.  This provides the
+	 * PCIe ACS equivalent of PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF
+	 */
+	bspr = readl(rcba_mem + INTEL_BSPR_REG);
+	bspr &= INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD;
+	if (bspr != (INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD)) {
+		updcr = readl(rcba_mem + INTEL_UPDCR_REG);
+		if (updcr & INTEL_UPDCR_REG_MASK) {
+			dev_info(&dev->dev, "Disabling UPDCR peer decodes\n");
+			updcr &= ~INTEL_UPDCR_REG_MASK;
+			writel(updcr, rcba_mem + INTEL_UPDCR_REG);
+		}
+	}
+
+	iounmap(rcba_mem);
+	return 0;
+}
+
+/* Miscellaneous Port Configuration register */
+#define INTEL_MPC_REG 0xd8
+/* MPC: Invalid Receive Bus Number Check Enable */
+#define INTEL_MPC_REG_IRBNCE (1 << 26)
+
+static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
+{
+	u32 mpc;
+
+	/*
+	 * When enabled, the IRBNCE bit of the MPC register enables the
+	 * equivalent of PCI ACS Source Validation (PCI_ACS_SV), which
+	 * ensures that requester IDs fall within the bus number range
+	 * of the bridge.  Enable if not already.
+	 */
+	pci_read_config_dword(dev, INTEL_MPC_REG, &mpc);
+	if (!(mpc & INTEL_MPC_REG_IRBNCE)) {
+		dev_info(&dev->dev, "Enabling MPC IRBNCE\n");
+		mpc |= INTEL_MPC_REG_IRBNCE;
+		pci_write_config_word(dev, INTEL_MPC_REG, mpc);
+	}
+}
+
+static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
+{
+	if (!pci_quirk_intel_pch_acs_match(dev))
+		return -ENOTTY;
+
+	if (atomic_read(&intel_pch_acs_quirk_status) < 0)
+		return 0;
+
+	if (pci_quirk_enable_intel_lpc_acs(dev)) {
+		dev_warn(&dev->dev, "Failed to enable Intel PCH ACS quirk\n");
+		atomic_set(&intel_pch_acs_quirk_status, -1);
+		return 0;
+	}
+
+	pci_quirk_enable_intel_rp_mpc_acs(dev);
+
+	atomic_inc_unless_negative(&intel_pch_acs_quirk_status);
+
+	return 0;
+}
+
 static const struct pci_dev_enable_acs {
 	u16 vendor;
 	u16 device;
 	int (*enable_acs)(struct pci_dev *dev);
 } pci_dev_enable_acs[] = {
+	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_enable_intel_pch_acs },
 	{ 0 }
 };
 


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features
  2014-01-20 22:01 [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Alex Williamson
  2014-01-20 22:01 ` [PATCH 1/2] pci: Add device specific PCI ACS enable Alex Williamson
  2014-01-20 22:01 ` [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports Alex Williamson
@ 2014-01-31 18:49 ` Bjorn Helgaas
  2014-01-31 19:25   ` Alex Williamson
  2 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2014-01-31 18:49 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-pci, linux-kernel

On Mon, Jan 20, 2014 at 03:01:40PM -0700, Alex Williamson wrote:
> As described in 2/2 many Intel root ports lack PCIe ACS capabilities
> which results in excessively large IOMMU groups.  Many of these root
> ports do provide isolation capabilities, we just need to use device
> specific mechanisms to enable and verify.  Long term, I hope we can
> round out this list (particularly to include X79 root ports) and
> more importantly, encourage proper PCIe ACS support in future
> products.  I'm really hoping we can get this in during the 3.14 cycle.

v3.13 was released Jan 19, so this came during the v3.14 merge window.  I
like to have things in -next for a while before asking Linus to pull them,
and I try to keep it to regression fixes after the merge window, i.e.,
things that used to work, but don't work any more.  But that's all standard
procedure that you already know, and maybe you can make a case for
accelerating this.

Bjorn

> Thanks,
> 
> Alex
> 
> ---
> 
> Alex Williamson (2):
>       pci: Add device specific PCI ACS enable
>       pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports
> 
> 
>  drivers/pci/pci.c    |   26 +++++-
>  drivers/pci/quirks.c |  201 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h  |    2 
>  3 files changed, 223 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features
  2014-01-31 18:49 ` [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Bjorn Helgaas
@ 2014-01-31 19:25   ` Alex Williamson
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2014-01-31 19:25 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel

On Fri, 2014-01-31 at 11:49 -0700, Bjorn Helgaas wrote:
> On Mon, Jan 20, 2014 at 03:01:40PM -0700, Alex Williamson wrote:
> > As described in 2/2 many Intel root ports lack PCIe ACS capabilities
> > which results in excessively large IOMMU groups.  Many of these root
> > ports do provide isolation capabilities, we just need to use device
> > specific mechanisms to enable and verify.  Long term, I hope we can
> > round out this list (particularly to include X79 root ports) and
> > more importantly, encourage proper PCIe ACS support in future
> > products.  I'm really hoping we can get this in during the 3.14 cycle.
> 
> v3.13 was released Jan 19, so this came during the v3.14 merge window.  I
> like to have things in -next for a while before asking Linus to pull them,
> and I try to keep it to regression fixes after the merge window, i.e.,
> things that used to work, but don't work any more.  But that's all standard
> procedure that you already know, and maybe you can make a case for
> accelerating this.

Sorry, I sent it out as early as I was able to.  It's my understanding
that fixes are always allowed after the merge window and we can
generally assume that 3.14 won't be released for at least 6-8 weeks
after rc1, which gives us plenty of time to let this cook in -next and
still make 3.14.  Obviously it's your prerogative as maintainer where
you feel comfortable.

So what does this fix and how close is it to a regression?  The fix is a
quirk for hardware that lacks proper ACS support, but still provides
device isolation when properly configured.  What that means to a user is
that if they attempt to use vfio to expose a device to a userspace
driver such as QEMU, the device may be artificially tied to other
devices behind the root port and devices behind root ports that are part
of the same multifunction PCH package.  In effect, there are devices
that are isolated or isolate-able that users should be able to use
independently of other devices, but they can't.  I fully acknowledge
that this use case is a small, but important to me, subset of users.

As to a regression, the only remote software regression is that legacy
KVM device assignment takes a much more lax (non-existent) approach to
kernel-base isolation and allows such assignments.  That's a weak
regression, but if you're a downstream trying to switch users over to
vfio-based device assignment, it's an important one.  In some respects
there's also a hardware regression.  Intel root ports used to support
ACS and for whatever reasons they forgot how critical ACS is in
determining isolation sets for exposing devices to VMs.  We obviously
can't go back and fix the existing hardware, but we can fix this
regression for many of those chipsets in software (and beat Intel to
include it in next generation hardware).

I hope you'll consider it for 3.14, I know a number of users who
continue to patch their kernel with the old ACS override patch who would
appreciate it sooner than later, but I fully understand wanting some
soak time in -next first.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports
  2014-01-20 22:01 ` [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports Alex Williamson
@ 2014-01-31 19:26   ` Bjorn Helgaas
  2014-01-31 20:06     ` Alex Williamson
  0 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2014-01-31 19:26 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-pci, linux-kernel

On Mon, Jan 20, 2014 at 03:01:52PM -0700, Alex Williamson wrote:
> Many of the currently available Intel PCH-based root ports do not
> provide PCIe ACS capabilities.  Without this, we must assume that
> peer-to-peer traffic between multifunction root ports and between
> devices behind root ports is possible.  This lack of isolation is
> exposed by grouping the devices together in the same IOMMU group.
> If we want to expose these devices to userspace, vfio uses IOMMU
> groups as the unit of ownership, thus making it very difficult to
> assign individual devices to separate users.
> 
> The good news is that the chipset does provide ACS-like isolation
> capabilities, but we do need to verify and enable those capabilities
> if the BIOS has not done so.  This patch implements the device
> specific enabling and testing of equivalent ACS function for these
> devices.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/quirks.c |  176 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 176 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4be1cd5..3439a02 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -24,6 +24,7 @@
>  #include <linux/ioport.h>
>  #include <linux/sched.h>
>  #include <linux/ktime.h>
> +#include <linux/atomic.h>
>  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
>  #include "pci.h"
>  
> @@ -3423,6 +3424,74 @@ static int pci_quirk_amd_sb_acs(struct pci_dev *dev, u16 acs_flags)
>  #endif
>  }
>  
> +/*
> + * Many Intel PCH root ports do provide ACS-like features to disable peer
> + * transactions and validate bus numbers in requests, but do not provide an
> + * actual PCIe ACS capability.  This is the list of device IDs known to fall
> + * into that category as provided by Intel in Red Hat bugzilla 1037684.

Is there any documentation for this?  I'd feel better if we had a public
statement from Intel that "these devices support ACS and this is how you do
it."  The standard PCIe ACS capability is an explicit statement that the
vendor intends to support the feature as documented in the spec.  If we put
in undocumented quirks, we may end up relying on something the vendor has
not qualified and does not intend to actually support.

I see the device ID list in the RH bugzilla, of course, but I don't see the
name of anybody in Intel who stands behind the list and the knobs used in
this patch.  I expect you'll remind me that we don't have any better
documentation for 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge
devices"), which is true.  Mea culpa.

> + */
> +static const u16 pci_quirk_intel_pch_acs_ids[] = {
> +	/* Ibexpeak PCH */
> +	0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49,
> +	0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51,
> +	/* Cougarpoint PCH */
> +	0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17,
> +	0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f,
> +	/* Pantherpoint PCH */
> +	0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17,
> +	0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f,
> +	/* Lynxpoint-H PCH */
> +	0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17,
> +	0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f,
> +	/* Lynxpoint-LP PCH */
> +	0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17,
> +	0x9c18, 0x9c19, 0x9c1a, 0x9c1b,
> +	/* Wildcat PCH */
> +	0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97,
> +	0x9c98, 0x9c99, 0x9c9a, 0x9c9b,
> +};
> +
> +/*
> + * We can have multiple PCHs in a system and it is possible that we can
> + * fail to enable ACS support on some of them.  To keep it simple we
> + * handle them as all or nothing with the following values:
> + *  0: ACS is not enabled
> + * -1: ACS enable failed
> + * >0: ACS enabled
> + */
> +static atomic_t intel_pch_acs_quirk_status = ATOMIC_INIT(0);
> +
> +#define INTEL_PCH_ACS_FLAGS (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV)
> +
> +static bool pci_quirk_intel_pch_acs_match(struct pci_dev *dev)
> +{
> +	int i;
> +
> +	/* Filter out a few obvious non-matches first */
> +	if (!pci_is_pcie(dev) ||
> +	    pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT || dev->bus->number)

Why the bus number test?  Does this work correctly on systems like this?

  ACPI: PCI Root Bridge [PCI0] (0000:00)
  ACPI: PCI Root Bridge [PCI1] (0000:80)
  pci 0000:00:01.0: [8086:3c02] type 01 class 0x060400
  pci 0000:80:01.0: [8086:3c02] type 01 class 0x060400

  00:01.0 PCI bridge: Intel Sandy Bridge IIO PCI Express Root Port 1a
  80:01.0 PCI bridge: Intel Sandy Bridge IIO PCI Express Root Port 1a

> +		return false;
> +
> +	for (i = 0; i < ARRAY_SIZE(pci_quirk_intel_pch_acs_ids); i++)
> +		if (pci_quirk_intel_pch_acs_ids[i] == dev->device)
> +			return true;
> +
> +	return false;
> +}
> +
> +static int pci_quirk_intel_pch_acs(struct pci_dev *dev, u16 acs_flags)
> +{
> +	u16 flags = 0;
> +
> +	if (!pci_quirk_intel_pch_acs_match(dev))
> +		return -ENOTTY;
> +
> +	if (atomic_read(&intel_pch_acs_quirk_status) > 0)
> +		flags = INTEL_PCH_ACS_FLAGS;
> +
> +	return acs_flags & ~flags ? 0 : 1;
> +}
> +
>  static const struct pci_dev_acs_enabled {
>  	u16 vendor;
>  	u16 device;
> @@ -3434,6 +3503,7 @@ static const struct pci_dev_acs_enabled {
>  	{ PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs },
>  	{ PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs },
>  	{ PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs },
> +	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs },
>  	{ 0 }
>  };
>  
> @@ -3462,11 +3532,117 @@ int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags)
>  	return -ENOTTY;
>  }
>  
> +/* Config space offset of Root Complex Base Address register */
> +#define INTEL_LPC_RCBA_REG 0xf0
> +/* 31:14 RCBA address */
> +#define INTEL_LPC_RCBA_MASK 0xffffc000
> +/* RCBA Enable */
> +#define INTEL_LPC_RCBA_ENABLE (1 << 0)
> +
> +/* Backbone Scratch Pad Register */
> +#define INTEL_BSPR_REG 0x1104
> +/* Backbone Peer Non-Posted Disable */
> +#define INTEL_BSPR_REG_BPNPD (1 << 8)
> +/* Backbone Peer Posted Disable */
> +#define INTEL_BSPR_REG_BPPD  (1 << 9)
> +
> +/* Upstream Peer Decode Configuration Register */
> +#define INTEL_UPDCR_REG 0x1114
> +/* 5:0 Peer Decode Enable bits */
> +#define INTEL_UPDCR_REG_MASK 0x3f
> +
> +static int pci_quirk_enable_intel_lpc_acs(struct pci_dev *dev)
> +{
> +	u32 rcba, bspr, updcr;
> +	void __iomem *rcba_mem;
> +
> +	/*
> +	 * Read the RCBA register from the LPC (D31:F0).  PCH root ports
> +	 * are D28:F* and therefore get probed before LPC, thus we can't
> +	 * use pci_get_slot/pci_read_config_dword here.
> +	 */
> +	pci_bus_read_config_dword(dev->bus, PCI_DEVFN(31, 0),
> +				  INTEL_LPC_RCBA_REG, &rcba);
> +	if (!(rcba & INTEL_LPC_RCBA_ENABLE))
> +		return -EINVAL;
> +
> +	rcba_mem = ioremap_nocache(rcba & INTEL_LPC_RCBA_MASK,
> +				   PAGE_ALIGN(INTEL_UPDCR_REG));
> +	if (!rcba_mem)
> +		return -ENOMEM;
> +
> +	/*
> +	 * The BSPR can disallow peer cycles, but it's set by soft strap and
> +	 * therefore read-only.  If both posted and non-posted peer cycles are
> +	 * disallowed, we're ok.  If either are allowed, then we need to use
> +	 * the UPDCR to disable peer decodes for each port.  This provides the
> +	 * PCIe ACS equivalent of PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF
> +	 */
> +	bspr = readl(rcba_mem + INTEL_BSPR_REG);
> +	bspr &= INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD;
> +	if (bspr != (INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD)) {
> +		updcr = readl(rcba_mem + INTEL_UPDCR_REG);
> +		if (updcr & INTEL_UPDCR_REG_MASK) {
> +			dev_info(&dev->dev, "Disabling UPDCR peer decodes\n");
> +			updcr &= ~INTEL_UPDCR_REG_MASK;
> +			writel(updcr, rcba_mem + INTEL_UPDCR_REG);
> +		}
> +	}
> +
> +	iounmap(rcba_mem);
> +	return 0;
> +}
> +
> +/* Miscellaneous Port Configuration register */
> +#define INTEL_MPC_REG 0xd8
> +/* MPC: Invalid Receive Bus Number Check Enable */
> +#define INTEL_MPC_REG_IRBNCE (1 << 26)
> +
> +static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> +{
> +	u32 mpc;
> +
> +	/*
> +	 * When enabled, the IRBNCE bit of the MPC register enables the
> +	 * equivalent of PCI ACS Source Validation (PCI_ACS_SV), which
> +	 * ensures that requester IDs fall within the bus number range
> +	 * of the bridge.  Enable if not already.
> +	 */
> +	pci_read_config_dword(dev, INTEL_MPC_REG, &mpc);
> +	if (!(mpc & INTEL_MPC_REG_IRBNCE)) {
> +		dev_info(&dev->dev, "Enabling MPC IRBNCE\n");
> +		mpc |= INTEL_MPC_REG_IRBNCE;
> +		pci_write_config_word(dev, INTEL_MPC_REG, mpc);
> +	}
> +}
> +
> +static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> +{
> +	if (!pci_quirk_intel_pch_acs_match(dev))
> +		return -ENOTTY;
> +
> +	if (atomic_read(&intel_pch_acs_quirk_status) < 0)
> +		return 0;
> +
> +	if (pci_quirk_enable_intel_lpc_acs(dev)) {
> +		dev_warn(&dev->dev, "Failed to enable Intel PCH ACS quirk\n");
> +		atomic_set(&intel_pch_acs_quirk_status, -1);

This means we handle hot-added Root Ports differently than those present at
boot.  If the system supports ACS, then we hot-add a Root Port that doesn't
support ACS, then remove it, I think the system (with original
configuration) no longer supports ACS.

> +		return 0;
> +	}
> +
> +	pci_quirk_enable_intel_rp_mpc_acs(dev);
> +
> +	atomic_inc_unless_negative(&intel_pch_acs_quirk_status);
> +
> +	return 0;
> +}
> +
>  static const struct pci_dev_enable_acs {
>  	u16 vendor;
>  	u16 device;
>  	int (*enable_acs)(struct pci_dev *dev);
>  } pci_dev_enable_acs[] = {
> +	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_enable_intel_pch_acs },
>  	{ 0 }
>  };
>  
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports
  2014-01-31 19:26   ` Bjorn Helgaas
@ 2014-01-31 20:06     ` Alex Williamson
  2014-01-31 23:25       ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2014-01-31 20:06 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel

On Fri, 2014-01-31 at 12:26 -0700, Bjorn Helgaas wrote:
> On Mon, Jan 20, 2014 at 03:01:52PM -0700, Alex Williamson wrote:
> > Many of the currently available Intel PCH-based root ports do not
> > provide PCIe ACS capabilities.  Without this, we must assume that
> > peer-to-peer traffic between multifunction root ports and between
> > devices behind root ports is possible.  This lack of isolation is
> > exposed by grouping the devices together in the same IOMMU group.
> > If we want to expose these devices to userspace, vfio uses IOMMU
> > groups as the unit of ownership, thus making it very difficult to
> > assign individual devices to separate users.
> > 
> > The good news is that the chipset does provide ACS-like isolation
> > capabilities, but we do need to verify and enable those capabilities
> > if the BIOS has not done so.  This patch implements the device
> > specific enabling and testing of equivalent ACS function for these
> > devices.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/quirks.c |  176 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 176 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 4be1cd5..3439a02 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -24,6 +24,7 @@
> >  #include <linux/ioport.h>
> >  #include <linux/sched.h>
> >  #include <linux/ktime.h>
> > +#include <linux/atomic.h>
> >  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
> >  #include "pci.h"
> >  
> > @@ -3423,6 +3424,74 @@ static int pci_quirk_amd_sb_acs(struct pci_dev *dev, u16 acs_flags)
> >  #endif
> >  }
> >  
> > +/*
> > + * Many Intel PCH root ports do provide ACS-like features to disable peer
> > + * transactions and validate bus numbers in requests, but do not provide an
> > + * actual PCIe ACS capability.  This is the list of device IDs known to fall
> > + * into that category as provided by Intel in Red Hat bugzilla 1037684.
> 
> Is there any documentation for this?  I'd feel better if we had a public
> statement from Intel that "these devices support ACS and this is how you do
> it."  The standard PCIe ACS capability is an explicit statement that the
> vendor intends to support the feature as documented in the spec.  If we put
> in undocumented quirks, we may end up relying on something the vendor has
> not qualified and does not intend to actually support.

I've tried to get a public document, but the bz list is the best I've
been able to achieve and the best I expect to happen.

> I see the device ID list in the RH bugzilla, of course, but I don't see the
> name of anybody in Intel who stands behind the list and the knobs used in
> this patch.  I expect you'll remind me that we don't have any better
> documentation for 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge
> devices"), which is true.  Mea culpa.

The list in the bz is actually posted by an Intel partner, for whatever
reason using their redhat.com login rather than intel.com.  FWIW, I
agree 100% that I would prefer the list and procedure where public
documents, those who have been part of the process can attest to how
hard we've pushed for that, unfortunately this is what we have.

> > + */
> > +static const u16 pci_quirk_intel_pch_acs_ids[] = {
> > +	/* Ibexpeak PCH */
> > +	0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49,
> > +	0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51,
> > +	/* Cougarpoint PCH */
> > +	0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17,
> > +	0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f,
> > +	/* Pantherpoint PCH */
> > +	0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17,
> > +	0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f,
> > +	/* Lynxpoint-H PCH */
> > +	0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17,
> > +	0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f,
> > +	/* Lynxpoint-LP PCH */
> > +	0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17,
> > +	0x9c18, 0x9c19, 0x9c1a, 0x9c1b,
> > +	/* Wildcat PCH */
> > +	0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97,
> > +	0x9c98, 0x9c99, 0x9c9a, 0x9c9b,
> > +};
> > +
> > +/*
> > + * We can have multiple PCHs in a system and it is possible that we can
> > + * fail to enable ACS support on some of them.  To keep it simple we
> > + * handle them as all or nothing with the following values:
> > + *  0: ACS is not enabled
> > + * -1: ACS enable failed
> > + * >0: ACS enabled
> > + */
> > +static atomic_t intel_pch_acs_quirk_status = ATOMIC_INIT(0);
> > +
> > +#define INTEL_PCH_ACS_FLAGS (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV)
> > +
> > +static bool pci_quirk_intel_pch_acs_match(struct pci_dev *dev)
> > +{
> > +	int i;
> > +
> > +	/* Filter out a few obvious non-matches first */
> > +	if (!pci_is_pcie(dev) ||
> > +	    pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT || dev->bus->number)
> 
> Why the bus number test?  Does this work correctly on systems like this?
> 
>   ACPI: PCI Root Bridge [PCI0] (0000:00)
>   ACPI: PCI Root Bridge [PCI1] (0000:80)
>   pci 0000:00:01.0: [8086:3c02] type 01 class 0x060400
>   pci 0000:80:01.0: [8086:3c02] type 01 class 0x060400
> 
>   00:01.0 PCI bridge: Intel Sandy Bridge IIO PCI Express Root Port 1a
>   80:01.0 PCI bridge: Intel Sandy Bridge IIO PCI Express Root Port 1a

That's a bug, thanks for catching it.  In a previous version I assumed
only bus 0 based PCH ports and kept a static byte around to remember
which ports were enabled.  Once I found that we could have multiple PCH
ports in the system, I switched to the all or nothing approach, but left
this.  I'll remove for v2.

> > +		return false;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(pci_quirk_intel_pch_acs_ids); i++)
> > +		if (pci_quirk_intel_pch_acs_ids[i] == dev->device)
> > +			return true;
> > +
> > +	return false;
> > +}
> > +
> > +static int pci_quirk_intel_pch_acs(struct pci_dev *dev, u16 acs_flags)
> > +{
> > +	u16 flags = 0;
> > +
> > +	if (!pci_quirk_intel_pch_acs_match(dev))
> > +		return -ENOTTY;
> > +
> > +	if (atomic_read(&intel_pch_acs_quirk_status) > 0)
> > +		flags = INTEL_PCH_ACS_FLAGS;
> > +
> > +	return acs_flags & ~flags ? 0 : 1;
> > +}
> > +
> >  static const struct pci_dev_acs_enabled {
> >  	u16 vendor;
> >  	u16 device;
> > @@ -3434,6 +3503,7 @@ static const struct pci_dev_acs_enabled {
> >  	{ PCI_VENDOR_ID_ATI, 0x439d, pci_quirk_amd_sb_acs },
> >  	{ PCI_VENDOR_ID_ATI, 0x4384, pci_quirk_amd_sb_acs },
> >  	{ PCI_VENDOR_ID_ATI, 0x4399, pci_quirk_amd_sb_acs },
> > +	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_intel_pch_acs },
> >  	{ 0 }
> >  };
> >  
> > @@ -3462,11 +3532,117 @@ int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags)
> >  	return -ENOTTY;
> >  }
> >  
> > +/* Config space offset of Root Complex Base Address register */
> > +#define INTEL_LPC_RCBA_REG 0xf0
> > +/* 31:14 RCBA address */
> > +#define INTEL_LPC_RCBA_MASK 0xffffc000
> > +/* RCBA Enable */
> > +#define INTEL_LPC_RCBA_ENABLE (1 << 0)
> > +
> > +/* Backbone Scratch Pad Register */
> > +#define INTEL_BSPR_REG 0x1104
> > +/* Backbone Peer Non-Posted Disable */
> > +#define INTEL_BSPR_REG_BPNPD (1 << 8)
> > +/* Backbone Peer Posted Disable */
> > +#define INTEL_BSPR_REG_BPPD  (1 << 9)
> > +
> > +/* Upstream Peer Decode Configuration Register */
> > +#define INTEL_UPDCR_REG 0x1114
> > +/* 5:0 Peer Decode Enable bits */
> > +#define INTEL_UPDCR_REG_MASK 0x3f
> > +
> > +static int pci_quirk_enable_intel_lpc_acs(struct pci_dev *dev)
> > +{
> > +	u32 rcba, bspr, updcr;
> > +	void __iomem *rcba_mem;
> > +
> > +	/*
> > +	 * Read the RCBA register from the LPC (D31:F0).  PCH root ports
> > +	 * are D28:F* and therefore get probed before LPC, thus we can't
> > +	 * use pci_get_slot/pci_read_config_dword here.
> > +	 */
> > +	pci_bus_read_config_dword(dev->bus, PCI_DEVFN(31, 0),
> > +				  INTEL_LPC_RCBA_REG, &rcba);
> > +	if (!(rcba & INTEL_LPC_RCBA_ENABLE))
> > +		return -EINVAL;
> > +
> > +	rcba_mem = ioremap_nocache(rcba & INTEL_LPC_RCBA_MASK,
> > +				   PAGE_ALIGN(INTEL_UPDCR_REG));
> > +	if (!rcba_mem)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * The BSPR can disallow peer cycles, but it's set by soft strap and
> > +	 * therefore read-only.  If both posted and non-posted peer cycles are
> > +	 * disallowed, we're ok.  If either are allowed, then we need to use
> > +	 * the UPDCR to disable peer decodes for each port.  This provides the
> > +	 * PCIe ACS equivalent of PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF
> > +	 */
> > +	bspr = readl(rcba_mem + INTEL_BSPR_REG);
> > +	bspr &= INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD;
> > +	if (bspr != (INTEL_BSPR_REG_BPNPD | INTEL_BSPR_REG_BPPD)) {
> > +		updcr = readl(rcba_mem + INTEL_UPDCR_REG);
> > +		if (updcr & INTEL_UPDCR_REG_MASK) {
> > +			dev_info(&dev->dev, "Disabling UPDCR peer decodes\n");
> > +			updcr &= ~INTEL_UPDCR_REG_MASK;
> > +			writel(updcr, rcba_mem + INTEL_UPDCR_REG);
> > +		}
> > +	}
> > +
> > +	iounmap(rcba_mem);
> > +	return 0;
> > +}
> > +
> > +/* Miscellaneous Port Configuration register */
> > +#define INTEL_MPC_REG 0xd8
> > +/* MPC: Invalid Receive Bus Number Check Enable */
> > +#define INTEL_MPC_REG_IRBNCE (1 << 26)
> > +
> > +static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> > +{
> > +	u32 mpc;
> > +
> > +	/*
> > +	 * When enabled, the IRBNCE bit of the MPC register enables the
> > +	 * equivalent of PCI ACS Source Validation (PCI_ACS_SV), which
> > +	 * ensures that requester IDs fall within the bus number range
> > +	 * of the bridge.  Enable if not already.
> > +	 */
> > +	pci_read_config_dword(dev, INTEL_MPC_REG, &mpc);
> > +	if (!(mpc & INTEL_MPC_REG_IRBNCE)) {
> > +		dev_info(&dev->dev, "Enabling MPC IRBNCE\n");
> > +		mpc |= INTEL_MPC_REG_IRBNCE;
> > +		pci_write_config_word(dev, INTEL_MPC_REG, mpc);
> > +	}
> > +}
> > +
> > +static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> > +{
> > +	if (!pci_quirk_intel_pch_acs_match(dev))
> > +		return -ENOTTY;
> > +
> > +	if (atomic_read(&intel_pch_acs_quirk_status) < 0)
> > +		return 0;
> > +
> > +	if (pci_quirk_enable_intel_lpc_acs(dev)) {
> > +		dev_warn(&dev->dev, "Failed to enable Intel PCH ACS quirk\n");
> > +		atomic_set(&intel_pch_acs_quirk_status, -1);
> 
> This means we handle hot-added Root Ports differently than those present at
> boot.  If the system supports ACS, then we hot-add a Root Port that doesn't
> support ACS, then remove it, I think the system (with original
> configuration) no longer supports ACS.

That's true, sort of.  IOMMU groups are determined as the devices are
probed and remain static.  So, while this turns off ACS, the IOMMU
groups after Root Port unplug remain as they were before.  Making them
dynamic is a much larger problem.  I was trying to use the atomic to
avoid allocating data structures runtime for this quirk.  The other
question though would be whether any of these particular PCH devices
support hotplug.  Do you know for which Intel chipsets this is even
feasible?  If we need more fine grained tracking we'll need to track the
segment and bus number, then we might as well use another byte with a
bit per function identifying the quirked ports.  Unfortunately with a
list comes some sort of locking and allocation and de-allocation of
entries, so we need an unplug hook.  It's possible, but I'd rather keep
it simple if this is only a "what if" question.  Thanks,

Alex

> > +		return 0;
> > +	}
> > +
> > +	pci_quirk_enable_intel_rp_mpc_acs(dev);
> > +
> > +	atomic_inc_unless_negative(&intel_pch_acs_quirk_status);
> > +
> > +	return 0;
> > +}
> > +
> >  static const struct pci_dev_enable_acs {
> >  	u16 vendor;
> >  	u16 device;
> >  	int (*enable_acs)(struct pci_dev *dev);
> >  } pci_dev_enable_acs[] = {
> > +	{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_enable_intel_pch_acs },
> >  	{ 0 }
> >  };
> >  
> > 




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports
  2014-01-31 20:06     ` Alex Williamson
@ 2014-01-31 23:25       ` Bjorn Helgaas
  0 siblings, 0 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2014-01-31 23:25 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-pci, linux-kernel

On Fri, Jan 31, 2014 at 01:06:13PM -0700, Alex Williamson wrote:
> On Fri, 2014-01-31 at 12:26 -0700, Bjorn Helgaas wrote:
> > On Mon, Jan 20, 2014 at 03:01:52PM -0700, Alex Williamson wrote:

> > > +/*
> > > + * Many Intel PCH root ports do provide ACS-like features to disable peer
> > > + * transactions and validate bus numbers in requests, but do not provide an
> > > + * actual PCIe ACS capability.  This is the list of device IDs known to fall
> > > + * into that category as provided by Intel in Red Hat bugzilla 1037684.
> > 
> > Is there any documentation for this?  I'd feel better if we had a public
> > statement from Intel that "these devices support ACS and this is how you do
> > it."  The standard PCIe ACS capability is an explicit statement that the
> > vendor intends to support the feature as documented in the spec.  If we put
> > in undocumented quirks, we may end up relying on something the vendor has
> > not qualified and does not intend to actually support.
> 
> I've tried to get a public document, but the bz list is the best I've
> been able to achieve and the best I expect to happen.
> 
> > I see the device ID list in the RH bugzilla, of course, but I don't see the
> > name of anybody in Intel who stands behind the list and the knobs used in
> > this patch.  I expect you'll remind me that we don't have any better
> > documentation for 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge
> > devices"), which is true.  Mea culpa.
> 
> The list in the bz is actually posted by an Intel partner, for whatever
> reason using their redhat.com login rather than intel.com.  FWIW, I
> agree 100% that I would prefer the list and procedure where public
> documents, those who have been part of the process can attest to how
> hard we've pushed for that, unfortunately this is what we have.

Heh, I saw "Don Dugger," but for some reason I read "Don Dutile" :)  I
wonder if you could get him (Don Dugger) to ack this patch?

Absent that, and maybe even *with* that, I'd like some sort of dmesg note
about enabling undocumented ACS-like features.  I'm not happy with Linux
claiming to support something that the vendor isn't willing to stand
behind, especially a security-related thing like this.

> > > +static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> > > +{
> > > +	if (!pci_quirk_intel_pch_acs_match(dev))
> > > +		return -ENOTTY;
> > > +
> > > +	if (atomic_read(&intel_pch_acs_quirk_status) < 0)
> > > +		return 0;
> > > +
> > > +	if (pci_quirk_enable_intel_lpc_acs(dev)) {
> > > +		dev_warn(&dev->dev, "Failed to enable Intel PCH ACS quirk\n");
> > > +		atomic_set(&intel_pch_acs_quirk_status, -1);
> > 
> > This means we handle hot-added Root Ports differently than those present at
> > boot.  If the system supports ACS, then we hot-add a Root Port that doesn't
> > support ACS, then remove it, I think the system (with original
> > configuration) no longer supports ACS.
> 
> That's true, sort of.  IOMMU groups are determined as the devices are
> probed and remain static.  So, while this turns off ACS, the IOMMU
> groups after Root Port unplug remain as they were before.  Making them
> dynamic is a much larger problem.  I was trying to use the atomic to
> avoid allocating data structures runtime for this quirk.  The other
> question though would be whether any of these particular PCH devices
> support hotplug.  Do you know for which Intel chipsets this is even
> feasible?  If we need more fine grained tracking we'll need to track the
> segment and bus number, then we might as well use another byte with a
> bit per function identifying the quirked ports.  Unfortunately with a
> list comes some sort of locking and allocation and de-allocation of
> entries, so we need an unplug hook.  It's possible, but I'd rather keep
> it simple if this is only a "what if" question.

This is definitely a "what if" question.  I have no idea whether these
devices can be hotplugged.

My point is that the reader should not *need* to know whether these
particular devices can be hot-plugged.  If we need that knowledge, it
becomes impractical to analyze the code.  So if we can write this in a way
that obviously works for hot-plugged Root Ports, that would be much better.

Can we add an "acs_enabled" bit in the pci_dev or something?

> Thanks,

Don't thank me, I haven't done anything yet except make your life harder :)

Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-01-31 23:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-20 22:01 [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Alex Williamson
2014-01-20 22:01 ` [PATCH 1/2] pci: Add device specific PCI ACS enable Alex Williamson
2014-01-20 22:01 ` [PATCH 2/2] pci/quirks: Enable quirks for PCIe ACS on Intel PCH root ports Alex Williamson
2014-01-31 19:26   ` Bjorn Helgaas
2014-01-31 20:06     ` Alex Williamson
2014-01-31 23:25       ` Bjorn Helgaas
2014-01-31 18:49 ` [PATCH 0/2] Quirk Intel PCH root ports for ACS-like features Bjorn Helgaas
2014-01-31 19:25   ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).