All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jon Mason <jdmason@kudzu.us>
To: Avi Kivity <avi@redhat.com>
Cc: Jon Mason <mason@myri.com>, Sven Schnelle <svens@stackframe.org>,
	Simon Kirby <sim@hostway.ca>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Niels Ole Salscheider <niels_ole@salscheider-online.de>,
	Jesse Barnes <jbarnes@virtuousgeek.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Ben Hutchings <bhutchings@solarflare.com>
Subject: Re: Workaround for Intel MPS errata
Date: Tue, 4 Oct 2011 22:46:39 -0500	[thread overview]
Message-ID: <20111005034636.GA8618@kudzu.us> (raw)
In-Reply-To: <4E8B04D8.5010107@redhat.com>

Hey Avi,
I believe this will fix the issue (assuming the errata is the issue in
the first place).  You'll need to apply the patch on top of Linus'
latest code and re-enable the MPS tuning (as it is now off by
default).  This can be done by adding "pci=pcie_bus_safe" to your boot
args.

After thinking about it some more, a PCI quark is the correct way of
doing things.  We must always disable read completion coalescing due
to the possibility of hotplugging a device with a MPS of 256B.  Also,
I believe everyone will think this is much cleaner.

Let me know how it goes and thanks again for testing this for me.

Thanks,
Jon

commit ada901c643c7d0978a762fbadc2a6526e4ddf860
Author: Jon Mason <mason@myri.com>
Date:   Thu Sep 29 16:56:37 2011 -0500

    PCI: Workaround for Intel MPS errata
    
    Intel 5000 and 5100 series memory controllers have a known issue if read
    completion coalescing is enabled and the PCI-E Maximum Payload Size is
    set to 256B.  To work around this issue, disable read completion
    coalescing in the memory controller and root complexes.  Unfortunately,
    it must always be disabled, even if no 256B MPS devices are precent, due
    to the possiblity of one being hotplugged.
    
    Links to erratas:
    http://www.intel.com/content/dam/doc/specification-update/5000-chipset-memory-controller-hub-specification-update.pdf
    http://www.intel.com/content/dam/doc/specification-update/5100-memory-controller-hub-chipset-specification-update.pdf
    
    Thanks to Jesse Brandeburg and Ben Hutchings for providing insight into
    the problem.
    
    Tested-and-Reported-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Jon Mason <mason@myri.com>

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6ab6bd3..c223d95 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1452,7 +1452,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
 	return 0;
 }
 
-/* pcie_bus_configure_mps requires that pci_walk_bus work in a top-down,
+/* pcie_bus_configure_settings requires that pci_walk_bus work in a top-down,
  * parents then children fashion.  If this changes, then this code will not
  * work as designed.
  */
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 1196f61..70da3f5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2822,6 +2822,75 @@ static void __devinit fixup_ti816x_class(struct pci_dev* dev)
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_TI, 0xb800, fixup_ti816x_class);
 
+/* Intel 5000 and 5100 Memory controllers have an errata with read completion
+ * coalescing (which is enabled by default on some BIOSes) and MPS of 256B.
+ * Since there is no way of knowing what the PCIE MPS on each fabric will be
+ * until all of the devices are discovered and buses walked, read completion
+ * coalescing must be disabled.  Unfortunately, it cannot be re-enabled because
+ * it is possible to hotplug a device with MPS of 256B.
+ */
+static void __devinit quirk_intel_mc_errata(struct pci_dev *dev)
+{
+	int err;
+	u16 rcc;
+
+	if (pcie_bus_config == PCIE_BUS_TUNE_OFF)
+		return;
+
+	/* Intel errata specifies bits to change but does not say what they are.
+	 * Keeping them magical until such time as the registers and values can
+	 * be explained.
+	 */
+	err = pci_read_config_word(dev, 0x48, &rcc);
+	if (err) {
+		dev_err(&dev->dev, "Error attempting to read the read "
+			"completion coalescing register.\n");
+		return;
+	}
+
+	if (!(rcc & (1 << 10)))
+		return;
+
+	rcc &= ~(1 << 10);
+
+	err = pci_write_config_word(dev, 0x48, rcc);
+	if (err) {
+		dev_err(&dev->dev, "Error attempting to write the read "
+			"completion coalescing register.\n");
+		return;
+	}
+
+	pr_info_once("Read completion coalescing disabled due to hardware "
+		     "errata relating to 256B MPS.\n");
+}
+/* Intel 5000 series memory controllers and ports 2-7 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25c0, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d0, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d4, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25d8, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e2, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e3, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e4, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e5, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e6, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25e7, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f7, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f8, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25f9, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x25fa, quirk_intel_mc_errata);
+/* Intel 5100 series memory controllers and ports 2-7 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65c0, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e2, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e3, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e4, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e5, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e6, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65e7, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f7, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f8, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65f9, quirk_intel_mc_errata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x65fa, quirk_intel_mc_errata);
+
 static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
 			  struct pci_fixup *end)
 {

  parent reply	other threads:[~2011-10-05  3:46 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-27 17:01 [REGRESSION] e1000e failure triggered by "PCI: Remove MRRS modification from MPS setting code" Avi Kivity
2011-09-27 17:59 ` Jon Mason
2011-09-27 18:28   ` Avi Kivity
2011-09-27 20:11     ` Jon Mason
2011-09-29  4:33       ` Benjamin Herrenschmidt
2011-09-29 13:53         ` Jon Mason
2011-09-30  0:16     ` Workaround for Intel MPS errata Jon Mason
2011-09-30  2:21       ` Jesse Brandeburg
2011-09-30  2:51         ` Jon Mason
2011-09-30  5:01       ` Bjorn Helgaas
2011-09-30 15:35         ` Jon Mason
2011-09-30 17:17           ` Bjorn Helgaas
2011-09-30 17:38             ` Jon Mason
2011-09-30 17:57               ` Bjorn Helgaas
2011-09-30  7:03       ` Rolf Eike Beer
2011-09-30 15:39         ` Jon Mason
2011-10-02  9:26       ` Avi Kivity
2011-10-03  4:58         ` Jon Mason
2011-10-03 10:11           ` Avi Kivity
2011-10-03 15:12             ` Jon Mason
2011-10-04  9:46               ` Avi Kivity
2011-10-04 13:06                 ` Avi Kivity
2011-10-04 13:11                   ` Jon Mason
2011-10-04 20:12                   ` Jon Mason
2011-10-05  3:46                   ` Jon Mason [this message]
2011-10-05 12:09                     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111005034636.GA8618@kudzu.us \
    --to=jdmason@kudzu.us \
    --cc=avi@redhat.com \
    --cc=bhutchings@solarflare.com \
    --cc=eric.dumazet@gmail.com \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mason@myri.com \
    --cc=niels_ole@salscheider-online.de \
    --cc=sim@hostway.ca \
    --cc=svens@stackframe.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.