All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Buettner <kevinb@redhat.com>
To: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>, alex.williamson@redhat.com
Subject: [PATCH] PCI: Avoid FLR for AMD Starship USB 3.0
Date: Sun, 24 May 2020 00:35:29 -0700	[thread overview]
Message-ID: <20200524003529.598434ff@f31-4.lan> (raw)

This commit adds an entry to the quirk_no_flr table for the AMD
Starship USB 3.0 host controller.

Tested on a Micro-Star International Co., Ltd. MS-7C59/Creator TRX40
motherboard with an AMD Ryzen Threadripper 3970X.

Without this patch, when attempting to assign (pass through) an AMD
Starship USB 3.0 host controller to a guest OS, the system becomes
increasingly unresponsive over the course of several minutes,
eventually requiring a hard reset.

Shortly after attempting to start the guest, I see these messages:

May 23 22:59:46 mesquite kernel: vfio-pci 0000:05:00.3: not ready 1023ms after FLR; waiting
May 23 22:59:48 mesquite kernel: vfio-pci 0000:05:00.3: not ready 2047ms after FLR; waiting
May 23 22:59:51 mesquite kernel: vfio-pci 0000:05:00.3: not ready 4095ms after FLR; waiting
May 23 22:59:56 mesquite kernel: vfio-pci 0000:05:00.3: not ready 8191ms after FLR; waiting

And then eventually:

May 23 23:01:00 mesquite kernel: vfio-pci 0000:05:00.3: not ready 65535ms after FLR; giving up
May 23 23:01:05 mesquite kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
May 23 23:01:06 mesquite kernel: perf: interrupt took too long (642744 > 2500), lowering kernel.perf_event_max_sample_rate to 1000
May 23 23:01:07 mesquite kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 82.270 msecs
May 23 23:01:08 mesquite kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 680.608 msecs
May 23 23:01:08 mesquite kernel: INFO: NMI handler (perf_event_nmi_handler) took too long to run: 100.952 msecs
...
 kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [qemu-system-x86:7487]
May 23 23:01:25 mesquite kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [qemu-system-x86:7487]

The above log snippets were obtained using the aforementioned hardware
running Fedora 32 w/ kernel package kernel-5.6.13-300.fc32.x86_64.  My
fix was applied to a local copy of the F32 kernel package, then
rebuilt, etc.

With this patch in place, the host kernel doesn't exhibit these
problems.  The guest OS (also Fedora 32) starts up and works as
expected with the passed-through USB host controller.

Signed-off-by: Kevin Buettner <kevinb@redhat.com>

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 43a0c2ce635e..b1db58d00d2b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5133,6 +5133,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
  * FLR may cause the following to devices to hang:
  *
  * AMD Starship/Matisse HD Audio Controller 0x1487
+ * AMD Starship USB 3.0 Host Controller 0x148c
  * AMD Matisse USB 3.0 Host Controller 0x149c
  * Intel 82579LM Gigabit Ethernet Controller 0x1502
  * Intel 82579V Gigabit Ethernet Controller 0x1503
@@ -5143,6 +5144,7 @@ static void quirk_no_flr(struct pci_dev *dev)
 	dev->dev_flags |= PCI_DEV_FLAGS_NO_FLR_RESET;
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1487, quirk_no_flr);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x148c, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);


             reply	other threads:[~2020-05-24  7:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-24  7:35 Kevin Buettner [this message]
2020-05-27 21:31 ` [PATCH] PCI: Avoid FLR for AMD Starship USB 3.0 Bjorn Helgaas
2020-05-27 21:42   ` Deucher, Alexander
2020-05-28  8:12     ` Marcos Scriven
2020-06-08 17:47       ` Marcos Scriven
2020-06-09 11:47         ` Shah, Nehal-bakulchandra
2020-06-25 10:22           ` Marcos Scriven
     [not found] <CAAri2DpQnrGH5bnjC==W+HmnD4XMh8gcp9u-_LQ=K-jtrdHwAg@mail.gmail.com>
2020-07-13 22:14 ` Bjorn Helgaas
2020-07-13 22:48   ` Deucher, Alexander
2020-08-14  8:52     ` Marcos Scriven
2020-08-14 14:46       ` Deucher, Alexander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200524003529.598434ff@f31-4.lan \
    --to=kevinb@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.