linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcos Scriven <marcos@scriven.org>
To: "Deucher, Alexander" <Alexander.Deucher@amd.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Kevin Buettner <kevinb@redhat.com>,
	"Shah, Nehal-bakulchandra" <Nehal-bakulchandra.Shah@amd.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: [PATCH] PCI: Avoid FLR for AMD Starship USB 3.0
Date: Mon, 8 Jun 2020 18:47:18 +0100	[thread overview]
Message-ID: <CAAri2Dqm6vGySEFjUYKcED5fJcN2Gr38Cj-02ab5ONuz6r88jw@mail.gmail.com> (raw)
In-Reply-To: <CAAri2Dqruwmu19o1V1b_=0-0RR+J_dgmxFi=izLej_m=XQ1VGw@mail.gmail.com>

On Thu, 28 May 2020 at 09:12, Marcos Scriven <marcos@scriven.org> wrote:
>
> On Wed, 27 May 2020 at 22:42, Deucher, Alexander
> <Alexander.Deucher@amd.com> wrote:
> >
> > [AMD Official Use Only - Internal Distribution Only]
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > Sent: Wednesday, May 27, 2020 5:32 PM
> > > To: Kevin Buettner <kevinb@redhat.com>
> > > Cc: linux-pci@vger.kernel.org; Bjorn Helgaas <bhelgaas@google.com>; Alex
> > > Williamson <alex.williamson@redhat.com>; Deucher, Alexander
> > > <Alexander.Deucher@amd.com>; Koenig, Christian
> > > <Christian.Koenig@amd.com>
> > > Subject: Re: [PATCH] PCI: Avoid FLR for AMD Starship USB 3.0
> > >
> > > [+cc Alex D, Christian -- do you guys have any contacts or insight into why we
> > > suddenly have three new AMD devices that advertise FLR support but it
> > > doesn't work?  Are we doing something wrong in Linux, or are these devices
> > > defective?
> >
> > +Nehal who handles our USB drivers.
> >
> > Nehal any ideas about FLR or whether it should be advertised?
> >
> > Alex
> >
>
> I had read somewhere that the IO die in the Ryzen/Threadripper
> packages are identical to the ones used in the motherboard chipsets.
>
> Since the latter do reset ok, it would seem a BIOS update of the AGESA
> may potentially fix the issue.
>
> Unfortunately, it's not something motherboard manufacturer's customer
> support people know how to deal with or pass back up the chain to AMD
> engineers. Actual use of this feature seems to be fairly niche.
>
> After I added the workaround for the USB and audio controllers on the
> 3rd-gen Ryzen, I tried contacting Kim Phillips (who I found as a
> kernel committer to x86/cpu/amd), but haven't heard back.
>
> It would be wonderful to know if this can potentially be fixed in CPU
> firmware, and whether there's any likelihood of it actually being
> distributed by motherboard manufacturers.
>
> Marcos
>
>
>

Dear Alex/Nehal

I wonder if you're able to comment please on whether FLR should be advertised?

Is there any chance this could be fixed at the bios/AGESA level, and
effectively rolled out?

Thanks

Marcos

> > >
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> > > kernel.org%2Fr%2F20200524003529.598434ff%40f31-
> > > 4.lan&amp;data=02%7C01%7Calexander.deucher%40amd.com%7Ccb77b56b
> > > 62ae47f60f8808d802855759%7C3dd8961fe4884e608e11a82d994e183d%7C0%
> > > 7C0%7C637262119015438912&amp;sdata=3z%2Btn%2Bv2pvUl3X0Tzk%2BLoi
> > > Mk06dLZCmgUOrsGf3kLpY%3D&amp;reserved=0
> > >   AMD Starship USB 3.0 host controller
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> > > kernel.org%2Fr%2FCAAri2DpkcuQZYbT6XsALhx2e6vRqPHwtbjHYeiH7MNp4z
> > > mt1RA%40mail.gmail.com&amp;data=02%7C01%7Calexander.deucher%40a
> > > md.com%7Ccb77b56b62ae47f60f8808d802855759%7C3dd8961fe4884e608e11
> > > a82d994e183d%7C0%7C0%7C637262119015438912&amp;sdata=69GsHB0HCp
> > > 6x0xW0tA%2FrAln0Vy0Yc9I8QSHowebdIxI%3D&amp;reserved=0
> > >   AMD Matisse HD Audio & USB 3.0 host controller ]
> > >
> > > On Sun, May 24, 2020 at 12:35:29AM -0700, Kevin Buettner wrote:
> > > > This commit adds an entry to the quirk_no_flr table for the AMD
> > > > Starship USB 3.0 host controller.
> > > >
> > > > Tested on a Micro-Star International Co., Ltd. MS-7C59/Creator TRX40
> > > > motherboard with an AMD Ryzen Threadripper 3970X.
> > > >
> > > > Without this patch, when attempting to assign (pass through) an AMD
> > > > Starship USB 3.0 host controller to a guest OS, the system becomes
> > > > increasingly unresponsive over the course of several minutes,
> > > > eventually requiring a hard reset.
> > > >
> > > > Shortly after attempting to start the guest, I see these messages:
> > > >
> > > > May 23 22:59:46 mesquite kernel: vfio-pci 0000:05:00.3: not ready
> > > > 1023ms after FLR; waiting May 23 22:59:48 mesquite kernel: vfio-pci
> > > > 0000:05:00.3: not ready 2047ms after FLR; waiting May 23 22:59:51
> > > > mesquite kernel: vfio-pci 0000:05:00.3: not ready 4095ms after FLR;
> > > > waiting May 23 22:59:56 mesquite kernel: vfio-pci 0000:05:00.3: not
> > > > ready 8191ms after FLR; waiting
> > > >
> > > > And then eventually:
> > > >
> > > > May 23 23:01:00 mesquite kernel: vfio-pci 0000:05:00.3: not ready
> > > > 65535ms after FLR; giving up May 23 23:01:05 mesquite kernel: INFO:
> > > > NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
> > > > May 23 23:01:06 mesquite kernel: perf: interrupt took too long (642744
> > > > > 2500), lowering kernel.perf_event_max_sample_rate to 1000 May 23
> > > > 23:01:07 mesquite kernel: INFO: NMI handler (perf_event_nmi_handler)
> > > > took too long to run: 82.270 msecs May 23 23:01:08 mesquite kernel: INFO:
> > > NMI handler (perf_event_nmi_handler) took too long to run: 680.608 msecs
> > > May 23 23:01:08 mesquite kernel: INFO: NMI handler
> > > (perf_event_nmi_handler) took too long to run: 100.952 msecs ...
> > > >  kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 22s!
> > > > [qemu-system-x86:7487] May 23 23:01:25 mesquite kernel: watchdog:
> > > BUG:
> > > > soft lockup - CPU#3 stuck for 22s! [qemu-system-x86:7487]
> > > >
> > > > The above log snippets were obtained using the aforementioned hardware
> > > > running Fedora 32 w/ kernel package kernel-5.6.13-300.fc32.x86_64.  My
> > > > fix was applied to a local copy of the F32 kernel package, then
> > > > rebuilt, etc.
> > > >
> > > > With this patch in place, the host kernel doesn't exhibit these
> > > > problems.  The guest OS (also Fedora 32) starts up and works as
> > > > expected with the passed-through USB host controller.
> > > >
> > > > Signed-off-by: Kevin Buettner <kevinb@redhat.com>
> > >
> > > Applied to pci/virtualization for v5.8, thanks!
> > >
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index
> > > > 43a0c2ce635e..b1db58d00d2b 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -5133,6 +5133,7 @@
> > > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443,
> > > quirk_intel_qat_vf_cap);
> > > >   * FLR may cause the following to devices to hang:
> > > >   *
> > > >   * AMD Starship/Matisse HD Audio Controller 0x1487
> > > > + * AMD Starship USB 3.0 Host Controller 0x148c
> > > >   * AMD Matisse USB 3.0 Host Controller 0x149c
> > > >   * Intel 82579LM Gigabit Ethernet Controller 0x1502
> > > >   * Intel 82579V Gigabit Ethernet Controller 0x1503 @@ -5143,6 +5144,7
> > > > @@ static void quirk_no_flr(struct pci_dev *dev)
> > > >     dev->dev_flags |= PCI_DEV_FLAGS_NO_FLR_RESET;  }
> > > > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1487, quirk_no_flr);
> > > > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x148c,
> > > quirk_no_flr);
> > > >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c,
> > > quirk_no_flr);
> > > > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502,
> > > quirk_no_flr);
> > > > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503,
> > > quirk_no_flr);
> > > >

  reply	other threads:[~2020-06-08 17:47 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-24  7:35 [PATCH] PCI: Avoid FLR for AMD Starship USB 3.0 Kevin Buettner
2020-05-27 21:31 ` Bjorn Helgaas
2020-05-27 21:42   ` Deucher, Alexander
2020-05-28  8:12     ` Marcos Scriven
2020-06-08 17:47       ` Marcos Scriven [this message]
2020-06-09 11:47         ` Shah, Nehal-bakulchandra
2020-06-25 10:22           ` Marcos Scriven
     [not found] <CAAri2DpQnrGH5bnjC==W+HmnD4XMh8gcp9u-_LQ=K-jtrdHwAg@mail.gmail.com>
2020-07-13 22:14 ` Bjorn Helgaas
2020-07-13 22:48   ` Deucher, Alexander
2020-08-14  8:52     ` Marcos Scriven
2020-08-14 14:46       ` Deucher, Alexander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAri2Dqm6vGySEFjUYKcED5fJcN2Gr38Cj-02ab5ONuz6r88jw@mail.gmail.com \
    --to=marcos@scriven.org \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Nehal-bakulchandra.Shah@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=kevinb@redhat.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).