From: Matthew Ruffell <matthew.ruffell@canonical.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: linux-pci@vger.kernel.org, lkml <linux-kernel@vger.kernel.org>,
kvm@vger.kernel.org,
nathan.langford@xcelesunifiedtechnologies.com
Subject: Re: [PROBLEM] Frequently get "irq 31: nobody cared" when passing through 2x GPUs that share same pci switch via vfio
Date: Thu, 16 Sep 2021 17:13:21 +1200 [thread overview]
Message-ID: <4d9d0366-1769-691f-fcb0-3b14d468e36e@canonical.com> (raw)
In-Reply-To: <20210915103235.097202d2.alex.williamson@redhat.com>
On 16/09/21 4:32 am, Alex Williamson wrote:
> On Wed, 15 Sep 2021 16:44:38 +1200
> Matthew Ruffell <matthew.ruffell@canonical.com> wrote:
>> On 15/09/21 4:43 am, Alex Williamson wrote:
>>>
>>> FWIW, I have access to a system with an NVIDIA K1 and M60, both use
>>> this same switch on-card and I've not experienced any issues assigning
>>> all the GPUs to a single VM. Topo:
>>>
>>> +-[0000:40]-+-02.0-[42-47]----00.0-[43-47]--+-08.0-[44]----00.0
>>> | +-09.0-[45]----00.0
>>> | +-10.0-[46]----00.0
>>> | \-11.0-[47]----00.0
>>> \-[0000:00]-+-03.0-[04-07]----00.0-[05-07]--+-08.0-[06]----00.0
>>> \-10.0-[07]----00.0
>
>
> I've actually found that the above configuration, assigning all 6 GPUs
> to a VM reproduces this pretty readily by simply rebooting the VM. In
> my case, I don't have the panic-on-warn/oops that must be set on your
> kernel, so the result is far more benign, the IRQ gets masked until
> it's re-registered.
>
> The fact that my upstream ports are using MSI seems irrelevant.
Hi Alex,
It is good news that you can reproduce an interrupt storm locally. Did a single
reboot trigger the storm, or did you have to loop the VM a few times?
On our system, if we don't have panic-on-warn/oops set, the system will
eventually grind to a halt and lock up, so we try to reset earlier on the first
oops, but we still get stuck in the crashkernel copying the IR tables from dmar.
>
> Adding debugging to the vfio-pci interrupt handler, it's correctly
> deferring the interrupt as the GPU device is not identifying itself as
> the source of the interrupt via the status register. In fact, setting
> the disable INTx bit in the GPU command register while the interrupt
> storm occurs does not stop the interrupts.
>
Interesting. So the source of the interrupts could be from the PEX switch
itself?
We did a run with DisIntx+ set on the PEX switches, but it didn't make any
difference. Serial log showing DisIntx+ and full dmesg below:
https://paste.ubuntu.com/p/n3XshCxPT8/
> The interrupt storm does seem to be related to the bus resets, but I
> can't figure out yet how multiple devices per switch factors into the
> issue. Serializing all bus resets via a mutex doesn't seem to change
> the behavior.
Very interesting indeed.
> I'm still investigating, but if anyone knows how to get access to the
> Broadcom datasheet or errata for this switch, please let me know.
I have tried reaching out to Broadcom asking for the datasheet and errata, but
I am unsure if they will get back to me.
They list the errata as publicly available on their website, in the
Documentation > errata tab.
https://www.broadcom.com/products/pcie-switches-bridges/pcie-switches/pex8749#documentation
The file "PEX 8749/48/47/33/32/25/24/23/17/16/13/12 Errata" seems to be missing
though.
https://docs.broadcom.com/docs/PEX8749-48-47-33-32-25-24-23-17-16-13-12%20Errata-and-Cautions
An Intel document talks about the errata for the PEX 8749:
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/rn/rn-ias-n3000-n.pdf
It links to the following URL, also missing.
https://docs.broadcom.com/docs/pub-005018
I did however find an older errata document at:
PEX 87xx Errata Version 1.14, September 25, 2015
https://docs.broadcom.com/doc/pub-005017
I will keep trying, and I will let you know if we manage to come across any
documents.
Thank you for your efforts.
Matthew
> Thanks,
> Alex
>
next prev parent reply other threads:[~2021-09-16 5:13 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-13 6:31 [PROBLEM] Frequently get "irq 31: nobody cared" when passing through 2x GPUs that share same pci switch via vfio Matthew Ruffell
2021-09-14 16:43 ` Alex Williamson
2021-09-15 4:44 ` Matthew Ruffell
2021-09-15 16:32 ` Alex Williamson
2021-09-16 5:13 ` Matthew Ruffell [this message]
2021-10-05 5:02 ` Matthew Ruffell
2021-10-05 23:13 ` Alex Williamson
2021-10-12 4:58 ` Matthew Ruffell
2021-10-12 20:05 ` Alex Williamson
2021-10-12 22:35 ` Matthew Ruffell
2021-11-01 4:35 ` Matthew Ruffell
2021-11-04 22:05 ` Alex Williamson
2021-11-24 5:52 ` Matthew Ruffell
2021-11-29 17:56 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4d9d0366-1769-691f-fcb0-3b14d468e36e@canonical.com \
--to=matthew.ruffell@canonical.com \
--cc=alex.williamson@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=nathan.langford@xcelesunifiedtechnologies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).