All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Ruffell <matthew.ruffell@canonical.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: linux-pci@vger.kernel.org, lkml <linux-kernel@vger.kernel.org>,
	kvm@vger.kernel.org,
	nathan.langford@xcelesunifiedtechnologies.com
Subject: Re: [PROBLEM] Frequently get "irq 31: nobody cared" when passing through 2x GPUs that share same pci switch via vfio
Date: Thu, 16 Sep 2021 17:13:21 +1200	[thread overview]
Message-ID: <4d9d0366-1769-691f-fcb0-3b14d468e36e@canonical.com> (raw)
In-Reply-To: <20210915103235.097202d2.alex.williamson@redhat.com>

On 16/09/21 4:32 am, Alex Williamson wrote:
> On Wed, 15 Sep 2021 16:44:38 +1200
> Matthew Ruffell <matthew.ruffell@canonical.com> wrote:
>> On 15/09/21 4:43 am, Alex Williamson wrote:
>>>
>>> FWIW, I have access to a system with an NVIDIA K1 and M60, both use
>>> this same switch on-card and I've not experienced any issues assigning
>>> all the GPUs to a single VM.  Topo:
>>>
>>>  +-[0000:40]-+-02.0-[42-47]----00.0-[43-47]--+-08.0-[44]----00.0
>>>  |                                           +-09.0-[45]----00.0
>>>  |                                           +-10.0-[46]----00.0
>>>  |                                           \-11.0-[47]----00.0
>>>  \-[0000:00]-+-03.0-[04-07]----00.0-[05-07]--+-08.0-[06]----00.0
>>>                                              \-10.0-[07]----00.0
> 
> 
> I've actually found that the above configuration, assigning all 6 GPUs
> to a VM reproduces this pretty readily by simply rebooting the VM.  In
> my case, I don't have the panic-on-warn/oops that must be set on your
> kernel, so the result is far more benign, the IRQ gets masked until
> it's re-registered.
> 
> The fact that my upstream ports are using MSI seems irrelevant.

Hi Alex,



It is good news that you can reproduce an interrupt storm locally. Did a single

reboot trigger the storm, or did you have to loop the VM a few times?



On our system, if we don't have panic-on-warn/oops set, the system will

eventually grind to a halt and lock up, so we try to reset earlier on the first

oops, but we still get stuck in the crashkernel copying the IR tables from dmar.

> 
> Adding debugging to the vfio-pci interrupt handler, it's correctly
> deferring the interrupt as the GPU device is not identifying itself as
> the source of the interrupt via the status register.  In fact, setting
> the disable INTx bit in the GPU command register while the interrupt
> storm occurs does not stop the interrupts.
> 

Interesting. So the source of the interrupts could be from the PEX switch

itself?



We did a run with DisIntx+ set on the PEX switches, but it didn't make any

difference. Serial log showing DisIntx+ and full dmesg below:



https://paste.ubuntu.com/p/n3XshCxPT8/

> The interrupt storm does seem to be related to the bus resets, but I
> can't figure out yet how multiple devices per switch factors into the
> issue.  Serializing all bus resets via a mutex doesn't seem to change
> the behavior.

Very interesting indeed.

> I'm still investigating, but if anyone knows how to get access to the
> Broadcom datasheet or errata for this switch, please let me know.

I have tried reaching out to Broadcom asking for the datasheet and errata, but

I am unsure if they will get back to me.



They list the errata as publicly available on their website, in the 

Documentation > errata tab.

https://www.broadcom.com/products/pcie-switches-bridges/pcie-switches/pex8749#documentation



The file "PEX 8749/48/47/33/32/25/24/23/17/16/13/12 Errata" seems to be missing

though.

https://docs.broadcom.com/docs/PEX8749-48-47-33-32-25-24-23-17-16-13-12%20Errata-and-Cautions



An Intel document talks about the errata for the PEX 8749:

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/rn/rn-ias-n3000-n.pdf

It links to the following URL, also missing.

https://docs.broadcom.com/docs/pub-005018



I did however find an older errata document at:



PEX 87xx Errata Version 1.14, September 25, 2015

https://docs.broadcom.com/doc/pub-005017



I will keep trying, and I will let you know if we manage to come across any

documents.



Thank you for your efforts.

Matthew

> Thanks,
> Alex
> 

  reply	other threads:[~2021-09-16  5:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13  6:31 [PROBLEM] Frequently get "irq 31: nobody cared" when passing through 2x GPUs that share same pci switch via vfio Matthew Ruffell
2021-09-14 16:43 ` Alex Williamson
2021-09-15  4:44   ` Matthew Ruffell
2021-09-15 16:32     ` Alex Williamson
2021-09-16  5:13       ` Matthew Ruffell [this message]
2021-10-05  5:02       ` Matthew Ruffell
2021-10-05 23:13         ` Alex Williamson
2021-10-12  4:58           ` Matthew Ruffell
2021-10-12 20:05             ` Alex Williamson
2021-10-12 22:35               ` Matthew Ruffell
2021-11-01  4:35                 ` Matthew Ruffell
2021-11-04 22:05                   ` Alex Williamson
2021-11-24  5:52                     ` Matthew Ruffell
2021-11-29 17:56                       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d9d0366-1769-691f-fcb0-3b14d468e36e@canonical.com \
    --to=matthew.ruffell@canonical.com \
    --cc=alex.williamson@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=nathan.langford@xcelesunifiedtechnologies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.