From: Thomas Gleixner <tglx@linutronix.de>
To: Salvatore Bonaccorso <carnil@debian.org>,
Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Cc: Dusty Mabe <dustymabe@redhat.com>, Stefan Roese <sr@denx.de>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
Bjorn Helgaas <bhelgaas@google.com>,
Michal Simek <michal.simek@xilinx.com>,
Marek Vasut <marex@denx.de>,
x86@kernel.org, maz@kernel.org,
Andrew Cooper <andrew.cooper3@citrix.com>,
Juergen Gross <jgross@suse.com>
Subject: Re: [tip: irq/urgent] PCI/MSI: Mask MSI-X vectors only on success
Date: Wed, 27 Apr 2022 19:35:55 +0200 [thread overview]
Message-ID: <87v8uuwhs4.ffs@tglx> (raw)
In-Reply-To: <Ymj3zzjQ9PwYaX/p@eldamar.lan>
On Wed, Apr 27 2022 at 09:59, Salvatore Bonaccorso wrote:
> On Mon, Mar 14, 2022 at 09:29:53PM +0100, Jeremi Piotrowski wrote:
>
> Does someone knows here on current state of the AWS instances using
> the older Xen version which causes the issues?
>
> AFAIU upstream is not planning to revert 83dbf898a2d4 ("PCI/MSI: Mask
> MSI-X vectors only on success") as it fixed a bug. Now several
> downstream distros do carry a revert of this commit, which I believe
> is an unfortunate situation and wonder if this can be addressed
> upstream to deal with the AWS m4.large instance issues.
The problem is that except for a bisect result we've not seen much
information which might help to debug and analyze the problem.
The guest uses MSI-X on that NIC:
Capabilities: [70] MSI-X: Enable+ Count=3 Masked-
So looking at the commit in question:
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 48e3f4e47b29..6748cf9d7d90 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -722,9 +722,6 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
goto out_disable;
}
- /* Ensure that all table entries are masked. */
- msix_mask_all(base, tsize);
-
ret = msix_setup_entries(dev, base, entries, nvec, affd);
if (ret)
goto out_disable;
@@ -751,6 +748,16 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
/* Set MSI-X enabled bits and unmask the function */
pci_intx_for_msi(dev, 0);
dev->msix_enabled = 1;
+
+ /*
+ * Ensure that all table entries are masked to prevent
+ * stale entries from firing in a crash kernel.
+ *
+ * Done late to deal with a broken Marvell NVME device
+ * which takes the MSI-X mask bits into account even
+ * when MSI-X is disabled, which prevents MSI delivery.
+ */
+ msix_mask_all(base, tsize);
pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
IOW, it moves the invocation of msix_mask_all() into the success
path.
As the device uses MSI-X this change does not make any difference from a
hardware perspective simply because _all_ MSI-X interrupts are masked
via the CTRL register already. And it does not matter whether the kernel
masks them individually _before_ or _after_ the allocation. At least not
on real hardware and on a sane emulation.
Now this is XEN and neither real hardware nor sane emulation.
It must be a XEN_HVM guest because XEN_PV guests disable PCI/MSI[-X]
completely which makes the invocation of msix_mask_all() a NOP.
If it's not a XEN_HVM guest, then you can stop reading further as I'm
unable to decode why moving a NOP makes a difference. That belongs in to
the realm of voodoo, but then XEN is voodoo at least for me. :)
XEN guests do not use the common PCI mask/unmask machinery which would
unmask the interrupt on request_irq().
So I assume that the following happens:
Guest Hypervisor
msix_capabilities_init()
....
alloc_irq()
xen_magic() -> alloc_msix_interrupt()
request_irq()
msix_mask_all() -> trap
do_magic()
request_irq()
unmask()
xen_magic()
unmask_evtchn() -> do_more_magic()
So I assume further that msix_mask_all() actually is able to mask the
interrupts in the hardware (ctrl word of the vector table) despite the
hypervisor having allocated and requested the interrupt already.
Nothing in XEN_HVM handles PCI/MSI[-X] mask/unmask in the guest, so I
really have to ask why XEN_HVM does not disable PCI/MSI[-X] masking like
XEN_PV does. I can only assume the answer is voodoo...
Maybe the XEN people have some more enlightened answers to that.
Thanks,
tglx
next prev parent reply other threads:[~2022-04-27 17:36 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20211210161025.3287927-1-sr@denx.de>
2021-12-14 12:28 ` [tip: irq/urgent] PCI/MSI: Mask MSI-X vectors only on success tip-bot2 for Stefan Roese
2022-03-14 16:36 ` Jeremi Piotrowski
2022-03-14 16:49 ` Stefan Roese
2022-03-14 17:04 ` Dusty Mabe
2022-03-14 20:29 ` Jeremi Piotrowski
2022-04-27 7:59 ` Salvatore Bonaccorso
2022-04-27 17:35 ` Thomas Gleixner [this message]
2022-04-28 13:48 ` Thomas Gleixner
2022-04-28 13:50 ` [PATCH] x86/pci/xen: Disable PCI/MSI[-X] masking for XEN_HVM guests Thomas Gleixner
2022-05-01 8:12 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-04-28 18:43 ` [tip: irq/urgent] PCI/MSI: Mask MSI-X vectors only on success Salvatore Bonaccorso
2022-04-29 6:37 ` Salvatore Bonaccorso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v8uuwhs4.ffs@tglx \
--to=tglx@linutronix.de \
--cc=andrew.cooper3@citrix.com \
--cc=bhelgaas@google.com \
--cc=carnil@debian.org \
--cc=dustymabe@redhat.com \
--cc=jgross@suse.com \
--cc=jpiotrowski@linux.microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=marex@denx.de \
--cc=maz@kernel.org \
--cc=michal.simek@xilinx.com \
--cc=sr@denx.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).