linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sinan Kaya <okaya@kernel.org>
To: "Guilherme G. Piccoli" <gpiccoli@canonical.com>,
	linux-pci@vger.kernel.org, kexec@lists.infradead.org,
	x86@kernel.org
Cc: linux-kernel@vger.kernel.org, bhelgaas@google.com,
	dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com,
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, andi@firstfloor.org, lukas@wunner.de,
	billy.olsen@canonical.com, cascardo@canonical.com,
	ddstreet@canonical.com, fabiomirmar@canonical.com,
	gavin.guo@canonical.com, jay.vosburgh@canonical.com,
	kernel@gpiccoli.net, mfo@canonical.com,
	shan.gavin@linux.alibaba.com
Subject: Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot
Date: Thu, 18 Oct 2018 16:30:22 -0400	[thread overview]
Message-ID: <50d84d48-eebf-ed91-8148-be727f76883f@kernel.org> (raw)
In-Reply-To: <12d6175b-7f09-872a-61c4-700e905579c7@canonical.com>

On 10/18/2018 4:13 PM, Guilherme G. Piccoli wrote:
>> These kind of issues are usually fixed by fixing the network driver's
>> shutdown routine to ensure that MSI interrupts are cleared there.
> 
> Sinan, I'm not sure shutdown handlers for drivers are called in panic
> kexec (I remember of an old experiment I did, loading a kernel
> with "kexec -p" didn't trigger the handlers).

AFAIK, all shutdown (not remove) routines are called before launching the next
kernel even in crash scenario. It is not safe to start the new kernel while
hardware is doing a DMA to the system memory and triggering interrupts.

Shutdown routine in PCI core used to disable MSI/MSI-x on behalf of all
endpoints but it was later decided that this is the responsibility of the
endpoint driver.

commit fda78d7a0ead144f4b2cdb582dcba47911f4952c
Author: Prarit Bhargava <prarit@redhat.com>
Date:   Thu Jan 26 14:07:47 2017 -0500

     PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()

     The pci_bus_type .shutdown method, pci_device_shutdown(), is called from
     device_shutdown() in the kernel restart and shutdown paths.

     Previously, pci_device_shutdown() called pci_msi_shutdown() and
     pci_msix_shutdown().  This disables MSI and MSI-X, which causes the device
     to fall back to raising interrupts via INTx.  But the driver is still bound
     to the device, it doesn't know about this change, and it likely doesn't
     have an INTx handler, so these INTx interrupts cause "nobody cared"
     warnings like this:

       irq 16: nobody cared (try booting with the "irqpoll" option)
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1
       Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/
       ...

     The MSI disabling code was added by d52877c7b1af ("pci/irq: let
     pci_device_shutdown to call pci_msi_shutdown v2") because a driver left MSI
     enabled and kdump failed because the kexeced kernel wasn't prepared to
     receive the MSI interrupts.

    Subsequent commits 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even
     if kernel doesn't support MSI") and  e80e7edc55ba ("PCI/MSI: Initialize MSI
     capability for all architectures") changed the kexeced kernel to disable
     all MSIs itself so it no longer depends on the crashed kernel to clean up
     after itself.

     Stop disabling MSI/MSI-X in pci_device_shutdown().  This resolves the
     "nobody cared" unhandled IRQ issue above.  It also allows PCI serial
     devices, which may rely on the MSI interrupts, to continue outputting
     messages during reboot/shutdown.

     [bhelgaas: changelog, drop pci_msi_shutdown() and pci_msix_shutdown() calls
     altogether]
     Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=187351
     Signed-off-by: Prarit Bhargava <prarit@redhat.com>
     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
     CC: Alex Williamson <alex.williamson@redhat.com>
     CC: David Arcari <darcari@redhat.com>
     CC: Myron Stowe <mstowe@redhat.com>
     CC: Lukas Wunner <lukas@wunner.de>
     CC: Keith Busch <keith.busch@intel.com>
     CC: Mika Westerberg <mika.westerberg@linux.intel.com>



> 
> But this case is even worse, because the NICs were in PCI passthrough
> mode, using vfio. So, they were completely unaware of what happened
> in the host kernel.
> 
> Also, this is spec compliant - system reset events should guarantee the
> bits are cleared (although kexec is not exactly a system reset, it's
> similar)


  reply	other threads:[~2018-10-18 20:30 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18 18:37 [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks Guilherme G. Piccoli
2018-10-18 18:37 ` [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code Guilherme G. Piccoli
2018-10-18 18:37 ` [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot Guilherme G. Piccoli
2018-10-18 20:08   ` Sinan Kaya
2018-10-18 20:13     ` Guilherme G. Piccoli
2018-10-18 20:30       ` Sinan Kaya [this message]
2018-10-22 19:44         ` Guilherme G. Piccoli
2018-10-18 22:15 ` [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks Bjorn Helgaas
2018-10-22 20:35   ` Guilherme G. Piccoli
2018-10-23 17:03     ` Bjorn Helgaas
2020-11-06 13:14       ` Guilherme G. Piccoli
2020-11-13 16:46         ` Bjorn Helgaas
2020-11-13 23:31           ` Thomas Gleixner
2020-11-13 23:40             ` Thomas Gleixner
2020-11-14 20:39               ` Bjorn Helgaas
2020-11-14 20:58                 ` Thomas Gleixner
2020-11-14 21:22                   ` Bjorn Helgaas
2020-11-15 14:05                     ` Eric W. Biederman
2020-11-15 14:29                       ` Eric W. Biederman
2020-11-15 15:11                         ` Thomas Gleixner
2020-11-15 17:01                           ` Lukas Wunner
2020-11-15 19:18                             ` Thomas Gleixner
2020-11-15 20:46                           ` Eric W. Biederman
2020-11-16 20:31                             ` Guilherme G. Piccoli
2020-11-16 21:45                               ` Eric W. Biederman
2020-11-16 21:49                                 ` Guilherme Piccoli
2020-11-17  0:19                               ` Bjorn Helgaas
2020-11-17  1:06                                 ` Eric W. Biederman
2020-11-17  9:53                                   ` Thomas Gleixner
2020-11-17 12:19                                     ` David Woodhouse
2020-11-17 19:34                                       ` Thomas Gleixner
2020-11-17 22:25                                         ` Eric W. Biederman
2020-11-17 12:04                                   ` Guilherme Piccoli
2020-11-18 21:05                                     ` Bjorn Helgaas
2020-11-18 22:36                                       ` Guilherme Piccoli
2020-11-30 20:20                                         ` Bjorn Helgaas
2020-12-14 18:32                                           ` Guilherme Piccoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50d84d48-eebf-ed91-8148-be727f76883f@kernel.org \
    --to=okaya@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=bhe@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=billy.olsen@canonical.com \
    --cc=bp@alien8.de \
    --cc=cascardo@canonical.com \
    --cc=ddstreet@canonical.com \
    --cc=dyoung@redhat.com \
    --cc=fabiomirmar@canonical.com \
    --cc=gavin.guo@canonical.com \
    --cc=gpiccoli@canonical.com \
    --cc=hpa@zytor.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=kernel@gpiccoli.net \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mfo@canonical.com \
    --cc=mingo@redhat.com \
    --cc=shan.gavin@linux.alibaba.com \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).