All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
To: Sinan Kaya <okaya@kernel.org>,
	linux-pci@vger.kernel.org, kexec@lists.infradead.org,
	x86@kernel.org
Cc: linux-kernel@vger.kernel.org, bhelgaas@google.com,
	dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com,
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, andi@firstfloor.org, lukas@wunner.de,
	billy.olsen@canonical.com, cascardo@canonical.com,
	ddstreet@canonical.com, fabiomirmar@canonical.com,
	gavin.guo@canonical.com, jay.vosburgh@canonical.com,
	kernel@gpiccoli.net, mfo@canonical.com,
	shan.gavin@linux.alibaba.com
Subject: Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot
Date: Thu, 18 Oct 2018 17:13:10 -0300	[thread overview]
Message-ID: <12d6175b-7f09-872a-61c4-700e905579c7@canonical.com> (raw)
In-Reply-To: <6fd4e2d2-c0ac-b26d-9a14-0379b4421679@kernel.org>



On 18/10/2018 17:08, Sinan Kaya wrote:
> On 10/18/2018 2:37 PM, Guilherme G. Piccoli wrote:
>> We observed a kdump failure in x86 that was narrowed down to MSI irq
>> storm coming from a PCI network device. The bug manifests as a lack of
>> progress in the boot process of kdump kernel, and a flood of kernel
>> messages like:
>>
>> [...]
>> [ 342.265294] do_IRQ: 0.155 No irq handler for vector
>> [ 342.266916] do_IRQ: 0.155 No irq handler for vector
>> [ 347.258422] do_IRQ: 14053260 callbacks suppressed
>> [...]
> 
> These kind of issues are usually fixed by fixing the network driver's
> shutdown routine to ensure that MSI interrupts are cleared there.


Sinan, I'm not sure shutdown handlers for drivers are called in panic
kexec (I remember of an old experiment I did, loading a kernel
with "kexec -p" didn't trigger the handlers).

But this case is even worse, because the NICs were in PCI passthrough
mode, using vfio. So, they were completely unaware of what happened
in the host kernel.

Also, this is spec compliant - system reset events should guarantee the
bits are cleared (although kexec is not exactly a system reset, it's
similar)

Cheers,


Guilherme

WARNING: multiple messages have this Message-ID (diff)
From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
To: Sinan Kaya <okaya@kernel.org>,
	linux-pci@vger.kernel.org, kexec@lists.infradead.org,
	x86@kernel.org
Cc: cascardo@canonical.com, andi@firstfloor.org, bhe@redhat.com,
	lukas@wunner.de, shan.gavin@linux.alibaba.com,
	kernel@gpiccoli.net, linux-kernel@vger.kernel.org,
	gavin.guo@canonical.com, ddstreet@canonical.com,
	mingo@redhat.com, bp@alien8.de, billy.olsen@canonical.com,
	mfo@canonical.com, hpa@zytor.com, bhelgaas@google.com,
	jay.vosburgh@canonical.com, tglx@linutronix.de,
	dyoung@redhat.com, fabiomirmar@canonical.com, vgoyal@redhat.com
Subject: Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot
Date: Thu, 18 Oct 2018 17:13:10 -0300	[thread overview]
Message-ID: <12d6175b-7f09-872a-61c4-700e905579c7@canonical.com> (raw)
In-Reply-To: <6fd4e2d2-c0ac-b26d-9a14-0379b4421679@kernel.org>



On 18/10/2018 17:08, Sinan Kaya wrote:
> On 10/18/2018 2:37 PM, Guilherme G. Piccoli wrote:
>> We observed a kdump failure in x86 that was narrowed down to MSI irq
>> storm coming from a PCI network device. The bug manifests as a lack of
>> progress in the boot process of kdump kernel, and a flood of kernel
>> messages like:
>>
>> [...]
>> [ 342.265294] do_IRQ: 0.155 No irq handler for vector
>> [ 342.266916] do_IRQ: 0.155 No irq handler for vector
>> [ 347.258422] do_IRQ: 14053260 callbacks suppressed
>> [...]
> 
> These kind of issues are usually fixed by fixing the network driver's
> shutdown routine to ensure that MSI interrupts are cleared there.


Sinan, I'm not sure shutdown handlers for drivers are called in panic
kexec (I remember of an old experiment I did, loading a kernel
with "kexec -p" didn't trigger the handlers).

But this case is even worse, because the NICs were in PCI passthrough
mode, using vfio. So, they were completely unaware of what happened
in the host kernel.

Also, this is spec compliant - system reset events should guarantee the
bits are cleared (although kexec is not exactly a system reset, it's
similar)

Cheers,


Guilherme

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2018-10-18 20:16 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18 18:37 [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks Guilherme G. Piccoli
2018-10-18 18:37 ` Guilherme G. Piccoli
2018-10-18 18:37 ` [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code Guilherme G. Piccoli
2018-10-18 18:37   ` Guilherme G. Piccoli
2018-10-18 18:37 ` [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot Guilherme G. Piccoli
2018-10-18 18:37   ` Guilherme G. Piccoli
2018-10-18 20:08   ` Sinan Kaya
2018-10-18 20:08     ` Sinan Kaya
2018-10-18 20:13     ` Guilherme G. Piccoli [this message]
2018-10-18 20:13       ` Guilherme G. Piccoli
2018-10-18 20:30       ` Sinan Kaya
2018-10-18 20:30         ` Sinan Kaya
2018-10-22 19:44         ` Guilherme G. Piccoli
2018-10-22 19:44           ` Guilherme G. Piccoli
2018-10-18 22:15 ` [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks Bjorn Helgaas
2018-10-18 22:15   ` Bjorn Helgaas
2018-10-22 20:35   ` Guilherme G. Piccoli
2018-10-22 20:35     ` Guilherme G. Piccoli
2018-10-23 17:03     ` Bjorn Helgaas
2018-10-23 17:03       ` Bjorn Helgaas
2020-11-06 13:14       ` Guilherme G. Piccoli
2020-11-06 13:14         ` Guilherme G. Piccoli
2020-11-13 16:46         ` Bjorn Helgaas
2020-11-13 16:46           ` Bjorn Helgaas
2020-11-13 23:31           ` Thomas Gleixner
2020-11-13 23:31             ` Thomas Gleixner
2020-11-13 23:40             ` Thomas Gleixner
2020-11-13 23:40               ` Thomas Gleixner
2020-11-14 20:39               ` Bjorn Helgaas
2020-11-14 20:39                 ` Bjorn Helgaas
2020-11-14 20:58                 ` Thomas Gleixner
2020-11-14 20:58                   ` Thomas Gleixner
2020-11-14 21:22                   ` Bjorn Helgaas
2020-11-14 21:22                     ` Bjorn Helgaas
2020-11-15 14:05                     ` Eric W. Biederman
2020-11-15 14:05                       ` Eric W. Biederman
2020-11-15 14:29                       ` Eric W. Biederman
2020-11-15 14:29                         ` Eric W. Biederman
2020-11-15 15:11                         ` Thomas Gleixner
2020-11-15 15:11                           ` Thomas Gleixner
2020-11-15 17:01                           ` Lukas Wunner
2020-11-15 19:18                             ` Thomas Gleixner
2020-11-15 19:18                               ` Thomas Gleixner
2020-11-15 20:46                           ` Eric W. Biederman
2020-11-15 20:46                             ` Eric W. Biederman
2020-11-16 20:31                             ` Guilherme G. Piccoli
2020-11-16 20:31                               ` Guilherme G. Piccoli
2020-11-16 21:45                               ` Eric W. Biederman
2020-11-16 21:45                                 ` Eric W. Biederman
2020-11-16 21:49                                 ` Guilherme Piccoli
2020-11-16 21:49                                   ` Guilherme Piccoli
2020-11-17  0:19                               ` Bjorn Helgaas
2020-11-17  0:19                                 ` Bjorn Helgaas
2020-11-17  1:06                                 ` Eric W. Biederman
2020-11-17  1:06                                   ` Eric W. Biederman
2020-11-17  9:53                                   ` Thomas Gleixner
2020-11-17  9:53                                     ` Thomas Gleixner
2020-11-17 12:19                                     ` David Woodhouse
2020-11-17 12:19                                       ` David Woodhouse
2020-11-17 19:34                                       ` Thomas Gleixner
2020-11-17 19:34                                         ` Thomas Gleixner
2020-11-17 22:25                                         ` Eric W. Biederman
2020-11-17 22:25                                           ` Eric W. Biederman
2020-11-17 12:04                                   ` Guilherme Piccoli
2020-11-17 12:04                                     ` Guilherme Piccoli
2020-11-18 21:05                                     ` Bjorn Helgaas
2020-11-18 21:05                                       ` Bjorn Helgaas
2020-11-18 22:36                                       ` Guilherme Piccoli
2020-11-18 22:36                                         ` Guilherme Piccoli
2020-11-30 20:20                                         ` Bjorn Helgaas
2020-11-30 20:20                                           ` Bjorn Helgaas
2020-12-14 18:32                                           ` Guilherme Piccoli
2020-12-14 18:32                                             ` Guilherme Piccoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12d6175b-7f09-872a-61c4-700e905579c7@canonical.com \
    --to=gpiccoli@canonical.com \
    --cc=andi@firstfloor.org \
    --cc=bhe@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=billy.olsen@canonical.com \
    --cc=bp@alien8.de \
    --cc=cascardo@canonical.com \
    --cc=ddstreet@canonical.com \
    --cc=dyoung@redhat.com \
    --cc=fabiomirmar@canonical.com \
    --cc=gavin.guo@canonical.com \
    --cc=hpa@zytor.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=kernel@gpiccoli.net \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mfo@canonical.com \
    --cc=mingo@redhat.com \
    --cc=okaya@kernel.org \
    --cc=shan.gavin@linux.alibaba.com \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.