linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: linux-kernel@vger.kernel.org, mingo@redhat.com, hpa@zytor.com,
	yinghai@kernel.org, torvalds@linux-foundation.org,
	kexec@lists.infradead.org, vgoyal@redhat.com,
	ebiederm@xmission.com, akpm@linux-foundation.org,
	tglx@linutronix.de, dzickus@redhat.com, mingo@elte.hu
Cc: linux-tip-commits@vger.kernel.org
Subject: Re: [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path
Date: Sat, 11 Feb 2012 17:04:15 -0800	[thread overview]
Message-ID: <CAE9FiQWRjvu_uNcx_bJXYHeg4D4k6YzHHmSudsq7EyubLi7+nw@mail.gmail.com> (raw)
In-Reply-To: <tip-d9bc9be89629445758670220787683e37c93f6c1@git.kernel.org>

On Sat, Feb 11, 2012 at 3:09 PM, tip-bot for Don Zickus
<dzickus@redhat.com> wrote:
> Commit-ID:  d9bc9be89629445758670220787683e37c93f6c1
> Gitweb:     http://git.kernel.org/tip/d9bc9be89629445758670220787683e37c93f6c1
> Author:     Don Zickus <dzickus@redhat.com>
> AuthorDate: Thu, 9 Feb 2012 16:53:41 -0500
> Committer:  Ingo Molnar <mingo@elte.hu>
> CommitDate: Sat, 11 Feb 2012 15:38:53 +0100
>
> x86/kdump: No need to disable ioapic/lapic in crash path
>
> A customer of ours noticed when their machine crashed, kdump did
> not work but hung instead.  Using their firmware dumping
> solution they grabbed a vmcore and decoded the stacks on the
> cpus.  What they noticed seemed to be a rare deadlock with the
> ioapic_lock.
>
>  CPU4:
>  machine_crash_shutdown
>  -> machine_ops.crash_shutdown
>    -> native_machine_crash_shutdown
>       -> kdump_nmi_shootdown_cpus ------> Send NMI to other CPUs
>       -> disable_IO_APIC
>          -> clear_IO_APIC
>             -> clear_IO_APIC_pin
>                -> ioapic_read_entry
>                   -> spin_lock_irqsave(&ioapic_lock, flags)
>                   ---Infinite loop here---
>
>  CPU0:
>  do_IRQ
>  -> handle_irq
>    -> handle_edge_irq
>        -> ack_apic_edge
>           -> move_native_irq
>               -> mask_IO_APIC_irq
>                  -> mask_IO_APIC_irq_desc
>                     -> spin_lock_irqsave(&ioapic_lock, flags)
>                     ---Receive NMI here after getting spinlock---
>                        -> nmi
>                           -> do_nmi
>                              -> crash_nmi_callback
>                              ---Infinite loop here---
>
> The problem is that although kdump tries to shutdown minimal
> hardware, it still needs to disable the IO APIC.  This requires
> spinlocks which may be held by another cpu.  This other cpu is
> being held infinitely in an NMI context by kdump in order to
> serialize the crashing path.  Instant deadlock.
>
> Eric brought up a point that because the boot code was
> restructured we may not need to disable the io apic any more in
> the crash path.  The original concern that led to the
> development of disable_IO_APIC, was that the jiffies calibration
> on boot up relied on the PIT timer for reference.  Access to the
> PIT required 8259 interrupts to be working.  This wouldn't work
> if the ioapic needed to be configured.  So on panic path, the
> ioapic was reconfigured to use virtual wire mode to allow the 8259 to passthrough.
>
> Those concerns don't hold true now, thanks to the jiffies
> calibration code not needing the PIT.  As a result, we can
> remove this call and simplify the locking needed in the panic
> path.
>
> The same work allowed us to remove the need to disable the local
> apic on shutdown too.  This should allow us to jump to the
> second a little faster.
>
> I tested kdump on an Ivy Bridge platform, a Pentium4 and an old
> athlon that did not have an ioapic.  All three were successful.
>
> I also tested using lkdtm that would use jprobes to panic the
> system when entering do_IRQ.  The idea was to see how the system
> reacted with an interrupt pending in the second kernel.  My
> core2 quad successfully kdump'd 3 times in a row with no issues.
>
> v2: removed the disable lapic code too

with this commit, kdump is not working anymore on my setups with
Nehalem, Westmere, sandbridge.
these setup all have VT-d enabled.


After reverting this commit, kdump is working again.

So assume you need to drop this patch.

Thanks

Yinghai Lu

       reply	other threads:[~2012-02-12  1:04 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <tip-d9bc9be89629445758670220787683e37c93f6c1@git.kernel.org>
2012-02-12  1:04 ` Yinghai Lu [this message]
2012-02-12  3:13   ` [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path Eric W. Biederman
2012-02-12  4:17     ` Yinghai Lu
2012-02-13 12:52       ` Eric W. Biederman
2012-02-13 16:51         ` Yinghai Lu
2012-02-13 18:16           ` Yinghai Lu
2012-02-16 17:27             ` Don Zickus
2012-02-16 21:53               ` Yinghai Lu
2012-02-16 21:56                 ` Don Zickus
2012-02-17  3:38                   ` Eric W. Biederman
2012-02-17 12:41                     ` Eric W. Biederman
2012-02-17 15:49                       ` HATAYAMA Daisuke
2012-02-17 20:18                         ` Don Zickus
2012-02-20  5:17                           ` HATAYAMA Daisuke
2012-02-20 15:24                             ` Don Zickus
2012-02-17 19:54                       ` Don Zickus
2012-02-18  3:21                         ` Eric W. Biederman
2012-02-20 15:14                           ` Don Zickus
2012-02-21  8:01                             ` Eric W. Biederman
2012-02-21 13:59                               ` Don Zickus
2012-02-29 23:19                                 ` Eric W. Biederman
2012-03-07 10:53                                   ` Fernando Luis Vázquez Cao
2012-03-07 10:54                                     ` [PATCH 1/2] boot: ignore early NMIs Fernando Luis Vázquez Cao
2012-03-07 10:56                                       ` [PATCH 2/2] boot: add early NMI counter Fernando Luis Vázquez Cao
2012-03-08  4:50                                         ` Eric W. Biederman
2012-03-08  6:00                                           ` Fernando Luis Vázquez Cao
2012-03-08  4:41                                       ` [PATCH 1/2] boot: ignore early NMIs Eric W. Biederman
2012-03-08  5:53                                         ` Fernando Luis Vázquez Cao
2012-03-08 16:35                                           ` Eric W. Biederman
2012-03-09  9:31                                             ` Fernando Luis Vázquez Cao
2012-03-09  9:51                                               ` [PATCH 1/3] boot: fortify early_idt_handlers definition Fernando Luis Vázquez Cao
2012-03-09  9:55                                                 ` [PATCH 2/3] boot: ignore early NMIs Fernando Luis Vázquez Cao
2012-03-09 10:01                                                   ` [PATCH 3/3] boot: add early NMI counter Fernando Luis Vázquez Cao
2012-03-09 20:52                                             ` [PATCH 1/2] boot: ignore early NMIs H. Peter Anvin
2012-03-12  5:43                                               ` Fernando Luis Vázquez Cao
2012-03-12  5:49                                                 ` H. Peter Anvin
2012-03-12  6:14                                                   ` Fernando Luis Vázquez Cao
2012-03-12 13:36                                                     ` Vivek Goyal
2012-03-12 19:02                                                       ` Eric W. Biederman
2012-03-12 19:58                                                         ` Vivek Goyal
2012-03-12 20:02                                                         ` H. Peter Anvin
2012-03-12 18:40                                                     ` H. Peter Anvin
2012-03-12 20:01                                                       ` Eric W. Biederman
2012-03-12 20:04                                                         ` H. Peter Anvin
2012-03-12 20:16                                                           ` H. Peter Anvin
2012-03-13  2:11                                                             ` Fernando Luis Vázquez Cao
2012-03-13 13:33                                                               ` Don Zickus
2012-03-15  0:43                                                                 ` Simon Horman
2012-03-13  1:43                                                       ` Fernando Luis Vázquez Cao
2012-03-12 14:41                                                   ` Don Zickus
2012-03-07 15:50                                     ` [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path Vivek Goyal
2012-03-07 18:27                                       ` Yinghai Lu
2012-03-08  1:29                                         ` Fernando Luis Vázquez Cao
2012-03-09  0:59                                     ` HATAYAMA Daisuke
2012-03-09  2:48                                       ` Eric W. Biederman
2012-02-12 11:12   ` Ingo Molnar
2012-02-13 15:28   ` Don Zickus
2012-02-13 16:52     ` Yinghai Lu
2012-02-13 22:12       ` Don Zickus
2012-02-13 22:51         ` Don Zickus
2012-02-16  2:53       ` Don Zickus
2012-02-16 18:43         ` Yinghai Lu
2012-02-16 21:41           ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE9FiQWRjvu_uNcx_bJXYHeg4D4k6YzHHmSudsq7EyubLi7+nw@mail.gmail.com \
    --to=yinghai@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).