linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Seiji Aguchi <seiji.aguchi@hds.com>
To: Don Zickus <dzickus@redhat.com>,
	"ebiederm@xmission.com" <ebiederm@xmission.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kexec-list <kexec@lists.infradead.org>,
	Vivek Goyal <vgoyal@redhat.com>
Subject: RE: [PATCH] x86, kdump: No need to disable ioapic in crash path
Date: Wed, 2 May 2012 15:10:34 -0400	[thread overview]
Message-ID: <5C4C569E8A4B9B42A84A977CF070A35B2E4D461583@USINDEVS01.corp.hds.com> (raw)
In-Reply-To: <20120430205354.GA32472@redhat.com>


> Perhaps calling setup_IO_APIC before setup_local_APIC would be a better fix?

I checked Intel develper's manual and there is no restriction about the order of enabling IO_APIC/local_APIC.
So, it may work.

But, I don't understand why we have to change the stable boot-up code.
If kdump disables both local_apic and IO_APIC in proper way in 1st kernel,  2nd kernel works without any change.

I think busting spinlocks ,like io_apic_lock, in 1st kernel is reasonable.

Seiji

> -----Original Message-----
> From: Don Zickus [mailto:dzickus@redhat.com]
> Sent: Monday, April 30, 2012 4:54 PM
> To: Seiji Aguchi; ebiederm@xmission.com
> Cc: x86@kernel.org; LKML; kexec-list; Vivek Goyal
> Subject: Re: [PATCH] x86, kdump: No need to disable ioapic in crash path
> 
> On Thu, Mar 15, 2012 at 05:16:50PM -0400, Seiji Aguchi wrote:
> > Don,
> >
> > What do you think about following scenario?
> > Disabling I/O APIC seems to be needed before booting kdump kernel.
> 
> For some reason I actually believed this was cleared before interrupts were enabled on bootup.  Apparently not.  On a virt guest I can
> easily create a scenario in which scp'ing a file then kdumping, leaves the ethernet interrupt in a triggered state.
> 
> Before this patch, it would be masked by disable_IO_APIC.  With my patch the irq nevers gets masked and during setup_local_APIC
> the kernel falls over once the local APIC is enabled (as setup_IO_APIC is called later).
> Perhaps calling setup_IO_APIC before setup_local_APIC would be a better fix?
> 
> Just like NMIs prohibit the abilty to remove the disable local apic code, an actively triggered interrupt seems to prevent us from
> removing the disable io apic.
> 
> This leaves me with my original problem of deadlocking in the disable_IO_APIC path.
> 
> Thoughts?
> 
> Cheers,
> Don
> 
> >
> > Seiji
> >
> >
> > commit 1e75b31d638d5242ca8e9771dfdcbd28a5f041df
> > Author: Suresh Siddha <suresh.b.siddha@intel.com>
> > Date:   Thu Aug 25 12:01:11 2011 -0700
> >
> >     x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC
> >
> >     In the kdump scenario mentioned below, we can have a case where
> >     the device using level triggered interrupt will not generate any
> >     interrupts in the kdump kernel.
> >
> >     1. IO-APIC sends a level triggered interrupt to the CPU's local APIC.
> >
> >     2. Kernel crashed before the CPU services this interrupt, leaving
> >        the remote-IRR in the IO-APIC set.
> >
> >     3. kdump kernel boot sequence does clear_IO_APIC() as part of IO-APIC
> >        initialization. But this fails to reset remote-IRR bit of the
> >        IO-APIC RTE as the remote-IRR bit is read-only.
> >
> >     4. Device using that level triggered entry can't generate any
> >        more interrupts because of the remote-IRR bit.
> >
> >     In clear_IO_APIC_pin(), check if the remote-IRR bit is set and if
> >     so do an explicit attempt to clear it (by doing EOI write on
> >     modern io-apic's and changing trigger mode to edge/level on
> >     older io-apic's). Also before doing the explicit EOI to the
> >     io-apic, ensure that the trigger mode is indeed set to level.
> >     This will enable the explicit EOI to the io-apic to reset the
> >     remote-IRR bit.
> >
> >     Tested-by: Leonardo Chiquitto <lchiquitto@novell.com>
> >     Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> >     Fixes: https://bugzilla.novell.com/show_bug.cgi?id=701686
> >     Cc: Rafael Wysocki <rjw@novell.com>
> >     Cc: Maciej W. Rozycki <macro@linux-mips.org>
> >     Cc: Thomas Renninger <trenn@suse.de>
> >     Cc: jbeulich@novell.com
> >     Cc: yinghai@kernel.org
> >     Link: http://lkml.kernel.org/r/20110825190657.157502602@sbsiddha-desk.sc.intel.com
> >     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> >
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org
> > > [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Don Zickus
> > > Sent: Thursday, March 15, 2012 4:27 PM
> > > To: x86@kernel.org
> > > Cc: LKML; kexec-list; Eric W. Biederman; Vivek Goyal
> > > Subject: Re: [PATCH] x86, kdump: No need to disable ioapic in crash
> > > path
> > >
> > > On Wed, Feb 29, 2012 at 03:08:49PM -0500, Don Zickus wrote:
> > > > A customer of ours noticed when their machine crashed, kdump did
> > > > not work but hung instead.  Using their firmware dumping solution
> > > > they grabbed a vmcore and decoded the stacks on the cpus.  What
> > > > they noticed seemed to be a rare deadlock with the ioapic_lock.
> > >
> > > While we are discussing the NMI stuff in another thread, does anyone
> > > have any objection to committing this patch.  It fixes a real problem today.
> > >
> > > Cheers,
> > > Don
> > >
> > > >
> > > >  CPU4:
> > > >  machine_crash_shutdown
> > > >  -> machine_ops.crash_shutdown
> > > >     -> native_machine_crash_shutdown
> > > >        -> kdump_nmi_shootdown_cpus ------> Send NMI to other CPUs
> > > >        -> disable_IO_APIC
> > > >           -> clear_IO_APIC
> > > >              -> clear_IO_APIC_pin
> > > >                 -> ioapic_read_entry
> > > >                    -> spin_lock_irqsave(&ioapic_lock, flags)
> > > >                    ---Infinite loop here---
> > > >
> > > >  CPU0:
> > > >  do_IRQ
> > > >  -> handle_irq
> > > >     -> handle_edge_irq
> > > >         -> ack_apic_edge
> > > >            -> move_native_irq
> > > >                -> mask_IO_APIC_irq
> > > >                   -> mask_IO_APIC_irq_desc
> > > >                      -> spin_lock_irqsave(&ioapic_lock, flags)
> > > >                      ---Receive NMI here after getting spinlock---
> > > >                         -> nmi
> > > >                            -> do_nmi
> > > >                               -> crash_nmi_callback
> > > >                               ---Infinite loop here---
> > > >
> > > > The problem is that although kdump tries to shutdown minimal
> > > > hardware, it still needs to disable the IO APIC.  This requires
> > > > spinlocks which may be held by another cpu.  This other cpu is
> > > > being held infinitely in an NMI context by kdump in order to serialize the crashing path.
> > > > Instant deadlock.
> > > >
> > > > Eric, brought up a point that because the boot code was
> > > > restructured we may not need to disable the io apic any more in the crash path.
> > > > The original concern that led to the development of
> > > > disable_IO_APIC, was that the jiffies calibration on boot up
> > > > relied on the PIT timer for reference.  Access to the PIT required
> > > > 8259 interrupts to be working.  This wouldn't work if the ioapic needed to be configured.
> > > > So on panic path, the ioapic was reconfigured to use virtual wire mode to allow the 8259 to passthrough.
> > > >
> > > > Those concerns don't hold true now, thanks to the jiffies
> > > > calibration code not needing the PIT.  As a result, we can remove
> > > > this call and simplify the locking needed in the panic path.
> > > >
> > > > I tested kdump on an Ivy Bridge platform, a Pentium4 and an old
> > > > athlon that did not have an ioapic.  All three were successful.
> > > >
> > > > I also tested using lkdtm that would use jprobes to panic the
> > > > system when entering do_IRQ.  The idea was to see how the system
> > > > reacted with an interrupt pending in the second kernel.  My core2
> > > > quad successfully kdump'd
> > > > 3 times in a row with no issues.
> > > >
> > > > v2: removed the disable lapic code too
> > > > v3: re-add disabling of lapic code
> > > >
> > > > Cc: Eric W. Biederman <ebiederm@xmission.com>
> > > > Cc: Vivek Goyal <vgoyal@redhat.com>
> > > > Signed-off-by: Don Zickus <dzickus@redhat.com>
> > > > ---
> > > >
> > > > There are really two problems here.  One is the deadlock of the
> > > > ioapic_lock that I describe above.  Removing the code to disable
> > > > the ioapic seems to resolve that.
> > > >
> > > > The second issue is handling non-IRQ exceptions like NMIs.  Eric
> > > > asked me to include removing the disable lapic code too.  However,
> > > > because the nmi watchdog is stil active and kexec zeros out the
> > > > idt before it jumps to purgatory, an NMI that comes in during the
> > > > transition between the first kernel and second kernel will see an empty idt and reset the cpu.
> > > >
> > > > Leaving the code to disable the lapic in, turns off perf and
> > > > blocks those NMIs from happening (though an external NMI would
> > > > still be an issue but that is no different than right now).
> > > >
> > > > I tried playing with a stub idt and leaving it in place through
> > > > the transition to the second kernel, but I can't quite get it to
> > > > work correctly.  Spinning in the first kernel before the purgatory
> > > > jump catches the idt properly.  Spinning in purgatory before the
> > > > second kernel jump doesn't.  I even disabled the zero'ing out of the idt in the purgatory code.
> > > >
> > > > I would like to get resolution on the ioapic deadlock to fix a
> > > > customer issue while working the idt and NMI thing on the side,
> > > > hence the split of this patchset.
> > > >
> > > > Hopefully, people recognize there are two issues here and that
> > > > this patch resolves the first one and the second one needs more debugging and time.
> > > > ---
> > > >  arch/x86/kernel/crash.c |    3 ---
> > > >  1 files changed, 0 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > > > index
> > > > 13ad899..b053cf9 100644
> > > > --- a/arch/x86/kernel/crash.c
> > > > +++ b/arch/x86/kernel/crash.c
> > > > @@ -96,9 +96,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
> > > >  	cpu_emergency_svm_disable();
> > > >
> > > >  	lapic_shutdown();
> > > > -#if defined(CONFIG_X86_IO_APIC)
> > > > -	disable_IO_APIC();
> > > > -#endif
> > > >  #ifdef CONFIG_HPET_TIMER
> > > >  	hpet_disable();
> > > >  #endif
> > > > --
> > > > 1.7.7.6
> > > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe
> > > linux-kernel" in the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2012-05-02 19:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-29 20:08 [PATCH] x86, kdump: No need to disable ioapic in crash path Don Zickus
2012-03-15 20:26 ` Don Zickus
2012-03-15 21:16   ` Seiji Aguchi
2012-03-15 21:33     ` Don Zickus
2012-03-15 21:37       ` Seiji Aguchi
2012-04-30 20:53     ` Don Zickus
2012-05-02 19:10       ` Seiji Aguchi [this message]
2012-05-02 19:39         ` Eric W. Biederman
2012-05-02 19:59           ` Don Zickus
2012-05-02 20:24             ` Eric W. Biederman
2012-03-29 16:02 ` Don Zickus
  -- strict thread matches above, loose matches on Subject: below --
2012-02-02 18:12 Don Zickus
2012-02-02 23:24 ` Eric W. Biederman
2012-02-07 21:57   ` Don Zickus
2012-02-07 22:19     ` Vivek Goyal
2012-02-07 23:35       ` Eric W. Biederman
2012-02-08 20:11         ` Don Zickus
2012-02-08 22:55           ` Eric W. Biederman
2012-02-09 14:48             ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5C4C569E8A4B9B42A84A977CF070A35B2E4D461583@USINDEVS01.corp.hds.com \
    --to=seiji.aguchi@hds.com \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).