From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755197Ab2BPTKz (ORCPT ); Thu, 16 Feb 2012 14:10:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:22066 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754913Ab2BPTKw (ORCPT ); Thu, 16 Feb 2012 14:10:52 -0500 Date: Thu, 16 Feb 2012 12:27:35 -0500 From: Don Zickus To: Yinghai Lu Cc: "Eric W. Biederman" , linux-kernel@vger.kernel.org, mingo@redhat.com, hpa@zytor.com, torvalds@linux-foundation.org, kexec@lists.infradead.org, vgoyal@redhat.com, akpm@linux-foundation.org, tglx@linutronix.de, mingo@elte.hu, linux-tip-commits@vger.kernel.org Subject: Re: [tip:x86/debug] x86/kdump: No need to disable ioapic/ lapic in crash path Message-ID: <20120216172735.GX9751@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 13, 2012 at 10:16:00AM -0800, Yinghai Lu wrote: > On Mon, Feb 13, 2012 at 8:51 AM, Yinghai Lu wrote: > >> So I suspect we have a bug in our apic initialization somewhere, but > >> apic initialization should happen after printk are enabled.  Or at least > >> after early printks so the reset YH is reporting doesn't make much sense. > > > > will try Don's first version patch that only removing disable_IO_APIC. > > first version patch (only removing disable_IO_APIC) is working. So I think I figured it out. I went through and commented out code in disable_local_APIC until I narrowed it down to the piece of code that needs to be disabled for it to work. Surprise, surprise... its LVTPC or perf! :-) Actually it is the nmi_watchdog which uses perf. My theory is NMIs are not disabled and one is generated by the local apic during decompression (just bad timing) and *splat*. Yinghai, you can probably prove this by echo 0 > /proc/sys/kernel/nmi_watchdog then do your kdump crash test. At least that test worked for me. So either we explicitly shutdown perf or just mask off LVTPC in a modified disable_local_APIC? Eric, thoughts, preferences? Cheers, Don