From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751996AbdAXBpA (ORCPT ); Mon, 23 Jan 2017 20:45:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43730 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751838AbdAXBo7 (ORCPT ); Mon, 23 Jan 2017 20:44:59 -0500 Reply-To: xlpang@redhat.com Subject: Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic References: <1485158511-22374-1-git-send-email-xlpang@redhat.com> <20170123125157.u2kefedwpvgcdyfo@pd.tnic> <588606B9.3070604@redhat.com> <20170123145056.fyraeehjfnwmmfb6@pd.tnic> <20170123174008.GA4945@intel.com> <20170123175130.l7c7mnmu74ln5v6h@pd.tnic> To: Borislav Petkov , "Luck, Tony" Cc: xlpang@redhat.com, x86@kernel.org, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Ingo Molnar , Dave Young , Prarit Bhargava , Junichi Nomura , Kiyoshi Ueda , Naoya Horiguchi From: Xunlei Pang Message-ID: <5886B208.90804@redhat.com> Date: Tue, 24 Jan 2017 09:46:48 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20170123175130.l7c7mnmu74ln5v6h@pd.tnic> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 24 Jan 2017 01:44:54 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: > Hey Tony, > > a "welcome back" is in order? :-) > > On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >> If the system had experienced some memory corruption, but >> recovered ... then there would be some pages sitting around >> that the old kernel had marked as POISON and stopped using. >> The kexec'd kernel doesn't know about these, so may touch that >> memory while taking a crash dump ... > Hmm, pass a list of poisoned pages to the kdump kernel so as not to > touch. Looks like there's already functionality for that: > > "makedumpfile can exclude the following types of pages while copying > VMCORE to DUMPFILE, and a user can choose which type of pages will be > excluded. > > - Pages filled with zero > - Cache pages > - User process data pages > - Free pages" > > (there is a makedumpfile manpage somewhere) > > And apparently crash knows about poisoned pages and handles them: > > static int __init crash_save_vmcoreinfo_init(void) > { > ... > #ifdef CONFIG_MEMORY_FAILURE > VMCOREINFO_NUMBER(PG_hwpoison); > #endif > > so if that works, the kexeced kernel should know about that list. >>From the log in my previous reply, MCE occurred before makedumpfile dumping, so I guess if the poisoned ones belong to the crash reserved memory or other type of events? Besides, some kdump kernel may not use makedumpfile, for example a simple "cp" is also allowed to process "/proc/vmcore". > >> and then you have a broadcast machine check (on older[1] Intel CPUs >> that don't support local machine check). > Right. > >> This is hard to work around. You really need all the CPUs to have set >> CR4.MCE=1 (if any didn't, then they will force a reset when they see >> the machine check). Also you need to make sure that they jump to the >> copy of do_machine_check() in the new kernel, not the old kernel. > Doesn't matter, right? The new copy is as clueless as the old one about > those MCEs. > It's the code in mce_start(), it waits for all the online cpus including the cpus that kdump boots on to synchronize. So for new mce handler of kdump kernel, it is fine as the number of online cpus is correct; as for old mce handler of 1st kernel, it's not true because some cpus which are regarded online from 1st kernel's view are running the 2nd kernel now, they can't respond to the old mce handler which will timeout the old mce handler. Regards, Xunlei From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1cVqAb-00019F-Us for kexec@lists.infradead.org; Tue, 24 Jan 2017 01:45:15 +0000 Subject: Re: [PATCH] x86/mce: Keep quiet in case of broadcasted mce after system panic References: <1485158511-22374-1-git-send-email-xlpang@redhat.com> <20170123125157.u2kefedwpvgcdyfo@pd.tnic> <588606B9.3070604@redhat.com> <20170123145056.fyraeehjfnwmmfb6@pd.tnic> <20170123174008.GA4945@intel.com> <20170123175130.l7c7mnmu74ln5v6h@pd.tnic> From: Xunlei Pang Message-ID: <5886B208.90804@redhat.com> Date: Tue, 24 Jan 2017 09:46:48 +0800 MIME-Version: 1.0 In-Reply-To: <20170123175130.l7c7mnmu74ln5v6h@pd.tnic> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: xlpang@redhat.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: Borislav Petkov , "Luck, Tony" Cc: Prarit Bhargava , Kiyoshi Ueda , xlpang@redhat.com, x86@kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Ingo Molnar , Junichi Nomura , Naoya Horiguchi , Dave Young On 01/24/2017 at 01:51 AM, Borislav Petkov wrote: > Hey Tony, > > a "welcome back" is in order? :-) > > On Mon, Jan 23, 2017 at 09:40:09AM -0800, Luck, Tony wrote: >> If the system had experienced some memory corruption, but >> recovered ... then there would be some pages sitting around >> that the old kernel had marked as POISON and stopped using. >> The kexec'd kernel doesn't know about these, so may touch that >> memory while taking a crash dump ... > Hmm, pass a list of poisoned pages to the kdump kernel so as not to > touch. Looks like there's already functionality for that: > > "makedumpfile can exclude the following types of pages while copying > VMCORE to DUMPFILE, and a user can choose which type of pages will be > excluded. > > - Pages filled with zero > - Cache pages > - User process data pages > - Free pages" > > (there is a makedumpfile manpage somewhere) > > And apparently crash knows about poisoned pages and handles them: > > static int __init crash_save_vmcoreinfo_init(void) > { > ... > #ifdef CONFIG_MEMORY_FAILURE > VMCOREINFO_NUMBER(PG_hwpoison); > #endif > > so if that works, the kexeced kernel should know about that list. >From the log in my previous reply, MCE occurred before makedumpfile dumping, so I guess if the poisoned ones belong to the crash reserved memory or other type of events? Besides, some kdump kernel may not use makedumpfile, for example a simple "cp" is also allowed to process "/proc/vmcore". > >> and then you have a broadcast machine check (on older[1] Intel CPUs >> that don't support local machine check). > Right. > >> This is hard to work around. You really need all the CPUs to have set >> CR4.MCE=1 (if any didn't, then they will force a reset when they see >> the machine check). Also you need to make sure that they jump to the >> copy of do_machine_check() in the new kernel, not the old kernel. > Doesn't matter, right? The new copy is as clueless as the old one about > those MCEs. > It's the code in mce_start(), it waits for all the online cpus including the cpus that kdump boots on to synchronize. So for new mce handler of kdump kernel, it is fine as the number of online cpus is correct; as for old mce handler of 1st kernel, it's not true because some cpus which are regarded online from 1st kernel's view are running the 2nd kernel now, they can't respond to the old mce handler which will timeout the old mce handler. Regards, Xunlei _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec