From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752541AbcGMCEf (ORCPT ); Tue, 12 Jul 2016 22:04:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47518 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750908AbcGMCE0 (ORCPT ); Tue, 12 Jul 2016 22:04:26 -0400 Date: Wed, 13 Jul 2016 10:03:36 +0800 From: "'Dave Young'" To: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= Cc: Michal Hocko , Toshi Kani , "Peter Zijlstra (Intel)" , Vitaly Kuznetsov , Minfei Huang , "H. Peter Anvin" , Daniel Walker , Ingo Molnar , Takao Indoh , Baoquan He , "x86@kernel.org" , "Lee, Chun-Yi" , Borislav Petkov , Vivek Goyal , Masami Hiramatsu , Petr Mladek , Josh Poimboeuf , Thomas Gleixner , Ingo Molnar , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , HATAYAMA Daisuke , "Eric W. Biederman" , Tejun Heo , Andrew Morton Subject: Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version Message-ID: <20160713020336.GA3222@dhcp-128-65.nay.redhat.com> References: <20160705113323.5864.66834.stgit@softrs> <20160705113325.5864.74840.stgit@softrs> <20160711083449.GA19320@dhcp-128-65.nay.redhat.com> <04EAB7311EE43145B2D3536183D1A84454C6F889@GSjpTKYDCembx31.service.hitachi.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <04EAB7311EE43145B2D3536183D1A84454C6F889@GSjpTKYDCembx31.service.hitachi.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 13 Jul 2016 02:03:46 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote: > Hi Dave, > > Thanks for the comments. > > > From: Dave Young [mailto:dyoung@redhat.com] > > Sent: Monday, July 11, 2016 5:35 PM > > > > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote: > > > This patch fixes one of the problems reported by Daniel Walker > > > (https://lkml.org/lkml/2015/6/24/44). > > > > > > If crash_kexec_post_notifiers boot option is specified, other CPUs > > > are stopped by smp_send_stop() instead of machine_crash_shutdown() > > > in crash_kexec() path. This behavior change leads two problems. > > > > > > Problem 1: > > > octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are > > > still online and try to stop their watchdog timer. If > > > smp_send_stop() is called before octeon_generic_shutdown(), stopping > > > watchdog timer will fail because other CPUs have been offlined by > > > smp_send_stop(). > > > > > > panic() > > > if crash_kexec_post_notifiers == 1 > > > smp_send_stop() > > > atomic_notifier_call_chain() > > > kmsg_dump() > > > crash_kexec() > > > machine_crash_shutdown() > > > octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs > > > > > > Problem 2: > > > Most of architectures stop other CPUs in machine_crash_shutdown() > > > path, and they also do something needed for kdump. For example, > > > they save registers, disable virtualization extensions, and so on. > > > However, if smp_send_stop() stops other CPUs before > > > machine_crash_shutdown(), we miss those operations. > > > > > > How do we fix these problems? In the first place, we should stop > > > other CPUs as soon as possible when panic() was called, otherwise > > > other CPUs may wipe out a clue to the cause of the failure. So, we > > > replace smp_send_stop() with more suitable one for kdump. > > > > We have been avoiding extra things in panic path, but unfortunately > > crash_kexec_post_notifiers were added. I tend to agree the best place > > for this stuff is in 2nd kernel or purgatory instead of in 1st kernel. > > Several months ago, I posted a patch set which writes regs to SEL, generate > an event to send SNMP message, and start/stop BMC's watchdog timer in > purgatory. This feature requires BMC with KCS (Keyboard Controller Style) > I/F, but the most of enterprise grade server would have it. > (http://thread.gmane.org/gmane.linux.kernel.kexec/15382) > > Doing kmsg_dump things in purgatory wouldn't be suitable (should be done > in the 2nd kernel before enabling devices and IRQs?) In theory it is doable maybe do something like oldmem_kmsg_dump while /proc/vmcore initializing? > > > As for this patch I'm not sure it is safe to replace the smp_send_stop > > with the kdump friendly function. I'm also not sure if the kdump friendly > > function is safe for kdump. Will glad to hear opinions from other > > arch experts. > > This stuff depends on architectures, so I speak only about > x86 (the logic doesn't change on other architectures at this time). > > kdump path with crash_kexec_post_notifiers disabled: > panic() > __crash_kexec() > crash_setup_regs() > crash_save_vmcoreinfo() > machine_crash_shutdown() > native_machine_crash_shutdown() > panic_smp_send_stop() /* mostly same as original > * kdump_nmi_shootdown_cpus() > */ > > kdump path with crash_kexec_post_notifiers enabled: > panic() > panic_smp_send_stop() > __crash_kexec() > crash_setup_regs() > crash_save_vmcoreinfo() > machine_crash_shutdown() > native_machine_crash_shutdown() > panic_smp_send_stop() // do nothing > > The difference is that stopping other CPUs before crash_setup_regs() > and crash_save_vmcoreinfo() or not. Since crash_setup_regs() and > crash_save_vmcoreinfo() just save information to some memory area, > they wouldn't be affected by panic_smp_send_stop(). This means > placing panic_smp_send_stop before __crash_kexec is safe. > > BTW, I noticed my patch breaks Xen kernel. I'll fix it in the > next version. But it does breaks stuff which depends on cpu not being disabled like problem 1 you mentioned in patch log. > > > BTW, if one want to use crash_kexec_post_notifiers he should take the > > risk of unreliable kdump. How about only call smp_send_stop in case no > > crash_kexec_post_notifiers being used. > > Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(), smp_send_stop() > for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI. > This would be because NMI IPI has a risk of deadlock. We checked if > the kdump path has a risk of deadlock in the case of NMI panic and fixed > it. But I'm not sure about normal panic path. I agree with that use > smp_send_stop if crash_kexec_post_notifiers or kdump is disabled. What I mean is like below, problem 1 will not exist in this way, but kdump will be unreliable: if (!crash_kexec_post_notifiers) smp_send_stop() > > > > This patch solves Problem 2 by replacing smp_send_stop() in panic() > > > with panic_smp_send_stop(). This is a weak function which calls > > > smp_send_stop(), and architecture dependent code may override this > > > with appropriate one. This patch only provides x86-specific version. > > > > It does not fix the Problem 1, it seem not possible to fix it? > > Problem 1 depends on architectures, and at least it doesn't happen > on x86. I can try to fix the Problem 1 for MIPS, but I can't test it. > Possible solution will be to use an smp_send_stop variant which stop > the CPU without offlining. > > Best regards, > > Hidehiro Kawai > Hitachi, Ltd. Research & Development Group > [snip] Thanks Dave