From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751825AbbKYJqq (ORCPT ); Wed, 25 Nov 2015 04:46:46 -0500 Received: from mail9.hitachi.co.jp ([133.145.228.44]:56797 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750726AbbKYJqm (ORCPT ); Wed, 25 Nov 2015 04:46:42 -0500 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= To: "'Borislav Petkov'" CC: Jonathan Corbet , Peter Zijlstra , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , Baoquan He , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Michal Hocko , Ingo Molnar , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: RE: [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context Thread-Topic: [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context Thread-Index: AQHRI3eQMDnSwb7qM0y43uYtcLxsUp6qbZyAgAGg9/D//9HtgIAAobHg Date: Wed, 25 Nov 2015 09:46:37 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A84454A2B929@GSjpTKYDCembx31.service.hitachi.net> References: <20151120093641.4285.97253.stgit@softrs> <20151120093646.4285.62259.stgit@softrs> <20151124104853.GC3785@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A2A425@GSjpTKYDCembx31.service.hitachi.net> <20151125085620.GA29499@pd.tnic> In-Reply-To: <20151125085620.GA29499@pd.tnic> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.220.44] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id tAP9kqPc027028 > On Wed, Nov 25, 2015 at 05:51:59AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > > > `Infinite loop in NMI context' can happen: > > > > > > > > a. when a cpu panics on NMI while another cpu is processing panic > > > > b. when a cpu received an external or unknown NMI while another > > > > cpu is processing panic on NMI > > > > > > > > In the case of a, it loops in panic_smp_self_stop(). In the case > > > > of b, it loops in raw_spin_lock() of nmi_reason_lock. > > > > > > Please describe those two cases more verbosely - it takes slow people > > > like me a while to figure out what exactly can happen. > > > > a. when a cpu panics on NMI while another cpu is processing panic > > Ex. > > CPU 0 CPU 1 > > ================= ================= > > panic() > > panic_cpu <-- 0 > > check panic_cpu > > crash_kexec() > > receive an unknown NMI > > unknown_nmi_error() > > nmi_panic() > > panic() > > check panic_cpu > > panic_smp_self_stop() > > infinite loop in NMI context > > > > b. when a cpu received an external or unknown NMI while another > > cpu is processing panic on NMI > > Ex. > > CPU 0 CPU 1 > > ====================== ================== > > receive an unknown NMI > > unknown_nmi_error() > > nmi_panic() receive an unknown NMI > > panic_cpu <-- 0 unknown_nmi_error() > > panic() nmi_panic() > > check panic_cpu panic > > crash_kexec() check panic_cpu > > panic_smp_self_stop() > > infinite loop in NMI context > > Ok, that's what I saw too, thanks for confirming. > > But please write those examples with the *old* code in the commit > message, i.e. without panic_cpu and nmi_panic() which you're adding. Does *old* code mean the code without this patch *series* ? panic_cpu and nmi_panic() are introduced by PATCH 1/4, not this patch. > Basically, you want to structure your commit message this way: > > This is the problem the current code has: ... > > But we need to do this: ... > > We fix it by doing that: ... Good suggestion! I'll revise a bit with following your comment. > > > > + * directly. This function is used when we have already been in NMI handler. > > > > + */ > > > > +void poll_crash_ipi_and_callback(struct pt_regs *regs) > > > > > > Why "poll"? We won't return from crash_nmi_callback() if we're not the > > > crashing CPU. > > > > This function polls that crash IPI has been issued by checking > > crash_ipi_done, then invokes the callback. This is different > > from so-called "poll" functions. Do you have some good name? > > Maybe something as simple as "run_crash_callback"? I prefer this, but we might want to add some more prefix or suffix. For example, "conditionally_run_crash_nmi_callback". > Or since we're calling it from other places, maybe add the "crash" > prefix: > > while (!raw_spin_trylock(&nmi_reason_lock)) > crash_run_callback(regs); > > Looks even better to me in code context. :) Thanks for your deep review! -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group {.n++%ݶw{.n+{G{ayʇڙ,jfhz_(階ݢj"mG?&~iOzv^m ?I From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail9.hitachi.co.jp ([133.145.228.44]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1a1WfH-0001Ij-4Q for kexec@lists.infradead.org; Wed, 25 Nov 2015 09:47:04 +0000 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= Subject: RE: [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context Date: Wed, 25 Nov 2015 09:46:37 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A84454A2B929@GSjpTKYDCembx31.service.hitachi.net> References: <20151120093641.4285.97253.stgit@softrs> <20151120093646.4285.62259.stgit@softrs> <20151124104853.GC3785@pd.tnic> <04EAB7311EE43145B2D3536183D1A84454A2A425@GSjpTKYDCembx31.service.hitachi.net> <20151125085620.GA29499@pd.tnic> In-Reply-To: <20151125085620.GA29499@pd.tnic> Content-Language: ja-JP MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "kexec" Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: 'Borislav Petkov' Cc: "x86@kernel.org" , Baoquan He , Jonathan Corbet , Peter Zijlstra , "linux-doc@vger.kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Michal Hocko , Ingo Molnar , Thomas Gleixner , "Eric W. Biederman" , "H. Peter Anvin" , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= , Andrew Morton , Ingo Molnar , Vivek Goyal PiBPbiBXZWQsIE5vdiAyNSwgMjAxNSBhdCAwNTo1MTo1OUFNICswMDAwLCDmsrPlkIjoi7Hlro8g LyBLQVdBSe+8jEhJREVISVJPIHdyb3RlOg0KPiA+ID4gPiBgSW5maW5pdGUgbG9vcCBpbiBOTUkg Y29udGV4dCcgY2FuIGhhcHBlbjoNCj4gPiA+ID4NCj4gPiA+ID4gICBhLiB3aGVuIGEgY3B1IHBh bmljcyBvbiBOTUkgd2hpbGUgYW5vdGhlciBjcHUgaXMgcHJvY2Vzc2luZyBwYW5pYw0KPiA+ID4g PiAgIGIuIHdoZW4gYSBjcHUgcmVjZWl2ZWQgYW4gZXh0ZXJuYWwgb3IgdW5rbm93biBOTUkgd2hp bGUgYW5vdGhlcg0KPiA+ID4gPiAgICAgIGNwdSBpcyBwcm9jZXNzaW5nIHBhbmljIG9uIE5NSQ0K PiA+ID4gPg0KPiA+ID4gPiBJbiB0aGUgY2FzZSBvZiBhLCBpdCBsb29wcyBpbiBwYW5pY19zbXBf c2VsZl9zdG9wKCkuICBJbiB0aGUgY2FzZQ0KPiA+ID4gPiBvZiBiLCBpdCBsb29wcyBpbiByYXdf c3Bpbl9sb2NrKCkgb2Ygbm1pX3JlYXNvbl9sb2NrLg0KPiA+ID4NCj4gPiA+IFBsZWFzZSBkZXNj cmliZSB0aG9zZSB0d28gY2FzZXMgbW9yZSB2ZXJib3NlbHkgLSBpdCB0YWtlcyBzbG93IHBlb3Bs ZQ0KPiA+ID4gbGlrZSBtZSBhIHdoaWxlIHRvIGZpZ3VyZSBvdXQgd2hhdCBleGFjdGx5IGNhbiBo YXBwZW4uDQo+ID4NCj4gPiAgIGEuIHdoZW4gYSBjcHUgcGFuaWNzIG9uIE5NSSB3aGlsZSBhbm90 aGVyIGNwdSBpcyBwcm9jZXNzaW5nIHBhbmljDQo+ID4gICAgICBFeC4NCj4gPiAgICAgIENQVSAw ICAgICAgICAgICAgICAgICAgICAgQ1BVIDENCj4gPiAgICAgID09PT09PT09PT09PT09PT09ICAg ICAgICAgPT09PT09PT09PT09PT09PT0NCj4gPiAgICAgIHBhbmljKCkNCj4gPiAgICAgICAgcGFu aWNfY3B1IDwtLSAwDQo+ID4gICAgICAgIGNoZWNrIHBhbmljX2NwdQ0KPiA+ICAgICAgICBjcmFz aF9rZXhlYygpDQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHJlY2VpdmUgYW4g dW5rbm93biBOTUkNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgdW5rbm93bl9u bWlfZXJyb3IoKQ0KPiA+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIG5taV9wYW5p YygpDQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBwYW5pYygpDQo+ID4g ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNoZWNrIHBhbmljX2NwdQ0KPiA+ ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBwYW5pY19zbXBfc2VsZl9zdG9w KCkNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBpbmZpbml0ZSBs b29wIGluIE5NSSBjb250ZXh0DQo+ID4NCj4gPiAgIGIuIHdoZW4gYSBjcHUgcmVjZWl2ZWQgYW4g ZXh0ZXJuYWwgb3IgdW5rbm93biBOTUkgd2hpbGUgYW5vdGhlcg0KPiA+ICAgICAgY3B1IGlzIHBy b2Nlc3NpbmcgcGFuaWMgb24gTk1JDQo+ID4gICAgICBFeC4NCj4gPiAgICAgIENQVSAwICAgICAg ICAgICAgICAgICAgICAgQ1BVIDENCj4gPiAgICAgID09PT09PT09PT09PT09PT09PT09PT0gICAg PT09PT09PT09PT09PT09PT09DQo+ID4gICAgICByZWNlaXZlIGFuIHVua25vd24gTk1JDQo+ID4g ICAgICB1bmtub3duX25taV9lcnJvcigpDQo+ID4gICAgICAgIG5taV9wYW5pYygpICAgICAgICAg ICAgIHJlY2VpdmUgYW4gdW5rbm93biBOTUkNCj4gPiAgICAgICAgICBwYW5pY19jcHUgPC0tIDAg ICAgICAgdW5rbm93bl9ubWlfZXJyb3IoKQ0KPiA+ICAgICAgICAgIHBhbmljKCkgICAgICAgICAg ICAgICAgIG5taV9wYW5pYygpDQo+ID4gICAgICAgICAgICBjaGVjayBwYW5pY19jcHUgICAgICAg ICBwYW5pYw0KPiA+ICAgICAgICAgICAgY3Jhc2hfa2V4ZWMoKSAgICAgICAgICAgICBjaGVjayBw YW5pY19jcHUNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgcGFuaWNf c21wX3NlbGZfc3RvcCgpDQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgaW5maW5pdGUgbG9vcCBpbiBOTUkgY29udGV4dA0KPiANCj4gT2ssIHRoYXQncyB3aGF0IEkg c2F3IHRvbywgdGhhbmtzIGZvciBjb25maXJtaW5nLg0KPiANCj4gQnV0IHBsZWFzZSB3cml0ZSB0 aG9zZSBleGFtcGxlcyB3aXRoIHRoZSAqb2xkKiBjb2RlIGluIHRoZSBjb21taXQNCj4gbWVzc2Fn ZSwgaS5lLiB3aXRob3V0IHBhbmljX2NwdSBhbmQgbm1pX3BhbmljKCkgd2hpY2ggeW91J3JlIGFk ZGluZy4NCg0KRG9lcyAqb2xkKiBjb2RlIG1lYW4gdGhlIGNvZGUgd2l0aG91dCB0aGlzIHBhdGNo ICpzZXJpZXMqID8NCnBhbmljX2NwdSBhbmQgbm1pX3BhbmljKCkgYXJlIGludHJvZHVjZWQgYnkg UEFUQ0ggMS80LCBub3QgdGhpcyBwYXRjaC4NCg0KPiBCYXNpY2FsbHksIHlvdSB3YW50IHRvIHN0 cnVjdHVyZSB5b3VyIGNvbW1pdCBtZXNzYWdlIHRoaXMgd2F5Og0KPiANCj4gVGhpcyBpcyB0aGUg cHJvYmxlbSB0aGUgY3VycmVudCBjb2RlIGhhczogLi4uDQo+IA0KPiBCdXQgd2UgbmVlZCB0byBk byB0aGlzOiAuLi4NCj4gDQo+IFdlIGZpeCBpdCBieSBkb2luZyB0aGF0OiAuLi4NCg0KR29vZCBz dWdnZXN0aW9uISAgSSdsbCByZXZpc2UgYSBiaXQgd2l0aCBmb2xsb3dpbmcgeW91ciBjb21tZW50 Lg0KDQo+ID4gPiA+ICsgKiBkaXJlY3RseS4gIFRoaXMgZnVuY3Rpb24gaXMgdXNlZCB3aGVuIHdl IGhhdmUgYWxyZWFkeSBiZWVuIGluIE5NSSBoYW5kbGVyLg0KPiA+ID4gPiArICovDQo+ID4gPiA+ ICt2b2lkIHBvbGxfY3Jhc2hfaXBpX2FuZF9jYWxsYmFjayhzdHJ1Y3QgcHRfcmVncyAqcmVncykN Cj4gPiA+DQo+ID4gPiBXaHkgInBvbGwiPyBXZSB3b24ndCByZXR1cm4gZnJvbSBjcmFzaF9ubWlf Y2FsbGJhY2soKSBpZiB3ZSdyZSBub3QgdGhlDQo+ID4gPiBjcmFzaGluZyBDUFUuDQo+ID4NCj4g PiBUaGlzIGZ1bmN0aW9uIHBvbGxzIHRoYXQgY3Jhc2ggSVBJIGhhcyBiZWVuIGlzc3VlZCBieSBj aGVja2luZw0KPiA+IGNyYXNoX2lwaV9kb25lLCB0aGVuIGludm9rZXMgdGhlIGNhbGxiYWNrLiAg VGhpcyBpcyBkaWZmZXJlbnQNCj4gPiBmcm9tIHNvLWNhbGxlZCAicG9sbCIgZnVuY3Rpb25zLiAg RG8geW91IGhhdmUgc29tZSBnb29kIG5hbWU/DQo+IA0KPiBNYXliZSBzb21ldGhpbmcgYXMgc2lt cGxlIGFzICJydW5fY3Jhc2hfY2FsbGJhY2siPw0KDQpJIHByZWZlciB0aGlzLCBidXQgd2UgbWln aHQgd2FudCB0byBhZGQgc29tZSBtb3JlIHByZWZpeCBvciBzdWZmaXguDQpGb3IgZXhhbXBsZSwg ImNvbmRpdGlvbmFsbHlfcnVuX2NyYXNoX25taV9jYWxsYmFjayIuDQoNCj4gT3Igc2luY2Ugd2Un cmUgY2FsbGluZyBpdCBmcm9tIG90aGVyIHBsYWNlcywgbWF5YmUgYWRkIHRoZSAiY3Jhc2giDQo+ IHByZWZpeDoNCj4gDQo+IAl3aGlsZSAoIXJhd19zcGluX3RyeWxvY2soJm5taV9yZWFzb25fbG9j aykpDQo+IAkJY3Jhc2hfcnVuX2NhbGxiYWNrKHJlZ3MpOw0KPiANCj4gTG9va3MgZXZlbiBiZXR0 ZXIgdG8gbWUgaW4gY29kZSBjb250ZXh0LiA6KQ0KDQpUaGFua3MgZm9yIHlvdXIgZGVlcCByZXZp ZXchDQoNCi0tDQpIaWRlaGlybyBLYXdhaQ0KSGl0YWNoaSwgTHRkLiBSZXNlYXJjaCAmIERldmVs b3BtZW50IEdyb3VwDQoNCg0KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX18Ka2V4ZWMgbWFpbGluZyBsaXN0CmtleGVjQGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0 cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9rZXhlYwo=