From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933305AbdCaPdg (ORCPT ); Fri, 31 Mar 2017 11:33:36 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:41411 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933114AbdCaPde (ORCPT ); Fri, 31 Mar 2017 11:33:34 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Sergey Senozhatsky Cc: Ye Xiaolong , Sergey Senozhatsky , Steven Rostedt , Petr Mladek , Jan Kara , Andrew Morton , Linus Torvalds , Peter Zijlstra , "Rafael J . Wysocki" , Greg Kroah-Hartman , Jiri Slaby , Pavel Machek , Len Brown , linux-kernel@vger.kernel.org, lkp@01.org References: <20170329092511.3958-9-sergey.senozhatsky@gmail.com> <20170330213829.GA21476@inn.lkp.intel.com> <20170331023506.GB3493@jagdpanzerIV.localdomain> <20170331040438.GA366@jagdpanzerIV.localdomain> <20170331063913.GE20961@yexl-desktop> <20170331144730.GA10578@tigerII.localdomain> Date: Fri, 31 Mar 2017 10:28:15 -0500 In-Reply-To: <20170331144730.GA10578@tigerII.localdomain> (Sergey Senozhatsky's message of "Fri, 31 Mar 2017 23:47:30 +0900") Message-ID: <87a881v52o.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1ctyYM-0002XD-Kg;;;mid=<87a881v52o.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=67.3.234.240;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/ZB0ZJEFsO8mw0EktO8FFveM9A+XaJedM= X-SA-Exim-Connect-IP: 67.3.234.240 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 1.5 TR_Symld_Words too many words that have symbols inside * 0.7 XMSubLong Long Subject * 1.0 XMGappySubj_02 Gappier still * 0.5 XMGappySubj_01 Very gappy subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.2 T_XMDrugObfuBody_14 obfuscated drug references * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Sergey Senozhatsky X-Spam-Relay-Country: X-Spam-Timing: total 551 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.8 (0.5%), b_tie_ro: 1.97 (0.4%), parse: 0.76 (0.1%), extract_message_metadata: 3.2 (0.6%), get_uri_detail_list: 1.59 (0.3%), tests_pri_-1000: 4.2 (0.8%), tests_pri_-950: 1.15 (0.2%), tests_pri_-900: 0.99 (0.2%), tests_pri_-400: 24 (4.4%), check_bayes: 23 (4.2%), b_tokenize: 8 (1.4%), b_tok_get_all: 8 (1.4%), b_comp_prob: 2.3 (0.4%), b_tok_touch_all: 3.0 (0.6%), b_finish: 0.59 (0.1%), tests_pri_0: 502 (91.1%), check_dkim_signature: 0.48 (0.1%), check_dkim_adsp: 244 (44.3%), tests_pri_500: 3.9 (0.7%), rewrite_mail: 0.00 (0.0%) Subject: Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sergey Senozhatsky writes: > On (03/31/17 14:39), Ye Xiaolong wrote: >> On 03/31, Sergey Senozhatsky wrote: >> >On (03/31/17 11:35), Sergey Senozhatsky wrote: >> >[..] >> >> > [ 21.009531] VFS: Warning: trinity-c2 using old stat() call. Recompile your binary. >> >> > [ 21.148898] VFS: Warning: trinity-c0 using old stat() call. Recompile your binary. >> >> > [ 22.298208] warning: process `trinity-c2' used the deprecated sysctl system call with >> >> > >> >> > Elapsed time: 310 >> >> > BUG: kernel reboot-without-warning in test stage >> >> >> >> so as far as I understand, this is the "missing kernel messages" >> >> type of bug report. a worst case scenario. >> > >> >panic() should have called console_flush_on_panic(), which sould have >> >flushed the messages regardless the printk_kthread state. so it probably >> >was not panic() that rebooted the kernel. (probably). >> > >> >kernel_restart() and kernel_halt() have pr_emerg() messages, printk switches >> >to printk_emergency mode the first time it sees EMERG level message. (may be >> >we switch to late). >> > >> >on the other hand, there is a emergency_restart(), where we don't switch >> >to printk_emergency mode and don't flush the existing kernel messages. >> >there is a bunch of places that call emergency_restart(), including sysrq. >> > >> >may I ask you, how do you usually restart the vm after the test? >> >`echo X > /proc/sysrq-trigger'? >> >> Yes. >> >> > >> >does this patch make it any better? >> >> I am trying it and will post the result once I get it. > > > ... I'd also probably add pr_emerg() print-out to emergency_restart(), > the same way kernel_restart()/kernel_halt()/kernel_power_off() do. > > for those cases when emergency_restart() is called with printk in > kthreaded mode, not in emergency mode. No. No. No. emergency_restart should be the equivalent of a watchdog going off. AKA it is long past the point where you want to be coordinating with other parts of the kernel. Rebooting is the priority. A print statement absolutely does not belong in emergency_restart. The fact that nothing managed to get printed out without magic flushing code is highly disturbing. Looking from the outside this patchset appears to be broken by design. If you don't want kernel functions suffering from the overhead of printing to a slow output device, don't do that then. The point of printk is to give debugging output. You have fundamentally incapacitated printk from serving it's primary purpose. NAK to the entire concept. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============5011779156116977879==" MIME-Version: 1.0 From: Eric W. Biederman To: lkp@lists.01.org Subject: Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage Date: Fri, 31 Mar 2017 10:28:15 -0500 Message-ID: <87a881v52o.fsf@xmission.com> In-Reply-To: <20170331144730.GA10578@tigerII.localdomain> List-Id: --===============5011779156116977879== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sergey Senozhatsky writes: > On (03/31/17 14:39), Ye Xiaolong wrote: >> On 03/31, Sergey Senozhatsky wrote: >> >On (03/31/17 11:35), Sergey Senozhatsky wrote: >> >[..] >> >> > [ 21.009531] VFS: Warning: trinity-c2 using old stat() call. Reco= mpile your binary. >> >> > [ 21.148898] VFS: Warning: trinity-c0 using old stat() call. Reco= mpile your binary. >> >> > [ 22.298208] warning: process `trinity-c2' used the deprecated sy= sctl system call with = >> >> > = >> >> > Elapsed time: 310 >> >> > BUG: kernel reboot-without-warning in test stage >> >> = >> >> so as far as I understand, this is the "missing kernel messages" >> >> type of bug report. a worst case scenario. >> > >> >panic() should have called console_flush_on_panic(), which sould have >> >flushed the messages regardless the printk_kthread state. so it probably >> >was not panic() that rebooted the kernel. (probably). >> > >> >kernel_restart() and kernel_halt() have pr_emerg() messages, printk swi= tches >> >to printk_emergency mode the first time it sees EMERG level message. (m= ay be >> >we switch to late). >> > >> >on the other hand, there is a emergency_restart(), where we don't switch >> >to printk_emergency mode and don't flush the existing kernel messages. >> >there is a bunch of places that call emergency_restart(), including sys= rq. >> > >> >may I ask you, how do you usually restart the vm after the test? >> >`echo X > /proc/sysrq-trigger'? >> = >> Yes. >> = >> > >> >does this patch make it any better? >> = >> I am trying it and will post the result once I get it. > > > ... I'd also probably add pr_emerg() print-out to emergency_restart(), > the same way kernel_restart()/kernel_halt()/kernel_power_off() do. > > for those cases when emergency_restart() is called with printk in > kthreaded mode, not in emergency mode. No. No. No. emergency_restart should be the equivalent of a watchdog going off. AKA it is long past the point where you want to be coordinating with other parts of the kernel. Rebooting is the priority. A print statement absolutely does not belong in emergency_restart. The fact that nothing managed to get printed out without magic flushing code is highly disturbing. Looking from the outside this patchset appears to be broken by design. If you don't want kernel functions suffering from the overhead of printing to a slow output device, don't do that then. The point of printk is to give debugging output. You have fundamentally incapacitated printk from serving it's primary purpose. NAK to the entire concept. Eric --===============5011779156116977879==--