From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752526AbdDIS1W (ORCPT ); Sun, 9 Apr 2017 14:27:22 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:46652 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752230AbdDIS1O (ORCPT ); Sun, 9 Apr 2017 14:27:14 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Sergey Senozhatsky Cc: Peter Zijlstra , Pavel Machek , Sergey Senozhatsky , Jan Kara , Ye Xiaolong , Steven Rostedt , Petr Mladek , Andrew Morton , Linus Torvalds , "Rafael J . Wysocki" , Greg Kroah-Hartman , Jiri Slaby , Len Brown , linux-kernel@vger.kernel.org, lkp@01.org References: <20170403093152.GB15168@quack2.suse.cz> <20170406173306.GD10363@amd> <20170407044334.GA487@jagdpanzerIV.localdomain> <20170407071558.GA11792@amd> <20170407074634.GB1091@jagdpanzerIV.localdomain> <20170407081449.GA12859@amd> <20170407121021.GA379@jagdpanzerIV.localdomain> <20170407124455.GC4756@amd> <20170407151306.GA384@tigerII.localdomain> <20170407152304.bkbceqmg3kxeqvur@hirez.programming.kicks-ass.net> <20170407154047.GB384@tigerII.localdomain> Date: Sun, 09 Apr 2017 13:21:44 -0500 In-Reply-To: <20170407154047.GB384@tigerII.localdomain> (Sergey Senozhatsky's message of "Sat, 8 Apr 2017 00:40:47 +0900") Message-ID: <87shlhv3uv.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1cxHYI-0003YA-NM;;;mid=<87shlhv3uv.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=67.3.234.240;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/cFhsvSoPZ0Dn29FgTDR34j81oPjbuPII= X-SA-Exim-Connect-IP: 67.3.234.240 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 1.0 XMGappySubj_02 Gappier still * 1.5 TR_Symld_Words too many words that have symbols inside * 0.7 XMSubLong Long Subject * 0.5 XMGappySubj_01 Very gappy subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Sergey Senozhatsky X-Spam-Relay-Country: X-Spam-Timing: total 245 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.6 (1.0%), b_tie_ro: 1.77 (0.7%), parse: 0.81 (0.3%), extract_message_metadata: 2.6 (1.1%), get_uri_detail_list: 1.14 (0.5%), tests_pri_-1000: 4.4 (1.8%), tests_pri_-950: 1.28 (0.5%), tests_pri_-900: 1.01 (0.4%), tests_pri_-400: 22 (8.9%), check_bayes: 21 (8.5%), b_tokenize: 7 (2.9%), b_tok_get_all: 7 (2.9%), b_comp_prob: 2.1 (0.9%), b_tok_touch_all: 2.9 (1.2%), b_finish: 0.53 (0.2%), tests_pri_0: 197 (80.7%), check_dkim_signature: 0.47 (0.2%), check_dkim_adsp: 3.5 (1.4%), tests_pri_500: 5 (2.1%), rewrite_mail: 0.00 (0.0%) Subject: Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sergey Senozhatsky writes: > On (04/07/17 17:23), Peter Zijlstra wrote: > [..] >> > we are looking at different typical setups :) serial console being 45 >> > seconds behind logbuf does not surprise me anymore. >> >> That does sound like you're doing something wrong and should look at >> reducing printk() more than anything else. > > yeah, 45sec is an extreme case that simply doesn't surprise me anymore ;) > that's not a normal/usual delay, of course, we are not this mad. on average > it's much better and may be not so far 2 seconds after all. a massive OOM > report, of course, appends logbuf messages at a much higher rate than UART > serial console can swallow, so the delay is getting larger, expectedly. > and, no, I don't add any printk-s, I'm looking at the lockup reports Are you running your serial consoles at 9600 baud? I would think the first thing to do would be to up your serial console baud rate to 115200 or at least 38400. Similarly anything the kernel is certain to survive I would set loglevel such that it is logging somewhere with syslog rather than printk. Of course my expectation on a production machine is to have panic on oom set, to print the huge OOM message and then reboot. So I don't possibly see how offloading to another thread and then switching right back to emergency mode is at all practical to solve the delay for a serious situation like OOM. It sounds like you are blaming printk when the problem is a very slow logging device. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============5290676274433174750==" MIME-Version: 1.0 From: Eric W. Biederman To: lkp@lists.01.org Subject: Re: [printk] fbc14616f4: BUG:kernel_reboot-without-warning_in_test_stage Date: Sun, 09 Apr 2017 13:21:44 -0500 Message-ID: <87shlhv3uv.fsf@xmission.com> In-Reply-To: <20170407154047.GB384@tigerII.localdomain> List-Id: --===============5290676274433174750== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sergey Senozhatsky writes: > On (04/07/17 17:23), Peter Zijlstra wrote: > [..] >> > we are looking at different typical setups :) serial console being 45 >> > seconds behind logbuf does not surprise me anymore. >> = >> That does sound like you're doing something wrong and should look at >> reducing printk() more than anything else. > > yeah, 45sec is an extreme case that simply doesn't surprise me anymore ;) > that's not a normal/usual delay, of course, we are not this mad. on avera= ge > it's much better and may be not so far 2 seconds after all. a massive OOM > report, of course, appends logbuf messages at a much higher rate than UART > serial console can swallow, so the delay is getting larger, expectedly. > and, no, I don't add any printk-s, I'm looking at the lockup reports Are you running your serial consoles at 9600 baud? I would think the first thing to do would be to up your serial console baud rate to 115200 or at least 38400. Similarly anything the kernel is certain to survive I would set loglevel such that it is logging somewhere with syslog rather than printk. Of course my expectation on a production machine is to have panic on oom set, to print the huge OOM message and then reboot. So I don't possibly see how offloading to another thread and then switching right back to emergency mode is at all practical to solve the delay for a serious situation like OOM. It sounds like you are blaming printk when the problem is a very slow logging device. Eric --===============5290676274433174750==--