From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760260AbdADNjJ (ORCPT ); Wed, 4 Jan 2017 08:39:09 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:35971 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964822AbdADNfP (ORCPT ); Wed, 4 Jan 2017 08:35:15 -0500 Date: Wed, 4 Jan 2017 22:34:48 +0900 From: Sergey Senozhatsky To: Petr Mladek Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , Steven Rostedt , Peter Hurley , linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] printk: always report lost messages on serial console Message-ID: <20170104133448.GA384@tigerII.localdomain> References: <20161224140902.1962-1-sergey.senozhatsky@gmail.com> <20161224140902.1962-3-sergey.senozhatsky@gmail.com> <20170103145532.GL14894@pathway.suse.cz> <20170103154745.GA496@tigerII.localdomain> <20170103165356.GN14894@pathway.suse.cz> <20170104024649.GA2362@jagdpanzerIV.localdomain> <20170104105235.GP14894@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170104105235.GP14894@pathway.suse.cz> User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (01/04/17 11:52), Petr Mladek wrote: [..] > > this is from the real serial logs I'm looking at right now. we attach > > "bad news" to 'critical' messages only: > > > > ... > > [ 32.941061] bc00: b65dc0d8 b65dc6d0 ae1fbc7c b65c11c5 b563c9c4 00000001 b65dc6d0 b0a62f74 > > ** 150 printk messages dropped ** [ 32.941614] cee0: 00000081 ae1fcef0 00000038 00000000 b5369000 ae1fcf00 ae1fd0f0 00000000 > > ** 75 printk messages dropped ** [ 32.941892] d860: 00056608 af203848 00000000 0004088c 000000d0 00000000 00000000 00000000 > > ** 12 printk messages dropped ** [ 32.941940] .. > > ** 2 printk messages dropped ** [ 32.941951] .. > > ** 10 printk messages dropped ** [ 32.941992] .. > > ** 1 printk messages dropped ** [ 32.941999] .. > > ... > > Do you see how useless the above messages are, please? what... these lost messages were of extreme importance. I can't tell the exactly the loglevel, but I'm sure it was at least pr_err() level. these were like really important messages, unlike the ones that got suppressed/filtered-out. along with these lost messages that were supposed to be printed, I have regions of lost kernel messages (?) with no reports of lost messages from console_unlock() (!). and that's the only thing I'm fixing here. and the only thing we can fix. no permutation of console_unlock() lines will make the buffer bigger or printk flooding CPUs nicer or serial console driver faster. once we unlocked the logbuf lock in console_unlock() we lost the race against the printk() flooding CPUs. all, or some, of the remaining messages, no matter how important and critical, will be drained. the next time we lock the logbuf again the logbuf will not be the same. the only meaningful thing we can print now is "XXX printk messages dropped". we don't know what was in those messages and no one will ever find out. it's gone. and the options here are "print 1 random message out of XXX or XXXX lost messages" vs "print 1 random message out of XXX or XXXX lost messages" and that's really it. except that in the second case we shuffle console_unlock() lines with no gain. and I'm not sure I understand why we even discuss it. both options are absolutely equally terrible, because the root cause of the problem is in completely different place. -ss