From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751112AbdAMFT3 (ORCPT ); Fri, 13 Jan 2017 00:19:29 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:33976 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750713AbdAMFT1 (ORCPT ); Fri, 13 Jan 2017 00:19:27 -0500 Date: Fri, 13 Jan 2017 14:11:12 +0900 From: Sergey Senozhatsky To: Petr Mladek Cc: Sergey Senozhatsky , Sergey Senozhatsky , Andrew Morton , Steven Rostedt , Peter Hurley , linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] printk: always report lost messages on serial console Message-ID: <20170113051112.GF9360@jagdpanzerIV.localdomain> References: <20161224140902.1962-1-sergey.senozhatsky@gmail.com> <20161224140902.1962-3-sergey.senozhatsky@gmail.com> <20170109165623.GE20785@pathway.suse.cz> <20170110084939.GA544@jagdpanzerIV.localdomain> <20170111165038.GK20785@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170111165038.GK20785@pathway.suse.cz> User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On (01/11/17 17:50), Petr Mladek wrote: > Hi Sergey, > > first, thanks a lot for the detailed description. I have finally > understood what was important on the "non-important" messages > and how you used them. I am sorry that I was not able to get > it earlier. sure, no prob. I was mostly offline for the past few days for personal reasons but now I'm back. [..] > > now, if the loss of messages was caused by: > > > > a) flood of suppressed loglevel messages > > then printing at least some of those messages makes *a lot* of sense. > > > > b) flood of visible loglevel messages > > then may be those messages are not so important. there a whole logbuf of > > them. per my experience, it is quite hard to overflow the logbuf with > > really important, unique, sensible messages of 'visible' loglevel with > > active loglevel filtering. > > Just for record, I guess that the same is true also for the messages > with lower level. I mean that they are repeating as well. right. those are 100% reproducible, quick to spot and easy to fix, I guess. a spontaneous explosion is a different/bigger problem. though the end result is the same -- we lose messages, may be very important ones. our best effort/goal is to print logbuf content. that's why we are playing with nmi/safe printk; zap locks; ignore locks state in some cases; and so on and on. but when we lose messages that were meant to be printed even _before_ we try to print them out, then our best effort is sort of void/undefined. > It would be great to make it easier to throttle the same messages or > do it a generic way. But this a food for the future work. yes. syslog tracks "duplicate messages". but I kinda couldn't understand how helpful it will be in vprintk_emit() /* because it's too late to track duplicates in console_unlock(). duplicates should not be stored in multiple instances in the first place */. backtraces are hard to suppress, besides we shouldn't suppress backtraces I think. log_store() would have to strcmp() or "hash+compare hashes" current message and the most recent logbuf message. but bigger concern is -- do people see dropped messages that often to add duplicate messages tracker to vprintk_emit()? I see dropped messages quite a lot, but that's just my setup. > #define KERN_EMERG KERN_SOH "0" /* system is unusable */ > #define KERN_ALERT KERN_SOH "1" /* action must be taken immediately */ > #define KERN_CRIT KERN_SOH "2" /* critical conditions */ > #define KERN_ERR KERN_SOH "3" /* error conditions */ > > The flood of messages usually means something pretty wrong. But > it might also be caused by too many or forgotten debug messages. well. it's still really a lot of forgotten messages. so much that we have to drop other messages. so I'd say the root cause is less important (if important at all) as long as the result is "lost messages". > It think that lost messages belong to the level "2". Note that > the warning about lost NMI messages and recent printk recursion > were printed with loglevel '2' as well. > > Would it make sense and be acceptable to ignore the log level > only when console_level allows to show KERN_CRIT messages? need to think. what will it improve? // I'm catching up with the emails, it'll take some time. -ss