From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752616AbdJPNNO (ORCPT ); Mon, 16 Oct 2017 09:13:14 -0400 Received: from mail-pg0-f47.google.com ([74.125.83.47]:54981 "EHLO mail-pg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752460AbdJPNNM (ORCPT ); Mon, 16 Oct 2017 09:13:12 -0400 X-Google-Smtp-Source: AOwi7QATggimMzQ+I224XnUYTqY2/ZME1CAbSw3soKomTEQnVCoXi+k9p9VHPS6U9z1e528ur2LKtw== Date: Mon, 16 Oct 2017 22:13:05 +0900 From: Sergey Senozhatsky To: Petr Mladek Cc: Linus Torvalds , Steven Rostedt , LKML , Sergey Senozhatsky , Peter Zijlstra , Andrew Morton , Thomas Gleixner , Ingo Molnar Subject: Re: NMI watchdog dump does not print on hard lockup Message-ID: <20171016131305.GE6316@tigerII.localdomain> References: <20171012121658.187c5af6@gandalf.local.home> <20171013111444.GB2795@pathway.suse.cz> <20171013091857.4afe8a7a@gandalf.local.home> <20171016111239.GK2795@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171016111239.GK2795@pathway.suse.cz> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On (10/16/17 13:12), Petr Mladek wrote: [..] > > I think an NMI watchdog should just force the flush - the same way an > > oops should. Deadlocks aren't really relevant if something doesn't get > > printed out anyway. > > We expicititely flush the NMI buffers in panic() when there is > not other way to see them. But it is questional in other situations. > Sometimes the flush might be the only way to see the messages > and sometimes printk() might unnecessarily cause a deadlock. > IMHO, the only solution is to make it optional. just "brainstorming" it... with some silly ideas. pushing the data from NMI panic might look like we are replacing one deadlock scenario with another deadlock scenario. some of the console drivers are soooo complex internally. so I have been thinking about... may be we can extend struct console and add ->write_on_panic() and that handler must be as lockless as possible; so lockless that calling it from anything that is not panic() is a severe bug. an absolutely trivial case, if serial console does console_write_cb(struct console *co, const char *s, unsigned int count) { spin_lock_irqsave(&port->lock, flags); uart_console_write(s, count, console_putchar); spin_unlock_irqrestore(&port->lock, flags); } then panic callback can look like console_write_on_panic_cb(struct console *co, const char *s, unsigned int count) { /* no, we don't take the port lock here */ uart_console_write(s, count, console_putchar); } a less trivial case might look more involved. but in general that write_on_panic() callback must do the absolute minimum of work. so it's sort of a early console, but as part of normal console driver. I also got some other serial console crazy ideas, but they are not related to this topic. -ss