From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752616AbdJPNNO (ORCPT <rfc822;w@1wt.eu>);
        Mon, 16 Oct 2017 09:13:14 -0400
Received: from mail-pg0-f47.google.com ([74.125.83.47]:54981 "EHLO
        mail-pg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752460AbdJPNNM (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 16 Oct 2017 09:13:12 -0400
X-Google-Smtp-Source: AOwi7QATggimMzQ+I224XnUYTqY2/ZME1CAbSw3soKomTEQnVCoXi+k9p9VHPS6U9z1e528ur2LKtw==
Date: Mon, 16 Oct 2017 22:13:05 +0900
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Steven Rostedt <rostedt@goodmis.org>,
        LKML <linux-kernel@vger.kernel.org>,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@kernel.org>
Subject: Re: NMI watchdog dump does not print on hard lockup
Message-ID: <20171016131305.GE6316@tigerII.localdomain>
References: <20171012121658.187c5af6@gandalf.local.home>
 <20171013111444.GB2795@pathway.suse.cz>
 <20171013091857.4afe8a7a@gandalf.local.home>
 <CA+55aFxCnmd8+9qYs1pBG+N3ULQjMOV5S6yZCjA-w_pwc6kXyA@mail.gmail.com>
 <20171016111239.GK2795@pathway.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20171016111239.GK2795@pathway.suse.cz>
User-Agent: Mutt/1.9.1 (2017-09-22)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello,

On (10/16/17 13:12), Petr Mladek wrote:
[..]
> > I think an NMI watchdog should just force the flush - the same way an
> > oops should. Deadlocks aren't really relevant if something doesn't get
> > printed out anyway.
> 
> We expicititely flush the NMI buffers in panic() when there is
> not other way to see them. But it is questional in other situations.
> Sometimes the flush might be the only way to see the messages
> and sometimes printk() might unnecessarily cause a deadlock.
> IMHO, the only solution is to make it optional.

just "brainstorming" it... with some silly ideas.

pushing the data from NMI panic might look like we are replacing one
deadlock scenario with another deadlock scenario. some of the console
drivers are soooo complex internally. so I have been thinking about...
may be we can extend struct console and add ->write_on_panic() and that
handler must be as lockless as possible; so lockless that calling it
from anything that is not panic() is a severe bug.

an absolutely trivial case,
if serial console does

	console_write_cb(struct console *co, const char *s, unsigned int count)
	{
		spin_lock_irqsave(&port->lock, flags);
		uart_console_write(s, count, console_putchar);
		spin_unlock_irqrestore(&port->lock, flags);
	}

then panic callback can look like

	console_write_on_panic_cb(struct console *co, const char *s, unsigned int count)
	{
		/* no, we don't take the port lock here */
		uart_console_write(s, count, console_putchar);
	}

a less trivial case might look more involved. but in general that
write_on_panic() callback must do the absolute minimum of work. so
it's sort of a early console, but as part of normal console driver.

I also got some other serial console crazy ideas, but they are not
related to this topic.

	-ss