From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752339AbeAXCuC (ORCPT <rfc822;w@1wt.eu>);
        Tue, 23 Jan 2018 21:50:02 -0500
Received: from mail-pg0-f52.google.com ([74.125.83.52]:34595 "EHLO
        mail-pg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752055AbeAXCuA (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 23 Jan 2018 21:50:00 -0500
X-Google-Smtp-Source: AH8x225gZQKImbVkQ1vwX/77FeWuaxuX5k0wFcAsT4teNZ8SR1OVKiyCo1E2EnoNhqmrtT4LG5OB2w==
Date: Wed, 24 Jan 2018 11:49:55 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>, Rik van Riel <riel@fb.com>,
        Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>,
        linux-kernel@vger.kernel.org, kernel-team@fb.com,
        Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        Petr Mladek <pmladek@suse.com>
Subject: Re: [PATCH] lockdep: Avoid triggering hardlockup from
 debug_show_all_locks()
Message-ID: <20180124024955.GB651@jagdpanzerIV>
References: <20180122220055.GB1771050@devbig577.frc2.facebook.com>
 <1516734237.31954.17.camel@fb.com>
 <20180123205706.GH1771050@devbig577.frc2.facebook.com>
 <20180123160054.325ff326@gandalf.local.home>
 <20180123211154.GI1771050@devbig577.frc2.facebook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20180123211154.GI1771050@devbig577.frc2.facebook.com>
User-Agent: Mutt/1.9.2 (2017-12-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello,

On (01/23/18 13:11), Tejun Heo wrote:
[..]
> > What about if every printk were to touch NMI watchdog?
> > 
> > NMI watchdog is really there for when the system locks up. If the
> > system is locked up doing printk, at least we see what is happening,
> > and not a total freeze.
> 
> Yeah, that would definitely be a solution.  The downside is that when
> the system completely locks up from printk storm while holding
> critical locks (say, tasklist_lock), the watchdog won't be able to
> reset the system.

Agreed.

It's not only NMI watchdog. RCU also might get stalled by printk.

> I guess the judgement would depend on what one expects of the NMI watchdog,
> but I personally would be happier with printk touching NMI automatically.

In the long term I think I'd rather move printk to a batched mode: printk
for X seconds (depending on watchdog threshold) tops and offload, don't stay
in the same context.

It seems, sometimes, that "offloading will ruin printk" thing might be a
bit exaggerated. IMHO.

	-ss

P.S.
Another problem, and I mentioned it somewhere in another email, is that
upstream printk people don't receive enough [if any at all] feedback from
guys who face printk issues. That's why every time printk_kthread re-surfaces
the reaction is "this is not a real problem, no one is seeing printk issues
like these, you idiot!". It'd be great to have more "we need ABC, because of
XYZ, but printk crashes the system. Here is the backtrace, fix it" reports.
As of now, those things mostly are not reported, that's why people are not
convinced. Just my 5 cents.