From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754544AbcDEFQK (ORCPT <rfc822;w@1wt.eu>);
	Tue, 5 Apr 2016 01:16:10 -0400
Received: from mail-pa0-f46.google.com ([209.85.220.46]:34573 "EHLO
	mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751292AbcDEFQI (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 5 Apr 2016 01:16:08 -0400
Date: Tue, 5 Apr 2016 14:17:17 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        Jan Kara <jack@suse.com>, Petr Mladek <pmladek@suse.com>,
        Tejun Heo <tj@kernel.org>,
        Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
        linux-kernel@vger.kernel.org, Byungchul Park <byungchul.park@lge.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
        Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v10 1/2] printk: Make printk() completely async
Message-ID: <20160405051717.GB1954@swordfish>
References: <1459789048-1337-1-git-send-email-sergey.senozhatsky@gmail.com>
 <1459789048-1337-2-git-send-email-sergey.senozhatsky@gmail.com>
 <20160404155149.a3e3307def2d1315e2099c63@linux-foundation.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160404155149.a3e3307def2d1315e2099c63@linux-foundation.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello Andrew,

On (04/04/16 15:51), Andrew Morton wrote:
[..]
> The whole idea remains worrisome.  It is definitely making printk()
> less reliable in the vast majority of cases: what happens if the
> scheduler is busted or random memory has been scribbled on, etc.

yes.

well, printk, in some sense, already depend on the scheduler: console
semaphore on its own; cond_resched() from console_unlock() with console_sem
being locked, etc. neither memory corruption is something that printk() can
always handle nicely. I saw logbuf_lock corruption and recursive spin_dump
from vprintk_emit(), was quite dramatic.


> All this downside to handle (afaict) one special case.

well, it's not just to make zillions-of-scsi-disks happy. I'm facing
different scenarios, more general ones, a 'moderate printk abuse'
(combined with slow serial console). printk is not always friendly
and shiny, it has some secrets and it can bite easily (lockups, stalls,
starvations, sched throttling, et cetera), and this is not something
that every developer know.

> Surely there is another way?  For example (but feel free to suggest other
> approaches!) can we put some limit on the number of extra characters which
> the printing task will print?  Once that limit is hit, new printk callers
> will spin until they can get in and do some printing themselves.  Or
> something else?

hm... there are not so many options, I think. this busy wait, depending on
the number of CPUs (and some other factors), can provoke mass softlockups
on other CPUs and a number of bad things. printk() and its deferred version
can be called from any context, so in some cases spinning in printk is as
good as doing console_unlock()->call_concosle_drivers() loop (iow, not really
good). things are not so bad (well, Tetsuo has some issues here though) if
printk() is called from non-atomic context, since now we have cond_resched()
in console_unlock() for console_lock()/console_trylock() callers; but printk()
from atomic context is still a problem -- we need to offload the actual
printing, unless we can guarantee that every atomic printk will be followed
by non-atomic printk (which will do the printing).

> > +	printk.synchronous=
..
> Well, it's good that we have this.
> 
> It would be better if it was runtime-controllable - changing boot
> parameters is a bit of a pain.  In fact with this approach, your
> zillions-of-scsi-disks scenario becomes less problematic: do the async
> offloading during the boot process then switch back to the more
> reliable sync printing late in boot.

well, I can add it if you insist.
my personal opinion is to keep it RO; RO->RW transition is easier than
RW->RO. giving the control over printk behaviour to user space can
potentially be even worse than drop_caches. besides I couldn't clearly
understand based on what observations user space may decide to switch
printk back to sync mode? and what may cause user space to switch printk
back from sync to async... lockups in dmesg output? any hint?

..
> > + * When true, printing to console will happen synchronously.
> > + * The default value on UP systems is 'true'.
> 
> That's rather obvious from the code.  Comments should explain "why",
> not "what".

fair enough.

> > +		 * By default we print message to console asynchronously so
> 
> Nit: this comment down here shouldn't know what the default is.  That
> should be documented up at the printk_sync definition site.

ok.

> > +		 * that kernel doesn't get stalled due to slow serial console.
> 
> s/kernel/the kernel/

ok.

> > +		 * requested by kernel parameter, or when console_verbose() was
> 
> s/kernel/a kernel/

ok.

> 
> > +		 * called to print everything during panic / oops.
> 
> We're missing a description of *why* console_verbose() is handled
> specially.

ok.

> > -		if (console_trylock())
> > -			console_unlock();
> > +		if (!in_panic && printk_kthread) {
> 
> We don't really need local variable in_panic.  I guess it has some
> documentary value.

just a shorter version of "console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH".

> > +
> > +	thread = kthread_run(printk_kthread_func, NULL, "printk");
> 
> This gets normal scheduling policy, so a spinning userspace SCHED_FIFO
> task will block printk for ever.  This seems bad.

yes, using SCHED_RR/FIFO policy here makes sense.

> > +late_initcall(init_printk_kthread);
> 
> Could do with a comment explaining why late_initcall was chosen.

late_initcall was chosen because of workqueue early_initcall, and
I just decided not to change it when I switched from wq to a
dedicated printk kthread. late_initcall seemed to be OK. can do
init_printk_kthread() somewhere in init/main start_kernel().

	-ss