From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751858AbdDNEmh (ORCPT <rfc822;w@1wt.eu>);
        Fri, 14 Apr 2017 00:42:37 -0400
Received: from mail-pf0-f196.google.com ([209.85.192.196]:32943 "EHLO
        mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1750747AbdDNEme (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 14 Apr 2017 00:42:34 -0400
Date: Fri, 14 Apr 2017 13:42:35 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
        Pavel Machek <pavel@ucw.cz>, Jan Kara <jack@suse.cz>,
        "Eric W. Biederman" <ebiederm@xmission.com>,
        Ye Xiaolong <xiaolong.ye@intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J . Wysocki" <rjw@rjwysocki.net>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Jiri Slaby <jslaby@suse.com>, Len Brown <len.brown@intel.com>,
        linux-kernel@vger.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [printk]  fbc14616f4:
 BUG:kernel_reboot-without-warning_in_test_stage
Message-ID: <20170414044235.GA462@jagdpanzerIV.localdomain>
References: <20170407081449.GA12859@amd>
 <20170407121021.GA379@jagdpanzerIV.localdomain>
 <20170407124455.GC4756@amd>
 <20170407151306.GA384@tigerII.localdomain>
 <20170409101230.GB27363@amd>
 <20170410045339.GB2793@jagdpanzerIV.localdomain>
 <20170410184858.GA24226@amd>
 <20170411014635.GA3836@jagdpanzerIV.localdomain>
 <20170411161953.GA447@tigerII.localdomain>
 <20170413140321.GH16938@pathway.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170413140321.GH16938@pathway.suse.cz>
User-Agent: Mutt/1.8.0 (2017-02-23)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello Petr,

thanks for taking a look!

On (04/13/17 16:03), Petr Mladek wrote:
> > +static inline bool console_offload_printing(void)
> > +{
> > +	static struct task_struct *printing_task = NULL;
> > +	static unsigned long lines_printed = 0;
> > +
> > +	if (!atomic_print_limit || !printk_kthread_enabled())
> > +		return false;
> > +
> > +	/* We rescheduled - reset the counters. */
> > +	if (printing_task != current) {
> > +		lines_printed = 0;
> > +		printing_task = current;
> > +		return false;
> > +	}
> 
> If we want to check that the process rescheduled, we should
> store/check also current->nvcsw + current->nivcsw.

ok.

[..]
> My only fear is that it is getting more and more complicated.
> On the other hand, any partial solution is asking for
> troubles and complains.

yeah. we have to aim slightly different and conflicting targets - introducing
a new printk behavior, while preserving an already existing guarantees. which
is a bit tricky.


[..]
> > +	if (lines_printed > 2 * (unsigned long)atomic_print_limit) {
> > +		printk_enforce_emergency = true;
> > +		pr_crit("Declaring printk emergency mode.\n");
> > +		return false;
> > +	}
> 
> The only messages that are printed on my workstation are the same
> few lines everytime I connect my phone over USB to get it charged.

you are right. this is a known and yet to be resolved issue.


> It might help to check the number of process switch counts as
> suggested above.

will take a look at your 'current->nvcsw + current->nivcsw' idea.
thanks.


[..]
> > +		/*
> > +		 * Sometimes we may lock console_sem before printk_kthread.
> > +		 * In this case we will jump to `again' label (if there are
> > +		 * pending messages), print one more line from current
> > +		 * process, break out of printing loop (we don't reset the
> > +		 * counter of printed lines) and do up_console_sem() to
> > +		 * wakeup printk_kthread again.
> > +		 *
> > +		 * If printk_kthread never wakes up (which may indicate that
> > +		 * the system is unstable or something weird is going on),
> > +		 * then we will keep jumping to `again' label and printing
> > +		 * one message from the logbuf. This is a bit ugly, but at
> > +		 * least we will print out the logbuf.
> > +		 *
> > +		 * If such condition occurs, console_offload_printing() can
> > +		 * declare `printk_emergency' at some point.
> 
> I am a bit confused by the comment above. The again goto target is
> used only when there is a race between leaving the loop and releasing
> the console_sem. It is a corner case.

not really.

this is the part where "preserve printk guarantees" jumps in.

when we limit the number of lines we can print we have to leave this loop
with not fully flushed logbuf. so `goto again' is not solely for corner
case anymore. when we prematurely leave the printing loop, we wake_up
printk_kthread, unlock console_sem... and then we have no idea if
printk_kthread going to wake_up at all, and, if it's going to, how much
time will it take. at the same time we have a task that is already in
console_unlock() and, probably, we still have pending messages in the logbuf.
that's why the process that just has left the printing loop [and there easily
might be pending messages in the logbuf] does the whole 'retry' thing. we can
have a misbehaving high priority process or something, that would prevent
printk_kthread from becoming running just when we need it. so, at least
sometimes, the printing process (the one that breaks ouf of printing loop
and wakes up printk_kthread) can re-acquire console_sem and print one more
line, then it up() console_sem, which, hopefully, will wake_up printk_kthread.
if printk_kthread did become running then if would be in console_sem wait
list at this point. if it didn't - then we

	a) wake up some other process that is probably in console_sem list
	   (hopefully there is one)

or

	b) continue printing from the current process. because printk_kthread
	   is still not running and there are no other processes that want to
	   console_lock(). not much we can do at this point.


so in expected/normal scenario, we fail to re-acquire the console_sem lock
(console_trylock()), which means that either printk_kthread or some other
process from console_sem wait list acquired the console_sem and will take
over printing.


I do something like this

---
@@ -2427,6 +2427,8 @@ void console_unlock(void)
                console_seq++;
                raw_spin_unlock(&logbuf_lock);
 
+               sprintf(text + 7, "{%s}", current->comm);
+
                stop_critical_timings();        /* don't trace print latency */
                call_console_drivers(ext_text, ext_len, text, len);
                start_critical_timings();
---


and fire up some silly printk tests (I don't care what exactly it prints,
I'm curious what process prints it). it sort of makes it easier to observe
the behaviour.


> If there are messages from other CPUs, they most likely appear during
> the slow call_console_drivers(). They are handled inside the
> for(;;) cycle unless we reach the atomic_print_limit and
> wake up the printk_kthread.

but `atomic_print_limit' can be quite restrictive. we break out of the
printing loop then `atomic_print_limit' tells us to do so, not when we
the logbuf is empty.


[..]
> > @@ -2276,6 +2445,8 @@ void console_unlock(void)
> >  	 */
> >  	raw_spin_lock(&logbuf_lock);
> >  	retry = console_seq != log_next_seq;
> 
> This should be:
> 
> 	retry = console_seq != log_next_seq && !did_offload;
> 
> Otherwise, it would never leave.

it is expected to leave at some point, when we would know for sure that
a successful offloading took place. otherwise we probably shouldn't leave.
we invoked wake_up_process(), but that does not really buy us anything
(from printk guarantees POV). we trust one thing only - a process holding
the console_sem lock. because that's "a fact"; wake_up_process() is nothing
but "a promise".


so, in short, the basic idea is:
- the worse the situation is, the closer printk behavior to the original one.


the bigger the scheduling latencies are, the more time we spent
in console_unlock(), printing more than `atomic_print_limit' chars.


in more details:

once we cross the `atomic_print_limit' we start this thing


@again label:

1) call_console_drivers()	- print a single logbuf entry

1) up()				- "somebody please take over" (we expect that at
				  least one process will do this - printk_kthread.
				  but we don't know when, and not entirely sure
				  if it will. so we are increasing our chances
				  by waking up printk_kthread, but we don't
				  completely count on it).

2) console_trylock()		- if successful
					then "eeehhh, OK... I'll print one more line"
					goto @again
				  else
					return, someone console_lock()-ed

yes, in absolutely bad scenario -- even printk_kthread can't lock
console_sem - this means old printk behaviour. but that's sort of exactly
what we want to do with the printk_emergency_begin/end annotations anyway.
in exchange we print out the logbuf eventually.


I'm trying to find a compromise that would make everyone happy. may be
I'm missing an easier/better solution, wouldn't be the first time ever :)

	-ss