Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Eric Biederman <ebiederm@xmission.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jslaby@suse.com>, Pavel Machek <pavel@ucw.cz>,
	Andreas Mohr <andi@lisas.de>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread
Date: Fri, 30 Jun 2017 21:42:24 +0900	[thread overview]
Message-ID: <20170630124224.GA792@jagdpanzerIV.localdomain> (raw)
In-Reply-To: <20170630115457.GE23069@pathway.suse.cz>

Hello,

On (06/30/17 13:54), Petr Mladek wrote:
> > but.....
> > the opposite possibility is that messages either won't be printed
> > soon (until next printk or console_unlock()) or won't be printed
> > ever at all (in case of sudden system death). I don't think it's
> > a good alternative.
> 
> I see it like a weighing machine. There is a "guaranteed" output on
> one side and a softlockups prevention on the other side. The more
> we prevent the softlockups the less we guarantee the output.

I apply a very simple litmus test. if the answer to the question
"so we leave console_unlock() and there are pending messages,
who and when is going to flush the remaining messages?" is
"something sometime in the future" then it's a no-no.

"something sometime in the future" is equal to "no one".

we must stay and continue printing. because it gives the right
answer - "current process and right now. until someone else
(+printk_kthread) takes over".

> We do not have the same opinion about the balance. My solution
> completely prevents softlockups.

not at all costs. especially if we talk about possibility of losing
messages. if we consider such possibility, then let's just
unconditionally drop logbuf entries every time console_lock() is
looping for too long.

> Your one tries to be more conservative.

Linus wrote:

: If those two things aren't the absolutely primary goals, the whole
: thing is pointless to even discuss. No amount of cool features,
: performance, or theoretical deadlock avoidance matters ONE WHIT
: compared to the two things above.

and

Linus wrote:

: But not having any messages at all, because we were trying so hard to
: abstract things out and put them in buffers so that we couldn't
: deadlock with the IO routines, and the timer or workqueue that was
: supposed to do it is never going to happen any more because of the bug
: that is trying to be printed out?
:
: THAT is bad.

I think our priorities should be quite clear here.

> My main unresolved doubts about this patchset are:
> 
> 1. It gives other processes only very small change to take
>    over the job. They either need to call console_trylock()
>    in very small "race" window or they need to call
>    console_lock(). Where console_lock() only signalizes
>    that the caller is willing to take the job and puts
>    him into sleep.

printk_kthread does console_lock().

we may (and need to) improve the retry path in console_unlock().
but we must not leave it until other process locks the console_sem.

>    Another detailed description of this problem can be found
>    in my previous mail, see
>    https://lkml.kernel.org/r/20170628121925.GN1538@pathway.suse.cz
> 
> 
> 2. It adds rather complex dependency on the scheduler. I know
>    that my simplified solution do this as well but another way[*]
>    Let me explain. I would split the dependency on the code
>    and behavior relation.
> 
>    From the code side: The current printk() calls wake_up_process()
>    and we need to use printk_deferred() in the related scheduler code.
>    This patch does this as well, so there is no win and no lose.
>    Well, you talk about changing the affinity and other tricks
>    in the other mails. This might add more locations where
>    printk_deferred() would be needed.

we are in printk_safe all the way through console_offload_printing(),
the context is derived from console_unlock().

why we would need printk_deferred()?

>    From the behavior side: The current code wakes the process
>    and is done. The code in this patch wakes the process and
>    waits until it[**] gets CPU and really runs. It switches to
>    the emergency mode when the other process does not run in time.
>    By other words, this patch depends on more actions done
>    by the scheduler and changes behavior based on it. IMHO,
>    this will be very hard to code, tune, and debug.
>    A proper solution might require more code dependency.
> 
>    [*] My solution depends on the scheduler in the sense
>        that messages will get lost when nobody else will take
>        over the console job.

which is precisely and exactly the thing that we should never
let to happen. there is no _win_, because we _lost_ the messages.

> 3. The prevention of soft-lockups is questionable. If you are in
>    soft-lockup prone situation, the only prevention is to do an
>    offload. But if you switch to the emergency mode and give
>    up offloading too early, the new code stops preventing
>    the softlockup.
> 
>    Of course, the patchset does not make it worse but the question
>    is how much it really helps. It would be bad to add a lot of
>    code/complexity with almost no gain.
> 
> 
> IMHO, if we try to solve the 1st problem (chance of offloading),
> it might add a risk of deadlocks and/or make the 2nd problem
> (dependency on scheduler) worse. Also I am afraid that we would
> repeat many dead ways already tried by Jan Kara.

what deadlock?

> If you will try to improve 3rd problem and make some guaranties
> of the soft-lockup prevention, it would make the 2nd problem
> (dependency on scheduler) worse. Also the code might be
> very hard to understand and tune.
> 
> 
> This is why I look for a rather simple solution. IMHO, we both
> agree that:
> 
>    +  the offload will be activated only when there is
>       a flood of messages
> 
>    + the only reason to wait for the other handler is to
>      better handle sudden death where panic() is not called.
> 
> IMHO, the only one who brought the problem of sudden death
> was Pavel Machek.

we gave up on printk-async. the last bug report was titled "blah blah
missing backtrace". and I really would rather prefer to see that
backtrace + soft lockup or even hard lockup. still would have been
better than seeing nothing at all.

	-ss

> AFAIK, he works on embedded systems and hardware enablement.
> I guess the combination of the flood of messages and sudden
> death is rare there. Also I doubt that the current code handle
> it well. The flood is badly handed in general. In each case,
> I wonder how long we could survive flushing messages when there
> is sudden death and scheduling does not work.
> 
> One problem here is that some questions/doubts are hard to
> answer/prove without wide testing.
> 
> A compromise might be to start with the simple code
> and disable the offloading by default. I am sure that
> there will be volunteers that would want to play with it,
> e.g. Tetsuo. We would enable it in SUSE as well because
> there should not be any regression against what we have
> used for years now. We could make it always more complex
> according to the feedback and eventually enable it
> by default.
> 
> Best Regards,
> Petr
>