From: Steven Rostedt <rostedt@goodmis.org>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Tejun Heo <tj@kernel.org>, Petr Mladek <pmladek@suse.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
Cong Wang <xiyou.wangcong@gmail.com>,
Dave Hansen <dave.hansen@intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jan Kara <jack@suse.cz>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
rostedt@rostedt.homelinux.com,
Byungchul Park <byungchul.park@lge.com>,
Pavel Machek <pavel@ucw.cz>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup
Date: Sun, 21 Jan 2018 16:04:41 -0500 [thread overview]
Message-ID: <20180121160441.7ea4b6d9@gandalf.local.home> (raw)
In-Reply-To: <20180121141521.GA429@tigerII.localdomain>
On Sun, 21 Jan 2018 23:15:21 +0900
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> so.... fix the console drivers ;)
Totally agree!
>
>
>
>
> just kidding. ok...
Darn it! ;-)
> the problem is that we flush printk_safe right when console_unlock() printing
> loop enables local IRQs via printk_safe_exit_irqrestore() [given that IRQs
> were enabled in the first place when the CPU went to console_unlock()].
> this forces that CPU to loop in console_unlock() as long as we have
> printk-s coming from call_console_drivers(). but we probably can postpone
> printk_safe flush. basically, we can declare a new rule - we don't flush
> printk_safe buffer as long as console_sem is locked. because this is how
> that printing CPU stuck in the console_unlock() printing loop. printk_safe
> buffer is very important when it comes to storing a non-repetitive stuff, like
> a lockdep splat, which is a single shot event. but the more repetitive the
> message is, like millions of similar kmalloc() dump_stack()-s over and over
> again, the less value in it. we should have printk_safe buffer big enough for
> important info, like a lockdep splat, but millions of similar kmalloc()
> messages are pretty invaluable - one is already enough, we can drop the rest.
> and we should not flush new messages while there is a CPU looping in
> console_unlock(), because it already has messages to print, which were
> log_store()-ed the normal way.
The above is really hard to read without any capitalization. Everything
seems to be a run-on sentence and gives me a head ache. So you lost me
there.
>
> this is where the "postpone thing" jumps in. so how do we postpone printk_safe
> flush.
>
> we can't console_trylock()/console_unlock() in printk_safe flush code.
> but there is a `console_locked' flag and is_console_locked() function which
> tell us if the console_sem is locked. as long as we are in console_unlock()
> printing loop that flag is set, even if we enabled local IRQs and printk_safe
> flush work arrived. so now printk_safe flush does extra check and does
> not flush printk_safe buffer content as long as someone is currently
> printing or soon will start printing. but we need to take extra step and
> to re-queue flush on CPUs that did postpone it [console_unlock() can
> reschedule]. so now we flush only when printing CPU printed all pending
> logbuf messages, hit the "console_seq == log_next_seq" and up()
> console_sem. this sets a boundary -- no matter how many times during the
> current printing loop we called console drivers and how many times those
> drivers caused printk recursion, we will flush only SAFE_LOG_BUF_LEN chars.
Another big paragraph with no capitals (besides macros and CPU ;-)
I guess this is what it is like when people listen to me talk too fast.
>
>
> IOW, what we have now, looks like this:
>
> a) printk_safe is for important stuff, we don't guarantee that a flood
> of messages will be preserved.
>
> b) we extend the previously existing "will flush messages later on from
> a safer context" and now we also consider console_unlock() printing loop
> as unsafe context. so the unsafe context it's not only the one that can
> deadlock, but also the one that can lockup CPU in a printing loop because
> of recursive printk messages.
Sure.
>
>
> so this
>
> printk
> console_unlock
> {
> for (;;) {
> call_console_drivers
> net_console
> printk
> printk_save -> irq_work queue
>
> IRQ work
> prink_safe_flush
> printk_deferred -> log_store()
> iret
> }
> up();
> }
>
>
> // which can never break out, because we can always append new messages
> // from prink_safe_flush.
>
> becomes this
>
> printk
> console_unlock
> {
> for (;;) {
> call_console_drivers
> net_console
> printk
> printk_save -> irq_work queue
>
> }
> up();
>
> IRQ work
> prink_safe_flush
> printk_deferred -> log_store()
> iret
> }
But we do eventually send this data out to the consoles, and if the
consoles cause more printks, wouldn't this still never end?
>
>
>
> something completely untested, sketchy and ugly.
>
> ---
>
> kernel/printk/internal.h | 2 ++
> kernel/printk/printk.c | 1 +
> kernel/printk/printk_safe.c | 37 +++++++++++++++++++++++++++++++++++--
> 3 files changed, 38 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
> index 2a7d04049af4..e85517818a49 100644
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
> @@ -30,6 +30,8 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args);
> void __printk_safe_enter(void);
> void __printk_safe_exit(void);
>
> +void printk_safe_requeue_flushing(void);
> +
> #define printk_safe_enter_irqsave(flags) \
> do { \
> local_irq_save(flags); \
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 9cb943c90d98..7aca23e8d7b2 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2428,6 +2428,7 @@ void console_unlock(void)
> raw_spin_lock(&logbuf_lock);
> retry = console_seq != log_next_seq;
> raw_spin_unlock(&logbuf_lock);
> + printk_safe_requeue_flushing();
> printk_safe_exit_irqrestore(flags);
>
> if (retry && console_trylock())
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index 3e3c2004bb23..45d5b292d7e1 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -22,6 +22,7 @@
> #include <linux/cpumask.h>
> #include <linux/irq_work.h>
> #include <linux/printk.h>
> +#include <linux/console.h>
>
> #include "internal.h"
>
> @@ -51,6 +52,7 @@ struct printk_safe_seq_buf {
> atomic_t message_lost;
> struct irq_work work; /* IRQ work that flushes the buffer */
> unsigned char buffer[SAFE_LOG_BUF_LEN];
> + bool need_requeue;
> };
>
> static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
> @@ -196,6 +198,7 @@ static void __printk_safe_flush(struct irq_work *work)
> size_t len;
> int i;
>
> + s->need_requeue = false;
> /*
> * The lock has two functions. First, one reader has to flush all
> * available message to make the lockless synchronization with
> @@ -243,6 +246,36 @@ static void __printk_safe_flush(struct irq_work *work)
> raw_spin_unlock_irqrestore(&read_lock, flags);
> }
>
> +/* NMI buffers are always flushed */
> +static void flush_nmi_buffer(struct irq_work *work)
> +{
> + __printk_safe_flush(work);
> +}
> +
> +/* printk_safe buffers flushing, on the contrary, can be postponed */
> +static void flush_printk_safe_buffer(struct irq_work *work)
> +{
> + struct printk_safe_seq_buf *s =
> + container_of(work, struct printk_safe_seq_buf, work);
> +
> + if (is_console_locked()) {
> + s->need_requeue = true;
> + return;
> + }
> +
> + __printk_safe_flush(work);
> +}
> +
> +void printk_safe_requeue_flushing(void)
> +{
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + if (per_cpu(safe_print_seq, cpu).need_requeue)
> + queue_flush_work(&per_cpu(safe_print_seq, cpu));
> + }
> +}
> +
> /**
> * printk_safe_flush - flush all per-cpu nmi buffers.
> *
> @@ -387,11 +420,11 @@ void __init printk_safe_init(void)
> struct printk_safe_seq_buf *s;
>
> s = &per_cpu(safe_print_seq, cpu);
> - init_irq_work(&s->work, __printk_safe_flush);
> + init_irq_work(&s->work, flush_printk_safe_buffer);
>
> #ifdef CONFIG_PRINTK_NMI
> s = &per_cpu(nmi_print_seq, cpu);
> - init_irq_work(&s->work, __printk_safe_flush);
> + init_irq_work(&s->work, flush_nmi_buffer);
> #endif
> }
>
> ---
>
>
>
> > > lose all of them now? then we can do a much simpler thing - steal one
> > > bit from `printk_context' and use if for a new PRINTK_NOOP_CONTEXT, which
> > > will be set around call_console_drivers(). vprintk_func() would redirect
> > > printks to vprintk_noop(fmt, args), which will do nothing.
> >
> > Not sure what you mean here. Have some pseudo code to demonstrate with?
>
> sure, I meant that if we want to disable printk recursion from
> call_console_drivers(), then we can add another printk_safe section, say
> printk_noop_begin()/printk_noop_end(), which would set a PRINTK_NOOP
> bit of `printk_context', so when we have printk() under PRINTK_NOOP
> then vprintk_func() goes to a special vprintk_noop(fmt, args), which
> simply drops the message [does not store any in the per-cpu printk
> safe buffer, so we don't flush it and don't add new messages to the
> logbuf]. and we annotate call_console_drivers() as a pintk_noop
> function. but that a no-brainer and I'd prefer to have another solution.
>
Another big paragraph without caps, but I figured it out.
I say we try that solution and see if it fixes the current issues.
Because right now, the bug I see Tejun presented was if something in
printk causes printks, it will start a printk bomb and lock up the
system. The only reasonable answer I see to that is to throttle printk
in such a case.
-- Steve
next prev parent reply other threads:[~2018-01-21 21:04 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-10 13:24 [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Petr Mladek
2018-01-10 13:24 ` [PATCH v5 1/2] printk: Add console owner and waiter logic to load balance console writes Petr Mladek
2018-01-10 16:50 ` Steven Rostedt
2018-01-12 16:54 ` Steven Rostedt
2018-01-12 17:11 ` Steven Rostedt
2018-01-17 19:13 ` Rasmus Villemoes
2018-01-17 19:33 ` Steven Rostedt
2018-01-19 9:51 ` Sergey Senozhatsky
2018-01-18 22:03 ` Pavel Machek
2018-01-19 0:20 ` Steven Rostedt
2018-01-17 2:19 ` Byungchul Park
2018-01-17 4:54 ` Byungchul Park
2018-01-17 7:34 ` Byungchul Park
2018-01-17 12:04 ` Petr Mladek
2018-01-18 1:53 ` Byungchul Park
2018-01-18 1:57 ` Byungchul Park
2018-01-18 2:19 ` Steven Rostedt
2018-01-18 4:01 ` Byungchul Park
2018-01-18 15:21 ` Steven Rostedt
2018-01-19 2:37 ` Byungchul Park
2018-01-19 3:27 ` Steven Rostedt
2018-01-22 2:31 ` Byungchul Park
2018-01-10 13:24 ` [PATCH v5 2/2] printk: Hide console waiter logic into helpers Petr Mladek
2018-01-10 17:52 ` Steven Rostedt
2018-01-11 12:03 ` Petr Mladek
2018-01-12 15:37 ` Steven Rostedt
2018-01-12 16:08 ` Petr Mladek
2018-01-12 16:36 ` Steven Rostedt
2018-01-15 16:08 ` Petr Mladek
2018-01-16 5:05 ` Sergey Senozhatsky
2018-01-10 14:05 ` [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Tejun Heo
2018-01-10 16:29 ` Petr Mladek
2018-01-10 17:02 ` Tejun Heo
2018-01-10 18:21 ` Peter Zijlstra
2018-01-10 18:30 ` Tejun Heo
2018-01-10 18:41 ` Peter Zijlstra
2018-01-10 19:05 ` Tejun Heo
2018-01-11 5:15 ` Sergey Senozhatsky
2018-01-10 18:22 ` Steven Rostedt
2018-01-10 18:36 ` Tejun Heo
2018-01-10 18:40 ` Mathieu Desnoyers
2018-01-11 7:36 ` Sergey Senozhatsky
2018-01-11 11:24 ` Petr Mladek
2018-01-11 13:19 ` Sergey Senozhatsky
2018-01-24 9:36 ` Peter Zijlstra
2018-01-24 18:46 ` Tejun Heo
2018-05-09 8:58 ` Sergey Senozhatsky
2018-01-10 18:54 ` Steven Rostedt
2018-01-11 5:10 ` Sergey Senozhatsky
2018-01-10 18:05 ` Steven Rostedt
2018-01-10 18:12 ` Tejun Heo
2018-01-10 18:14 ` Tejun Heo
2018-01-10 18:45 ` Steven Rostedt
2018-01-10 18:41 ` Steven Rostedt
2018-01-10 18:57 ` Tejun Heo
2018-01-10 19:17 ` Steven Rostedt
2018-01-10 19:34 ` Tejun Heo
2018-01-10 19:44 ` Steven Rostedt
2018-01-10 22:44 ` Tejun Heo
2018-01-11 5:35 ` Sergey Senozhatsky
2018-01-11 4:58 ` Sergey Senozhatsky
2018-01-11 9:34 ` Petr Mladek
2018-01-11 10:38 ` Sergey Senozhatsky
2018-01-11 11:50 ` Petr Mladek
2018-01-11 16:29 ` Steven Rostedt
2018-01-12 1:30 ` Steven Rostedt
2018-01-12 2:55 ` Steven Rostedt
2018-01-12 4:20 ` Steven Rostedt
2018-01-16 19:44 ` Tejun Heo
2018-01-17 9:12 ` Petr Mladek
2018-01-17 15:15 ` Tejun Heo
2018-01-17 17:12 ` Steven Rostedt
2018-01-17 18:42 ` Steven Rostedt
2018-01-19 18:20 ` Steven Rostedt
2018-01-20 7:14 ` Sergey Senozhatsky
2018-01-20 15:49 ` Steven Rostedt
2018-01-21 14:15 ` Sergey Senozhatsky
2018-01-21 21:04 ` Steven Rostedt [this message]
2018-01-22 8:56 ` Sergey Senozhatsky
2018-01-22 10:28 ` Sergey Senozhatsky
2018-01-22 10:36 ` Sergey Senozhatsky
2018-01-23 6:40 ` Sergey Senozhatsky
2018-01-23 7:05 ` Sergey Senozhatsky
2018-01-23 7:31 ` Sergey Senozhatsky
2018-01-23 14:56 ` Steven Rostedt
2018-01-23 15:21 ` Sergey Senozhatsky
2018-01-23 15:41 ` Steven Rostedt
2018-01-23 15:43 ` Tejun Heo
2018-01-23 16:12 ` Sergey Senozhatsky
2018-01-23 16:13 ` Steven Rostedt
2018-01-23 17:21 ` Tejun Heo
2018-04-23 5:35 ` Sergey Senozhatsky
2018-01-23 16:01 ` Sergey Senozhatsky
2018-01-23 16:24 ` Steven Rostedt
2018-01-24 2:11 ` Sergey Senozhatsky
2018-01-24 2:52 ` Steven Rostedt
2018-01-24 4:44 ` Sergey Senozhatsky
2018-01-23 17:22 ` Tejun Heo
2018-01-20 12:19 ` Tejun Heo
2018-01-20 14:51 ` Steven Rostedt
2018-01-17 20:05 ` Tejun Heo
2018-01-18 5:43 ` Sergey Senozhatsky
2018-01-18 11:51 ` Petr Mladek
2018-01-18 5:42 ` Sergey Senozhatsky
2018-01-12 3:12 ` Sergey Senozhatsky
2018-01-12 2:56 ` Sergey Senozhatsky
2018-01-12 3:21 ` Steven Rostedt
2018-01-12 10:05 ` Sergey Senozhatsky
2018-01-12 12:21 ` Steven Rostedt
2018-01-12 12:55 ` Petr Mladek
2018-01-13 7:31 ` Sergey Senozhatsky
2018-01-15 8:51 ` Petr Mladek
2018-01-15 9:48 ` Sergey Senozhatsky
2018-01-16 5:16 ` Sergey Senozhatsky
2018-01-16 9:08 ` Petr Mladek
2018-01-15 12:08 ` Steven Rostedt
2018-01-16 4:51 ` Sergey Senozhatsky
2018-01-13 7:28 ` Sergey Senozhatsky
2018-01-15 10:17 ` Petr Mladek
2018-01-15 11:50 ` Petr Mladek
2018-01-16 6:10 ` Sergey Senozhatsky
2018-01-16 9:36 ` Petr Mladek
2018-01-16 10:10 ` Sergey Senozhatsky
2018-01-16 16:06 ` Steven Rostedt
2018-01-16 5:23 ` Sergey Senozhatsky
2018-01-15 12:06 ` Steven Rostedt
2018-01-15 14:45 ` Petr Mladek
2018-01-16 2:23 ` Sergey Senozhatsky
2018-01-16 4:47 ` Sergey Senozhatsky
2018-01-16 10:19 ` Petr Mladek
2018-01-17 2:24 ` Sergey Senozhatsky
2018-01-16 15:45 ` Steven Rostedt
2018-01-17 2:18 ` Sergey Senozhatsky
2018-01-17 13:04 ` Petr Mladek
2018-01-17 15:24 ` Steven Rostedt
2018-01-18 4:31 ` Sergey Senozhatsky
2018-01-18 15:22 ` Steven Rostedt
2018-01-16 10:13 ` Petr Mladek
2018-01-17 6:29 ` Sergey Senozhatsky
2018-01-16 1:46 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180121160441.7ea4b6d9@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=byungchul.park@lge.com \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=pavel@ucw.cz \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@rostedt.homelinux.com \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).