linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Tejun Heo <tj@kernel.org>, Petr Mladek <pmladek@suse.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	rostedt@rostedt.homelinux.com,
	Byungchul Park <byungchul.park@lge.com>,
	Pavel Machek <pavel@ucw.cz>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup
Date: Sun, 21 Jan 2018 23:15:21 +0900	[thread overview]
Message-ID: <20180121141521.GA429@tigerII.localdomain> (raw)
In-Reply-To: <20180120104931.1942483e@gandalf.local.home>

On (01/20/18 10:49), Steven Rostedt wrote:
[..]
> > printks from console_unlock()->call_console_drivers() are redirected
> > to printk_safe buffer. we need irq_work on that CPU to flush its
> > printk_safe buffer.
> 
> So is the issue that we keep triggering this irq work then? Then this
> solution does seem to be one that would work. Because after x amount of
> recursive printks (printk called by printk) it would just stop printing
> them, and end the irq work.
> 
> Perhaps what Tejun is seeing is:
> 
>  printk()
>    net_console()
>      printk() --> redirected to irq work
> 
>  <irq work>
>   printk
>     net_console()
>       printk() --> redirected to another irq work
> 
> and so on and so on.

it's a bit trickier than that, I think.

we have printk recursion from console drivers. it's redirected to
printk_safe and we queue an IRQ work to flush the buffer

 printk
  console_unlock
   call_console_drivers
    net_console
     printk
      printk_save -> irq_work queue

now console_unlock() enables local IRQs, we have the printk_safe
flush. but printk_safe flush does not call into the console_unlock(),
it uses printk_deferred() version of printk

IRQ work

 prink_safe_flush
  printk_deferred -> irq_work queue


so we schedule another IRQ work (deferred printk work), which eventually
tries to lock console_sem

IRQ work
 wake_up_klogd_work_func()
  if (console_trylock())
   console_unlock()

if it succeeds then it goes to console_unlock(), where console driver
can cause another printk recursion. but, once again, it will be
redirected to printk_safe buffer first. if it fails then we have either
the original CPU to print out those irq_work messages, which is sort
of bad, or another CPU which already acquired the console_sem and will
print out.

> This solution would need to be tweaked to add a timer to allow only so
> many nested printks in a given time. Otherwise it too would be an issue:
[..]
> > how are we going to distinguish between lockdep splats, for instance,
> > or WARNs from call_console_drivers() -> foo_write(), which are valuable,
> > and kmalloc() print outs, which might be less valuable? are we going to
> 
> The problem is that printk causing more printks is extremely dangerous,
> and ANY printk that is caused by a printk is of equal value, whether it
> is a console driver running out of memory or a lockdep splat. And
> the chances of having two hit at the same time is extremely low.

so.... fix the console drivers ;)




just kidding. ok...
the problem is that we flush printk_safe right when console_unlock() printing
loop enables local IRQs via printk_safe_exit_irqrestore() [given that IRQs
were enabled in the first place when the CPU went to console_unlock()].
this forces that CPU to loop in console_unlock() as long as we have
printk-s coming from call_console_drivers(). but we probably can postpone
printk_safe flush. basically, we can declare a new rule - we don't flush
printk_safe buffer as long as console_sem is locked. because this is how
that printing CPU stuck in the console_unlock() printing loop. printk_safe
buffer is very important when it comes to storing a non-repetitive stuff, like
a lockdep splat, which is a single shot event. but the more repetitive the
message is, like millions of similar kmalloc() dump_stack()-s over and over
again, the less value in it. we should have printk_safe buffer big enough for
important info, like a lockdep splat, but millions of similar kmalloc()
messages are pretty invaluable - one is already enough, we can drop the rest.
and we should not flush new messages while there is a CPU looping in
console_unlock(), because it already has messages to print, which were
log_store()-ed the normal way.

this is where the "postpone thing" jumps in. so how do we postpone printk_safe
flush.

we can't console_trylock()/console_unlock() in printk_safe flush code.
but there is a `console_locked' flag and is_console_locked() function which
tell us if the console_sem is locked. as long as we are in console_unlock()
printing loop that flag is set, even if we enabled local IRQs and printk_safe
flush work arrived. so now printk_safe flush does extra check and does
not flush printk_safe buffer content as long as someone is currently
printing or soon will start printing. but we need to take extra step and
to re-queue flush on CPUs that did postpone it [console_unlock() can
reschedule]. so now we flush only when printing CPU printed all pending
logbuf messages, hit the "console_seq == log_next_seq" and up()
console_sem. this sets a boundary -- no matter how many times during the
current printing loop we called console drivers and how many times those
drivers caused printk recursion, we will flush only SAFE_LOG_BUF_LEN chars.


IOW, what we have now, looks like this:

a) printk_safe is for important stuff, we don't guarantee that a flood
   of messages will be preserved.

b) we extend the previously existing "will flush messages later on from
   a safer context" and now we also consider console_unlock() printing loop
   as unsafe context. so the unsafe context it's not only the one that can
   deadlock, but also the one that can lockup CPU in a printing loop because
   of recursive printk messages.


so this

 printk
  console_unlock
  {
   for (;;) {
     call_console_drivers
      net_console
       printk
        printk_save -> irq_work queue

	   IRQ work
	     prink_safe_flush
	       printk_deferred -> log_store()
           iret
    }
    up();
  }


   // which can never break out, because we can always append new messages
   // from prink_safe_flush.

becomes this

printk
  console_unlock
  {
   for (;;) {
     call_console_drivers
      net_console
       printk
        printk_save -> irq_work queue

    }
    up();

  IRQ work
   prink_safe_flush
    printk_deferred -> log_store()
  iret
}



something completely untested, sketchy and ugly.

---

 kernel/printk/internal.h    |  2 ++
 kernel/printk/printk.c      |  1 +
 kernel/printk/printk_safe.c | 37 +++++++++++++++++++++++++++++++++++--
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 2a7d04049af4..e85517818a49 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -30,6 +30,8 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args);
 void __printk_safe_enter(void);
 void __printk_safe_exit(void);
 
+void printk_safe_requeue_flushing(void);
+
 #define printk_safe_enter_irqsave(flags)	\
 	do {					\
 		local_irq_save(flags);		\
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9cb943c90d98..7aca23e8d7b2 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2428,6 +2428,7 @@ void console_unlock(void)
 	raw_spin_lock(&logbuf_lock);
 	retry = console_seq != log_next_seq;
 	raw_spin_unlock(&logbuf_lock);
+	printk_safe_requeue_flushing();
 	printk_safe_exit_irqrestore(flags);
 
 	if (retry && console_trylock())
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 3e3c2004bb23..45d5b292d7e1 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -22,6 +22,7 @@
 #include <linux/cpumask.h>
 #include <linux/irq_work.h>
 #include <linux/printk.h>
+#include <linux/console.h>
 
 #include "internal.h"
 
@@ -51,6 +52,7 @@ struct printk_safe_seq_buf {
 	atomic_t		message_lost;
 	struct irq_work		work;	/* IRQ work that flushes the buffer */
 	unsigned char		buffer[SAFE_LOG_BUF_LEN];
+	bool			need_requeue;
 };
 
 static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
@@ -196,6 +198,7 @@ static void __printk_safe_flush(struct irq_work *work)
 	size_t len;
 	int i;
 
+	s->need_requeue = false;
 	/*
 	 * The lock has two functions. First, one reader has to flush all
 	 * available message to make the lockless synchronization with
@@ -243,6 +246,36 @@ static void __printk_safe_flush(struct irq_work *work)
 	raw_spin_unlock_irqrestore(&read_lock, flags);
 }
 
+/* NMI buffers are always flushed */
+static void flush_nmi_buffer(struct irq_work *work)
+{
+	__printk_safe_flush(work);
+}
+
+/* printk_safe buffers flushing, on the contrary, can be postponed */
+static void flush_printk_safe_buffer(struct irq_work *work)
+{
+	struct printk_safe_seq_buf *s =
+		container_of(work, struct printk_safe_seq_buf, work);
+
+	if (is_console_locked()) {
+		s->need_requeue = true;
+		return;
+	}
+
+	__printk_safe_flush(work);
+}
+
+void printk_safe_requeue_flushing(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		if (per_cpu(safe_print_seq, cpu).need_requeue)
+			queue_flush_work(&per_cpu(safe_print_seq, cpu));
+	}
+}
+
 /**
  * printk_safe_flush - flush all per-cpu nmi buffers.
  *
@@ -387,11 +420,11 @@ void __init printk_safe_init(void)
 		struct printk_safe_seq_buf *s;
 
 		s = &per_cpu(safe_print_seq, cpu);
-		init_irq_work(&s->work, __printk_safe_flush);
+		init_irq_work(&s->work, flush_printk_safe_buffer);
 
 #ifdef CONFIG_PRINTK_NMI
 		s = &per_cpu(nmi_print_seq, cpu);
-		init_irq_work(&s->work, __printk_safe_flush);
+		init_irq_work(&s->work, flush_nmi_buffer);
 #endif
 	}
 
---



> > lose all of them now? then we can do a much simpler thing - steal one
> > bit from `printk_context' and use if for a new PRINTK_NOOP_CONTEXT, which
> > will be set around call_console_drivers(). vprintk_func() would redirect
> > printks to vprintk_noop(fmt, args), which will do nothing.
> 
> Not sure what you mean here. Have some pseudo code to demonstrate with?

sure, I meant that if we want to disable printk recursion from
call_console_drivers(), then we can add another printk_safe section, say
printk_noop_begin()/printk_noop_end(), which would set a PRINTK_NOOP
bit of `printk_context', so when we have printk() under PRINTK_NOOP
then vprintk_func() goes to a special vprintk_noop(fmt, args), which
simply drops the message [does not store any in the per-cpu printk
safe buffer, so we don't flush it and don't add new messages to the
logbuf]. and we annotate call_console_drivers() as a pintk_noop
function. but that a no-brainer and I'd prefer to have another solution.

	-ss

  reply	other threads:[~2018-01-21 14:15 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10 13:24 [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Petr Mladek
2018-01-10 13:24 ` [PATCH v5 1/2] printk: Add console owner and waiter logic to load balance console writes Petr Mladek
2018-01-10 16:50   ` Steven Rostedt
2018-01-12 16:54   ` Steven Rostedt
2018-01-12 17:11     ` Steven Rostedt
2018-01-17 19:13       ` Rasmus Villemoes
2018-01-17 19:33         ` Steven Rostedt
2018-01-19  9:51         ` Sergey Senozhatsky
2018-01-18 22:03     ` Pavel Machek
2018-01-19  0:20       ` Steven Rostedt
2018-01-17  2:19   ` Byungchul Park
2018-01-17  4:54     ` Byungchul Park
2018-01-17  7:34     ` Byungchul Park
2018-01-17 12:04     ` Petr Mladek
2018-01-18  1:53       ` Byungchul Park
2018-01-18  1:57         ` Byungchul Park
2018-01-18  2:19         ` Steven Rostedt
2018-01-18  4:01           ` Byungchul Park
2018-01-18 15:21             ` Steven Rostedt
2018-01-19  2:37               ` Byungchul Park
2018-01-19  3:27                 ` Steven Rostedt
2018-01-22  2:31                   ` Byungchul Park
2018-01-10 13:24 ` [PATCH v5 2/2] printk: Hide console waiter logic into helpers Petr Mladek
2018-01-10 17:52   ` Steven Rostedt
2018-01-11 12:03     ` Petr Mladek
2018-01-12 15:37       ` Steven Rostedt
2018-01-12 16:08         ` Petr Mladek
2018-01-12 16:36           ` Steven Rostedt
2018-01-15 16:08             ` Petr Mladek
2018-01-16  5:05               ` Sergey Senozhatsky
2018-01-10 14:05 ` [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Tejun Heo
2018-01-10 16:29   ` Petr Mladek
2018-01-10 17:02     ` Tejun Heo
2018-01-10 18:21       ` Peter Zijlstra
2018-01-10 18:30         ` Tejun Heo
2018-01-10 18:41           ` Peter Zijlstra
2018-01-10 19:05             ` Tejun Heo
2018-01-11  5:15         ` Sergey Senozhatsky
2018-01-10 18:22       ` Steven Rostedt
2018-01-10 18:36         ` Tejun Heo
2018-01-10 18:40       ` Mathieu Desnoyers
2018-01-11  7:36         ` Sergey Senozhatsky
2018-01-11 11:24           ` Petr Mladek
2018-01-11 13:19             ` Sergey Senozhatsky
2018-01-24  9:36       ` Peter Zijlstra
2018-01-24 18:46         ` Tejun Heo
2018-05-09  8:58       ` Sergey Senozhatsky
2018-01-10 18:54     ` Steven Rostedt
2018-01-11  5:10     ` Sergey Senozhatsky
2018-01-10 18:05   ` Steven Rostedt
2018-01-10 18:12     ` Tejun Heo
2018-01-10 18:14       ` Tejun Heo
2018-01-10 18:45         ` Steven Rostedt
2018-01-10 18:41       ` Steven Rostedt
2018-01-10 18:57         ` Tejun Heo
2018-01-10 19:17           ` Steven Rostedt
2018-01-10 19:34             ` Tejun Heo
2018-01-10 19:44               ` Steven Rostedt
2018-01-10 22:44                 ` Tejun Heo
2018-01-11  5:35             ` Sergey Senozhatsky
2018-01-11  4:58     ` Sergey Senozhatsky
2018-01-11  9:34       ` Petr Mladek
2018-01-11 10:38         ` Sergey Senozhatsky
2018-01-11 11:50           ` Petr Mladek
2018-01-11 16:29           ` Steven Rostedt
2018-01-12  1:30             ` Steven Rostedt
2018-01-12  2:55               ` Steven Rostedt
2018-01-12  4:20                 ` Steven Rostedt
2018-01-16 19:44                 ` Tejun Heo
2018-01-17  9:12                   ` Petr Mladek
2018-01-17 15:15                     ` Tejun Heo
2018-01-17 17:12                       ` Steven Rostedt
2018-01-17 18:42                         ` Steven Rostedt
2018-01-19 18:20                           ` Steven Rostedt
2018-01-20  7:14                             ` Sergey Senozhatsky
2018-01-20 15:49                               ` Steven Rostedt
2018-01-21 14:15                                 ` Sergey Senozhatsky [this message]
2018-01-21 21:04                                   ` Steven Rostedt
2018-01-22  8:56                                     ` Sergey Senozhatsky
2018-01-22 10:28                                       ` Sergey Senozhatsky
2018-01-22 10:36                                         ` Sergey Senozhatsky
2018-01-23  6:40                                   ` Sergey Senozhatsky
2018-01-23  7:05                                     ` Sergey Senozhatsky
2018-01-23  7:31                                     ` Sergey Senozhatsky
2018-01-23 14:56                                     ` Steven Rostedt
2018-01-23 15:21                                       ` Sergey Senozhatsky
2018-01-23 15:41                                         ` Steven Rostedt
2018-01-23 15:43                                           ` Tejun Heo
2018-01-23 16:12                                             ` Sergey Senozhatsky
2018-01-23 16:13                                             ` Steven Rostedt
2018-01-23 17:21                                               ` Tejun Heo
2018-04-23  5:35                                             ` Sergey Senozhatsky
2018-01-23 16:01                                           ` Sergey Senozhatsky
2018-01-23 16:24                                             ` Steven Rostedt
2018-01-24  2:11                                               ` Sergey Senozhatsky
2018-01-24  2:52                                                 ` Steven Rostedt
2018-01-24  4:44                                                   ` Sergey Senozhatsky
2018-01-23 17:22                                             ` Tejun Heo
2018-01-20 12:19                             ` Tejun Heo
2018-01-20 14:51                               ` Steven Rostedt
2018-01-17 20:05                         ` Tejun Heo
2018-01-18  5:43                           ` Sergey Senozhatsky
2018-01-18 11:51                           ` Petr Mladek
2018-01-18  5:42                         ` Sergey Senozhatsky
2018-01-12  3:12               ` Sergey Senozhatsky
2018-01-12  2:56             ` Sergey Senozhatsky
2018-01-12  3:21               ` Steven Rostedt
2018-01-12 10:05                 ` Sergey Senozhatsky
2018-01-12 12:21                   ` Steven Rostedt
2018-01-12 12:55                     ` Petr Mladek
2018-01-13  7:31                       ` Sergey Senozhatsky
2018-01-15  8:51                         ` Petr Mladek
2018-01-15  9:48                           ` Sergey Senozhatsky
2018-01-16  5:16                           ` Sergey Senozhatsky
2018-01-16  9:08                             ` Petr Mladek
2018-01-15 12:08                       ` Steven Rostedt
2018-01-16  4:51                         ` Sergey Senozhatsky
2018-01-13  7:28                     ` Sergey Senozhatsky
2018-01-15 10:17                       ` Petr Mladek
2018-01-15 11:50                         ` Petr Mladek
2018-01-16  6:10                           ` Sergey Senozhatsky
2018-01-16  9:36                             ` Petr Mladek
2018-01-16 10:10                               ` Sergey Senozhatsky
2018-01-16 16:06                             ` Steven Rostedt
2018-01-16  5:23                         ` Sergey Senozhatsky
2018-01-15 12:06                       ` Steven Rostedt
2018-01-15 14:45                         ` Petr Mladek
2018-01-16  2:23                           ` Sergey Senozhatsky
2018-01-16  4:47                             ` Sergey Senozhatsky
2018-01-16 10:19                               ` Petr Mladek
2018-01-17  2:24                                 ` Sergey Senozhatsky
2018-01-16 15:45                               ` Steven Rostedt
2018-01-17  2:18                                 ` Sergey Senozhatsky
2018-01-17 13:04                                   ` Petr Mladek
2018-01-17 15:24                                     ` Steven Rostedt
2018-01-18  4:31                                     ` Sergey Senozhatsky
2018-01-18 15:22                                       ` Steven Rostedt
2018-01-16 10:13                             ` Petr Mladek
2018-01-17  6:29                               ` Sergey Senozhatsky
2018-01-16  1:46                         ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180121141521.GA429@tigerII.localdomain \
    --to=sergey.senozhatsky@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=byungchul.park@lge.com \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=rostedt@rostedt.homelinux.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).