From: Steven Rostedt <rostedt@goodmis.org>
To: Tejun Heo <tj@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
Cong Wang <xiyou.wangcong@gmail.com>,
Dave Hansen <dave.hansen@intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jan Kara <jack@suse.cz>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
rostedt@home.goodmis.org, Byungchul Park <byungchul.park@lge.com>,
Pavel Machek <pavel@ucw.cz>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup
Date: Wed, 17 Jan 2018 12:12:51 -0500 [thread overview]
Message-ID: <20180117121251.7283a56e@gandalf.local.home> (raw)
In-Reply-To: <20180117151509.GT3460072@devbig577.frc2.facebook.com>
On Wed, 17 Jan 2018 07:15:09 -0800
Tejun Heo <tj@kernel.org> wrote:
> It's great that Steven's patches solve a good number of problems. It
> is also true that there's a class of problems that it doesn't solve,
> which other approaches do. The productive thing to do here is trying
> to solve the unsolved one too, especially given that it doesn't seem
> too difficuilt to do so on top of what's proposed.
OK, let's talk about the other problems, as this is no longer related
to my patch.
From your previous email:
> 1. Console is IPMI emulated serial console. Super slow. Also
> netconsole is in use.
> 2. System runs out of memory, OOM triggers.
> 3. OOM handler is printing out OOM debug info.
> 4. While trying to emit the messages for netconsole, the network stack
> / driver tries to allocate memory and then fail, which in turn
> triggers allocation failure or other warning messages. printk was
> already flushing, so the messages are queued on the ring.
> 5. OOM handler keeps flushing but 4 repeats and the queue is never
> shrinking. Because OOM handler is trapped in printk flushing, it
> never manages to free memory and no one else can enter OOM path
> either, so the system is trapped in this state.
From what I gathered, you said an OOM would trigger, and then the
network console would not be able to allocate memory and it would
trigger a printk too, and cause an infinite amount of printks.
This could very well be a great place to force offloading. If a printk
is called from within a printk, at the same context (normal, softirq,
irq or NMI), then we should trigger the offloading.
My ftrace ring buffer has a context level recursion check, we could use
that, and even tie it into my previous patch:
With something like this (not compiled tested or anything, and
kick_offload_thread() would need to be implemented).
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9cb943c90d98..b80b23a0ca13 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2261,6 +2261,63 @@ static int have_callable_console(void)
return 0;
}
+/*
+ * Used for which context the printk is in.
+ * NMI = 0
+ * IRQ = 1
+ * SOFTIRQ = 2
+ * NORMAL = 3
+ *
+ * Stack ordered, where the lower number can preempt
+ * the higher number: mask &= mask - 1, will only clear
+ * the lowerest set bit.
+ */
+enum {
+ CTX_NMI,
+ CTX_IRQ,
+ CTX_SOFTIRQ,
+ CTX_NORMAL,
+};
+
+static DEFINE_PER_CPU(int, recursion_bits);
+
+static bool recursion_check_start(void)
+{
+ unsigned long pc = preempt_count();
+ int val = this_cpu_read(recursion_bits);
+
+ if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
+ bit = CTX_NORMAL;
+ else
+ bit = pc & NMI_MASK ? CTX_NMI :
+ pc & HARDIRQ_MASK ? CTX_IRQ : CTX_SOFTIRQ;
+
+ if (unlikely(val & (1 << bit)))
+ return true;
+
+ val |= (1 << bit);
+ this_cpu_write(recursion_bits, val);
+ return false;
+}
+
+static void recursion_check_finish(bool offload)
+{
+ int val = this_cpu_read(recursion_bits);
+
+ if (offload)
+ return;
+
+ val &= val - 1;
+ this_cpu_write(recursion_bits, val);
+}
+
+static void kick_offload_thread(void)
+{
+ /*
+ * Consoles are triggering printks, offload the printks
+ * to another CPU to hopefully avoid a lockup.
+ */
+}
/*
* Can we actually use the console at this time on this cpu?
@@ -2333,6 +2390,7 @@ void console_unlock(void)
for (;;) {
struct printk_log *msg;
+ bool offload;
size_t ext_len = 0;
size_t len;
@@ -2393,15 +2451,20 @@ void console_unlock(void)
* waiter waiting to take over.
*/
console_lock_spinning_enable();
+ offload = recursion_check_start();
stop_critical_timings(); /* don't trace print latency */
call_console_drivers(ext_text, ext_len, text, len);
start_critical_timings();
+ recursion_check_finish(offload);
+
if (console_lock_spinning_disable_and_check()) {
printk_safe_exit_irqrestore(flags);
return;
}
+ if (offload)
+ kick_offload_thread();
printk_safe_exit_irqrestore(flags);
-- Steve
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-17 17:12 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-10 13:24 [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Petr Mladek
2018-01-10 13:24 ` [PATCH v5 1/2] printk: Add console owner and waiter logic to load balance console writes Petr Mladek
2018-01-10 16:50 ` Steven Rostedt
2018-01-12 16:54 ` Steven Rostedt
2018-01-12 17:11 ` Steven Rostedt
2018-01-17 19:13 ` Rasmus Villemoes
2018-01-17 19:33 ` Steven Rostedt
2018-01-19 9:51 ` Sergey Senozhatsky
2018-01-18 22:03 ` Pavel Machek
2018-01-19 0:20 ` Steven Rostedt
2018-01-17 2:19 ` Byungchul Park
2018-01-17 4:54 ` Byungchul Park
2018-01-17 7:34 ` Byungchul Park
2018-01-17 12:04 ` Petr Mladek
2018-01-18 1:53 ` Byungchul Park
2018-01-18 1:57 ` Byungchul Park
2018-01-18 2:19 ` Steven Rostedt
2018-01-18 4:01 ` Byungchul Park
2018-01-18 15:21 ` Steven Rostedt
2018-01-19 2:37 ` Byungchul Park
2018-01-19 3:27 ` Steven Rostedt
2018-01-22 2:31 ` Byungchul Park
2018-01-10 13:24 ` [PATCH v5 2/2] printk: Hide console waiter logic into helpers Petr Mladek
2018-01-10 17:52 ` Steven Rostedt
2018-01-11 12:03 ` Petr Mladek
2018-01-12 15:37 ` Steven Rostedt
2018-01-12 16:08 ` Petr Mladek
2018-01-12 16:36 ` Steven Rostedt
2018-01-15 16:08 ` Petr Mladek
2018-01-16 5:05 ` Sergey Senozhatsky
2018-01-10 14:05 ` [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Tejun Heo
2018-01-10 16:29 ` Petr Mladek
2018-01-10 17:02 ` Tejun Heo
2018-01-10 18:21 ` Peter Zijlstra
2018-01-10 18:30 ` Tejun Heo
2018-01-10 18:41 ` Peter Zijlstra
2018-01-10 19:05 ` Tejun Heo
2018-01-11 5:15 ` Sergey Senozhatsky
2018-01-10 18:22 ` Steven Rostedt
2018-01-10 18:36 ` Tejun Heo
2018-01-10 18:40 ` Mathieu Desnoyers
2018-01-11 7:36 ` Sergey Senozhatsky
2018-01-11 11:24 ` Petr Mladek
2018-01-11 13:19 ` Sergey Senozhatsky
2018-01-24 9:36 ` Peter Zijlstra
2018-01-24 18:46 ` Tejun Heo
2018-05-09 8:58 ` Sergey Senozhatsky
2018-01-10 18:54 ` Steven Rostedt
2018-01-11 5:10 ` Sergey Senozhatsky
2018-01-10 18:05 ` Steven Rostedt
2018-01-10 18:12 ` Tejun Heo
2018-01-10 18:14 ` Tejun Heo
2018-01-10 18:45 ` Steven Rostedt
2018-01-10 18:41 ` Steven Rostedt
2018-01-10 18:57 ` Tejun Heo
2018-01-10 19:17 ` Steven Rostedt
2018-01-10 19:34 ` Tejun Heo
2018-01-10 19:44 ` Steven Rostedt
2018-01-10 22:44 ` Tejun Heo
2018-01-11 5:35 ` Sergey Senozhatsky
2018-01-11 4:58 ` Sergey Senozhatsky
2018-01-11 9:34 ` Petr Mladek
2018-01-11 10:38 ` Sergey Senozhatsky
2018-01-11 11:50 ` Petr Mladek
2018-01-11 16:29 ` Steven Rostedt
2018-01-12 1:30 ` Steven Rostedt
2018-01-12 2:55 ` Steven Rostedt
2018-01-12 4:20 ` Steven Rostedt
2018-01-16 19:44 ` Tejun Heo
2018-01-17 9:12 ` Petr Mladek
2018-01-17 15:15 ` Tejun Heo
2018-01-17 17:12 ` Steven Rostedt [this message]
2018-01-17 18:42 ` Steven Rostedt
2018-01-19 18:20 ` Steven Rostedt
2018-01-20 7:14 ` Sergey Senozhatsky
2018-01-20 15:49 ` Steven Rostedt
2018-01-21 14:15 ` Sergey Senozhatsky
2018-01-21 21:04 ` Steven Rostedt
2018-01-22 8:56 ` Sergey Senozhatsky
2018-01-22 10:28 ` Sergey Senozhatsky
2018-01-22 10:36 ` Sergey Senozhatsky
2018-01-23 6:40 ` Sergey Senozhatsky
2018-01-23 7:05 ` Sergey Senozhatsky
2018-01-23 7:31 ` Sergey Senozhatsky
2018-01-23 14:56 ` Steven Rostedt
2018-01-23 15:21 ` Sergey Senozhatsky
2018-01-23 15:41 ` Steven Rostedt
2018-01-23 15:43 ` Tejun Heo
2018-01-23 16:12 ` Sergey Senozhatsky
2018-01-23 16:13 ` Steven Rostedt
2018-01-23 17:21 ` Tejun Heo
2018-04-23 5:35 ` Sergey Senozhatsky
2018-01-23 16:01 ` Sergey Senozhatsky
2018-01-23 16:24 ` Steven Rostedt
2018-01-24 2:11 ` Sergey Senozhatsky
2018-01-24 2:52 ` Steven Rostedt
2018-01-24 4:44 ` Sergey Senozhatsky
2018-01-23 17:22 ` Tejun Heo
2018-01-20 12:19 ` Tejun Heo
2018-01-20 14:51 ` Steven Rostedt
2018-01-17 20:05 ` Tejun Heo
2018-01-18 5:43 ` Sergey Senozhatsky
2018-01-18 11:51 ` Petr Mladek
2018-01-18 5:42 ` Sergey Senozhatsky
2018-01-12 3:12 ` Sergey Senozhatsky
2018-01-12 2:56 ` Sergey Senozhatsky
2018-01-12 3:21 ` Steven Rostedt
2018-01-12 10:05 ` Sergey Senozhatsky
2018-01-12 12:21 ` Steven Rostedt
2018-01-12 12:55 ` Petr Mladek
2018-01-13 7:31 ` Sergey Senozhatsky
2018-01-15 8:51 ` Petr Mladek
2018-01-15 9:48 ` Sergey Senozhatsky
2018-01-16 5:16 ` Sergey Senozhatsky
2018-01-16 9:08 ` Petr Mladek
2018-01-15 12:08 ` Steven Rostedt
2018-01-16 4:51 ` Sergey Senozhatsky
2018-01-13 7:28 ` Sergey Senozhatsky
2018-01-15 10:17 ` Petr Mladek
2018-01-15 11:50 ` Petr Mladek
2018-01-16 6:10 ` Sergey Senozhatsky
2018-01-16 9:36 ` Petr Mladek
2018-01-16 10:10 ` Sergey Senozhatsky
2018-01-16 16:06 ` Steven Rostedt
2018-01-16 5:23 ` Sergey Senozhatsky
2018-01-15 12:06 ` Steven Rostedt
2018-01-15 14:45 ` Petr Mladek
2018-01-16 2:23 ` Sergey Senozhatsky
2018-01-16 4:47 ` Sergey Senozhatsky
2018-01-16 10:19 ` Petr Mladek
2018-01-17 2:24 ` Sergey Senozhatsky
2018-01-16 15:45 ` Steven Rostedt
2018-01-17 2:18 ` Sergey Senozhatsky
2018-01-17 13:04 ` Petr Mladek
2018-01-17 15:24 ` Steven Rostedt
2018-01-18 4:31 ` Sergey Senozhatsky
2018-01-18 15:22 ` Steven Rostedt
2018-01-16 10:13 ` Petr Mladek
2018-01-17 6:29 ` Sergey Senozhatsky
2018-01-16 1:46 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180117121251.7283a56e@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=byungchul.park@lge.com \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=pavel@ucw.cz \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@home.goodmis.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).