All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task.
@ 2018-10-17 10:06 Tetsuo Handa
  2018-10-17 10:28 ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Tetsuo Handa @ 2018-10-17 10:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, linux-mm, syzkaller-bugs, guro, kirill.shutemov,
	linux-kernel, rientjes, yang.s, Andrew Morton,
	Sergey Senozhatsky, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, Tetsuo Handa, Michal Hocko, syzbot

syzbot is hitting RCU stall at shmem_fault() [1].
This is because memcg-OOM events with no eligible task (current thread
is marked as OOM-unkillable) continued calling dump_header() from
out_of_memory() enabled by commit 3100dab2aa09dc6e ("mm: memcontrol:
print proper OOM header when no eligible victim left.").

Michal proposed ratelimiting dump_header() [2]. But I don't think that
that patch is appropriate because that patch does not ratelimit

  "%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n"
  "Out of memory and no killable processes...\n"

messages which can be printed for every few milliseconds (i.e. effectively
denial of service for console users) until the OOM situation is solved.

Let's make sure that next dump_header() waits for at least 60 seconds from
previous "Out of memory and no killable processes..." message. Michal is
thinking that any interval is meaningless without knowing the printk()
throughput. But since printk() is synchronous unless handed over to
somebody else by commit dbdda842fe96f893 ("printk: Add console owner and
waiter logic to load balance console writes"), it is likely that all OOM
messages from this out_of_memory() request is already flushed to consoles
when pr_warn("Out of memory and no killable processes...\n") returned.
Thus, we will be able to allow console users to do what they need to do.

To summarize, this patch allows threads in requested memcg to complete
memory allocation requests for doing recovery operation, and also allows
administrators to manually do recovery operation from console if
OOM-unkillable thread is failing to solve the OOM situation automatically.

[1] https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
[2] https://lkml.kernel.org/r/20181010151135.25766-1-mhocko@kernel.org

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+77e6b28a7a7106ad0def@syzkaller.appspotmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa53..9056f9b 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1106,6 +1106,11 @@ bool out_of_memory(struct oom_control *oc)
 	select_bad_process(oc);
 	/* Found nothing?!?! */
 	if (!oc->chosen) {
+		static unsigned long last_warned;
+
+		if ((is_sysrq_oom(oc) || is_memcg_oom(oc)) &&
+		    time_in_range(jiffies, last_warned, last_warned + 60 * HZ))
+			return false;
 		dump_header(oc, NULL);
 		pr_warn("Out of memory and no killable processes...\n");
 		/*
@@ -1115,6 +1120,7 @@ bool out_of_memory(struct oom_control *oc)
 		 */
 		if (!is_sysrq_oom(oc) && !is_memcg_oom(oc))
 			panic("System is deadlocked on memory\n");
+		last_warned = jiffies;
 	}
 	if (oc->chosen && oc->chosen != (void *)-1UL)
 		oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-10-23 10:23 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-17 10:06 [PATCH v3] mm: memcontrol: Don't flood OOM messages with no eligible task Tetsuo Handa
2018-10-17 10:28 ` Michal Hocko
2018-10-17 11:17   ` Sergey Senozhatsky
2018-10-17 11:29     ` Michal Hocko
2018-10-18  2:46     ` Tetsuo Handa
2018-10-18  2:46       ` Tetsuo Handa
2018-10-18  4:27       ` Sergey Senozhatsky
2018-10-18  5:26         ` Tetsuo Handa
2018-10-18  5:26           ` Tetsuo Handa
2018-10-18  6:10           ` Sergey Senozhatsky
2018-10-18  7:56             ` Michal Hocko
2018-10-18  8:13               ` Sergey Senozhatsky
2018-10-18 11:58                 ` Tetsuo Handa
2018-10-18 23:54                   ` Sergey Senozhatsky
2018-10-19 10:35                     ` Tetsuo Handa
2018-10-19 10:35                       ` Tetsuo Handa
2018-10-23  0:47                       ` Sergey Senozhatsky
2018-10-23  8:37                       ` Petr Mladek
2018-10-23  8:54                         ` Michal Hocko
2018-10-18 14:30         ` Petr Mladek
2018-10-19  0:18           ` Tetsuo Handa
2018-10-23  8:21             ` Petr Mladek
2018-10-23 10:23               ` Tetsuo Handa
2018-10-18  6:55       ` Michal Hocko
2018-10-18 10:37         ` Tetsuo Handa
2018-10-18 11:23           ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.