All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: syzbot <syzbot+77e6b28a7a7106ad0def@syzkaller.appspotmail.com>,
	hannes@cmpxchg.org, akpm@linux-foundation.org, guro@fb.com,
	kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, rientjes@google.com,
	syzkaller-bugs@googlegroups.com, yang.s@alibaba-inc.com,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Petr Mladek <pmladek@suse.com>
Subject: Re: INFO: rcu detected stall in shmem_fault
Date: Wed, 10 Oct 2018 13:35:00 +0200	[thread overview]
Message-ID: <20181010113500.GH5873@dhcp22.suse.cz> (raw)
In-Reply-To: <e72f799e-0634-f958-1af0-291f8577f4e8@i-love.sakura.ne.jp>

On Wed 10-10-18 19:43:38, Tetsuo Handa wrote:
> On 2018/10/10 17:59, Michal Hocko wrote:
> > On Wed 10-10-18 09:12:45, Tetsuo Handa wrote:
> >> syzbot is hitting RCU stall due to memcg-OOM event.
> >> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
> > 
> > This is really interesting. If we do not have any eligible oom victim we
> > simply force the charge (allow to proceed and go over the hard limit)
> > and break the isolation. That means that the caller gets back to running
> > and realease all locks take on the way.
> 
> What happens if the caller continued trying to allocate more memory
> because the caller cannot be noticed by SIGKILL from the OOM killer?

It could eventually trigger the global OOM.

> >                                         I am wondering how come we are
> > seeing the RCU stall. Whole is holding the rcu lock? Certainly not the
> > charge patch and neither should the caller because you have to be in a
> > sleepable context to trigger the OOM killer. So there must be something
> > more going on.
> 
> Just flooding out of memory messages can trigger RCU stall problems.
> For example, a severe skbuff_head_cache or kmalloc-512 leak bug is causing

[...]

Quite some of them, indeed! I guess we want to rate limit the output.
What about the following?

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa5360616..4ee393c85e27 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -430,6 +430,9 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 
 static void dump_header(struct oom_control *oc, struct task_struct *p)
 {
+	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
+					      DEFAULT_RATELIMIT_BURST);
+
 	pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
 		current->comm, oc->gfp_mask, &oc->gfp_mask,
 		nodemask_pr_args(oc->nodemask), oc->order,
@@ -437,6 +440,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 	if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order)
 		pr_warn("COMPACTION is disabled!!!\n");
 
+	if (!__ratelimit(&oom_rs))
+		return;
+
 	cpuset_print_current_mems_allowed();
 	dump_stack();
 	if (is_memcg_oom(oc))
@@ -931,8 +937,6 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
 	struct task_struct *t;
 	struct mem_cgroup *oom_group;
 	unsigned int victim_points = 0;
-	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
-					      DEFAULT_RATELIMIT_BURST);
 
 	/*
 	 * If the task is already exiting, don't alarm the sysadmin or kill
@@ -949,8 +953,7 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
 	}
 	task_unlock(p);
 
-	if (__ratelimit(&oom_rs))
-		dump_header(oc, p);
+	dump_header(oc, p);
 
 	pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n",
 		message, task_pid_nr(p), p->comm, points);
 
> >> What should we do if memcg-OOM found no killable task because the allocating task
> >> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires 
> >> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> >> OOM header when no eligible victim left") because syzbot was terminating the test
> >> upon WARN(1) removed by that commit) is not a good behavior.
> > 
> > We definitely want to inform about ineligible oom victim. We might
> > consider some rate limiting for the memcg state but that is a valuable
> > information to see under normal situation (when you do not have floods
> > of these situations).
> > 
> 
> But if the caller cannot be noticed by SIGKILL from the OOM killer,
> allowing the caller to trigger the OOM killer again and again (until
> global OOM killer triggers) is bad.

There is simply no other option. Well, except for failing the charge
which has been considered and refused because it could trigger
unexpected error paths and that breaking the isolation on rare cases
when of the misconfiguration is acceptable. We can reconsider that
but you should bring really good arguments on the table. I was very
successful doing that.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-10-10 11:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-10  0:08 INFO: rcu detected stall in shmem_fault syzbot
2018-10-10  0:12 ` Tetsuo Handa
2018-10-10  4:11   ` David Rientjes
2018-10-10  7:55     ` Dmitry Vyukov
2018-10-10  9:13       ` Michal Hocko
2018-10-10  9:33         ` Dmitry Vyukov
2018-10-10  9:02     ` Michal Hocko
2018-10-10  8:59   ` Michal Hocko
2018-10-10 10:43     ` Tetsuo Handa
2018-10-10 11:35       ` Michal Hocko [this message]
2018-10-10 11:48         ` Sergey Senozhatsky
2018-10-10 12:25           ` Michal Hocko
2018-10-10 12:29             ` Dmitry Vyukov
2018-10-10 12:36               ` Dmitry Vyukov
2018-10-10 13:10                 ` Tetsuo Handa
2018-10-10 13:17                   ` Dmitry Vyukov
2018-10-11  1:17                   ` Sergey Senozhatsky
2018-10-10 15:17               ` Sergey Senozhatsky
2018-10-10 14:19         ` Tetsuo Handa
2018-10-10 15:11 ` [RFC PATCH] memcg, oom: throttle dump_header for memcg ooms without eligible tasks Michal Hocko
2018-10-10 15:11   ` Michal Hocko
2018-10-11  6:37   ` Tetsuo Handa
2018-10-11  6:37     ` Tetsuo Handa
2018-10-12 10:47     ` Tetsuo Handa
2018-10-12 10:47       ` Tetsuo Handa
2018-10-12 11:20   ` Johannes Weiner
2018-10-12 12:08     ` Michal Hocko
2018-10-12 12:10       ` Tetsuo Handa
2018-10-12 12:41         ` Johannes Weiner
2018-10-12 12:58           ` Tetsuo Handa
2018-10-13 11:09             ` Tetsuo Handa
2018-10-13 11:22               ` Johannes Weiner
2018-10-13 11:28                 ` Tetsuo Handa
2018-10-15  8:19                   ` Michal Hocko
2018-10-15 10:57                     ` Tetsuo Handa
2018-10-15 11:24                       ` Michal Hocko
2018-10-15 12:47                         ` Tetsuo Handa
2018-10-15 13:35                           ` Michal Hocko
2018-10-16  0:55                             ` Tetsuo Handa
2018-10-16  9:20                               ` Michal Hocko
2018-10-16 11:05                                 ` Tetsuo Handa
2018-10-16 11:17                                   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181010113500.GH5873@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=pmladek@suse.com \
    --cc=rientjes@google.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=syzbot+77e6b28a7a7106ad0def@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=yang.s@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.