From: Linus Torvalds <torvalds@linux-foundation.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Michal Hocko <mhocko@kernel.org>,
David Rientjes <rientjes@google.com>,
Oleg Nesterov <oleg@redhat.com>, Kyle Walker <kwalker@redhat.com>,
Christoph Lameter <cl@linux.com>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov@parallels.com>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Stanislav Kozina <skozina@redhat.com>
Subject: Re: Silent hang up caused by pages being not scanned?
Date: Mon, 12 Oct 2015 14:23:45 -0700 [thread overview]
Message-ID: <CA+55aFwapaED7JV6zm-NVkP-jKie+eQ1vDXWrKD=SkbshZSgmw@mail.gmail.com> (raw)
In-Reply-To: <201510130025.EJF21331.FFOQJtVOMLFHSO@I-love.SAKURA.ne.jp>
On Mon, Oct 12, 2015 at 8:25 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> I examined this hang up using additional debug printk() patch. And it was
> observed that when this silent hang up occurs, zone_reclaimable() called from
> shrink_zones() called from a __GFP_FS memory allocation request is returning
> true forever. Since the __GFP_FS memory allocation request can never call
> out_of_memory() due to did_some_progree > 0, the system will silently hang up
> with 100% CPU usage.
I wouldn't blame the zones_reclaimable() logic itself, but yeah, that looks bad.
So the do_try_to_free_pages() logic that does that
/* Any of the zones still reclaimable? Don't OOM. */
if (zones_reclaimable)
return 1;
is rather dubious. The history of that odd line is pretty dubious too:
it used to be that we would return success if "shrink_zones()"
succeeded or if "nr_reclaimed" was non-zero, but that "shrink_zones()"
logic got rewritten, and I don't think the current situation is all
that sane.
And returning 1 there is actively misleading to callers, since it
makes them think that it made progress.
So I think you should look at what happens if you just remove that
illogical and misleading return value.
HOWEVER.
I think that it's very true that we have then tuned all our *other*
heuristics for taking this thing into account, so I suspect that we'll
find that we'll need to tweak other places. But this crazy "let's say
that we made progress even when we didn't" thing looks just wrong.
In particular, I think that you'll find that you will have to change
the heuristics in __alloc_pages_slowpath() where we currently do
if ((did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER) || ..
when the "did_some_progress" logic changes that radically.
Because while the current return value looks insane, all the other
testing and tweaking has been done with that very odd return value in
place.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-12 21:23 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-17 17:59 [PATCH] mm/oom_kill.c: don't kill TASK_UNINTERRUPTIBLE tasks Kyle Walker
2015-09-17 19:22 ` Oleg Nesterov
2015-09-18 15:41 ` Christoph Lameter
2015-09-18 16:24 ` Oleg Nesterov
2015-09-18 16:39 ` Tetsuo Handa
2015-09-18 16:54 ` Oleg Nesterov
2015-09-18 17:00 ` Christoph Lameter
2015-09-18 19:07 ` Oleg Nesterov
2015-09-18 19:19 ` Christoph Lameter
2015-09-18 21:28 ` Kyle Walker
2015-09-18 22:07 ` Christoph Lameter
2015-09-19 8:32 ` Michal Hocko
2015-09-19 14:33 ` Tetsuo Handa
2015-09-19 15:51 ` Michal Hocko
2015-09-21 23:33 ` David Rientjes
2015-09-22 5:33 ` Tetsuo Handa
2015-09-22 23:32 ` David Rientjes
2015-09-23 12:03 ` Kyle Walker
2015-09-24 11:50 ` Tetsuo Handa
2015-09-19 14:44 ` Oleg Nesterov
2015-09-21 23:27 ` David Rientjes
2015-09-19 8:25 ` Michal Hocko
2015-09-19 8:22 ` Michal Hocko
2015-09-21 23:08 ` David Rientjes
2015-09-19 15:03 ` can't oom-kill zap the victim's memory? Oleg Nesterov
2015-09-19 15:10 ` Oleg Nesterov
2015-09-19 15:58 ` Michal Hocko
2015-09-20 13:16 ` Oleg Nesterov
2015-09-19 22:24 ` Linus Torvalds
2015-09-19 22:54 ` Raymond Jennings
2015-09-19 23:00 ` Raymond Jennings
2015-09-19 23:13 ` Linus Torvalds
2015-09-20 9:33 ` Michal Hocko
2015-09-20 13:06 ` Oleg Nesterov
2015-09-20 12:56 ` Oleg Nesterov
2015-09-20 18:05 ` Linus Torvalds
2015-09-20 18:21 ` Raymond Jennings
2015-09-20 18:23 ` Raymond Jennings
2015-09-20 19:07 ` Raymond Jennings
2015-09-21 13:57 ` Oleg Nesterov
2015-09-21 13:44 ` Oleg Nesterov
2015-09-21 14:24 ` Michal Hocko
2015-09-21 15:32 ` Oleg Nesterov
2015-09-21 16:12 ` Michal Hocko
2015-09-22 16:06 ` Oleg Nesterov
2015-09-22 23:04 ` David Rientjes
2015-09-23 20:59 ` Michal Hocko
2015-09-24 21:15 ` David Rientjes
2015-09-25 9:35 ` Michal Hocko
2015-09-25 16:14 ` Tetsuo Handa
2015-09-28 16:18 ` Tetsuo Handa
2015-09-28 22:28 ` David Rientjes
2015-10-02 12:36 ` Michal Hocko
2015-10-02 19:01 ` Linus Torvalds
2015-10-05 14:44 ` Michal Hocko
2015-10-07 5:16 ` Vlastimil Babka
2015-10-07 10:43 ` Tetsuo Handa
2015-10-08 9:40 ` Vlastimil Babka
2015-10-06 7:55 ` Eric W. Biederman
2015-10-06 8:49 ` Linus Torvalds
2015-10-06 8:55 ` Linus Torvalds
2015-10-06 14:52 ` Eric W. Biederman
2015-10-03 6:02 ` Can't we use timeout based OOM warning/killing? Tetsuo Handa
2015-10-06 14:51 ` Tetsuo Handa
2015-10-12 6:43 ` Tetsuo Handa
2015-10-12 15:25 ` Silent hang up caused by pages being not scanned? Tetsuo Handa
2015-10-12 21:23 ` Linus Torvalds [this message]
2015-10-13 12:21 ` Tetsuo Handa
2015-10-13 16:37 ` Linus Torvalds
2015-10-14 12:21 ` Tetsuo Handa
2015-10-15 13:14 ` Michal Hocko
2015-10-16 15:57 ` Michal Hocko
2015-10-16 18:34 ` Linus Torvalds
2015-10-16 18:49 ` Tetsuo Handa
2015-10-19 12:57 ` Michal Hocko
2015-10-19 12:53 ` Michal Hocko
2015-10-13 13:32 ` Michal Hocko
2015-10-13 16:19 ` Tetsuo Handa
2015-10-14 13:22 ` Michal Hocko
2015-10-14 14:38 ` Tetsuo Handa
2015-10-14 14:59 ` Michal Hocko
2015-10-14 15:06 ` Tetsuo Handa
2015-10-26 11:44 ` Newbie's question: memory allocation when reclaiming memory Tetsuo Handa
2015-11-05 8:46 ` Vlastimil Babka
2015-10-06 15:25 ` Can't we use timeout based OOM warning/killing? Linus Torvalds
2015-10-08 15:33 ` Tetsuo Handa
2015-10-10 12:50 ` Tetsuo Handa
2015-09-28 22:24 ` can't oom-kill zap the victim's memory? David Rientjes
2015-09-29 7:57 ` Tetsuo Handa
2015-09-29 22:56 ` David Rientjes
2015-09-30 4:25 ` Tetsuo Handa
2015-09-30 10:21 ` Tetsuo Handa
2015-09-30 21:11 ` David Rientjes
2015-10-01 12:13 ` Tetsuo Handa
2015-10-01 14:48 ` Michal Hocko
2015-10-02 13:06 ` Tetsuo Handa
2015-10-06 18:45 ` Oleg Nesterov
2015-10-07 11:03 ` Tetsuo Handa
2015-10-07 12:00 ` Oleg Nesterov
2015-10-08 14:04 ` Michal Hocko
2015-10-08 14:01 ` Michal Hocko
2015-09-21 16:51 ` Tetsuo Handa
2015-09-22 12:43 ` Oleg Nesterov
2015-09-22 14:30 ` Tetsuo Handa
2015-09-22 14:45 ` Oleg Nesterov
2015-09-21 23:42 ` David Rientjes
2015-09-21 16:55 ` Linus Torvalds
2015-09-20 14:50 ` Tetsuo Handa
2015-09-20 14:55 ` Oleg Nesterov
2015-10-14 8:03 Silent hang up caused by pages being not scanned? Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+55aFwapaED7JV6zm-NVkP-jKie+eQ1vDXWrKD=SkbshZSgmw@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=kwalker@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=skozina@redhat.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).