linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com,
	vdavydov@parallels.com, mgorman@techsingularity.net,
	hughd@google.com, riel@redhat.com, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: mm, oom_reaper: How to handle race with oom_killer_disable() ?
Date: Wed, 22 Jun 2016 19:57:17 +0900	[thread overview]
Message-ID: <201606221957.DBC18723.LOFQSMHVJOFFOt@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160622065016.GD7520@dhcp22.suse.cz>

Michal Hocko wrote:
> On Wed 22-06-16 08:40:15, Michal Hocko wrote:
> > On Wed 22-06-16 06:47:48, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > On Wed 22-06-16 00:32:29, Tetsuo Handa wrote:
> > > > > Michal Hocko wrote:
> > > > [...]
> > > > > > Hmm, what about the following instead. It is rather a workaround than a
> > > > > > full flaged fix but it seems much more easier and shouldn't introduce
> > > > > > new issues.
> > > > > 
> > > > > Yes, I think that will work. But I think below patch (marking signal_struct
> > > > > to ignore TIF_MEMDIE instead of clearing TIF_MEMDIE from task_struct) on top of
> > > > > current linux.git will implement no-lockup requirement. No race is possible unlike
> > > > > "[PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init".
> > > > 
> > > > Not really. Because without the exit_oom_victim from oom_reaper you have
> > > > no guarantee that the oom_killer_disable will ever return. I have
> > > > mentioned that in the changelog. There is simply no guarantee the oom
> > > > victim will ever reach exit_mm->exit_oom_victim.
> > > 
> > > Why? Since any allocation after setting oom_killer_disabled = true will be
> > > forced to fail, nobody will be blocked on waiting for memory allocation. Thus,
> > > the TIF_MEMDIE tasks will eventually reach exit_mm->exit_oom_victim, won't it?
> > 
> > What if it gets blocked waiting for an operation which cannot make any
> > forward progress because it cannot proceed with an allocation (e.g.
> > an open coded allocation retry loop - not that uncommon when sending
> > a bio)? I mean if we want to guarantee a forward progress then there has
> > to be something to clear the flag no matter in what state the oom victim
> > is or give up on oom_killer_disable.

That sounds as if CONFIG_MMU=n kernels do OOM livelock at __mmput() regardless
of oom_killer_disabled.

> 
> That being said I guess the patch to try_to_freeze_tasks after
> oom_killer_disable should be simple enough to go for now and stable
> trees and we can come up with something less hackish later. I do not
> like the fact that oom_killer_disable doesn't act as a full "barrier"
> anymore.
> 
> What do you think?

I'm OK with calling try_to_freeze_tasks(true) again for Linux 4.6 and 4.7 kernels.

But if free memory is little such that oom_killer_disable() can not expect TIF_MEMDIE
threads to clear TIF_MEMDIE by themselves (and therefore has to depend on the OOM
reaper to clear TIF_MEMDIE on behalf of them after the OOM reaper reaped some memory),
subsequent operations would be as well blocked waiting for an operation which cannot
make any forward progress because it cannot proceed with an allocation. Then,
oom_killer_disable() returns false after some timeout (i.e. "do not try to suspend
when the system is almost OOM") will be a safer reaction.

  reply	other threads:[~2016-06-22 10:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-10 14:23 Tetsuo Handa
2016-06-13 11:19 ` Michal Hocko
2016-06-21  8:31   ` Michal Hocko
2016-06-21 11:03     ` Tetsuo Handa
2016-06-21 11:46       ` Michal Hocko
2016-06-21 13:27         ` Michal Hocko
2016-06-21 15:32           ` Tetsuo Handa
2016-06-21 17:46             ` Michal Hocko
2016-06-21 21:47               ` Tetsuo Handa
2016-06-22  6:40                 ` Michal Hocko
2016-06-22  6:50                   ` Michal Hocko
2016-06-22 10:57                     ` Tetsuo Handa [this message]
2016-06-22 12:08                       ` Michal Hocko
2016-06-22 12:15                         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201606221957.DBC18723.LOFQSMHVJOFFOt@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    --subject='Re: mm, oom_reaper: How to handle race with oom_killer_disable() ?' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox