linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com,
	vdavydov@parallels.com, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/10 -v3] Handle oom bypass more gracefully
Date: Wed, 8 Jun 2016 09:27:41 +0200	[thread overview]
Message-ID: <20160608072741.GE22570@dhcp22.suse.cz> (raw)
In-Reply-To: <201606080649.DGF51523.FLMOSHVtFFOJOQ@I-love.SAKURA.ne.jp>

On Wed 08-06-16 06:49:24, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > OK, so you are arming the timer for each mark_oom_victim regardless
> > of the oom context. This means that you have replaced one potential
> > lockup by other potential livelocks. Tasks from different oom domains
> > might interfere here...
> > 
> > Also this code doesn't even seem easier. It is surely less lines of
> > code but it is really hard to realize how would the timer behave for
> > different oom contexts.
> 
> If you worry about interference, we can use per signal_struct timestamp.
> I used per task_struct timestamp in my earlier versions (where per
> task_struct TIF_MEMDIE check was used instead of per signal_struct
> oom_victims).

This would allow pre-mature new victim selection for very large victims
(note that exit_mmap can take a while depending on the mm size). It also
pushed the timeout heuristic for everybody which will sooner or later
open a question why is this $NUMBER rathen than $NUMBER+$FOO.

[...]
> But expiring timeout by sleeping inside oom_kill_process() prevents other
> threads which are OOM-killed from obtaining TIF_MEMDIE, for anybody needs
> to wait for oom_lock in order to obtain TIF_MEMDIE.

True, but please note that this will happen only for the _unlikely_ case
when the mm is shared with kthread or init. All other cases would rely
on the oom_reaper which has a feedback mechanism to tell the oom killer
to move on if something bad is going on.

> Unless you set TIF_MEMDIE to all OOM-killed threads from
> oom_kill_process() or allow the caller context to use
> ALLOC_NO_WATERMARKS by checking whether current was already OOM-killed
> rather than TIF_MEMDIE, attempt to expiring timeout by sleeping inside
> oom_kill_process() is useless.

Well this is a rather strong statement for a highly unlikely corner
case, don't you think? I do not mind fortifying this class of cases some
more if we ever find out they are a real problem but I would rather make
sure they cannot lockup at this stage rather than optimize for them.

To be honest I would rather explore ways to handle kthread case (which
is the only real one IMHO from the two) gracefully and made them a
nonissue - e.g. enforce EFAULT on a dead mm during the kthread page fault
or something similar.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-06-08  7:27 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-03  9:16 [PATCH 0/10 -v3] Handle oom bypass more gracefully Michal Hocko
2016-06-03  9:16 ` [PATCH 01/10] proc, oom: drop bogus task_lock and mm check Michal Hocko
2016-06-03  9:16 ` [PATCH 02/10] proc, oom: drop bogus sighand lock Michal Hocko
2016-06-03  9:16 ` [PATCH 03/10] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-06-03  9:16 ` [PATCH 04/10] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-06-03  9:16 ` [PATCH 05/10] mm, oom: skip vforked tasks from being selected Michal Hocko
2016-06-03  9:16 ` [PATCH 06/10] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-06-06 22:27   ` David Rientjes
2016-06-06 23:20     ` Oleg Nesterov
2016-06-07  6:37       ` Michal Hocko
2016-06-07 22:15       ` David Rientjes
2016-06-08  6:22         ` Michal Hocko
2016-06-08 22:51           ` David Rientjes
2016-06-09  6:46             ` Michal Hocko
2016-06-03  9:16 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-03  9:16 ` [PATCH 08/10] mm, oom: task_will_free_mem should skip oom_reaped tasks Michal Hocko
2016-06-03  9:16 ` [RFC PATCH 09/10] mm, oom_reaper: do not attempt to reap a task more than twice Michal Hocko
2016-06-03  9:16 ` [RFC PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init Michal Hocko
2016-06-03 15:16   ` Tetsuo Handa
2016-06-06  8:15     ` Michal Hocko
2016-06-06 13:26     ` Michal Hocko
2016-06-07  6:26       ` Michal Hocko
2016-06-03 12:00 ` [PATCH 0/10 -v3] Handle oom bypass more gracefully Tetsuo Handa
2016-06-03 12:20   ` Michal Hocko
2016-06-03 12:22     ` Michal Hocko
2016-06-04 10:57       ` Tetsuo Handa
2016-06-06  8:39         ` Michal Hocko
2016-06-03 15:17     ` Tetsuo Handa
2016-06-06  8:36       ` Michal Hocko
2016-06-07 14:30         ` Tetsuo Handa
2016-06-07 15:05           ` Michal Hocko
2016-06-07 21:49             ` Tetsuo Handa
2016-06-08  7:27               ` Michal Hocko [this message]
2016-06-08 14:55                 ` Tetsuo Handa
2016-06-08 16:05                   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160608072741.GE22570@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).