All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper
Date: Fri, 8 Apr 2016 13:34:25 +0200	[thread overview]
Message-ID: <20160408113425.GF29820@dhcp22.suse.cz> (raw)
In-Reply-To: <201604072055.GAI52128.tHLVOFJOQMFOFS@I-love.SAKURA.ne.jp>

On Thu 07-04-16 20:55:34, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > The first obvious one is when the oom victim clears its mm and gets
> > stuck later on. oom_reaper would back of on find_lock_task_mm returning
> > NULL. We can safely try to clear TIF_MEMDIE in this case because such a
> > task would be ignored by the oom killer anyway. The flag would be
> > cleared by that time already most of the time anyway.
> 
> I didn't understand what this wants to tell. The OOM victim will clear
> TIF_MEMDIE as soon as it sets current->mm = NULL.

No it clears the flag _after_ it returns from mmput. There is no
guarantee it won't get stuck somewhere on the way there - e.g. exit_aio
waits for completion and who knows what else might get stuck.

> Even if the oom victim
> clears its mm and gets stuck later on (e.g. at exit_task_work()),
> TIF_MEMDIE was already cleared by that moment by the OOM victim.
> 
> > 
> > The less obvious one is when the oom reaper fails due to mmap_sem
> > contention. Even if we clear TIF_MEMDIE for this task then it is not
> > very likely that we would select another task too easily because
> > we haven't reaped the last victim and so it would be still the #1
> > candidate. There is a rare race condition possible when the current
> > victim terminates before the next select_bad_process but considering
> > that oom_reap_task had retried several times before giving up then
> > this sounds like a borderline thing.
> 
> Is it helpful? Allowing the OOM killer to select the same thread again
> simply makes the kernel log buffer flooded with the OOM kill messages.

I am trying to be as conservative as possible here. The likelyhood of
mmap sem contention will be reduced considerably after my
down_write_killable series will get merged. If this turns out to be a
problem (trivial to spot as the same task will be killed again) then we
can think about a fix for that (e.g. ignore the task if the has been
selected more than N times).

> I think we should not allow the OOM killer to select the same thread again
> by e.g. doing tsk->signal->oom_score_adj = OOM_SCORE_ADJ_MIN regardless of
> whether reaping that thread's memory succeeded or not.

I think this comes with some risk and so it should go as a separate
patch with a full justification why the outcome is better. Especially
after the mmap_sem contention will be reduced by other means.
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper
Date: Fri, 8 Apr 2016 13:34:25 +0200	[thread overview]
Message-ID: <20160408113425.GF29820@dhcp22.suse.cz> (raw)
In-Reply-To: <201604072055.GAI52128.tHLVOFJOQMFOFS@I-love.SAKURA.ne.jp>

On Thu 07-04-16 20:55:34, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > The first obvious one is when the oom victim clears its mm and gets
> > stuck later on. oom_reaper would back of on find_lock_task_mm returning
> > NULL. We can safely try to clear TIF_MEMDIE in this case because such a
> > task would be ignored by the oom killer anyway. The flag would be
> > cleared by that time already most of the time anyway.
> 
> I didn't understand what this wants to tell. The OOM victim will clear
> TIF_MEMDIE as soon as it sets current->mm = NULL.

No it clears the flag _after_ it returns from mmput. There is no
guarantee it won't get stuck somewhere on the way there - e.g. exit_aio
waits for completion and who knows what else might get stuck.

> Even if the oom victim
> clears its mm and gets stuck later on (e.g. at exit_task_work()),
> TIF_MEMDIE was already cleared by that moment by the OOM victim.
> 
> > 
> > The less obvious one is when the oom reaper fails due to mmap_sem
> > contention. Even if we clear TIF_MEMDIE for this task then it is not
> > very likely that we would select another task too easily because
> > we haven't reaped the last victim and so it would be still the #1
> > candidate. There is a rare race condition possible when the current
> > victim terminates before the next select_bad_process but considering
> > that oom_reap_task had retried several times before giving up then
> > this sounds like a borderline thing.
> 
> Is it helpful? Allowing the OOM killer to select the same thread again
> simply makes the kernel log buffer flooded with the OOM kill messages.

I am trying to be as conservative as possible here. The likelyhood of
mmap sem contention will be reduced considerably after my
down_write_killable series will get merged. If this turns out to be a
problem (trivial to spot as the same task will be killed again) then we
can think about a fix for that (e.g. ignore the task if the has been
selected more than N times).

> I think we should not allow the OOM killer to select the same thread again
> by e.g. doing tsk->signal->oom_score_adj = OOM_SCORE_ADJ_MIN regardless of
> whether reaping that thread's memory succeeded or not.

I think this comes with some risk and so it should go as a separate
patch with a full justification why the outcome is better. Especially
after the mmap_sem contention will be reduced by other means.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-04-08 11:34 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06 14:13 [PATCH 0/3] oom reaper follow ups v1 Michal Hocko
2016-04-06 14:13 ` Michal Hocko
2016-04-06 14:13 ` [PATCH 1/3] mm, oom: move GFP_NOFS check to out_of_memory Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-06 14:13 ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular OOM killer path Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-07 11:38   ` Tetsuo Handa
2016-04-07 11:38     ` Tetsuo Handa
2016-04-08 11:19     ` Tetsuo Handa
2016-04-08 11:19       ` Tetsuo Handa
2016-04-08 11:50       ` Michal Hocko
2016-04-08 11:50         ` Michal Hocko
2016-04-09  4:39         ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skipregular " Tetsuo Handa
2016-04-09  4:39           ` Tetsuo Handa
2016-04-11 12:02           ` Michal Hocko
2016-04-11 12:02             ` Michal Hocko
2016-04-11 13:26             ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular " Tetsuo Handa
2016-04-11 13:26               ` Tetsuo Handa
2016-04-11 13:43               ` Michal Hocko
2016-04-11 13:43                 ` Michal Hocko
2016-04-13 11:08                 ` Tetsuo Handa
2016-04-13 11:08                   ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko
2016-04-08 11:34       ` Michal Hocko
2016-04-08 13:14   ` Michal Hocko
2016-04-08 13:14     ` Michal Hocko
2016-04-06 14:13 ` [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-07 11:55   ` Tetsuo Handa
2016-04-07 11:55     ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko [this message]
2016-04-08 11:34       ` Michal Hocko
2016-04-16  2:51       ` Tetsuo Handa
2016-04-17 11:54         ` Michal Hocko
2016-04-18 11:59           ` Tetsuo Handa
2016-04-19 14:17             ` Michal Hocko
2016-04-19 15:07               ` Tetsuo Handa
2016-04-19 19:32                 ` Michal Hocko
2016-04-08 13:07   ` Michal Hocko
2016-04-08 13:07     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160408113425.GF29820@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.