All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: <linux-mm@kvack.org>
Cc: David Rientjes <rientjes@google.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper
Date: Wed,  6 Apr 2016 16:13:16 +0200	[thread overview]
Message-ID: <1459951996-12875-4-git-send-email-mhocko@kernel.org> (raw)
In-Reply-To: <1459951996-12875-1-git-send-email-mhocko@kernel.org>

From: Michal Hocko <mhocko@suse.com>

Right now the oom reaper will clear TIF_MEMDIE only for tasks which
were successfully reaped. This is the safest option because we know
that such an oom victim would only block forward progress of the oom
killer without a good reason because it is highly unlikely it would
release much more memory. Basically most of its memory has been already
torn down.

We can relax this assumption to catch more corner cases though.

The first obvious one is when the oom victim clears its mm and gets
stuck later on. oom_reaper would back of on find_lock_task_mm returning
NULL. We can safely try to clear TIF_MEMDIE in this case because such a
task would be ignored by the oom killer anyway. The flag would be
cleared by that time already most of the time anyway.

The less obvious one is when the oom reaper fails due to mmap_sem
contention. Even if we clear TIF_MEMDIE for this task then it is not
very likely that we would select another task too easily because
we haven't reaped the last victim and so it would be still the #1
candidate. There is a rare race condition possible when the current
victim terminates before the next select_bad_process but considering
that oom_reap_task had retried several times before giving up then
this sounds like a borderline thing.

After this patch we should have a guarantee that the OOM killer will
not be block for unbounded amount of time for most cases.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 74c38f5fffef..7098104b7475 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -510,14 +510,10 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	up_read(&mm->mmap_sem);
 
 	/*
-	 * Clear TIF_MEMDIE because the task shouldn't be sitting on a
-	 * reasonably reclaimable memory anymore. OOM killer can continue
-	 * by selecting other victim if unmapping hasn't led to any
-	 * improvements. This also means that selecting this task doesn't
-	 * make any sense.
+	 * This task can be safely ignored because we cannot do much more
+	 * to release its memory.
 	 */
 	tsk->signal->oom_score_adj = OOM_SCORE_ADJ_MIN;
-	exit_oom_victim(tsk);
 out:
 	mmput(mm);
 	return ret;
@@ -538,6 +534,14 @@ static void oom_reap_task(struct task_struct *tsk)
 		debug_show_all_locks();
 	}
 
+	/*
+	 * Clear TIF_MEMDIE because the task shouldn't be sitting on a
+	 * reasonably reclaimable memory anymore or it is not a good candidate
+	 * for the oom victim right now because it cannot release its memory
+	 * itself nor by the oom reaper.
+	 */
+	exit_oom_victim(tsk);
+
 	/* Drop a reference taken by wake_oom_reaper */
 	put_task_struct(tsk);
 }
-- 
2.8.0.rc3

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: linux-mm@kvack.org
Cc: David Rientjes <rientjes@google.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper
Date: Wed,  6 Apr 2016 16:13:16 +0200	[thread overview]
Message-ID: <1459951996-12875-4-git-send-email-mhocko@kernel.org> (raw)
In-Reply-To: <1459951996-12875-1-git-send-email-mhocko@kernel.org>

From: Michal Hocko <mhocko@suse.com>

Right now the oom reaper will clear TIF_MEMDIE only for tasks which
were successfully reaped. This is the safest option because we know
that such an oom victim would only block forward progress of the oom
killer without a good reason because it is highly unlikely it would
release much more memory. Basically most of its memory has been already
torn down.

We can relax this assumption to catch more corner cases though.

The first obvious one is when the oom victim clears its mm and gets
stuck later on. oom_reaper would back of on find_lock_task_mm returning
NULL. We can safely try to clear TIF_MEMDIE in this case because such a
task would be ignored by the oom killer anyway. The flag would be
cleared by that time already most of the time anyway.

The less obvious one is when the oom reaper fails due to mmap_sem
contention. Even if we clear TIF_MEMDIE for this task then it is not
very likely that we would select another task too easily because
we haven't reaped the last victim and so it would be still the #1
candidate. There is a rare race condition possible when the current
victim terminates before the next select_bad_process but considering
that oom_reap_task had retried several times before giving up then
this sounds like a borderline thing.

After this patch we should have a guarantee that the OOM killer will
not be block for unbounded amount of time for most cases.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 74c38f5fffef..7098104b7475 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -510,14 +510,10 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	up_read(&mm->mmap_sem);
 
 	/*
-	 * Clear TIF_MEMDIE because the task shouldn't be sitting on a
-	 * reasonably reclaimable memory anymore. OOM killer can continue
-	 * by selecting other victim if unmapping hasn't led to any
-	 * improvements. This also means that selecting this task doesn't
-	 * make any sense.
+	 * This task can be safely ignored because we cannot do much more
+	 * to release its memory.
 	 */
 	tsk->signal->oom_score_adj = OOM_SCORE_ADJ_MIN;
-	exit_oom_victim(tsk);
 out:
 	mmput(mm);
 	return ret;
@@ -538,6 +534,14 @@ static void oom_reap_task(struct task_struct *tsk)
 		debug_show_all_locks();
 	}
 
+	/*
+	 * Clear TIF_MEMDIE because the task shouldn't be sitting on a
+	 * reasonably reclaimable memory anymore or it is not a good candidate
+	 * for the oom victim right now because it cannot release its memory
+	 * itself nor by the oom reaper.
+	 */
+	exit_oom_victim(tsk);
+
 	/* Drop a reference taken by wake_oom_reaper */
 	put_task_struct(tsk);
 }
-- 
2.8.0.rc3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-04-06 14:13 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06 14:13 [PATCH 0/3] oom reaper follow ups v1 Michal Hocko
2016-04-06 14:13 ` Michal Hocko
2016-04-06 14:13 ` [PATCH 1/3] mm, oom: move GFP_NOFS check to out_of_memory Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-06 14:13 ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular OOM killer path Michal Hocko
2016-04-06 14:13   ` Michal Hocko
2016-04-07 11:38   ` Tetsuo Handa
2016-04-07 11:38     ` Tetsuo Handa
2016-04-08 11:19     ` Tetsuo Handa
2016-04-08 11:19       ` Tetsuo Handa
2016-04-08 11:50       ` Michal Hocko
2016-04-08 11:50         ` Michal Hocko
2016-04-09  4:39         ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skipregular " Tetsuo Handa
2016-04-09  4:39           ` Tetsuo Handa
2016-04-11 12:02           ` Michal Hocko
2016-04-11 12:02             ` Michal Hocko
2016-04-11 13:26             ` [PATCH 2/3] oom, oom_reaper: Try to reap tasks which skip regular " Tetsuo Handa
2016-04-11 13:26               ` Tetsuo Handa
2016-04-11 13:43               ` Michal Hocko
2016-04-11 13:43                 ` Michal Hocko
2016-04-13 11:08                 ` Tetsuo Handa
2016-04-13 11:08                   ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko
2016-04-08 11:34       ` Michal Hocko
2016-04-08 13:14   ` Michal Hocko
2016-04-08 13:14     ` Michal Hocko
2016-04-06 14:13 ` Michal Hocko [this message]
2016-04-06 14:13   ` [PATCH 3/3] mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper Michal Hocko
2016-04-07 11:55   ` Tetsuo Handa
2016-04-07 11:55     ` Tetsuo Handa
2016-04-08 11:34     ` Michal Hocko
2016-04-08 11:34       ` Michal Hocko
2016-04-16  2:51       ` Tetsuo Handa
2016-04-17 11:54         ` Michal Hocko
2016-04-18 11:59           ` Tetsuo Handa
2016-04-19 14:17             ` Michal Hocko
2016-04-19 15:07               ` Tetsuo Handa
2016-04-19 19:32                 ` Michal Hocko
2016-04-08 13:07   ` Michal Hocko
2016-04-08 13:07     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459951996-12875-4-git-send-email-mhocko@kernel.org \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.