All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: linux-mm@kvack.org, akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>, Roman Gushchin <guro@fb.com>,
	Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>
Subject: [PATCH 7/8] mm,oom: Do not sleep with oom_lock held.
Date: Tue,  3 Jul 2018 23:25:08 +0900	[thread overview]
Message-ID: <1530627910-3415-8-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <1530627910-3415-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>

Since oom_reap_mm() might take quite long time, it is not a good thing to
block other threads in different OOM domains. This patch allows calling
oom_reap_mm() from multiple concurrently allocating threads. By this
change, the page allocator can spend CPU resource for oom_reap_mm() in
their interested OOM domains.

Also, out_of_memory() no longer holds oom_lock which might sleep (except
cond_resched() and CONFIG_PREEMPT=y cases), for both OOM notifiers and
oom_reap_mm() are called outside of oom_lock. This means that oom_lock is
almost a spinlock now. But this patch does not convert oom_lock, for
saving CPU resources for selecting OOM victims, printk() etc. is a still
good thing to do.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: CVE-2016-10723
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
---
 mm/oom_kill.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index a1d3616..d534684 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -921,11 +921,18 @@ static bool oom_has_pending_victims(struct oom_control *oc)
 	struct task_struct *p, *tmp;
 	bool ret = false;
 	bool gaveup = false;
+	unsigned int pos = 0;
+	unsigned int last_pos = 0;
 
+ retry:
 	lockdep_assert_held(&oom_lock);
 	list_for_each_entry_safe(p, tmp, &oom_victim_list, oom_victim_list) {
 		struct mm_struct *mm = p->signal->oom_mm;
 
+		if (pos++ < last_pos)
+			continue;
+		last_pos = pos;
+
 		/* Skip OOM victims which current thread cannot select. */
 		if (oom_unkillable_task(p, oc->memcg, oc->nodemask))
 			continue;
@@ -937,8 +944,23 @@ static bool oom_has_pending_victims(struct oom_control *oc)
 		 */
 		if (down_read_trylock(&mm->mmap_sem)) {
 			if (!test_bit(MMF_OOM_SKIP, &mm->flags) &&
-			    !mm_has_blockable_invalidate_notifiers(mm))
+			    !mm_has_blockable_invalidate_notifiers(mm)) {
+				get_task_struct(p);
+				mmgrab(mm);
+				mutex_unlock(&oom_lock);
 				oom_reap_mm(mm);
+				up_read(&mm->mmap_sem);
+				mmdrop(mm);
+				put_task_struct(p);
+				mutex_lock(&oom_lock);
+				/*
+				 * Since ret == true, skipping some OOM victims
+				 * by racing with exit_oom_mm() will not cause
+				 * premature OOM victim selection.
+				 */
+				pos = 0;
+				goto retry;
+			}
 			up_read(&mm->mmap_sem);
 		}
 #endif
-- 
1.8.3.1

  parent reply	other threads:[~2018-07-03 14:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-03 14:25 [PATCH 0/8] OOM killer/reaper changes for avoiding OOM lockup problem Tetsuo Handa
2018-07-03 14:25 ` [PATCH 1/8] mm,oom: Don't call schedule_timeout_killable() with oom_lock held Tetsuo Handa
2018-07-03 14:38   ` Michal Hocko
2018-07-03 14:25 ` [PATCH 2/8] mm,oom: Check pending victims earlier in out_of_memory() Tetsuo Handa
2018-07-03 14:25 ` [PATCH 3/8] mm,oom: Fix unnecessary killing of additional processes Tetsuo Handa
2018-07-03 14:58   ` Michal Hocko
2018-07-03 14:25 ` [PATCH 4/8] mm,page_alloc: Make oom_reserves_allowed() even Tetsuo Handa
2018-07-03 14:25 ` [PATCH 5/8] mm,oom: Bring OOM notifier to outside of oom_lock Tetsuo Handa
2018-07-03 14:59   ` Michal Hocko
2018-07-03 14:25 ` [PATCH 6/8] mm,oom: Make oom_lock static variable Tetsuo Handa
2018-07-03 14:25 ` Tetsuo Handa [this message]
2018-07-03 14:25 ` [PATCH 8/8] mm,page_alloc: Move the short sleep to should_reclaim_retry() Tetsuo Handa
2018-07-03 15:12 ` [PATCH 0/8] OOM killer/reaper changes for avoiding OOM lockup problem Michal Hocko
2018-07-03 15:29   ` Michal Hocko
2018-07-04  2:22     ` penguin-kernel
2018-07-04  7:16       ` Michal Hocko
2018-07-04  7:22         ` Michal Hocko
2018-07-05  3:05           ` Tetsuo Handa
2018-07-05  7:24             ` Michal Hocko
2018-07-06  2:40               ` Tetsuo Handa
2018-07-06  2:49                 ` Linus Torvalds
2018-07-07  1:12                   ` Tetsuo Handa
2018-07-09  7:45                     ` Michal Hocko
2018-07-06  5:56                 ` Michal Hocko
2018-07-10  3:57                   ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1530627910-3415-8-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.