linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mmotm: mm-oom-fortify-task_will_free_mem-fix
@ 2016-06-29 11:59 Michal Hocko
  2016-06-29 19:22 ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2016-06-29 11:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tetsuo Handa, Oleg Nesterov, Vladimir Davydov, David Rientjes,
	linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

"mm, oom: fortify task_will_free_mem" has dropped task_lock around
task_will_free_mem in oom_kill_process bacause it assumed that a
potential race when the selected task exits will not be a problem
as the oom_reaper will call exit_oom_victim.

Tetsuo was objecting that nommu doesn't have oom_reaper so the race
would be still possible.  The code would be racy and lockup prone
theoretically in other aspects without the oom reaper anyway so I didn't
considered this a big deal. But it seems that further changes I am
planning in this area will benefit from stable task->mm in this path as
well. So let's drop find_lock_task_mm from task_will_free_mem and call
it from under task_lock as we did previously. Just pull the task->mm !=
NULL check inside the function.

Andrew, could you please fold this into
mm-oom-fortify-task_will_free_mem-fix.patch?

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c | 41 +++++++++++++++--------------------------
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 4c21f744daa6..7d0a275df822 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -757,45 +757,35 @@ static inline bool __task_will_free_mem(struct task_struct *task)
  * Checks whether the given task is dying or exiting and likely to
  * release its address space. This means that all threads and processes
  * sharing the same mm have to be killed or exiting.
+ * Caller has to make sure that task->mm is stable (hold task_lock or
+ * it operates on the current).
  */
 bool task_will_free_mem(struct task_struct *task)
 {
-	struct mm_struct *mm;
+	struct mm_struct *mm = task->mm;
 	struct task_struct *p;
 	bool ret;
 
-	if (!__task_will_free_mem(task))
-		return false;
-
 	/*
-	 * If the process has passed exit_mm we have to skip it because
-	 * we have lost a link to other tasks sharing this mm, we do not
-	 * have anything to reap and the task might then get stuck waiting
-	 * for parent as zombie and we do not want it to hold TIF_MEMDIE
+	 * Skip tasks without mm because it might have passed its exit_mm and
+	 * exit_oom_victim. oom_reaper could have rescued that but do not rely
+	 * on that for now. We can consider find_lock_task_mm in future.
 	 */
-	p = find_lock_task_mm(task);
-	if (!p)
+	if (!mm)
 		return false;
 
-	mm = p->mm;
+	if (!__task_will_free_mem(task))
+		return false;
 
 	/*
 	 * This task has already been drained by the oom reaper so there are
 	 * only small chances it will free some more
 	 */
-	if (test_bit(MMF_OOM_REAPED, &mm->flags)) {
-		task_unlock(p);
+	if (test_bit(MMF_OOM_REAPED, &mm->flags))
 		return false;
-	}
 
-	if (atomic_read(&mm->mm_users) <= 1) {
-		task_unlock(p);
+	if (atomic_read(&mm->mm_users) <= 1)
 		return true;
-	}
-
-	/* pin the mm to not get freed and reused */
-	atomic_inc(&mm->mm_count);
-	task_unlock(p);
 
 	/*
 	 * This is really pessimistic but we do not have any reliable way
@@ -812,7 +802,6 @@ bool task_will_free_mem(struct task_struct *task)
 			break;
 	}
 	rcu_read_unlock();
-	mmdrop(mm);
 
 	return ret;
 }
@@ -838,12 +827,15 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	 * If the task is already exiting, don't alarm the sysadmin or kill
 	 * its children or threads, just set TIF_MEMDIE so it can die quickly
 	 */
+	task_lock(p);
 	if (task_will_free_mem(p)) {
 		mark_oom_victim(p);
 		wake_oom_reaper(p);
+		task_unlock(p);
 		put_task_struct(p);
 		return;
 	}
+	task_unlock(p);
 
 	if (__ratelimit(&oom_rs))
 		dump_header(oc, p);
@@ -1014,11 +1006,8 @@ bool out_of_memory(struct oom_control *oc)
 	 * If current has a pending SIGKILL or is exiting, then automatically
 	 * select it.  The goal is to allow it to allocate so that it may
 	 * quickly exit and free its memory.
-	 *
-	 * But don't select if current has already released its mm and cleared
-	 * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur.
 	 */
-	if (current->mm && task_will_free_mem(current)) {
+	if (task_will_free_mem(current)) {
 		mark_oom_victim(current);
 		wake_oom_reaper(current);
 		return true;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mmotm: mm-oom-fortify-task_will_free_mem-fix
  2016-06-29 11:59 [PATCH] mmotm: mm-oom-fortify-task_will_free_mem-fix Michal Hocko
@ 2016-06-29 19:22 ` Oleg Nesterov
  2016-06-30  8:19   ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2016-06-29 19:22 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Tetsuo Handa, Vladimir Davydov, David Rientjes,
	linux-mm, LKML, Michal Hocko

On 06/29, Michal Hocko wrote:
>
> But it seems that further changes I am
> planning in this area will benefit from stable task->mm in this path

Oh, so I hope you will cleanup this later,

> Just pull the task->mm !=
> NULL check inside the function.

OK, but this means it will always return false if the task is a zombie
leader.

I am not really arguing and this is not that bad, but this doesn't look
nice and imo asks for cleanup.

Oleg.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mmotm: mm-oom-fortify-task_will_free_mem-fix
  2016-06-29 19:22 ` Oleg Nesterov
@ 2016-06-30  8:19   ` Michal Hocko
  0 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2016-06-30  8:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Tetsuo Handa, Vladimir Davydov, David Rientjes,
	linux-mm, LKML

On Wed 29-06-16 21:22:33, Oleg Nesterov wrote:
> On 06/29, Michal Hocko wrote:
> >
> > But it seems that further changes I am
> > planning in this area will benefit from stable task->mm in this path
> 
> Oh, so I hope you will cleanup this later,
> 
> > Just pull the task->mm !=
> > NULL check inside the function.
> 
> OK, but this means it will always return false if the task is a zombie
> leader.
> 
> I am not really arguing and this is not that bad, but this doesn't look
> nice and imo asks for cleanup.

I will keep that in mind and hopefully we can make this less obscure.
Who would like zombie leaders lurking around ;)
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-06-30  8:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-29 11:59 [PATCH] mmotm: mm-oom-fortify-task_will_free_mem-fix Michal Hocko
2016-06-29 19:22 ` Oleg Nesterov
2016-06-30  8:19   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).