[PATCH] exit: clear TIF_MEMDIE after exit_task_work

* [PATCH] exit: clear TIF_MEMDIE after exit_task_work
@ 2016-02-29 17:02 Vladimir Davydov
  2016-02-29 18:21 ` Michal Hocko
  2016-03-01 15:52 ` Michal Hocko
  0 siblings, 2 replies; 18+ messages in thread
From: Vladimir Davydov @ 2016-02-29 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Tetsuo Handa, David Rientjes, linux-mm, linux-kernel

An mm_struct may be pinned by a file. An example is vhost-net device
created by a qemu/kvm (see vhost_net_ioctl -> vhost_net_set_owner ->
vhost_dev_set_owner). If such process gets OOM-killed, the reference to
its mm_struct will only be released from exit_task_work -> ____fput ->
__fput -> vhost_net_release -> vhost_dev_cleanup, which is called after
exit_mmap, where TIF_MEMDIE is cleared. As a result, we can start
selecting the next victim before giving the last one a chance to free
its memory. In practice, this leads to killing several VMs along with
the fattest one.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 kernel/exit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index fd90195667e1..cc50e12165f7 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -434,8 +434,6 @@ static void exit_mm(struct task_struct *tsk)
 	task_unlock(tsk);
 	mm_update_next_owner(mm);
 	mmput(mm);
-	if (test_thread_flag(TIF_MEMDIE))
-		exit_oom_victim(tsk);
 }
 
 static struct task_struct *find_alive_thread(struct task_struct *p)
@@ -746,6 +744,8 @@ void do_exit(long code)
 		disassociate_ctty(1);
 	exit_task_namespaces(tsk);
 	exit_task_work(tsk);
+	if (test_thread_flag(TIF_MEMDIE))
+		exit_oom_victim(tsk);
 	exit_thread();
 
 	/*
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread