linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Oleg Nesterov <oleg@redhat.com>, Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH] mm, oom: allow oom reaper to race with exit_mmap
Date: Mon, 24 Jul 2017 17:00:08 +0300	[thread overview]
Message-ID: <20170724140008.sd2n6af6izjyjtda@node.shutemov.name> (raw)
In-Reply-To: <20170724072332.31903-1-mhocko@kernel.org>

On Mon, Jul 24, 2017 at 09:23:32AM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> David has noticed that the oom killer might kill additional tasks while
> the exiting oom victim hasn't terminated yet because the oom_reaper marks
> the curent victim MMF_OOM_SKIP too early when mm->mm_users dropped down
> to 0. The race is as follows
> 
> oom_reap_task				do_exit
> 					  exit_mm
>   __oom_reap_task_mm
> 					    mmput
> 					      __mmput
>     mmget_not_zero # fails
>     						exit_mmap # frees memory
>   set_bit(MMF_OOM_SKIP)
> 
> The victim is still visible to the OOM killer until it is unhashed.
> 
> Currently we try to reduce a risk of this race by taking oom_lock
> and wait for out_of_memory sleep while holding the lock to give the
> victim some time to exit. This is quite suboptimal approach because
> there is no guarantee the victim (especially a large one) will manage
> to unmap its address space and free enough memory to the particular oom
> domain which needs a memory (e.g. a specific NUMA node).
> 
> Fix this problem by allowing __oom_reap_task_mm and __mmput path to
> race. __oom_reap_task_mm is basically MADV_DONTNEED and that is allowed
> to run in parallel with other unmappers (hence the mmap_sem for read).
> 
> The only tricky part is to exclude page tables tear down and all
> operations which modify the address space in the __mmput path. exit_mmap
> doesn't expect any other users so it doesn't use any locking. Nothing
> really forbids us to use mmap_sem for write, though. In fact we are
> already relying on this lock earlier in the __mmput path to synchronize
> with ksm and khugepaged.

That's true, but we take mmap_sem there for small portion of cases.

It's quite different from taking the lock unconditionally. I'm worry about
scalability implication of such move. On bigger machines it can be big
hit.

Should we do performance/scalability evaluation of the patch before
getting it applied?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-24 14:00 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24  7:23 [PATCH] mm, oom: allow oom reaper to race with exit_mmap Michal Hocko
2017-07-24 14:00 ` Kirill A. Shutemov [this message]
2017-07-24 14:15   ` Michal Hocko
2017-07-24 14:51     ` Kirill A. Shutemov
2017-07-24 16:11       ` Michal Hocko
2017-07-25 14:17         ` Kirill A. Shutemov
2017-07-25 14:26           ` Michal Hocko
2017-07-25 15:07             ` Kirill A. Shutemov
2017-07-25 15:15               ` Michal Hocko
2017-07-25 14:26         ` Michal Hocko
2017-07-25 15:17           ` Kirill A. Shutemov
2017-07-25 15:23             ` Michal Hocko
2017-07-25 15:31               ` Kirill A. Shutemov
2017-07-25 16:04                 ` Michal Hocko
2017-07-25 19:19                   ` Andrea Arcangeli
2017-07-26  5:45                     ` Michal Hocko
2017-07-26 16:29                       ` Andrea Arcangeli
2017-07-26 16:43                         ` Andrea Arcangeli
2017-07-27  6:50                         ` Michal Hocko
2017-07-27 14:55                           ` Andrea Arcangeli
2017-07-28  6:23                             ` Michal Hocko
2017-07-28  1:58                         ` [PATCH 1/1] mm: oom: let oom_reap_task and exit_mmap to run kbuild test robot
2017-08-15  0:20                         ` [PATCH] mm, oom: allow oom reaper to race with exit_mmap David Rientjes
2017-07-24 15:27 ` Michal Hocko
2017-07-24 16:42 ` kbuild test robot
2017-07-24 18:12   ` Michal Hocko
2017-07-25 15:26 ` Andrea Arcangeli
2017-07-25 15:45   ` Michal Hocko
2017-07-25 18:26     ` Andrea Arcangeli
2017-07-26  5:45       ` Michal Hocko
2017-07-26 16:39         ` Andrea Arcangeli
2017-07-27  6:32           ` Michal Hocko
2017-08-10  8:16 Michal Hocko
2017-08-10 18:05 ` Andrea Arcangeli
2017-08-10 18:51   ` Michal Hocko
2017-08-10 20:36     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170724140008.sd2n6af6izjyjtda@node.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).