linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Oleg Nesterov <oleg@redhat.com>, Hugh Dickins <hughd@google.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm, oom: allow oom reaper to race with exit_mmap
Date: Thu, 27 Jul 2017 08:50:24 +0200	[thread overview]
Message-ID: <20170727065023.GB20970@dhcp22.suse.cz> (raw)
In-Reply-To: <20170726162912.GA29716@redhat.com>

On Wed 26-07-17 18:29:12, Andrea Arcangeli wrote:
> On Wed, Jul 26, 2017 at 07:45:57AM +0200, Michal Hocko wrote:
> > On Tue 25-07-17 21:19:52, Andrea Arcangeli wrote:
> > > On Tue, Jul 25, 2017 at 06:04:00PM +0200, Michal Hocko wrote:
> > > > -	down_write(&mm->mmap_sem);
> > > > +	if (tsk_is_oom_victim(current))
> > > > +		down_write(&mm->mmap_sem);
> > > >  	free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
> > > >  	tlb_finish_mmu(&tlb, 0, -1);
> > > >  
> > > > @@ -3012,7 +3014,8 @@ void exit_mmap(struct mm_struct *mm)
> > > >  	}
> > > >  	mm->mmap = NULL;
> > > >  	vm_unacct_memory(nr_accounted);
> > > > -	up_write(&mm->mmap_sem);
> > > > +	if (tsk_is_oom_victim(current))
> > > > +		up_write(&mm->mmap_sem);
> > > 
> > > How is this possibly safe? mark_oom_victim can run while exit_mmap is
> > > running.
> > 
> > I believe it cannot. We always call mark_oom_victim (on !current) with
> > task_lock held and check task->mm != NULL and we call do_exit->mmput after
> > mm is set to NULL under the same lock.
> 
> Holding the mmap_sem for writing and setting mm->mmap to NULL to
> filter which tasks already released the mmap_sem for writing post
> free_pgtables still look unnecessary to solve this.
> 
> Using MMF_OOM_SKIP as flag had side effects of oom_badness() skipping
> it, but we can use the same tsk_is_oom_victim instead and relay on the
> locking in mark_oom_victim you pointed out above instead of the
> test_and_set_bit of my patch, because current->mm is already NULL at
> that point.
> 
> A race at the light of the above now is, because current->mm is NULL by the
> time mmput is called, how can you start the oom_reap_task on a process
> with current->mm NULL that called the last mmput and is blocked
> in exit_aio?

Because we have that mm available. See tsk->signal->oom_mm in
oom_reap_task

> It looks like no false positive can get fixed until this
> is solved first because 
> 
> Isn't this enough? If this is enough it avoids other modification to
> the exit_mmap runtime that looks unnecessary: mm->mmap = NULL replaced
> by MMF_OOM_SKIP that has to be set anyway by __mmput later and one
> unnecessary branch to call the up_write.
> 
[...]
> diff --git a/mm/mmap.c b/mm/mmap.c
> index f19efcf75418..bdab595ce25c 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2993,6 +2993,23 @@ void exit_mmap(struct mm_struct *mm)
>  	/* Use -1 here to ensure all VMAs in the mm are unmapped */
>  	unmap_vmas(&tlb, vma, 0, -1);
>  
> +	set_bit(MMF_OOM_SKIP, &mm->flags);
> +	if (tsk_is_oom_victim(current)) {
> +		/*
> +		 * Wait for oom_reap_task() to stop working on this
> +		 * mm. Because MMF_OOM_SKIP is already set before
> +		 * calling down_read(), oom_reap_task() will not run
> +		 * on this "mm" post up_write().
> +		 *
> +		 * tsk_is_oom_victim() cannot be set from under us
> +		 * either because current->mm is already set to NULL
> +		 * under task_lock before calling mmput and oom_mm is
> +		 * set not NULL by the OOM killer only if current->mm
> +		 * is found not NULL while holding the task_lock.
> +		 */
> +		down_write(&mm->mmap_sem);
> +		up_write(&mm->mmap_sem);
> +	}
>  	free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
>  	tlb_finish_mmu(&tlb, 0, -1);

Yes this will work and it won't depend on the oom_lock. But isn't it
just more ugly than simply doing

	if (tsk_is_oom_victim) {
		down_write(&mm->mmap_sem);
		locked = true;
	}
	free_pgtables(...)
	[...]
	if (locked)
		down_up(&mm->mmap_sem);

in general I do not like empty locked sections much, to be honest. Now
with the conditional locking my patch looks as follows. It should be
pretty much equivalent to your patch. Would that be acceptable for you
or do you think there is a strong reason to go with yours?
---

  parent reply	other threads:[~2017-07-27  6:50 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24  7:23 [PATCH] mm, oom: allow oom reaper to race with exit_mmap Michal Hocko
2017-07-24 14:00 ` Kirill A. Shutemov
2017-07-24 14:15   ` Michal Hocko
2017-07-24 14:51     ` Kirill A. Shutemov
2017-07-24 16:11       ` Michal Hocko
2017-07-25 14:17         ` Kirill A. Shutemov
2017-07-25 14:26           ` Michal Hocko
2017-07-25 15:07             ` Kirill A. Shutemov
2017-07-25 15:15               ` Michal Hocko
2017-07-25 14:26         ` Michal Hocko
2017-07-25 15:17           ` Kirill A. Shutemov
2017-07-25 15:23             ` Michal Hocko
2017-07-25 15:31               ` Kirill A. Shutemov
2017-07-25 16:04                 ` Michal Hocko
2017-07-25 19:19                   ` Andrea Arcangeli
2017-07-26  5:45                     ` Michal Hocko
2017-07-26 16:29                       ` Andrea Arcangeli
2017-07-26 16:43                         ` Andrea Arcangeli
2017-07-27  6:50                         ` Michal Hocko [this message]
2017-07-27 14:55                           ` Andrea Arcangeli
2017-07-28  6:23                             ` Michal Hocko
2017-07-28  1:58                         ` [PATCH 1/1] mm: oom: let oom_reap_task and exit_mmap to run kbuild test robot
2017-08-15  0:20                         ` [PATCH] mm, oom: allow oom reaper to race with exit_mmap David Rientjes
2017-07-24 15:27 ` Michal Hocko
2017-07-24 16:42 ` kbuild test robot
2017-07-24 18:12   ` Michal Hocko
2017-07-25 15:26 ` Andrea Arcangeli
2017-07-25 15:45   ` Michal Hocko
2017-07-25 18:26     ` Andrea Arcangeli
2017-07-26  5:45       ` Michal Hocko
2017-07-26 16:39         ` Andrea Arcangeli
2017-07-27  6:32           ` Michal Hocko
2017-08-10  8:16 Michal Hocko
2017-08-10 18:05 ` Andrea Arcangeli
2017-08-10 18:51   ` Michal Hocko
2017-08-10 20:36     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170727065023.GB20970@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).