linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, michel@lespinasse.org,
	jglisse@google.com, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, mgorman@techsingularity.net,
	dave@stgolabs.net, willy@infradead.org, peterz@infradead.org,
	ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com,
	will@kernel.org, luto@kernel.org, songliubraving@fb.com,
	peterx@redhat.com, david@redhat.com, dhowells@redhat.com,
	hughd@google.com, bigeasy@linutronix.de,
	kent.overstreet@linux.dev, punit.agrawal@bytedance.com,
	lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com,
	chriscli@google.com, axelrasmussen@google.com, joelaf@google.com,
	minchan@google.com, rppt@kernel.org, jannh@google.com,
	shakeelb@google.com, tatashin@google.com, edumazet@google.com,
	gthelen@google.com, gurua@google.com, arjunroy@google.com,
	soheil@google.com, leewalsh@google.com, posk@google.com,
	michalechner92@googlemail.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area
Date: Fri, 17 Feb 2023 09:50:52 -0500	[thread overview]
Message-ID: <20230217145052.y526nmjudi6t2ael@revolver> (raw)
In-Reply-To: <CAJuCfpEkujbHNxNWcWr8bmrsMhXGcpDyraOfQaPAcOH=RQPv5A@mail.gmail.com>

* Suren Baghdasaryan <surenb@google.com> [230216 14:36]:
> On Thu, Feb 16, 2023 at 7:34 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> >
> > First, sorry I didn't see this before v3..
> 
> Feedback at any time is highly appreciated!
> 
> >
> > * Suren Baghdasaryan <surenb@google.com> [230216 00:18]:
> > > While unmapping VMAs, adjacent VMAs might be able to grow into the area
> > > being unmapped. In such cases write-lock adjacent VMAs to prevent this
> > > growth.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > ---
> > >  mm/mmap.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 118b2246bba9..00f8c5798936 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -2399,11 +2399,13 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > >        * down_read(mmap_lock) and collide with the VMA we are about to unmap.
> > >        */
> > >       if (downgrade) {
> > > -             if (next && (next->vm_flags & VM_GROWSDOWN))
> > > +             if (next && (next->vm_flags & VM_GROWSDOWN)) {
> > > +                     vma_start_write(next);
> > >                       downgrade = false;
> >
> > If the mmap write lock is insufficient to protect us from next/prev
> > modifications then we need to move *most* of this block above the maple
> > tree write operation, otherwise we have a race here.  When I say most, I
> > mean everything besides the call to mmap_write_downgrade() needs to be
> > moved.
> 
> Which prior maple tree write operation are you referring to? I see
> __split_vma() and munmap_sidetree() which both already lock the VMAs
> they operate on, so page faults can't happen in those VMAs.

The write that removes the VMAs from the maple tree a few lines above..
/* Point of no return */

If the mmap lock is not sufficient, then we need to move the
vma_start_write() of prev/next to above the call to
vma_iter_clear_gfp() in do_vmi_align_munmap().

But I still think it IS enough.

> 
> >
> > If the mmap write lock is sufficient to protect us from next/prev
> > modifications then we don't need to write lock the vmas themselves.
> 
> mmap write lock is not sufficient because with per-VMA locks we do not
> take mmap lock at all.

Understood, but it also does not expand VMAs.

> 
> >
> > I believe this is for expand_stack() protection, so I believe it's okay
> > to not vma write lock these vmas.. I don't think there are other areas
> > where we can modify the vmas without holding the mmap lock, but others
> > on the CC list please chime in if I've forgotten something.
> >
> > So, if I am correct, then you shouldn't lock next/prev and allow the
> > vma locking fault method on these vmas.  This will work because
> > lock_vma_under_rcu() uses mas_walk() on the faulting address.  That is,
> > your lock_vma_under_rcu() will fail to find anything that needs to be
> > grown and go back to mmap lock protection.  As it is written today, the
> > vma locking fault handler will fail and we will wait for the mmap lock
> > to be released even when the vma isn't going to expand.
> 
> So, let's consider a case when the next VMA is not being removed (so
> it was neither removed nor locked by munmap_sidetree()) and it is
> found by lock_vma_under_rcu() in the page fault handling path.

By this point next VMA is either NULL or outside the munmap area, so
what you said here is always true.

>Page
> fault handler can now expand it and push into the area we are
> unmapping in unmap_region(). That is the race I'm trying to prevent
> here by locking the next/prev VMAs which can be expanded before
> unmap_region() unmaps them. Am I missing something?

Yes, I think the part you are missing (or I am missing..) is that
expand_stack() will never be called without the mmap lock.  We don't use
the vma locking to expand the stack.

...


  reply	other threads:[~2023-02-17 14:52 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16  5:17 [PATCH v3 00/35] Per-VMA locks Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 01/35] maple_tree: Be more cautious about dead nodes Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 02/35] maple_tree: Detect dead nodes in mas_start() Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 03/35] maple_tree: Fix freeing of nodes in rcu mode Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 04/35] maple_tree: remove extra smp_wmb() from mas_dead_leaves() Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 05/35] maple_tree: Fix write memory barrier of nodes once dead for RCU mode Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 06/35] maple_tree: Add smp_rmb() to dead node detection Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 07/35] maple_tree: Add RCU lock checking to rcu callback functions Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 08/35] mm: Enable maple tree RCU mode by default Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 09/35] mm: introduce CONFIG_PER_VMA_LOCK Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 10/35] mm: rcu safe VMA freeing Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 11/35] mm: move mmap_lock assert function definitions Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 12/35] mm: add per-VMA lock and helper functions to control it Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 13/35] mm: mark VMA as being written when changing vm_flags Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 14/35] mm/mmap: move VMA locking before vma_adjust_trans_huge call Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 15/35] mm/khugepaged: write-lock VMA while collapsing a huge page Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 16/35] mm/mmap: write-lock VMAs before merging, splitting or expanding them Suren Baghdasaryan
2023-02-23 14:51   ` Hyeonggon Yoo
2023-02-23 14:59     ` Hyeonggon Yoo
2023-02-23 17:46     ` Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 17/35] mm/mmap: write-lock VMA before shrinking or expanding it Suren Baghdasaryan
2023-02-23 20:20   ` Liam R. Howlett
2023-02-23 20:28     ` Liam R. Howlett
2023-02-23 21:16       ` Suren Baghdasaryan
2023-02-24  1:46         ` Liam R. Howlett
2023-02-24  2:06           ` Suren Baghdasaryan
2023-02-24 16:14             ` Liam R. Howlett
2023-02-24 16:19               ` Suren Baghdasaryan
2023-02-27 17:33                 ` Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 18/35] mm/mremap: write-lock VMA while remapping it to a new address range Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 19/35] mm: write-lock VMAs before removing them from VMA tree Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 20/35] mm: conditionally write-lock VMA in free_pgtables Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area Suren Baghdasaryan
2023-02-16 15:34   ` Liam R. Howlett
2023-02-16 19:36     ` Suren Baghdasaryan
2023-02-17 14:50       ` Liam R. Howlett [this message]
2023-02-17 15:54         ` Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 22/35] kernel/fork: assert no VMA readers during its destruction Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 23/35] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration Suren Baghdasaryan
2023-02-23 20:06   ` Liam R. Howlett
2023-02-23 20:29     ` Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 24/35] mm: introduce vma detached flag Suren Baghdasaryan
2023-02-23 20:08   ` Liam R. Howlett
2023-02-23 20:34     ` Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 25/35] mm: introduce lock_vma_under_rcu to be used from arch-specific code Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 26/35] mm: fall back to mmap_lock if vma->anon_vma is not yet set Suren Baghdasaryan
2023-02-16 15:44   ` Matthew Wilcox
2023-02-16 19:43     ` Suren Baghdasaryan
2023-02-17  2:14       ` Suren Baghdasaryan
2023-02-17 10:21         ` Hyeonggon Yoo
2023-02-17 16:13           ` Suren Baghdasaryan
2023-02-17 18:49             ` Hyeonggon Yoo
2023-02-17 16:05         ` Matthew Wilcox
2023-02-17 16:10           ` Suren Baghdasaryan
2023-04-03 19:49             ` Matthew Wilcox
2023-02-16  5:17 ` [PATCH v3 27/35] mm: add FAULT_FLAG_VMA_LOCK flag Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 28/35] mm: prevent do_swap_page from handling page faults under VMA lock Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 29/35] mm: prevent userfaults to be handled under per-vma lock Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 30/35] mm: introduce per-VMA lock statistics Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 31/35] x86/mm: try VMA lock-based page fault handling first Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 32/35] arm64/mm: " Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 33/35] powerc/mm: " Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 34/35] mm/mmap: free vm_area_struct without call_rcu in exit_mmap Suren Baghdasaryan
2023-02-16  5:17 ` [PATCH v3 35/35] mm: separate vma->lock from vm_area_struct Suren Baghdasaryan
2023-02-24  9:21 ` [PATCH v3 00/35] Per-VMA locks freak07
2023-02-27 16:50   ` Davidlohr Bueso
2023-02-27 17:22     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230217145052.y526nmjudi6t2ael@revolver \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjunroy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=chriscli@google.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=gthelen@google.com \
    --cc=gurua@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=joelaf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=leewalsh@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lstoakes@gmail.com \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=michalechner92@googlemail.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterjung1337@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=posk@google.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shakeelb@google.com \
    --cc=soheil@google.com \
    --cc=songliubraving@fb.com \
    --cc=surenb@google.com \
    --cc=tatashin@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).