linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Paul McKenney <paulmckrcu@fb.com>
Subject: Re: synchronize_rcu in munmap?
Date: Tue, 9 Feb 2021 13:38:22 -0400	[thread overview]
Message-ID: <20210209173822.GH4718@ziepe.ca> (raw)
In-Reply-To: <17e3b4d0-8a16-75ba-e1c7-b678e4cf2089@linux.ibm.com>

On Tue, Feb 09, 2021 at 06:19:35PM +0100, Laurent Dufour wrote:
> Le 09/02/2021 à 15:29, Matthew Wilcox a écrit :
> > On Mon, Feb 08, 2021 at 01:26:43PM +0000, Matthew Wilcox wrote:
> > > Next problem: /proc/$pid/smaps calls walk_page_vma() which starts out by
> > > saying:
> > >          mmap_assert_locked(walk.mm);
> > > which made me realise that smaps is also going to walk the page tables.
> > > So the page tables have to be pinned by the existence of the VMA.
> > > Which means the page tables must be freed by the same RCU callback that
> > > frees the VMA.  But doing that means that a task which calls mmap();
> > > munmap(); mmap(); must avoid allocating the same address for the second
> > > mmap (until the RCU grace period has elapsed), otherwise threads on
> > > other CPUs may see the stale PTEs instead of the new ones.
> > > 
> > > Solution 1: Move the page table freeing into the RCU callback, call
> > > synchronize_rcu() in munmap().
> > > 
> > > Solution 2: Refcount the VMA and free the page tables on refcount
> > > dropping to zero.  This doesn't actually work because the stale PTE
> > > problem still exists.
> > > 
> > > Solution 3: When unmapping a VMA, instead of erasing the VMA from the
> > > maple tree, put a "dead" entry in its place.  Once the RCU freeing and the
> > > TLB shootdown has happened, erase the entry and it can then be allocated.
> > > If we do that MAP_FIXED will have to synchronize_rcu() if it overlaps
> > > a dead entry.
> > 
> > Solution 4: RCU free the page table pages and teach pagewalk.c to
> > be RCU-safe.  That means that it will have to use rcu_dereference()
> > or READ_ONCE to dereference (eg) pmdp, but also allows GUP-fast to run
> > under the rcu read lock instead of disabling interrupts.
> 
> I might be wrong but my understanding is that the RCU window could not be
> closed on a CPU where IRQs are disabled. So in a first step GUP-fast might
> continue to disable interrupts to get safe walking the page directories.

Yes, this is right. PPC already uses RCU for the TLB flush and the
GUP-fast trick is safe against that.

The comments for PPC say the downside of RCU is having to do an
allocation in paths that really don't want to fail on memory
exhaustion

The pagewalk.c needs to call its ops in a sleepable context, otherwise
it could just use the normal page table locks.. Not sure RCU could be
fit into here?

Jason


  reply	other threads:[~2021-02-09 17:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 13:26 synchronize_rcu in munmap? Matthew Wilcox
2021-02-09 14:29 ` Matthew Wilcox
2021-02-09 17:19   ` Laurent Dufour
2021-02-09 17:38     ` Jason Gunthorpe [this message]
2021-02-09 19:58       ` Matthew Wilcox
2021-02-10 16:42         ` Jason Gunthorpe
2021-02-09 17:08 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210209173822.GH4718@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=Liam.Howlett@oracle.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=paulmckrcu@fb.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).