linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Michel Lespinasse <michel@lespinasse.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Linux-MM <linux-mm@kvack.org>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Michal Hocko <mhocko@suse.com>, Rik van Riel <riel@surriel.com>,
	Paul McKenney <paulmck@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Rom Lemarchand <romlem@google.com>,
	Linux-Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 24/37] mm: implement speculative handling in __do_fault()
Date: Thu, 8 Apr 2021 12:28:08 +0100	[thread overview]
Message-ID: <20210408112808.GK2531743@casper.infradead.org> (raw)
In-Reply-To: <20210408083734.GB27824@lespinasse.org>

On Thu, Apr 08, 2021 at 01:37:34AM -0700, Michel Lespinasse wrote:
> On Thu, Apr 08, 2021 at 08:13:43AM +0100, Matthew Wilcox wrote:
> > On Thu, Apr 08, 2021 at 09:00:26AM +0200, Peter Zijlstra wrote:
> > > On Wed, Apr 07, 2021 at 10:27:12PM +0100, Matthew Wilcox wrote:
> > > > Doing I/O without any lock held already works; it just uses the file
> > > > refcount.  It would be better to use a vma refcount, as I already said.
> > > 
> > > The original workload that I developed SPF for (waaaay back when) was
> > > prefaulting a single huge vma. Using a vma refcount was a total loss
> > > because it resulted in the same cacheline contention that down_read()
> > > was having.
> > > 
> > > As such, I'm always incredibly sad to see mention of vma refcounts.
> > > They're fundamentally not solving the problem :/
> > 
> > OK, let me outline my locking scheme because I think it's rather better
> > than Michel's.  The vma refcount is the slow path.
> > 
> > 1. take the RCU read lock
> > 2. walk the pgd/p4d/pud/pmd
> > 3. allocate page tables if necessary.  *handwave GFP flags*.
> > 4. walk the vma tree
> > 5. call ->map_pages
> > 6. take ptlock
> > 7. insert page(s)
> > 8. drop ptlock
> > if this all worked out, we're done, drop the RCU read lock and return.
> > 9. increment vma refcount
> > 10. drop RCU read lock
> > 11. call ->fault
> > 12. decrement vma refcount
> 
> Note that most of your proposed steps seem similar in principle to mine.
> Looking at the fast path (steps 1-8):
> - step 2 sounds like the speculative part of __handle_mm_fault()
> - (step 3 not included in my proposal)
> - step 4 is basically the lookup I currently have in the arch fault handler
> - step 6 sounds like the speculative part of map_pte_lock()
> 
> I have working implementations for each step, while your proposal
> summarizes each as a point item. It's not clear to me what to make of it;
> presumably you would be "filling in the blanks" in a different way
> than I have but you are not explaining how. Are you suggesting that
> the precautions taken in each step to avoid races with mmap writers
> would not be necessary in your proposal ? if that is the case, what is
> the alternative mechanism would you use to handle such races ?

I don't know if you noticed, I've been a little busy with memory folios.
I did tell you that on the call, but you don't seem to retain anything
I tell you on the call, so maybe I shouldn't bother calling in any more.

> Going back to the source of this, you suggested not copying the VMA,
> what is your proposed alternative ? Do you suggest that fault handlers
> should deal with the vma potentially mutating under them ? Or should
> mmap writers consider vmas as immutable and copy them whenever they
> want to change them ? or are you implying a locking mechanism that would
> prevent mmap writers from executing while the fault is running ?

The VMA should be immutable, as I explained to you before.

  reply	other threads:[~2021-04-08 11:29 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07  1:44 [RFC PATCH 00/37] Speculative page faults Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 01/37] mmap locking API: mmap_lock_is_contended returns a bool Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 02/37] mmap locking API: name the return values Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 03/37] do_anonymous_page: use update_mmu_tlb() Michel Lespinasse
2021-04-07  2:06   ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 04/37] do_anonymous_page: reduce code duplication Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 05/37] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 06/37] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 07/37] mm: add FAULT_FLAG_SPECULATIVE flag Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 08/37] mm: add do_handle_mm_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling Michel Lespinasse
2021-04-07 14:47   ` Peter Zijlstra
2021-04-07 20:50     ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 10/37] mm: rcu safe vma freeing Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-07 14:48   ` Peter Zijlstra
2021-04-07 15:35     ` Matthew Wilcox
2021-04-07 20:32       ` Michel Lespinasse
2021-04-07 20:14     ` Michel Lespinasse
2021-04-07 20:18       ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 12/37] mm: refactor __handle_mm_fault() / handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault() Michel Lespinasse
2021-04-07 15:36   ` Andy Lutomirski
2021-04-28 14:58     ` Michel Lespinasse
2021-04-28 15:13       ` Andy Lutomirski
2021-04-28 16:11         ` Paul E. McKenney
2021-04-29  0:02           ` Michel Lespinasse
2021-04-29  0:05             ` Andy Lutomirski
2021-04-29 16:12               ` Matthew Wilcox
2021-04-29 18:04                 ` Andy Lutomirski
2021-04-29 19:14                 ` Michel Lespinasse
2021-04-29 19:34                   ` Matthew Wilcox
2021-04-29 23:56                     ` Michel Lespinasse
2021-04-29 15:52             ` Paul E. McKenney
2021-04-29 18:34               ` Paul E. McKenney
2021-04-29 18:49                 ` Matthew Wilcox
2021-05-03  3:14                   ` Paul E. McKenney
2021-04-29 21:17                 ` Michel Lespinasse
2021-05-03  3:40                   ` Paul E. McKenney
2021-05-03  4:34                     ` Michel Lespinasse
2021-05-03 16:32                       ` Paul E. McKenney
2021-04-07  1:44 ` [RFC PATCH 14/37] mm: add pte_map_lock() and pte_spinlock() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 15/37] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 16/37] mm: enable speculative fault handling through do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 17/37] mm: implement speculative handling in do_numa_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 18/37] mm: enable speculative fault " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 19/37] mm: implement speculative handling in wp_page_copy() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 20/37] mm: implement and enable speculative fault handling in handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 21/37] mm: implement speculative handling in do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 22/37] mm: enable speculative fault handling through do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 23/37] mm: rcu safe vma->vm_file freeing Michel Lespinasse
2021-04-08  5:12   ` [mm] 87b1c39af4: nvml.blk_rw_mt_TEST0_check_pmem_debug.fail kernel test robot
2021-04-07  1:44 ` [RFC PATCH 24/37] mm: implement speculative handling in __do_fault() Michel Lespinasse
2021-04-07  2:35   ` Matthew Wilcox
2021-04-07  2:53     ` Michel Lespinasse
2021-04-07  3:01       ` Matthew Wilcox
2021-04-07 14:40   ` Peter Zijlstra
2021-04-07 21:20     ` Michel Lespinasse
2021-04-07 21:27       ` Matthew Wilcox
2021-04-08  7:00         ` Peter Zijlstra
2021-04-08  7:13           ` Matthew Wilcox
2021-04-08  8:18             ` Peter Zijlstra
2021-04-08  8:37             ` Michel Lespinasse
2021-04-08 11:28               ` Matthew Wilcox [this message]
2021-04-07  1:44 ` [RFC PATCH 25/37] mm: implement speculative handling in filemap_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 26/37] mm: implement speculative fault handling in finish_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 27/37] mm: implement speculative handling in do_fault_around() Michel Lespinasse
2021-04-07  2:37   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 28/37] mm: implement speculative handling in filemap_map_pages() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 29/37] fs: list file types that support speculative faults Michel Lespinasse
2021-04-07  2:39   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 30/37] mm: enable speculative fault handling for supported file types Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 31/37] ext4: implement speculative fault handling Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 32/37] f2fs: " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 33/37] mm: enable speculative fault handling only for multithreaded user space Michel Lespinasse
2021-04-07  2:48   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 34/37] mm: rcu safe vma freeing " Michel Lespinasse
2021-04-07  2:50   ` Matthew Wilcox
2021-04-08  7:53     ` Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 35/37] mm: spf statistics Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 36/37] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 37/37] arm64/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-21  1:44 ` [RFC PATCH 00/37] Speculative page faults Chinwen Chang
2021-06-28 22:14 ` Axel Rasmussen
2021-07-21 11:33 ` vjitta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210408112808.GK2531743@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=joelaf@google.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=romlem@google.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).