linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michel Lespinasse <michel@lespinasse.org>
To: Linux-MM <linux-mm@kvack.org>
Cc: Laurent Dufour <ldufour@linux.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Matthew Wilcox <willy@infradead.org>,
	Rik van Riel <riel@surriel.com>,
	Paul McKenney <paulmck@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Rom Lemarchand <romlem@google.com>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	Michel Lespinasse <michel@lespinasse.org>
Subject: [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault().
Date: Tue,  6 Apr 2021 18:44:38 -0700	[thread overview]
Message-ID: <20210407014502.24091-14-michel@lespinasse.org> (raw)
In-Reply-To: <20210407014502.24091-1-michel@lespinasse.org>

The page table tree is walked with local irqs disabled, which prevents
page table reclamation (similarly to what fast GUP does). The logic is
otherwise similar to the non-speculative path, but with additional
restrictions: in the speculative path, we do not handle huge pages or
wiring new pages tables.

Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
 include/linux/mm.h |  4 +++
 mm/memory.c        | 77 ++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d5988e78e6ab..dee8a4833779 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -525,6 +525,10 @@ struct vm_fault {
 	};
 	unsigned int flags;		/* FAULT_FLAG_xxx flags
 					 * XXX: should really be 'const' */
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	unsigned long seq;
+	pmd_t orig_pmd;
+#endif
 	pmd_t *pmd;			/* Pointer to pmd entry matching
 					 * the 'address' */
 	pud_t *pud;			/* Pointer to pud entry matching
diff --git a/mm/memory.c b/mm/memory.c
index 66e7a4554c54..a17704aac019 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4307,7 +4307,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
  * return value.  See filemap_fault() and __lock_page_or_retry().
  */
 static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
-		unsigned long address, unsigned int flags)
+		unsigned long address, unsigned int flags, unsigned long seq)
 {
 	struct vm_fault vmf = {
 		.vma = vma,
@@ -4322,6 +4322,79 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 	p4d_t *p4d;
 	vm_fault_t ret;
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+	if (flags & FAULT_FLAG_SPECULATIVE) {
+		pgd_t pgdval;
+		p4d_t p4dval;
+		pud_t pudval;
+
+		vmf.seq = seq;
+
+		local_irq_disable();
+		pgd = pgd_offset(mm, address);
+		pgdval = READ_ONCE(*pgd);
+		if (pgd_none(pgdval) || unlikely(pgd_bad(pgdval)))
+			goto spf_fail;
+
+		p4d = p4d_offset(pgd, address);
+		p4dval = READ_ONCE(*p4d);
+		if (p4d_none(p4dval) || unlikely(p4d_bad(p4dval)))
+			goto spf_fail;
+
+		vmf.pud = pud_offset(p4d, address);
+		pudval = READ_ONCE(*vmf.pud);
+		if (pud_none(pudval) || unlikely(pud_bad(pudval)) ||
+		    unlikely(pud_trans_huge(pudval)) ||
+		    unlikely(pud_devmap(pudval)))
+			goto spf_fail;
+
+		vmf.pmd = pmd_offset(vmf.pud, address);
+		vmf.orig_pmd = READ_ONCE(*vmf.pmd);
+
+		/*
+		 * pmd_none could mean that a hugepage collapse is in
+		 * progress in our back as collapse_huge_page() mark
+		 * it before invalidating the pte (which is done once
+		 * the IPI is catched by all CPU and we have interrupt
+		 * disabled).  For this reason we cannot handle THP in
+		 * a speculative way since we can't safely identify an
+		 * in progress collapse operation done in our back on
+		 * that PMD.
+		 */
+		if (unlikely(pmd_none(vmf.orig_pmd) ||
+			     is_swap_pmd(vmf.orig_pmd) ||
+			     pmd_trans_huge(vmf.orig_pmd) ||
+			     pmd_devmap(vmf.orig_pmd)))
+			goto spf_fail;
+
+		/*
+		 * The above does not allocate/instantiate page-tables because
+		 * doing so would lead to the possibility of instantiating
+		 * page-tables after free_pgtables() -- and consequently
+		 * leaking them.
+		 *
+		 * The result is that we take at least one non-speculative
+		 * fault per PMD in order to instantiate it.
+		 */
+
+		vmf.pte = pte_offset_map(vmf.pmd, address);
+		vmf.orig_pte = READ_ONCE(*vmf.pte);
+		barrier();
+		if (pte_none(vmf.orig_pte)) {
+			pte_unmap(vmf.pte);
+			vmf.pte = NULL;
+		}
+
+		local_irq_enable();
+
+		return handle_pte_fault(&vmf);
+
+spf_fail:
+		local_irq_enable();
+		return VM_FAULT_RETRY;
+	}
+#endif	/* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 	pgd = pgd_offset(mm, address);
 	p4d = p4d_alloc(mm, pgd, address);
 	if (!p4d)
@@ -4541,7 +4614,7 @@ vm_fault_t do_handle_mm_fault(struct vm_area_struct *vma,
 	if (unlikely(is_vm_hugetlb_page(vma)))
 		ret = hugetlb_fault(vma->vm_mm, vma, address, flags);
 	else
-		ret = __handle_mm_fault(vma, address, flags);
+		ret = __handle_mm_fault(vma, address, flags, seq);
 
 	if (flags & FAULT_FLAG_USER) {
 		mem_cgroup_exit_user_fault();
-- 
2.20.1


  parent reply	other threads:[~2021-04-07  1:53 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07  1:44 [RFC PATCH 00/37] Speculative page faults Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 01/37] mmap locking API: mmap_lock_is_contended returns a bool Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 02/37] mmap locking API: name the return values Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 03/37] do_anonymous_page: use update_mmu_tlb() Michel Lespinasse
2021-04-07  2:06   ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 04/37] do_anonymous_page: reduce code duplication Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 05/37] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 06/37] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 07/37] mm: add FAULT_FLAG_SPECULATIVE flag Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 08/37] mm: add do_handle_mm_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling Michel Lespinasse
2021-04-07 14:47   ` Peter Zijlstra
2021-04-07 20:50     ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 10/37] mm: rcu safe vma freeing Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-07 14:48   ` Peter Zijlstra
2021-04-07 15:35     ` Matthew Wilcox
2021-04-07 20:32       ` Michel Lespinasse
2021-04-07 20:14     ` Michel Lespinasse
2021-04-07 20:18       ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 12/37] mm: refactor __handle_mm_fault() / handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` Michel Lespinasse [this message]
2021-04-07 15:36   ` [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault() Andy Lutomirski
2021-04-28 14:58     ` Michel Lespinasse
2021-04-28 15:13       ` Andy Lutomirski
2021-04-28 16:11         ` Paul E. McKenney
2021-04-29  0:02           ` Michel Lespinasse
2021-04-29  0:05             ` Andy Lutomirski
2021-04-29 16:12               ` Matthew Wilcox
2021-04-29 18:04                 ` Andy Lutomirski
2021-04-29 19:14                 ` Michel Lespinasse
2021-04-29 19:34                   ` Matthew Wilcox
2021-04-29 23:56                     ` Michel Lespinasse
2021-04-29 15:52             ` Paul E. McKenney
2021-04-29 18:34               ` Paul E. McKenney
2021-04-29 18:49                 ` Matthew Wilcox
2021-05-03  3:14                   ` Paul E. McKenney
2021-04-29 21:17                 ` Michel Lespinasse
2021-05-03  3:40                   ` Paul E. McKenney
2021-05-03  4:34                     ` Michel Lespinasse
2021-05-03 16:32                       ` Paul E. McKenney
2021-04-07  1:44 ` [RFC PATCH 14/37] mm: add pte_map_lock() and pte_spinlock() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 15/37] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 16/37] mm: enable speculative fault handling through do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 17/37] mm: implement speculative handling in do_numa_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 18/37] mm: enable speculative fault " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 19/37] mm: implement speculative handling in wp_page_copy() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 20/37] mm: implement and enable speculative fault handling in handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 21/37] mm: implement speculative handling in do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 22/37] mm: enable speculative fault handling through do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 23/37] mm: rcu safe vma->vm_file freeing Michel Lespinasse
2021-04-08  5:12   ` [mm] 87b1c39af4: nvml.blk_rw_mt_TEST0_check_pmem_debug.fail kernel test robot
2021-04-07  1:44 ` [RFC PATCH 24/37] mm: implement speculative handling in __do_fault() Michel Lespinasse
2021-04-07  2:35   ` Matthew Wilcox
2021-04-07  2:53     ` Michel Lespinasse
2021-04-07  3:01       ` Matthew Wilcox
2021-04-07 14:40   ` Peter Zijlstra
2021-04-07 21:20     ` Michel Lespinasse
2021-04-07 21:27       ` Matthew Wilcox
2021-04-08  7:00         ` Peter Zijlstra
2021-04-08  7:13           ` Matthew Wilcox
2021-04-08  8:18             ` Peter Zijlstra
2021-04-08  8:37             ` Michel Lespinasse
2021-04-08 11:28               ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 25/37] mm: implement speculative handling in filemap_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 26/37] mm: implement speculative fault handling in finish_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 27/37] mm: implement speculative handling in do_fault_around() Michel Lespinasse
2021-04-07  2:37   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 28/37] mm: implement speculative handling in filemap_map_pages() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 29/37] fs: list file types that support speculative faults Michel Lespinasse
2021-04-07  2:39   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 30/37] mm: enable speculative fault handling for supported file types Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 31/37] ext4: implement speculative fault handling Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 32/37] f2fs: " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 33/37] mm: enable speculative fault handling only for multithreaded user space Michel Lespinasse
2021-04-07  2:48   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 34/37] mm: rcu safe vma freeing " Michel Lespinasse
2021-04-07  2:50   ` Matthew Wilcox
2021-04-08  7:53     ` Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 35/37] mm: spf statistics Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 36/37] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 37/37] arm64/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-21  1:44 ` [RFC PATCH 00/37] Speculative page faults Chinwen Chang
2021-06-28 22:14 ` Axel Rasmussen
2021-07-21 11:33 ` vjitta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210407014502.24091-14-michel@lespinasse.org \
    --to=michel@lespinasse.org \
    --cc=akpm@linux-foundation.org \
    --cc=joelaf@google.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=romlem@google.com \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).