From: Michel Lespinasse <michel@lespinasse.org>
To: Linux-MM <linux-mm@kvack.org>,
Linux-Kernel <linux-kernel@vger.kernel.org>
Cc: Laurent Dufour <ldufour@linux.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Rik van Riel <riel@surriel.com>,
Paul McKenney <paulmck@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Joel Fernandes <joelaf@google.com>,
Andy Lutomirski <luto@kernel.org>,
Michel Lespinasse <michel@lespinasse.org>
Subject: [PATCH 15/29] mm: implement speculative handling in __handle_mm_fault().
Date: Fri, 30 Apr 2021 12:52:16 -0700 [thread overview]
Message-ID: <20210430195232.30491-16-michel@lespinasse.org> (raw)
In-Reply-To: <20210430195232.30491-1-michel@lespinasse.org>
The speculative path calls speculative_page_walk_begin() before walking
the page table tree to prevent page table reclamation. The logic is
otherwise similar to the non-speculative path, but with additional
restrictions: in the speculative path, we do not handle huge pages or
wiring new pages tables.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
include/linux/mm.h | 4 +++
mm/memory.c | 77 ++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 79 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d5988e78e6ab..dee8a4833779 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -525,6 +525,10 @@ struct vm_fault {
};
unsigned int flags; /* FAULT_FLAG_xxx flags
* XXX: should really be 'const' */
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+ unsigned long seq;
+ pmd_t orig_pmd;
+#endif
pmd_t *pmd; /* Pointer to pmd entry matching
* the 'address' */
pud_t *pud; /* Pointer to pud entry matching
diff --git a/mm/memory.c b/mm/memory.c
index 45696166b10f..3f5c3d6c0197 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4329,7 +4329,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
* return value. See filemap_fault() and __lock_page_or_retry().
*/
static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
- unsigned long address, unsigned int flags)
+ unsigned long address, unsigned int flags, unsigned long seq)
{
struct vm_fault vmf = {
.vma = vma,
@@ -4344,6 +4344,79 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
p4d_t *p4d;
vm_fault_t ret;
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+ if (flags & FAULT_FLAG_SPECULATIVE) {
+ pgd_t pgdval;
+ p4d_t p4dval;
+ pud_t pudval;
+
+ vmf.seq = seq;
+
+ speculative_page_walk_begin();
+ pgd = pgd_offset(mm, address);
+ pgdval = READ_ONCE(*pgd);
+ if (pgd_none(pgdval) || unlikely(pgd_bad(pgdval)))
+ goto spf_fail;
+
+ p4d = p4d_offset(pgd, address);
+ p4dval = READ_ONCE(*p4d);
+ if (p4d_none(p4dval) || unlikely(p4d_bad(p4dval)))
+ goto spf_fail;
+
+ vmf.pud = pud_offset(p4d, address);
+ pudval = READ_ONCE(*vmf.pud);
+ if (pud_none(pudval) || unlikely(pud_bad(pudval)) ||
+ unlikely(pud_trans_huge(pudval)) ||
+ unlikely(pud_devmap(pudval)))
+ goto spf_fail;
+
+ vmf.pmd = pmd_offset(vmf.pud, address);
+ vmf.orig_pmd = READ_ONCE(*vmf.pmd);
+
+ /*
+ * pmd_none could mean that a hugepage collapse is in
+ * progress in our back as collapse_huge_page() mark
+ * it before invalidating the pte (which is done once
+ * the IPI is catched by all CPU and we have interrupt
+ * disabled). For this reason we cannot handle THP in
+ * a speculative way since we can't safely identify an
+ * in progress collapse operation done in our back on
+ * that PMD.
+ */
+ if (unlikely(pmd_none(vmf.orig_pmd) ||
+ is_swap_pmd(vmf.orig_pmd) ||
+ pmd_trans_huge(vmf.orig_pmd) ||
+ pmd_devmap(vmf.orig_pmd)))
+ goto spf_fail;
+
+ /*
+ * The above does not allocate/instantiate page-tables because
+ * doing so would lead to the possibility of instantiating
+ * page-tables after free_pgtables() -- and consequently
+ * leaking them.
+ *
+ * The result is that we take at least one non-speculative
+ * fault per PMD in order to instantiate it.
+ */
+
+ vmf.pte = pte_offset_map(vmf.pmd, address);
+ vmf.orig_pte = READ_ONCE(*vmf.pte);
+ barrier();
+ if (pte_none(vmf.orig_pte)) {
+ pte_unmap(vmf.pte);
+ vmf.pte = NULL;
+ }
+
+ speculative_page_walk_end();
+
+ return handle_pte_fault(&vmf);
+
+ spf_fail:
+ speculative_page_walk_end();
+ return VM_FAULT_RETRY;
+ }
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+
pgd = pgd_offset(mm, address);
p4d = p4d_alloc(mm, pgd, address);
if (!p4d)
@@ -4563,7 +4636,7 @@ vm_fault_t do_handle_mm_fault(struct vm_area_struct *vma,
if (unlikely(is_vm_hugetlb_page(vma)))
ret = hugetlb_fault(vma->vm_mm, vma, address, flags);
else
- ret = __handle_mm_fault(vma, address, flags);
+ ret = __handle_mm_fault(vma, address, flags, seq);
if (flags & FAULT_FLAG_USER) {
mem_cgroup_exit_user_fault();
--
2.20.1
next prev parent reply other threads:[~2021-04-30 19:53 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-30 19:52 [PATCH 00/29] Speculative page faults (anon vmas only) Michel Lespinasse
2021-04-30 19:52 ` [PATCH 01/29] mm: export dump_mm Michel Lespinasse
2021-04-30 19:52 ` [PATCH 02/29] mmap locking API: mmap_lock_is_contended returns a bool Michel Lespinasse
2021-04-30 19:52 ` [PATCH 03/29] mmap locking API: name the return values Michel Lespinasse
2021-04-30 19:52 ` [PATCH 04/29] do_anonymous_page: use update_mmu_tlb() Michel Lespinasse
2021-06-10 0:38 ` Suren Baghdasaryan
2021-04-30 19:52 ` [PATCH 05/29] do_anonymous_page: reduce code duplication Michel Lespinasse
2021-04-30 19:52 ` [PATCH 06/29] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-30 19:52 ` [PATCH 07/29] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-30 19:52 ` [PATCH 08/29] mm: add FAULT_FLAG_SPECULATIVE flag Michel Lespinasse
2021-06-10 0:58 ` Suren Baghdasaryan
2021-04-30 19:52 ` [PATCH 09/29] mm: add do_handle_mm_fault() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 10/29] mm: add per-mm mmap sequence counter for speculative page fault handling Michel Lespinasse
2021-04-30 19:52 ` [PATCH 11/29] mm: rcu safe vma freeing Michel Lespinasse
2021-04-30 19:52 ` [PATCH 12/29] x86/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-30 19:52 ` [PATCH 13/29] mm: add speculative_page_walk_begin() and speculative_page_walk_end() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 14/29] mm: refactor __handle_mm_fault() / handle_pte_fault() Michel Lespinasse
2021-04-30 19:52 ` Michel Lespinasse [this message]
2021-04-30 19:52 ` [PATCH 16/29] mm: add pte_map_lock() and pte_spinlock() Michel Lespinasse
2021-04-30 23:33 ` kernel test robot
2021-04-30 23:45 ` kernel test robot
2021-04-30 19:52 ` [PATCH 17/29] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 18/29] mm: enable speculative fault handling through do_anonymous_page() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 19/29] mm: implement speculative handling in do_numa_page() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 20/29] mm: enable speculative fault " Michel Lespinasse
2021-04-30 19:52 ` [PATCH 21/29] mm: implement speculative handling in wp_page_copy() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 22/29] mm: implement and enable speculative fault handling in handle_pte_fault() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 23/29] mm: implement speculative handling in do_swap_page() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 24/29] mm: enable speculative fault handling through do_swap_page() Michel Lespinasse
2021-04-30 19:52 ` [PATCH 25/29] mm: disable speculative faults for single threaded user space Michel Lespinasse
2021-04-30 19:52 ` [PATCH 26/29] mm: disable rcu safe vma freeing " Michel Lespinasse
2021-04-30 19:52 ` [PATCH 27/29] mm: anon spf statistics Michel Lespinasse
2021-04-30 22:52 ` kernel test robot
2021-04-30 19:52 ` [PATCH 28/29] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-30 19:52 ` [PATCH 29/29] arm64/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-30 19:52 ` [PATCH 30/31] powerpc/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-30 19:52 ` [PATCH 31/31] powerpc/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-30 22:46 ` [PATCH 00/29] Speculative page faults (anon vmas only) Michel Lespinasse
2021-05-03 18:11 ` Michel Lespinasse
2021-05-17 17:57 ` Paul E. McKenney
2021-05-20 22:10 ` Suren Baghdasaryan
2021-05-20 23:08 ` Paul E. McKenney
2021-06-01 7:41 ` Michel Lespinasse
2021-06-01 20:18 ` Paul E. McKenney
2021-06-01 20:23 ` Paul E. McKenney
2021-06-14 7:04 ` Michel Lespinasse
2021-05-01 19:56 ` Theodore Ts'o
2021-05-01 21:19 ` Michel Lespinasse
2021-06-17 13:46 ` David Hildenbrand
2021-07-09 10:41 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210430195232.30491-16-michel@lespinasse.org \
--to=michel@lespinasse.org \
--cc=akpm@linux-foundation.org \
--cc=joelaf@google.com \
--cc=ldufour@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mhocko@suse.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=surenb@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).