From: Peter Zijlstra <peterz@infradead.org> To: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>, paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox <willy@infradead.org>, benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, hpa@zytor.com, Will Deacon <will.deacon@arm.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen <tim.c.chen@linux.intel.com>, linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure Date: Mon, 28 Aug 2017 11:37:27 +0200 [thread overview] Message-ID: <20170828093727.5wldedputadanssh@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <20170827001823.n5wgkfq36z6snvf2@node.shutemov.name> On Sun, Aug 27, 2017 at 03:18:23AM +0300, Kirill A. Shutemov wrote: > On Fri, Aug 18, 2017 at 12:05:13AM +0200, Laurent Dufour wrote: > > + /* > > + * Can't call vm_ops service has we don't know what they would do > > + * with the VMA. > > + * This include huge page from hugetlbfs. > > + */ > > + if (vma->vm_ops) > > + goto unlock; > > I think we need to have a way to white-list safe ->vm_ops. Either that, or simply teach all ->fault() callbacks about speculative faults. Shouldn't be too hard, just 'work'. > > + > > + if (unlikely(!vma->anon_vma)) > > + goto unlock; > > It deserves a comment. Yes, that was very much not intended. It wrecks most of the fun. This really _should_ work for file maps too. > > + /* > > + * Do a speculative lookup of the PTE entry. > > + */ > > + local_irq_disable(); > > + pgd = pgd_offset(mm, address); > > + if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) > > + goto out_walk; > > + > > + p4d = p4d_alloc(mm, pgd, address); > > + if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d))) > > + goto out_walk; > > + > > + pud = pud_alloc(mm, p4d, address); > > + if (pud_none(*pud) || unlikely(pud_bad(*pud))) > > + goto out_walk; > > + > > + pmd = pmd_offset(pud, address); > > + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) > > + goto out_walk; > > + > > + /* > > + * The above does not allocate/instantiate page-tables because doing so > > + * would lead to the possibility of instantiating page-tables after > > + * free_pgtables() -- and consequently leaking them. > > + * > > + * The result is that we take at least one !speculative fault per PMD > > + * in order to instantiate it. > > + */ > > > Doing all this job and just give up because we cannot allocate page tables > looks very wasteful to me. > > Have you considered to look how we can hand over from speculative to > non-speculative path without starting from scratch (when possible)? So we _can_ in fact allocate and install page-tables, but we have to be very careful about it. The interesting case is where we race with free_pgtables() and install a page that was just taken out. But since we already have the VMA I think we can do something like: if (p*g_none()) { p*d_t *new = p*d_alloc_one(mm, address); spin_lock(&mm->page_table_lock); if (!vma_changed_or_dead(vma,seq)) { if (p*d_none()) p*d_populate(mm, p*d, new); else p*d_free(new); new = NULL; } spin_unlock(&mm->page_table_lock); if (new) { p*d_free(new); goto out_walk; } } I just never bothered with that, figured we ought to get the basics working before trying to be clever. > > + /* Transparent huge pages are not supported. */ > > + if (unlikely(pmd_trans_huge(*pmd))) > > + goto out_walk; > > That's looks like a blocker to me. > > Is there any problem with making it supported (besides plain coding)? Not that I can remember, but I never really looked at THP, I don't think we even had that when I did the first versions.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org> To: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>, paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox <willy@infradead.org>, benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, hpa@zytor.com, Will Deacon <will.deacon@arm.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen <tim.c.chen@linux.intel.com>, linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure Date: Mon, 28 Aug 2017 11:37:27 +0200 [thread overview] Message-ID: <20170828093727.5wldedputadanssh@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <20170827001823.n5wgkfq36z6snvf2@node.shutemov.name> On Sun, Aug 27, 2017 at 03:18:23AM +0300, Kirill A. Shutemov wrote: > On Fri, Aug 18, 2017 at 12:05:13AM +0200, Laurent Dufour wrote: > > + /* > > + * Can't call vm_ops service has we don't know what they would do > > + * with the VMA. > > + * This include huge page from hugetlbfs. > > + */ > > + if (vma->vm_ops) > > + goto unlock; > > I think we need to have a way to white-list safe ->vm_ops. Either that, or simply teach all ->fault() callbacks about speculative faults. Shouldn't be too hard, just 'work'. > > + > > + if (unlikely(!vma->anon_vma)) > > + goto unlock; > > It deserves a comment. Yes, that was very much not intended. It wrecks most of the fun. This really _should_ work for file maps too. > > + /* > > + * Do a speculative lookup of the PTE entry. > > + */ > > + local_irq_disable(); > > + pgd = pgd_offset(mm, address); > > + if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) > > + goto out_walk; > > + > > + p4d = p4d_alloc(mm, pgd, address); > > + if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d))) > > + goto out_walk; > > + > > + pud = pud_alloc(mm, p4d, address); > > + if (pud_none(*pud) || unlikely(pud_bad(*pud))) > > + goto out_walk; > > + > > + pmd = pmd_offset(pud, address); > > + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) > > + goto out_walk; > > + > > + /* > > + * The above does not allocate/instantiate page-tables because doing so > > + * would lead to the possibility of instantiating page-tables after > > + * free_pgtables() -- and consequently leaking them. > > + * > > + * The result is that we take at least one !speculative fault per PMD > > + * in order to instantiate it. > > + */ > > > Doing all this job and just give up because we cannot allocate page tables > looks very wasteful to me. > > Have you considered to look how we can hand over from speculative to > non-speculative path without starting from scratch (when possible)? So we _can_ in fact allocate and install page-tables, but we have to be very careful about it. The interesting case is where we race with free_pgtables() and install a page that was just taken out. But since we already have the VMA I think we can do something like: if (p*g_none()) { p*d_t *new = p*d_alloc_one(mm, address); spin_lock(&mm->page_table_lock); if (!vma_changed_or_dead(vma,seq)) { if (p*d_none()) p*d_populate(mm, p*d, new); else p*d_free(new); new = NULL; } spin_unlock(&mm->page_table_lock); if (new) { p*d_free(new); goto out_walk; } } I just never bothered with that, figured we ought to get the basics working before trying to be clever. > > + /* Transparent huge pages are not supported. */ > > + if (unlikely(pmd_trans_huge(*pmd))) > > + goto out_walk; > > That's looks like a blocker to me. > > Is there any problem with making it supported (besides plain coding)? Not that I can remember, but I never really looked at THP, I don't think we even had that when I did the first versions. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-28 9:37 UTC|newest] Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-08-17 22:04 [PATCH v2 00/20] Speculative page faults Laurent Dufour 2017-08-17 22:04 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 01/20] mm: Dont assume page-table invariance during faults Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 02/20] mm: Prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 03/20] mm: Introduce pte_spinlock " Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 04/20] mm: VMA sequence count Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 05/20] mm: Protect VMA modifications using " Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 06/20] mm: RCU free VMAs Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 07/20] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 08/20] mm: Protect SPF handler against anon_vma changes Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 09/20] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 10/20] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 11/20] mm: Introduce __maybe_mkwrite() Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 12/20] mm: Introduce __vm_normal_page() Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 13/20] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 14/20] mm: Provide speculative fault infrastructure Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-20 12:11 ` Sergey Senozhatsky 2017-08-20 12:11 ` Sergey Senozhatsky 2017-08-25 8:52 ` Laurent Dufour 2017-08-25 8:52 ` Laurent Dufour 2017-08-27 0:18 ` Kirill A. Shutemov 2017-08-27 0:18 ` Kirill A. Shutemov 2017-08-28 9:37 ` Peter Zijlstra [this message] 2017-08-28 9:37 ` Peter Zijlstra 2017-08-28 21:14 ` Benjamin Herrenschmidt 2017-08-28 21:14 ` Benjamin Herrenschmidt 2017-08-28 22:35 ` Andi Kleen 2017-08-28 22:35 ` Andi Kleen 2017-08-29 8:15 ` Peter Zijlstra 2017-08-29 8:15 ` Peter Zijlstra 2017-08-29 8:33 ` Peter Zijlstra 2017-08-29 8:33 ` Peter Zijlstra 2017-08-29 11:27 ` Peter Zijlstra 2017-08-29 11:27 ` Peter Zijlstra 2017-08-29 21:19 ` Benjamin Herrenschmidt 2017-08-29 21:19 ` Benjamin Herrenschmidt 2017-08-30 6:13 ` Peter Zijlstra 2017-08-30 6:13 ` Peter Zijlstra 2017-08-29 7:59 ` Laurent Dufour 2017-08-29 7:59 ` Laurent Dufour 2017-08-29 12:04 ` Peter Zijlstra 2017-08-29 12:04 ` Peter Zijlstra 2017-08-29 13:18 ` Laurent Dufour 2017-08-29 13:18 ` Laurent Dufour 2017-08-29 13:45 ` Peter Zijlstra 2017-08-29 13:45 ` Peter Zijlstra 2017-08-30 5:03 ` Anshuman Khandual 2017-08-30 5:03 ` Anshuman Khandual 2017-08-30 5:58 ` Peter Zijlstra 2017-08-30 5:58 ` Peter Zijlstra 2017-08-30 9:32 ` Laurent Dufour 2017-08-30 9:32 ` Laurent Dufour 2017-08-31 6:55 ` Anshuman Khandual 2017-08-31 6:55 ` Anshuman Khandual 2017-08-31 7:31 ` Peter Zijlstra 2017-08-31 7:31 ` Peter Zijlstra 2017-08-30 9:53 ` Laurent Dufour 2017-08-30 9:53 ` Laurent Dufour 2017-08-30 3:48 ` Anshuman Khandual 2017-08-30 3:48 ` Anshuman Khandual 2017-08-30 5:25 ` Anshuman Khandual 2017-08-30 5:25 ` Anshuman Khandual 2017-08-30 8:56 ` Laurent Dufour 2017-08-30 8:56 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 15/20] mm: Try spin lock in speculative path Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 16/20] mm: Adding speculative page fault failure trace events Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 17/20] perf: Add a speculative page fault sw event Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-21 8:55 ` Anshuman Khandual 2017-08-21 8:55 ` Anshuman Khandual 2017-08-22 1:46 ` Michael Ellerman 2017-08-22 1:46 ` Michael Ellerman 2017-08-17 22:05 ` [PATCH v2 18/20] perf tools: Add support for the SPF perf event Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-21 8:48 ` Anshuman Khandual 2017-08-21 8:48 ` Anshuman Khandual 2017-08-25 8:53 ` Laurent Dufour 2017-08-25 8:53 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 19/20] x86/mm: Add speculative pagefault handling Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-21 7:29 ` Anshuman Khandual 2017-08-21 7:29 ` Anshuman Khandual 2017-08-29 14:50 ` Laurent Dufour 2017-08-29 14:50 ` Laurent Dufour 2017-08-29 14:58 ` Laurent Dufour 2017-08-29 14:58 ` Laurent Dufour 2017-08-17 22:05 ` [PATCH v2 20/20] powerpc/mm: Add speculative page fault Laurent Dufour 2017-08-17 22:05 ` Laurent Dufour 2017-08-21 6:58 ` Anshuman Khandual 2017-08-21 6:58 ` Anshuman Khandual 2017-08-29 15:13 ` Laurent Dufour 2017-08-29 15:13 ` Laurent Dufour 2017-08-21 2:26 ` [PATCH v2 00/20] Speculative page faults Sergey Senozhatsky 2017-08-21 2:26 ` Sergey Senozhatsky 2017-09-08 9:24 ` Laurent Dufour 2017-09-08 9:24 ` Laurent Dufour 2017-09-11 0:45 ` Sergey Senozhatsky 2017-09-11 0:45 ` Sergey Senozhatsky 2017-09-11 6:28 ` Laurent Dufour 2017-09-11 6:28 ` Laurent Dufour 2017-08-21 6:28 ` Anshuman Khandual 2017-08-21 6:28 ` Anshuman Khandual 2017-08-22 0:41 ` Paul E. McKenney 2017-08-22 0:41 ` Paul E. McKenney 2017-08-25 9:41 ` Laurent Dufour 2017-08-25 9:41 ` Laurent Dufour
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170828093727.5wldedputadanssh@hirez.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=ak@linux.intel.com \ --cc=akpm@linux-foundation.org \ --cc=benh@kernel.crashing.org \ --cc=bsingharora@gmail.com \ --cc=dave@stgolabs.net \ --cc=haren@linux.vnet.ibm.com \ --cc=hpa@zytor.com \ --cc=jack@suse.cz \ --cc=khandual@linux.vnet.ibm.com \ --cc=kirill@shutemov.name \ --cc=ldufour@linux.vnet.ibm.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mhocko@kernel.org \ --cc=mingo@redhat.com \ --cc=mpe@ellerman.id.au \ --cc=npiggin@gmail.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=paulus@samba.org \ --cc=tglx@linutronix.de \ --cc=tim.c.chen@linux.intel.com \ --cc=will.deacon@arm.com \ --cc=willy@infradead.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.