From: Leonardo Bras <leonardo@linux.ibm.com> To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>, Arnd Bergmann <arnd@arndb.de>, Andrew Morton <akpm@linux-foundation.org>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Nicholas Piggin <npiggin@gmail.com>, Christophe Leroy <christophe.leroy@c-s.fr>, Steven Price <steven.price@arm.com>, Robin Murphy <robin.murphy@arm.com>, Leonardo Bras <leonardo@linux.ibm.com>, Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>, Balbir Singh <bsingharora@gmail.com>, Reza Arbab <arbab@linux.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Allison Randal <allison@lohutok.net>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Mike Rapoport <rppt@linux.ibm.com>, Michal Suchanek <msuchanek@su> Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 08/11] powerpc/kvm/book3s_hv: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:57 -0300 [thread overview] Message-ID: <20200206030900.147032-9-leonardo@linux.ibm.com> (raw) In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> Applies the new functions for tracking all book3s_hv related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0 kvmppc_do_h_enter: Fixes where local_irq_restore() must be placed (after the last usage of ptep). Given that some of these functions can be called in real mode, and others always are, we use __{begin,end}_lockless_pgtbl_walk so we can decide when to disable interrupts. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> --- arch/powerpc/kvm/book3s_hv_nested.c | 22 ++++++++++++++++++++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 28 ++++++++++++++++++---------- 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index dc97e5be76f6..a398061d5778 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -803,7 +803,11 @@ static void kvmhv_update_nest_rmap_rc(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find the pte */ + /* Find the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* * If the pte is present and the pfn is still the same, update the pte. @@ -853,7 +857,11 @@ static void kvmhv_remove_nest_rmap(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find and invalidate the pte */ + /* Find and invalidate the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* Don't spuriously invalidate ptes if the pfn has changed */ if (ptep && pte_present(*ptep) && ((pte_val(*ptep) & mask) == hpa)) @@ -921,6 +929,11 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, int shift; spin_lock(&kvm->mmu_lock); + /* + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; @@ -1362,6 +1375,11 @@ static long int __kvmhv_nested_page_fault(struct kvm_run *run, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); spin_lock(&kvm->mmu_lock); + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ pte_p = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 220305454c23..fd4d8f174f09 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -210,7 +210,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pte_t *ptep; unsigned int writing; unsigned long mmu_seq; - unsigned long rcbits, irq_flags = 0; + unsigned long rcbits, irq_mask = 0; if (kvm_is_radix(kvm)) return H_FUNCTION; @@ -252,8 +252,8 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * If we had a page table table change after lookup, we would * retry via mmu_notifier_retry. */ - if (!realmode) - local_irq_save(irq_flags); + irq_mask = __begin_lockless_pgtbl_walk(!realmode); + /* * If called in real mode we have MSR_EE = 0. Otherwise * we disable irq above. @@ -272,8 +272,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * to <= host page size, if host is using hugepage */ if (host_pte_size < psize) { - if (!realmode) - local_irq_restore(flags); + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; } pte = kvmppc_read_update_linux_pte(ptep, writing); @@ -287,8 +286,6 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - if (!realmode) - local_irq_restore(irq_flags); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -302,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /*If we had host pte mapping then Check WIMG */ if (ptep && !hpte_cache_flags_ok(ptel, is_ci)) { - if (is_ci) + if (is_ci) { + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; + } /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -311,6 +310,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, ptel &= ~(HPTE_R_W|HPTE_R_I|HPTE_R_G); ptel |= HPTE_R_M; } + __end_lockless_pgtbl_walk(irq_mask, !realmode); /* Find and lock the HPTEG slot to use */ do_insert: @@ -907,11 +907,19 @@ static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - /* Try to find the host pte for that virtual address */ + /* Try to find the host pte for that virtual address : + * Called by hcall_real_table (real mode + MSR_EE=0) + * Interrupts are disabled here. + */ + __begin_lockless_pgtbl_walk(false); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - if (!ptep) + if (!ptep) { + __end_lockless_pgtbl_walk(0, false); return H_TOO_HARD; + } pte = kvmppc_read_update_linux_pte(ptep, writing); + __end_lockless_pgtbl_walk(0, false); + if (!pte_present(pte)) return H_TOO_HARD; -- 2.24.1
WARNING: multiple messages have this Message-ID (diff)
From: Leonardo Bras <leonardo@linux.ibm.com> To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au>, Arnd Bergmann <arnd@arndb.de>, Andrew Morton <akpm@linux-foundation.org>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Nicholas Piggin <npiggin@gmail.com>, Christophe Leroy <christophe.leroy@c-s.fr>, Steven Price <steven.price@arm.com>, Robin Murphy <robin.murphy@arm.com>, Leonardo Bras <leonardo@linux.ibm.com>, Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>, Balbir Singh <bsingharora@gmail.com>, Reza Arbab <arbab@linux.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Allison Randal <allison@lohutok.net>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Mike Rapoport <rppt@linux.ibm.com>, Michal Suchanek <msuchanek@suse.de> Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 08/11] powerpc/kvm/book3s_hv: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:57 -0300 [thread overview] Message-ID: <20200206030900.147032-9-leonardo@linux.ibm.com> (raw) Message-ID: <20200206030857.-3uiAKD-t77a5TRIRxqUjaTkmrU_qZOwhDvFeLUTQmk@z> (raw) In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> Applies the new functions for tracking all book3s_hv related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0 kvmppc_do_h_enter: Fixes where local_irq_restore() must be placed (after the last usage of ptep). Given that some of these functions can be called in real mode, and others always are, we use __{begin,end}_lockless_pgtbl_walk so we can decide when to disable interrupts. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> --- arch/powerpc/kvm/book3s_hv_nested.c | 22 ++++++++++++++++++++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 28 ++++++++++++++++++---------- 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index dc97e5be76f6..a398061d5778 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -803,7 +803,11 @@ static void kvmhv_update_nest_rmap_rc(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find the pte */ + /* Find the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* * If the pte is present and the pfn is still the same, update the pte. @@ -853,7 +857,11 @@ static void kvmhv_remove_nest_rmap(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find and invalidate the pte */ + /* Find and invalidate the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* Don't spuriously invalidate ptes if the pfn has changed */ if (ptep && pte_present(*ptep) && ((pte_val(*ptep) & mask) == hpa)) @@ -921,6 +929,11 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, int shift; spin_lock(&kvm->mmu_lock); + /* + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; @@ -1362,6 +1375,11 @@ static long int __kvmhv_nested_page_fault(struct kvm_run *run, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); spin_lock(&kvm->mmu_lock); + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ pte_p = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 220305454c23..fd4d8f174f09 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -210,7 +210,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pte_t *ptep; unsigned int writing; unsigned long mmu_seq; - unsigned long rcbits, irq_flags = 0; + unsigned long rcbits, irq_mask = 0; if (kvm_is_radix(kvm)) return H_FUNCTION; @@ -252,8 +252,8 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * If we had a page table table change after lookup, we would * retry via mmu_notifier_retry. */ - if (!realmode) - local_irq_save(irq_flags); + irq_mask = __begin_lockless_pgtbl_walk(!realmode); + /* * If called in real mode we have MSR_EE = 0. Otherwise * we disable irq above. @@ -272,8 +272,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * to <= host page size, if host is using hugepage */ if (host_pte_size < psize) { - if (!realmode) - local_irq_restore(flags); + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; } pte = kvmppc_read_update_linux_pte(ptep, writing); @@ -287,8 +286,6 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - if (!realmode) - local_irq_restore(irq_flags); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -302,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /*If we had host pte mapping then Check WIMG */ if (ptep && !hpte_cache_flags_ok(ptel, is_ci)) { - if (is_ci) + if (is_ci) { + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; + } /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -311,6 +310,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, ptel &= ~(HPTE_R_W|HPTE_R_I|HPTE_R_G); ptel |= HPTE_R_M; } + __end_lockless_pgtbl_walk(irq_mask, !realmode); /* Find and lock the HPTEG slot to use */ do_insert: @@ -907,11 +907,19 @@ static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - /* Try to find the host pte for that virtual address */ + /* Try to find the host pte for that virtual address : + * Called by hcall_real_table (real mode + MSR_EE=0) + * Interrupts are disabled here. + */ + __begin_lockless_pgtbl_walk(false); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - if (!ptep) + if (!ptep) { + __end_lockless_pgtbl_walk(0, false); return H_TOO_HARD; + } pte = kvmppc_read_update_linux_pte(ptep, writing); + __end_lockless_pgtbl_walk(0, false); + if (!pte_present(pte)) return H_TOO_HARD; -- 2.24.1
next prev parent reply other threads:[~2020-02-06 3:08 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-02-06 3:08 [PATCH v6 00/11] Introduces new functions for tracking lockless pagetable walks Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 01/11] asm-generic/pgtable: Adds generic functions to track lockless pgtable walks Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 5:54 ` Christophe Leroy 2020-02-06 5:54 ` Christophe Leroy 2020-02-07 2:19 ` Leonardo Bras 2020-02-07 2:19 ` Leonardo Bras 2020-02-07 5:39 ` kbuild test robot 2020-02-07 5:39 ` kbuild test robot 2020-02-06 3:08 ` [PATCH v6 02/11] mm/gup: Use functions to track lockless pgtbl walks on gup_pgd_range Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 3:25 ` Leonardo Bras 2020-02-06 3:25 ` Leonardo Bras 2020-02-07 22:54 ` John Hubbard 2020-02-07 22:54 ` John Hubbard 2020-02-17 20:55 ` Leonardo Bras 2020-02-17 20:55 ` Leonardo Bras 2020-10-15 14:46 ` Michal Suchánek 2020-10-16 3:27 ` Aneesh Kumar K.V 2020-02-07 1:19 ` kbuild test robot 2020-02-07 1:19 ` kbuild test robot 2020-02-07 8:01 ` kbuild test robot 2020-02-07 8:01 ` kbuild test robot 2020-02-06 3:08 ` [PATCH v6 03/11] powerpc/mm: Adds arch-specificic functions to track lockless pgtable walks Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 5:46 ` Christophe Leroy 2020-02-06 5:46 ` Christophe Leroy 2020-02-07 4:38 ` Leonardo Bras 2020-02-07 4:38 ` Leonardo Bras 2020-02-17 20:32 ` Leonardo Bras 2020-02-17 20:32 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 04/11] powerpc/mce_power: Use functions to track lockless pgtbl walks Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 5:48 ` Christophe Leroy 2020-02-06 5:48 ` Christophe Leroy 2020-02-07 4:00 ` Leonardo Bras 2020-02-07 4:00 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 05/11] powerpc/perf: " Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 06/11] powerpc/mm/book3s64/hash: " Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 6:06 ` Christophe Leroy 2020-02-06 6:06 ` Christophe Leroy 2020-02-07 3:49 ` Leonardo Bras 2020-02-07 3:49 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 07/11] powerpc/kvm/e500: " Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 6:18 ` Christophe Leroy 2020-02-06 6:18 ` Christophe Leroy 2020-02-07 3:10 ` Leonardo Bras 2020-02-07 3:10 ` Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras [this message] 2020-02-06 3:08 ` [PATCH v6 08/11] powerpc/kvm/book3s_hv: " Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 09/11] powerpc/kvm/book3s_64: " Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 3:08 ` [PATCH v6 10/11] powerpc/mm: Adds counting method to track lockless pagetable walks Leonardo Bras 2020-02-06 3:08 ` Leonardo Bras 2020-02-06 6:23 ` Christophe Leroy 2020-02-06 6:23 ` Christophe Leroy 2020-02-07 1:56 ` Leonardo Bras 2020-02-07 1:56 ` Leonardo Bras 2020-02-06 3:09 ` [PATCH v6 11/11] powerpc/mm/book3s64/pgtable: Uses counting method to skip serializing Leonardo Bras 2020-02-06 3:09 ` Leonardo Bras
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200206030900.147032-9-leonardo@linux.ibm.com \ --to=leonardo@linux.ibm.com \ --cc=akpm@linux-foundation.org \ --cc=allison@lohutok.net \ --cc=aneesh.kumar@linux.ibm.com \ --cc=arbab@linux.ibm.com \ --cc=arnd@arndb.de \ --cc=benh@kernel.crashing.org \ --cc=bsingharora@gmail.com \ --cc=christophe.leroy@c-s.fr \ --cc=gregkh@linuxfoundation.org \ --cc=kvm-ppc@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mahesh@linux.vnet.ibm.com \ --cc=mpe@ellerman.id.au \ --cc=msuchanek@su \ --cc=npiggin@gmail.com \ --cc=paulus@samba.org \ --cc=robin.murphy@arm.com \ --cc=rppt@linux.ibm.com \ --cc=steven.price@arm.com \ --cc=tglx@linutronix.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).