* [PATCH 1/2] KVM: PPC: Book3S HV: Remove user-triggerable WARN_ON
@ 2020-05-28 1:53 Paul Mackerras
2020-05-28 1:54 ` [PATCH 2/2] KVM: PPC: Book3S HV: Close race with page faults around memslot flushes Paul Mackerras
0 siblings, 1 reply; 2+ messages in thread
From: Paul Mackerras @ 2020-05-28 1:53 UTC (permalink / raw)
To: kvm; +Cc: kvm-ppc, Laurent Vivier, David Gibson, Nick Piggin
Although in general we do not expect valid PTEs to be found in
kvmppc_create_pte when we are inserting a large page mapping, there
is one situation where this can occur. That is when dirty page
logging is turned off for a memslot while the VM is running.
Because the new memslots are installed before the old memslot is
flushed in kvmppc_core_commit_memory_region_hv(), there is a
window where a hypervisor page fault can try to install a 2MB
(or 1GB) page where there are already small page mappings which
were installed while dirty page logging was enabled and which
have not yet been flushed.
Since we have a situation where valid PTEs can legitimately be
found by kvmppc_unmap_free_pte, and which can be triggered by
userspace, just remove the WARN_ON_ONCE, since it is undesirable
to have userspace able to trigger a kernel warning.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
arch/powerpc/kvm/book3s_64_mmu_radix.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 97b45ea..bc3f795 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -429,9 +429,13 @@ void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte, unsigned long gpa,
* Callers are responsible for flushing the PWC.
*
* When page tables are being unmapped/freed as part of page fault path
- * (full == false), ptes are not expected. There is code to unmap them
- * and emit a warning if encountered, but there may already be data
- * corruption due to the unexpected mappings.
+ * (full == false), valid ptes are generally not expected; however, there
+ * is one situation where they arise, which is when dirty page logging is
+ * turned off for a memslot while the VM is running. The new memslot
+ * becomes visible to page faults before the memslot commit function
+ * gets to flush the memslot, which can lead to a 2MB page mapping being
+ * installed for a guest physical address where there are already 64kB
+ * (or 4kB) mappings (of sub-pages of the same 2MB page).
*/
static void kvmppc_unmap_free_pte(struct kvm *kvm, pte_t *pte, bool full,
unsigned int lpid)
@@ -445,7 +449,6 @@ static void kvmppc_unmap_free_pte(struct kvm *kvm, pte_t *pte, bool full,
for (it = 0; it < PTRS_PER_PTE; ++it, ++p) {
if (pte_val(*p) == 0)
continue;
- WARN_ON_ONCE(1);
kvmppc_unmap_pte(kvm, p,
pte_pfn(*p) << PAGE_SHIFT,
PAGE_SHIFT, NULL, lpid);
--
2.7.4
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH 2/2] KVM: PPC: Book3S HV: Close race with page faults around memslot flushes
2020-05-28 1:53 [PATCH 1/2] KVM: PPC: Book3S HV: Remove user-triggerable WARN_ON Paul Mackerras
@ 2020-05-28 1:54 ` Paul Mackerras
0 siblings, 0 replies; 2+ messages in thread
From: Paul Mackerras @ 2020-05-28 1:54 UTC (permalink / raw)
To: kvm; +Cc: kvm-ppc, Laurent Vivier, David Gibson, Nick Piggin
There is a potential race condition between hypervisor page faults
and flushing a memslot. It is possible for a page fault to read the
memslot before a memslot is updated and then write a PTE to the
partition-scoped page tables after kvmppc_radix_flush_memslot has
completed. (Note that this race has never been explicitly observed.)
To close this race, it is sufficient to increment the MMU sequence
number while the kvm->mmu_lock is held. That will cause
mmu_notifier_retry() to return true, and the page fault will then
return to the guest without inserting a PTE.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
arch/powerpc/kvm/book3s_64_mmu_radix.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index bc3f795..aa41183 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -1130,6 +1130,11 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
kvm->arch.lpid);
gpa += PAGE_SIZE;
}
+ /*
+ * Increase the mmu notifier sequence number to prevent any page
+ * fault that read the memslot earlier from writing a PTE.
+ */
+ kvm->mmu_notifier_seq++;
spin_unlock(&kvm->mmu_lock);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-05-28 1:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 1:53 [PATCH 1/2] KVM: PPC: Book3S HV: Remove user-triggerable WARN_ON Paul Mackerras
2020-05-28 1:54 ` [PATCH 2/2] KVM: PPC: Book3S HV: Close race with page faults around memslot flushes Paul Mackerras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).