All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: x86/mmu: Fix unsync races within TDP MMU
@ 2021-08-10 22:45 Sean Christopherson
  2021-08-10 22:45 ` [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock Sean Christopherson
  2021-08-10 22:45 ` [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page() Sean Christopherson
  0 siblings, 2 replies; 9+ messages in thread
From: Sean Christopherson @ 2021-08-10 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon

Patch 1 fixes a bug where concurrent TDP MMU page faults can "corrupt" the
unsync shadow page fields, notably unsync_children, as the TDP MMU holds
mmu_lock for read and the unsync logic is not thread safe.

Patch 2 is a tangentially related cleanup.

Sean Christopherson (2):
  KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with
    spinlock
  KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page()

 Documentation/virt/kvm/locking.rst |  8 +++----
 arch/x86/include/asm/kvm_host.h    |  3 +++
 arch/x86/kvm/mmu/mmu.c             | 36 ++++++++++++++++++++++++++++--
 arch/x86/kvm/mmu/mmu_internal.h    |  3 ++-
 arch/x86/kvm/mmu/spte.c            |  4 ++--
 arch/x86/kvm/mmu/spte.h            |  2 +-
 arch/x86/kvm/mmu/tdp_mmu.c         | 20 +++++------------
 7 files changed, 52 insertions(+), 24 deletions(-)

-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  2021-08-10 22:45 [PATCH 0/2] KVM: x86/mmu: Fix unsync races within TDP MMU Sean Christopherson
@ 2021-08-10 22:45 ` Sean Christopherson
  2021-08-11 11:47   ` Paolo Bonzini
  2021-08-10 22:45 ` [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page() Sean Christopherson
  1 sibling, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2021-08-10 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon

Add yet another spinlock for the TDP MMU and take it when marking indirect
shadow pages unsync.  When using the TDP MMU and L1 is running L2(s) with
nested TDP, KVM may encounter shadow pages for the TDP entries managed by
L1 (controlling L2) when handling a TDP MMU page fault.  The unsync logic
is not thread safe, e.g. the kvm_mmu_page fields are not atomic, and
misbehaves when a shadow page is marked unsync via a TDP MMU page fault,
which runs with mmu_lock held for read, not write.

Lack of a critical section manifests most visibly as an underflow of
unsync_children in clear_unsync_child_bit() due to unsync_children being
corrupted when multiple CPUs write it without a critical section and
without atomic operations.  But underflow is the best case scenario.  The
worst case scenario is that unsync_children prematurely hits '0' and
leads to guest memory corruption due to KVM neglecting to properly sync
shadow pages.

Use an entirely new spinlock even though piggybacking tdp_mmu_pages_lock
would functionally be ok.  Usurping the lock could degrade performance when
building upper level page tables on different vCPUs, especially since the
unsync flow could hold the lock for a comparatively long time depending on
the number of indirect shadow pages and the depth of the paging tree.

Note, even though L2 could theoretically be given access to its own EPT
entries, a nested MMU must hold mmu_lock for write and thus cannot race
against a TDP MMU page fault.  I.e. the additional spinlock only needs to
be taken by the TDP MMU, as opposed to being taken by any MMU for a VM
that is running with the TDP MMU enabled.  Holding mmu_lock for read also
prevents the indirect shadow page from being freed.

Alternative #1, the TDP MMU could simply pass "false" for can_unsync and
effectively disable unsync behavior for nested TDP.  Write protecting leaf
shadow pages is unlikely to noticeably impact traditional L1 VMMs, as such
VMMs typically don't modify TDP entries, but the same may not hold true for
non-standard use cases and/or VMMs that are migrating physical pages (from
L1's perspective).

Alternative #2, the unsync logic could be made thread safe.  In theory,
simply converting all relevant kvm_mmu_page fields to atomics and using
atomic bitops for the bitmap would suffice.  However, (a) an in-depth audit
would be required, (b) the code churn would be substantial, and (c) legacy
shadow paging would incur additional atomic operations in performance
sensitive paths for no benefit (to legacy shadow paging).

Fixes: a2855afc7ee8 ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/locking.rst |  8 +++----
 arch/x86/include/asm/kvm_host.h    |  3 +++
 arch/x86/kvm/mmu/mmu.c             | 36 ++++++++++++++++++++++++++++--
 arch/x86/kvm/mmu/mmu_internal.h    |  3 ++-
 arch/x86/kvm/mmu/spte.c            |  4 ++--
 arch/x86/kvm/mmu/spte.h            |  2 +-
 arch/x86/kvm/mmu/tdp_mmu.c         |  3 ++-
 7 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 8138201efb09..b6bb56325138 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -31,10 +31,10 @@ On x86:
 
 - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
 
-- kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock is
-  taken inside kvm->arch.mmu_lock, and cannot be taken without already
-  holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
-  there's no need to take kvm->arch.tdp_mmu_pages_lock at all).
+- kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock and
+  kvm->arch.tdp_mmu_unsync_pages_lock are taken inside kvm->arch.mmu_lock, and
+  cannot be taken without already holding kvm->arch.mmu_lock (typically with
+  ``read_lock``, thus the need for additional spinlocks).
 
 Everything else is a leaf: no other lock is taken inside the critical
 sections.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4c567b05edad..0df970dc5f45 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1176,6 +1176,9 @@ struct kvm_arch {
 	 * the thread holds the MMU lock in write mode.
 	 */
 	spinlock_t tdp_mmu_pages_lock;
+
+	/* Protects marking pages unsync during TDP MMU page faults. */
+	spinlock_t tdp_mmu_unsync_pages_lock;
 #endif /* CONFIG_X86_64 */
 
 	/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d574c68cbc5c..bcc5dedd531a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1887,6 +1887,8 @@ static int mmu_unsync_walk(struct kvm_mmu_page *sp,
 
 static void kvm_unlink_unsync_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
+	lockdep_assert_held_write(&kvm->mmu_lock);
+
 	WARN_ON(!sp->unsync);
 	trace_kvm_mmu_sync_page(sp);
 	sp->unsync = 0;
@@ -2592,10 +2594,17 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
  * were marked unsync (or if there is no shadow page), -EPERM if the SPTE must
  * be write-protected.
  */
-int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
+int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync,
+			    spinlock_t *unsync_lock)
 {
+	bool locked_write = !unsync_lock;
 	struct kvm_mmu_page *sp;
 
+	if (locked_write)
+		lockdep_assert_held_write(&vcpu->kvm->mmu_lock);
+	else
+		lockdep_assert_held_read(&vcpu->kvm->mmu_lock);
+
 	/*
 	 * Force write-protection if the page is being tracked.  Note, the page
 	 * track machinery is used to write-protect upper-level shadow pages,
@@ -2617,9 +2626,32 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
 		if (sp->unsync)
 			continue;
 
+		/*
+		 * TDP MMU page faults require an additional spinlock as they
+		 * run with mmu_lock held for read, not write, and the unsync
+		 * logic is not thread safe.
+		 */
+		if (!locked_write) {
+			locked_write = true;
+			spin_lock(unsync_lock);
+
+			/*
+			 * Recheck after taking the spinlock, a different vCPU
+			 * may have since marked the page unsync.  A false
+			 * positive on the unprotected check above is not
+			 * possible as clearing sp->unsync _must_ hold mmu_lock
+			 * for write, i.e. unsync cannot transition from 0->1
+			 * while this CPU holds mmu_lock for read.
+			 */
+			if (READ_ONCE(sp->unsync))
+				continue;
+		}
+
 		WARN_ON(sp->role.level != PG_LEVEL_4K);
 		kvm_unsync_page(vcpu, sp);
 	}
+	if (unsync_lock && locked_write)
+		spin_unlock(unsync_lock);
 
 	/*
 	 * We need to ensure that the marking of unsync pages is visible
@@ -2675,7 +2707,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	sp = sptep_to_sp(sptep);
 
 	ret = make_spte(vcpu, pte_access, level, gfn, pfn, *sptep, speculative,
-			can_unsync, host_writable, sp_ad_disabled(sp), &spte);
+			can_unsync, host_writable, sp_ad_disabled(sp), &spte, NULL);
 
 	if (spte & PT_WRITABLE_MASK)
 		kvm_vcpu_mark_page_dirty(vcpu, gfn);
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 62bb8f758b3f..45179ae4a265 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -122,7 +122,8 @@ static inline bool is_nx_huge_page_enabled(void)
 	return READ_ONCE(nx_huge_pages);
 }
 
-int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync);
+int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync,
+			    spinlock_t *unsync_lock);
 
 void kvm_mmu_gfn_disallow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 3e97cdb13eb7..4a053d2022de 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -92,7 +92,7 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
 int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
 		     gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
 		     bool can_unsync, bool host_writable, bool ad_disabled,
-		     u64 *new_spte)
+		     u64 *new_spte, spinlock_t *unsync_lock)
 {
 	u64 spte = SPTE_MMU_PRESENT_MASK;
 	int ret = 0;
@@ -159,7 +159,7 @@ int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
 		 * e.g. it's write-tracked (upper-level SPs) or has one or more
 		 * shadow pages and unsync'ing pages is not allowed.
 		 */
-		if (mmu_try_to_unsync_pages(vcpu, gfn, can_unsync)) {
+		if (mmu_try_to_unsync_pages(vcpu, gfn, can_unsync, unsync_lock)) {
 			pgprintk("%s: found shadow page for %llx, marking ro\n",
 				 __func__, gfn);
 			ret |= SET_SPTE_WRITE_PROTECTED_PT;
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index eb7b227fc6cf..be159c5caadb 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -342,7 +342,7 @@ static inline u64 get_mmio_spte_generation(u64 spte)
 int make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
 		     gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
 		     bool can_unsync, bool host_writable, bool ad_disabled,
-		     u64 *new_spte);
+		     u64 *new_spte, spinlock_t *unsync_lock);
 u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
 u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
 u64 mark_spte_for_access_track(u64 spte);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index dab6cb46cdb2..d99e064d366f 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -25,6 +25,7 @@ bool kvm_mmu_init_tdp_mmu(struct kvm *kvm)
 	INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots);
 	spin_lock_init(&kvm->arch.tdp_mmu_pages_lock);
 	INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages);
+	spin_lock_init(&kvm->arch.tdp_mmu_unsync_pages_lock);
 
 	return true;
 }
@@ -952,7 +953,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
 		make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
 					 pfn, iter->old_spte, prefault, true,
 					 map_writable, !shadow_accessed_mask,
-					 &new_spte);
+					 &new_spte, &vcpu->kvm->arch.tdp_mmu_unsync_pages_lock);
 
 	if (new_spte == iter->old_spte)
 		ret = RET_PF_SPURIOUS;
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page()
  2021-08-10 22:45 [PATCH 0/2] KVM: x86/mmu: Fix unsync races within TDP MMU Sean Christopherson
  2021-08-10 22:45 ` [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock Sean Christopherson
@ 2021-08-10 22:45 ` Sean Christopherson
  2021-08-11 16:33   ` Ben Gardon
  1 sibling, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2021-08-10 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Ben Gardon

Drop @shared from tdp_mmu_link_page() and hardcode it to work for
mmu_lock being held for read.  The helper has exactly one caller and
in all likelihood will only ever have exactly one caller.  Even if KVM
adds a path to install translations without an initiating page fault,
odds are very, very good that the path will just be a wrapper to the
"page fault" handler (both SNP and TDX RFCs propose patches to do
exactly that).

No functional change intended.

Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index d99e064d366f..c5b901744d15 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -257,26 +257,17 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
  *
  * @kvm: kvm instance
  * @sp: the new page
- * @shared: This operation may not be running under the exclusive use of
- *	    the MMU lock and the operation must synchronize with other
- *	    threads that might be adding or removing pages.
  * @account_nx: This page replaces a NX large page and should be marked for
  *		eventual reclaim.
  */
 static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
-			      bool shared, bool account_nx)
+			      bool account_nx)
 {
-	if (shared)
-		spin_lock(&kvm->arch.tdp_mmu_pages_lock);
-	else
-		lockdep_assert_held_write(&kvm->mmu_lock);
-
+	spin_lock(&kvm->arch.tdp_mmu_pages_lock);
 	list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
 	if (account_nx)
 		account_huge_nx_page(kvm, sp);
-
-	if (shared)
-		spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
+	spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
 }
 
 /**
@@ -1062,7 +1053,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 						     !shadow_accessed_mask);
 
 			if (tdp_mmu_set_spte_atomic_no_dirty_log(vcpu->kvm, &iter, new_spte)) {
-				tdp_mmu_link_page(vcpu->kvm, sp, true,
+				tdp_mmu_link_page(vcpu->kvm, sp,
 						  huge_page_disallowed &&
 						  req_level >= iter.level);
 
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  2021-08-10 22:45 ` [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock Sean Christopherson
@ 2021-08-11 11:47   ` Paolo Bonzini
  2021-08-11 15:52     ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2021-08-11 11:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon

On 11/08/21 00:45, Sean Christopherson wrote:
> Use an entirely new spinlock even though piggybacking tdp_mmu_pages_lock
> would functionally be ok.  Usurping the lock could degrade performance when
> building upper level page tables on different vCPUs, especially since the
> unsync flow could hold the lock for a comparatively long time depending on
> the number of indirect shadow pages and the depth of the paging tree.

If we are to introduce a new spinlock, do we need to make it conditional 
and pass it around like this?  It would be simpler to just take it 
everywhere (just like, in patch 2, passing "shared == true" to 
tdp_mmu_link_page is always safe anyway).

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  2021-08-11 11:47   ` Paolo Bonzini
@ 2021-08-11 15:52     ` Sean Christopherson
  2021-08-12 15:37       ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2021-08-11 15:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon

On Wed, Aug 11, 2021, Paolo Bonzini wrote:
> On 11/08/21 00:45, Sean Christopherson wrote:
> > Use an entirely new spinlock even though piggybacking tdp_mmu_pages_lock
> > would functionally be ok.  Usurping the lock could degrade performance when
> > building upper level page tables on different vCPUs, especially since the
> > unsync flow could hold the lock for a comparatively long time depending on
> > the number of indirect shadow pages and the depth of the paging tree.
> 
> If we are to introduce a new spinlock, do we need to make it conditional and
> pass it around like this?  It would be simpler to just take it everywhere
> (just like, in patch 2, passing "shared == true" to tdp_mmu_link_page is
> always safe anyway).

It's definitely not necessary to pass it around.  I liked this approach because
the lock is directly referenced only by the TDP MMU.

My runner up was to key off of is_tdp_mmu_enabled(), which is not strictly
necessary, but I didn't like checking is_tdp_mmu() this far down the call chain.
E.g. minus comments and lockdeps

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d574c68cbc5c..651256a10cb9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2594,6 +2594,8 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
  */
 int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
 {
+       bool tdp_mmu = is_tdp_mmu_enabled(vcpu->kvm);
+       bool write_locked = !tdp_mmu;
        struct kvm_mmu_page *sp;

        /*
@@ -2617,9 +2619,19 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
                if (sp->unsync)
                        continue;

+               if (!write_locked) {
+                       write_locked = true;
+                       spin_lock(&vcpu->kvm->arch.tdp_mmu_unsync_pages_lock);
+
+                       if (READ_ONCE(sp->unsync))
+                               continue;
+               }
+
                WARN_ON(sp->role.level != PG_LEVEL_4K);
                kvm_unsync_page(vcpu, sp);
        }
+       if (tdp_mmu && write_locked)
+               spin_unlock(&vcpu->kvm->arch.tdp_mmu_unsync_pages_lock);

        /*
         * We need to ensure that the marking of unsync pages is visible



All that said, I do not have a strong preference.  Were you thinking something
like this?

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d574c68cbc5c..b622e8a13b8b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2595,6 +2595,7 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
 {
        struct kvm_mmu_page *sp;
+       bool locked = false;

        /*
         * Force write-protection if the page is being tracked.  Note, the page
@@ -2617,9 +2618,34 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
                if (sp->unsync)
                        continue;

+               /*
+                * TDP MMU page faults require an additional spinlock as they
+                * run with mmu_lock held for read, not write, and the unsync
+                * logic is not thread safe.  Take the spinklock regardless of
+                * the MMU type to avoid extra conditionals/parameters, there's
+                * no meaningful penalty if mmu_lock is held for write.
+                */
+               if (!locked) {
+                       locked = true;
+                       spin_lock(&kvm->arch.mmu_unsync_pages_lock);
+
+                       /*
+                        * Recheck after taking the spinlock, a different vCPU
+                        * may have since marked the page unsync.  A false
+                        * positive on the unprotected check above is not
+                        * possible as clearing sp->unsync _must_ hold mmu_lock
+                        * for write, i.e. unsync cannot transition from 0->1
+                        * while this CPU holds mmu_lock for read.
+                        */
+                       if (READ_ONCE(sp->unsync))
+                               continue;
+               }
+
                WARN_ON(sp->role.level != PG_LEVEL_4K);
                kvm_unsync_page(vcpu, sp);
        }
+       if (locked)
+               spin_unlock(&kvm->arch.mmu_unsync_pages_lock);

        /*
         * We need to ensure that the marking of unsync pages is visible

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page()
  2021-08-10 22:45 ` [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page() Sean Christopherson
@ 2021-08-11 16:33   ` Ben Gardon
  0 siblings, 0 replies; 9+ messages in thread
From: Ben Gardon @ 2021-08-11 16:33 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, LKML

On Tue, Aug 10, 2021 at 3:46 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Drop @shared from tdp_mmu_link_page() and hardcode it to work for
> mmu_lock being held for read.  The helper has exactly one caller and
> in all likelihood will only ever have exactly one caller.  Even if KVM
> adds a path to install translations without an initiating page fault,
> odds are very, very good that the path will just be a wrapper to the
> "page fault" handler (both SNP and TDX RFCs propose patches to do
> exactly that).
>
> No functional change intended.
>
> Cc: Ben Gardon <bgardon@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Ben Gardon <bgardon@google.com>

Nice cleanup!

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 17 ++++-------------
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index d99e064d366f..c5b901744d15 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -257,26 +257,17 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
>   *
>   * @kvm: kvm instance
>   * @sp: the new page
> - * @shared: This operation may not be running under the exclusive use of
> - *         the MMU lock and the operation must synchronize with other
> - *         threads that might be adding or removing pages.
>   * @account_nx: This page replaces a NX large page and should be marked for
>   *             eventual reclaim.
>   */
>  static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> -                             bool shared, bool account_nx)
> +                             bool account_nx)
>  {
> -       if (shared)
> -               spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> -       else
> -               lockdep_assert_held_write(&kvm->mmu_lock);
> -
> +       spin_lock(&kvm->arch.tdp_mmu_pages_lock);
>         list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
>         if (account_nx)
>                 account_huge_nx_page(kvm, sp);
> -
> -       if (shared)
> -               spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> +       spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>  }
>
>  /**
> @@ -1062,7 +1053,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
>                                                      !shadow_accessed_mask);
>
>                         if (tdp_mmu_set_spte_atomic_no_dirty_log(vcpu->kvm, &iter, new_spte)) {
> -                               tdp_mmu_link_page(vcpu->kvm, sp, true,
> +                               tdp_mmu_link_page(vcpu->kvm, sp,
>                                                   huge_page_disallowed &&
>                                                   req_level >= iter.level);
>
> --
> 2.32.0.605.g8dce9f2422-goog
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  2021-08-11 15:52     ` Sean Christopherson
@ 2021-08-12 15:37       ` Paolo Bonzini
  2021-08-12 16:06         ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2021-08-12 15:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon

On 11/08/21 17:52, Sean Christopherson wrote:
> All that said, I do not have a strong preference.  Were you thinking something
> like this?

Yes, pretty much this.

Paolo

> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index d574c68cbc5c..b622e8a13b8b 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2595,6 +2595,7 @@ static void kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
>   int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
>   {
>          struct kvm_mmu_page *sp;
> +       bool locked = false;
> 
>          /*
>           * Force write-protection if the page is being tracked.  Note, the page
> @@ -2617,9 +2618,34 @@ int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync)
>                  if (sp->unsync)
>                          continue;
> 
> +               /*
> +                * TDP MMU page faults require an additional spinlock as they
> +                * run with mmu_lock held for read, not write, and the unsync
> +                * logic is not thread safe.  Take the spinklock regardless of
> +                * the MMU type to avoid extra conditionals/parameters, there's
> +                * no meaningful penalty if mmu_lock is held for write.
> +                */
> +               if (!locked) {
> +                       locked = true;
> +                       spin_lock(&kvm->arch.mmu_unsync_pages_lock);
> +
> +                       /*
> +                        * Recheck after taking the spinlock, a different vCPU
> +                        * may have since marked the page unsync.  A false
> +                        * positive on the unprotected check above is not
> +                        * possible as clearing sp->unsync_must_  hold mmu_lock
> +                        * for write, i.e. unsync cannot transition from 0->1
> +                        * while this CPU holds mmu_lock for read.
> +                        */
> +                       if (READ_ONCE(sp->unsync))
> +                               continue;
> +               }
> +
>                  WARN_ON(sp->role.level != PG_LEVEL_4K);
>                  kvm_unsync_page(vcpu, sp);
>          }
> +       if (locked)
> +               spin_unlock(&kvm->arch.mmu_unsync_pages_lock);


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
  2021-08-12 15:37       ` Paolo Bonzini
@ 2021-08-12 16:06         ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2021-08-12 16:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Ben Gardon

On Thu, Aug 12, 2021, Paolo Bonzini wrote:
> On 11/08/21 17:52, Sean Christopherson wrote:
> > All that said, I do not have a strong preference.  Were you thinking something
> > like this?
> 
> Yes, pretty much this.

Roger that, I'll work on a new version.  Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
@ 2021-08-11  5:22 kernel test robot
  0 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-08-11  5:22 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 15212 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210810224554.2978735-2-seanjc@google.com>
References: <20210810224554.2978735-2-seanjc@google.com>
TO: Sean Christopherson <seanjc@google.com>
TO: Paolo Bonzini <pbonzini@redhat.com>
CC: Sean Christopherson <seanjc@google.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Wanpeng Li <wanpengli@tencent.com>
CC: Jim Mattson <jmattson@google.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: kvm(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
CC: Ben Gardon <bgardon@google.com>

Hi Sean,

I love your patch! Perhaps something to improve:

[auto build test WARNING on kvm/queue]
[also build test WARNING on v5.14-rc5 next-20210810]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Sean-Christopherson/KVM-x86-mmu-Fix-unsync-races-within-TDP-MMU/20210811-064845
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
:::::: branch date: 7 hours ago
:::::: commit date: 7 hours ago
config: x86_64-rhel-8.3-kselftests (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-348-gf0e6938b-dirty
        # https://github.com/0day-ci/linux/commit/d114d08445896b8e18922d411ee4240ece169793
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Sean-Christopherson/KVM-x86-mmu-Fix-unsync-races-within-TDP-MMU/20210811-064845
        git checkout d114d08445896b8e18922d411ee4240ece169793
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/kvm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   arch/x86/kvm/mmu/mmu.c:699:9: sparse: sparse: context imbalance in 'walk_shadow_page_lockless_begin' - different lock contexts for basic block
   arch/x86/kvm/mmu/mmu.c: note: in included file (through include/linux/rbtree.h, include/linux/mm_types.h, arch/x86/kvm/irq.h):
   include/linux/rcupdate.h:718:9: sparse: sparse: context imbalance in 'walk_shadow_page_lockless_end' - unexpected unlock
>> arch/x86/kvm/mmu/mmu.c:2622:9: sparse: sparse: context imbalance in 'mmu_try_to_unsync_pages' - different lock contexts for basic block
   arch/x86/kvm/mmu/mmu.c:4491:57: sparse: sparse: cast truncates bits from constant value (ffffff33 becomes 33)
   arch/x86/kvm/mmu/mmu.c:4493:56: sparse: sparse: cast truncates bits from constant value (ffffff0f becomes f)
   arch/x86/kvm/mmu/mmu.c:4495:57: sparse: sparse: cast truncates bits from constant value (ffffff55 becomes 55)

vim +/mmu_try_to_unsync_pages +2622 arch/x86/kvm/mmu/mmu.c

9cf5cf5ad43b293 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-05-24  2590  
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2591  /*
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2592   * Attempt to unsync any shadow pages that can be reached by the specified gfn,
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2593   * KVM is creating a writable mapping for said gfn.  Returns 0 if all pages
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2594   * were marked unsync (or if there is no shadow page), -EPERM if the SPTE must
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2595   * be write-protected.
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2596   */
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2597  int mmu_try_to_unsync_pages(struct kvm_vcpu *vcpu, gfn_t gfn, bool can_unsync,
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2598  			    spinlock_t *unsync_lock)
4731d4c7a07769c arch/x86/kvm/mmu.c     Marcelo Tosatti     2008-09-23  2599  {
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2600  	bool locked_write = !unsync_lock;
5c520e90af3ad54 arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2601  	struct kvm_mmu_page *sp;
9cf5cf5ad43b293 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-05-24  2602  
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2603  	if (locked_write)
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2604  		lockdep_assert_held_write(&vcpu->kvm->mmu_lock);
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2605  	else
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2606  		lockdep_assert_held_read(&vcpu->kvm->mmu_lock);
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2607  
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2608  	/*
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2609  	 * Force write-protection if the page is being tracked.  Note, the page
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2610  	 * track machinery is used to write-protect upper-level shadow pages,
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2611  	 * i.e. this guards the role.level == 4K assertion below!
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2612  	 */
3d0c27ad6ee465f arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2613  	if (kvm_page_track_is_active(vcpu, gfn, KVM_PAGE_TRACK_WRITE))
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2614  		return -EPERM;
3d0c27ad6ee465f arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2615  
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2616  	/*
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2617  	 * The page is not write-tracked, mark existing shadow pages unsync
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2618  	 * unless KVM is synchronizing an unsync SP (can_unsync = false).  In
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2619  	 * that case, KVM must complete emulation of the guest TLB flush before
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2620  	 * allowing shadow pages to become unsync (writable by the guest).
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2621  	 */
5c520e90af3ad54 arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24 @2622  	for_each_gfn_indirect_valid_sp(vcpu->kvm, sp, gfn) {
36a2e6774bfb5f3 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-06-30  2623  		if (!can_unsync)
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2624  			return -EPERM;
36a2e6774bfb5f3 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-06-30  2625  
5c520e90af3ad54 arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2626  		if (sp->unsync)
5c520e90af3ad54 arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2627  			continue;
9cf5cf5ad43b293 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-05-24  2628  
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2629  		/*
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2630  		 * TDP MMU page faults require an additional spinlock as they
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2631  		 * run with mmu_lock held for read, not write, and the unsync
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2632  		 * logic is not thread safe.
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2633  		 */
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2634  		if (!locked_write) {
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2635  			locked_write = true;
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2636  			spin_lock(unsync_lock);
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2637  
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2638  			/*
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2639  			 * Recheck after taking the spinlock, a different vCPU
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2640  			 * may have since marked the page unsync.  A false
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2641  			 * positive on the unprotected check above is not
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2642  			 * possible as clearing sp->unsync _must_ hold mmu_lock
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2643  			 * for write, i.e. unsync cannot transition from 0->1
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2644  			 * while this CPU holds mmu_lock for read.
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2645  			 */
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2646  			if (READ_ONCE(sp->unsync))
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2647  				continue;
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2648  		}
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2649  
3bae0459bcd5595 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2020-04-27  2650  		WARN_ON(sp->role.level != PG_LEVEL_4K);
5c520e90af3ad54 arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2651  		kvm_unsync_page(vcpu, sp);
9cf5cf5ad43b293 arch/x86/kvm/mmu.c     Xiao Guangrong      2010-05-24  2652  	}
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2653  	if (unsync_lock && locked_write)
d114d08445896b8 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-08-10  2654  		spin_unlock(unsync_lock);
3d0c27ad6ee465f arch/x86/kvm/mmu.c     Xiao Guangrong      2016-02-24  2655  
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2656  	/*
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2657  	 * We need to ensure that the marking of unsync pages is visible
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2658  	 * before the SPTE is updated to allow writes because
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2659  	 * kvm_mmu_sync_roots() checks the unsync flags without holding
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2660  	 * the MMU lock and so can race with this. If the SPTE was updated
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2661  	 * before the page had been marked as unsync-ed, something like the
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2662  	 * following could happen:
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2663  	 *
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2664  	 * CPU 1                    CPU 2
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2665  	 * ---------------------------------------------------------------------
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2666  	 * 1.2 Host updates SPTE
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2667  	 *     to be writable
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2668  	 *                      2.1 Guest writes a GPTE for GVA X.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2669  	 *                          (GPTE being in the guest page table shadowed
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2670  	 *                           by the SP from CPU 1.)
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2671  	 *                          This reads SPTE during the page table walk.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2672  	 *                          Since SPTE.W is read as 1, there is no
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2673  	 *                          fault.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2674  	 *
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2675  	 *                      2.2 Guest issues TLB flush.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2676  	 *                          That causes a VM Exit.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2677  	 *
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2678  	 *                      2.3 Walking of unsync pages sees sp->unsync is
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2679  	 *                          false and skips the page.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2680  	 *
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2681  	 *                      2.4 Guest accesses GVA X.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2682  	 *                          Since the mapping in the SP was not updated,
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2683  	 *                          so the old mapping for GVA X incorrectly
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2684  	 *                          gets used.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2685  	 * 1.1 Host marks SP
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2686  	 *     as unsync
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2687  	 *     (sp->unsync = true)
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2688  	 *
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2689  	 * The write barrier below ensures that 1.1 happens before 1.2 and thus
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2690  	 * the situation in 2.4 does not arise. The implicit barrier in 2.2
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2691  	 * pairs with this write barrier.
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2692  	 */
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2693  	smp_wmb();
578e1c4db22135d arch/x86/kvm/mmu.c     Junaid Shahid       2018-06-27  2694  
0337f585f57fc80 arch/x86/kvm/mmu/mmu.c Sean Christopherson 2021-06-22  2695  	return 0;
4731d4c7a07769c arch/x86/kvm/mmu.c     Marcelo Tosatti     2008-09-23  2696  }
4731d4c7a07769c arch/x86/kvm/mmu.c     Marcelo Tosatti     2008-09-23  2697  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 42141 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-12 16:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10 22:45 [PATCH 0/2] KVM: x86/mmu: Fix unsync races within TDP MMU Sean Christopherson
2021-08-10 22:45 ` [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock Sean Christopherson
2021-08-11 11:47   ` Paolo Bonzini
2021-08-11 15:52     ` Sean Christopherson
2021-08-12 15:37       ` Paolo Bonzini
2021-08-12 16:06         ` Sean Christopherson
2021-08-10 22:45 ` [PATCH 2/2] KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page() Sean Christopherson
2021-08-11 16:33   ` Ben Gardon
2021-08-11  5:22 [PATCH 1/2] KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.