linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] add track_remove_slot and remove track_flush_slot
@ 2022-11-11 10:32 Yan Zhao
  2022-11-11 10:33 ` [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot Yan Zhao
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Yan Zhao @ 2022-11-11 10:32 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: pbonzini, seanjc, intel-gfx, intel-gvt-dev, zhenyuw, Yan Zhao

This series is based on Sean's series
https://lore.kernel.org/all/20221110014821.1548347-1-seanjc@google.com/,
which allows KVM internal user of page track not to rely on the page track
hook .track_flush_slot.

Page track hook track_flush_slot is for notification of slot flush and
is called when a slot DELETE/MOVE is on-going.

Page track hook track_remove_slot is for notification of slot removal
and is called when the slot DELETE/MOVE has been committed.

As KVMGT, the only external user of page track, actually only cares about
when slot removal indeed happens, this series switches KVMGT to use the new
hook .track_remove_slot.
And as there are no users to .track_flush_slot any more, this hook is
removed.

v2:
Corrected wrong email address of Sean. sorry.
 
Yan Zhao (3):
  KVM: x86: add a new page track hook track_remove_slot
  drm/i915/gvt: switch from track_flush_slot to track_remove_slot
  KVM: x86: Remove the unused page track hook track_flush_slot

 arch/x86/include/asm/kvm_page_track.h | 8 ++++----
 arch/x86/kvm/mmu/page_track.c         | 8 ++++----
 arch/x86/kvm/x86.c                    | 5 +++--
 drivers/gpu/drm/i915/gvt/kvmgt.c      | 6 +++---
 4 files changed, 14 insertions(+), 13 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-11 10:32 [PATCH v2 0/3] add track_remove_slot and remove track_flush_slot Yan Zhao
@ 2022-11-11 10:33 ` Yan Zhao
  2022-11-11 18:19   ` Sean Christopherson
  2022-11-11 10:34 ` [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot Yan Zhao
  2022-11-11 10:35 ` [PATCH v2 3/3] KVM: x86: Remove the unused page track hook track_flush_slot Yan Zhao
  2 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-11 10:33 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: pbonzini, seanjc, intel-gfx, intel-gvt-dev, zhenyuw, Yan Zhao

Page track hook track_remove_slot is used to notify users that a slot
has been removed and is called when a slot DELETE/MOVE is about to be
completed.

Users of this hook can drop write protections in the removed slot.

Note:
Since KVM_MR_MOVE currently never actually happens in KVM/QEMU, and has
never been properly supported in the external page track user, we just
use the hook track_remove_slot to notify users of the old slot being
removed.

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 arch/x86/include/asm/kvm_page_track.h | 11 +++++++++++
 arch/x86/kvm/mmu/page_track.c         | 26 ++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                    |  3 +++
 3 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index eb186bc57f6a..046b024d1813 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -44,6 +44,16 @@ struct kvm_page_track_notifier_node {
 	 */
 	void (*track_flush_slot)(struct kvm *kvm, struct kvm_memory_slot *slot,
 			    struct kvm_page_track_notifier_node *node);
+	/*
+	 * It is called when memory slot is moved or removed
+	 * users can drop write-protection for the pages in that memory slot
+	 *
+	 * @kvm: the kvm where memory slot being moved or removed
+	 * @slot: the memory slot being moved or removed
+	 * @node: this node
+	 */
+	void (*track_remove_slot)(struct kvm *kvm, struct kvm_memory_slot *slot,
+				  struct kvm_page_track_notifier_node *node);
 };
 
 int kvm_page_track_init(struct kvm *kvm);
@@ -76,4 +86,5 @@ kvm_page_track_unregister_notifier(struct kvm *kvm,
 void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 			  int bytes);
 void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot);
+void kvm_page_track_remove_slot(struct kvm *kvm, struct kvm_memory_slot *slot);
 #endif
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index 2e09d1b6249f..4d6bab1d61c9 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -300,3 +300,29 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
 			n->track_flush_slot(kvm, slot, n);
 	srcu_read_unlock(&head->track_srcu, idx);
 }
+
+/*
+ * Notify the node that memory slot is removed or moved so that it can
+ * drop write-protection for the pages in the memory slot.
+ *
+ * The node should figure out it has any write-protected pages in this slot
+ * by itself.
+ */
+void kvm_page_track_remove_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_page_track_notifier_head *head;
+	struct kvm_page_track_notifier_node *n;
+	int idx;
+
+	head = &kvm->arch.track_notifier_head;
+
+	if (hlist_empty(&head->track_notifier_list))
+		return;
+
+	idx = srcu_read_lock(&head->track_srcu);
+	hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
+				srcu_read_lock_held(&head->track_srcu))
+		if (n->track_remove_slot)
+			n->track_remove_slot(kvm, slot, n);
+	srcu_read_unlock(&head->track_srcu, idx);
+}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 916ebbc81e52..a24a4a2ad1a0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12844,6 +12844,9 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				const struct kvm_memory_slot *new,
 				enum kvm_mr_change change)
 {
+	if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
+		kvm_page_track_remove_slot(kvm, old);
+
 	if (!kvm->arch.n_requested_mmu_pages &&
 	    (change == KVM_MR_CREATE || change == KVM_MR_DELETE)) {
 		unsigned long nr_mmu_pages;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot
  2022-11-11 10:32 [PATCH v2 0/3] add track_remove_slot and remove track_flush_slot Yan Zhao
  2022-11-11 10:33 ` [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot Yan Zhao
@ 2022-11-11 10:34 ` Yan Zhao
  2022-11-11 17:07   ` Sean Christopherson
  2022-11-11 10:35 ` [PATCH v2 3/3] KVM: x86: Remove the unused page track hook track_flush_slot Yan Zhao
  2 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-11 10:34 UTC (permalink / raw)
  To: intel-gfx, intel-gvt-dev
  Cc: zhenyuw, kvm, linux-kernel, pbonzini, seanjc, Yan Zhao

KVMGT only cares about when a slot is indeed removed.
So switch to use track_remove_slot which is called when a slot is removed.

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 drivers/gpu/drm/i915/gvt/kvmgt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 714221f9a131..9582d047471f 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -109,7 +109,7 @@ struct gvt_dma {
 static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		const u8 *val, int len,
 		struct kvm_page_track_notifier_node *node);
-static void kvmgt_page_track_flush_slot(struct kvm *kvm,
+static void kvmgt_page_track_remove_slot(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		struct kvm_page_track_notifier_node *node);
 
@@ -673,7 +673,7 @@ static int intel_vgpu_open_device(struct vfio_device *vfio_dev)
 	gvt_cache_init(vgpu);
 
 	vgpu->track_node.track_write = kvmgt_page_track_write;
-	vgpu->track_node.track_flush_slot = kvmgt_page_track_flush_slot;
+	vgpu->track_node.track_remove_slot = kvmgt_page_track_remove_slot;
 	kvm_get_kvm(vgpu->vfio_device.kvm);
 	kvm_page_track_register_notifier(vgpu->vfio_device.kvm,
 					 &vgpu->track_node);
@@ -1617,7 +1617,7 @@ static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 						     (void *)val, len);
 }
 
-static void kvmgt_page_track_flush_slot(struct kvm *kvm,
+static void kvmgt_page_track_remove_slot(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		struct kvm_page_track_notifier_node *node)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/3] KVM: x86: Remove the unused page track hook track_flush_slot
  2022-11-11 10:32 [PATCH v2 0/3] add track_remove_slot and remove track_flush_slot Yan Zhao
  2022-11-11 10:33 ` [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot Yan Zhao
  2022-11-11 10:34 ` [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot Yan Zhao
@ 2022-11-11 10:35 ` Yan Zhao
  2 siblings, 0 replies; 16+ messages in thread
From: Yan Zhao @ 2022-11-11 10:35 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: pbonzini, seanjc, intel-gfx, intel-gvt-dev, zhenyuw, Yan Zhao

There's no users of hook track_remove_slot any more and no external page
tracker user cares about slot flush.
So remove this hook.

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 arch/x86/include/asm/kvm_page_track.h | 11 -----------
 arch/x86/kvm/mmu/page_track.c         | 26 --------------------------
 arch/x86/kvm/x86.c                    |  2 --
 3 files changed, 39 deletions(-)

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index 046b024d1813..4f1d3c91fdc7 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -34,16 +34,6 @@ struct kvm_page_track_notifier_node {
 	 */
 	void (*track_write)(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 			    int bytes, struct kvm_page_track_notifier_node *node);
-	/*
-	 * It is called when memory slot is being moved or removed
-	 * users can drop write-protection for the pages in that memory slot
-	 *
-	 * @kvm: the kvm where memory slot being moved or removed
-	 * @slot: the memory slot being moved or removed
-	 * @node: this node
-	 */
-	void (*track_flush_slot)(struct kvm *kvm, struct kvm_memory_slot *slot,
-			    struct kvm_page_track_notifier_node *node);
 	/*
 	 * It is called when memory slot is moved or removed
 	 * users can drop write-protection for the pages in that memory slot
@@ -85,6 +75,5 @@ kvm_page_track_unregister_notifier(struct kvm *kvm,
 				   struct kvm_page_track_notifier_node *n);
 void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 			  int bytes);
-void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot);
 void kvm_page_track_remove_slot(struct kvm *kvm, struct kvm_memory_slot *slot);
 #endif
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index 4d6bab1d61c9..f783aea618f8 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -275,32 +275,6 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 	srcu_read_unlock(&head->track_srcu, idx);
 }
 
-/*
- * Notify the node that memory slot is being removed or moved so that it can
- * drop write-protection for the pages in the memory slot.
- *
- * The node should figure out it has any write-protected pages in this slot
- * by itself.
- */
-void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
-{
-	struct kvm_page_track_notifier_head *head;
-	struct kvm_page_track_notifier_node *n;
-	int idx;
-
-	head = &kvm->arch.track_notifier_head;
-
-	if (hlist_empty(&head->track_notifier_list))
-		return;
-
-	idx = srcu_read_lock(&head->track_srcu);
-	hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
-				srcu_read_lock_held(&head->track_srcu))
-		if (n->track_flush_slot)
-			n->track_flush_slot(kvm, slot, n);
-	srcu_read_unlock(&head->track_srcu, idx);
-}
-
 /*
  * Notify the node that memory slot is removed or moved so that it can
  * drop write-protection for the pages in the memory slot.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a24a4a2ad1a0..260288f4d741 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12872,8 +12872,6 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
 				   struct kvm_memory_slot *slot)
 {
 	kvm_mmu_zap_all_fast(kvm);
-
-	kvm_page_track_flush_slot(kvm, slot);
 }
 
 static inline bool kvm_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot
  2022-11-11 10:34 ` [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot Yan Zhao
@ 2022-11-11 17:07   ` Sean Christopherson
  2022-11-12  0:05     ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-11 17:07 UTC (permalink / raw)
  To: Yan Zhao; +Cc: intel-gfx, intel-gvt-dev, zhenyuw, kvm, linux-kernel, pbonzini

On Fri, Nov 11, 2022, Yan Zhao wrote:
> KVMGT only cares about when a slot is indeed removed.
> So switch to use track_remove_slot which is called when a slot is removed.

This should capture the original motivation, i.e. that the existing
->track_flush_slot() hook is theoretically flawed.  I think it also makes sense
to call out that KVMGT undoubtedly does the wrong thing if a memslot is moved,
but that (a) KVMGT has never supported moving memslots and (b) there's no sane
use case for moving memslots that might be used by the guest for the GTT.

Bonus points if you can figure out a way to capture the restriction in the docs,
e.g. somewhere in gpu/i915.rst?

Lastly, provide a link to the original discussion which provides even more context.

Link: https://lore.kernel.org/all/20221108084416.11447-1-yan.y.zhao@intel.com

> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-11 10:33 ` [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot Yan Zhao
@ 2022-11-11 18:19   ` Sean Christopherson
  2022-11-12  0:03     ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-11 18:19 UTC (permalink / raw)
  To: Yan Zhao; +Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

TL;DR: I'm going to try to add more aggressive patches for this into my series to
clean up the KVM side of things, along with many more patches to clean up the page
track APIs.

I'll post patches next week if things go well (fingers crossed), and if not I'll
give an update 

On Fri, Nov 11, 2022, Yan Zhao wrote:
> Page track hook track_remove_slot is used to notify users that a slot
> has been removed and is called when a slot DELETE/MOVE is about to be
> completed.

Phrase this as a command, and explain _why_ the new hook is being added, e.g.

  Add a new page track hook, track_remove_slot(), that is called when a
  memslot DELETE/MOVE operation is about to be committed.  The "remove"
  hook will be used by KVMGT and will effectively replace the existing
  track_flush_slot() altogether now that KVM itself doesn't rely on the
  "flush" hook either.

  The "flush" hook is flawed as it's invoked before the memslot operation
  is guaranteed, i.e. KVM might ultimately keep the existing memslot without
  notifying external page track users, a.k.a. KVMGT.

> Users of this hook can drop write protections in the removed slot.

Hmm, actually, on second thought, after thinking about what KVGT is doing in
response to the memslot update, I think we should be more aggressive and actively
prevent MOVE if there are external page trackers, i.e. if KVMGT is attached.

Dropping write protections when a memslot is being deleted is a waste of cycles.
The memslot and thus gfn_track is literally being deleted immediately after invoking
the hook, updating gfn_track from KVMGT is completely unecessary.

I.e. if we kill off the MOVE path, then KVMGT just needs to delete its hash table
entry.

Oooh!  Looking at this code again made me realize that the larger page track cleanup
that I want to do might actually work.  Long story short, I want to stop forcing
KVMGT to poke into KVM internals, but I thought there was a lock inversion problem.

But AFAICT, there is no such problem.  And the cleanup I want to do will actually
fix an existing KVMGT bug: kvmgt_page_track_write() invokes kvmgt_gfn_is_write_protected()
without holding mmu_lock, and thus could consume garbage when walking the hash
table.

  static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa,
		const u8 *val, int len,
		struct kvm_page_track_notifier_node *node)
  {
	struct intel_vgpu *info =
		container_of(node, struct intel_vgpu, track_node);

	if (kvmgt_gfn_is_write_protected(info, gpa_to_gfn(gpa)))
		intel_vgpu_page_track_handler(info, gpa,
						     (void *)val, len);
  }

Acquiring mmu_lock isn't an option as intel_vgpu_page_track_handler() might sleep,
e.g. when acquiring vgpu_lock.

Let me see if the clean up I have in mind will actually work.  If it does, I think
the end result will be quite nice for both KVM and KVMGT.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-11 18:19   ` Sean Christopherson
@ 2022-11-12  0:03     ` Yan Zhao
  2022-11-12  0:43       ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-12  0:03 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Fri, Nov 11, 2022 at 06:19:15PM +0000, Sean Christopherson wrote:
> TL;DR: I'm going to try to add more aggressive patches for this into my series to
> clean up the KVM side of things, along with many more patches to clean up the page
> track APIs.
> 
> I'll post patches next week if things go well (fingers crossed), and if not I'll
> give an update 
> 
> On Fri, Nov 11, 2022, Yan Zhao wrote:
> > Page track hook track_remove_slot is used to notify users that a slot
> > has been removed and is called when a slot DELETE/MOVE is about to be
> > completed.
> 
> Phrase this as a command, and explain _why_ the new hook is being added, e.g.
> 
>   Add a new page track hook, track_remove_slot(), that is called when a
>   memslot DELETE/MOVE operation is about to be committed.  The "remove"
>   hook will be used by KVMGT and will effectively replace the existing
>   track_flush_slot() altogether now that KVM itself doesn't rely on the
>   "flush" hook either.
> 
>   The "flush" hook is flawed as it's invoked before the memslot operation
>   is guaranteed, i.e. KVM might ultimately keep the existing memslot without
>   notifying external page track users, a.k.a. KVMGT.
> 
> > Users of this hook can drop write protections in the removed slot.
> 
> Hmm, actually, on second thought, after thinking about what KVGT is doing in
> response to the memslot update, I think we should be more aggressive and actively
> prevent MOVE if there are external page trackers, i.e. if KVMGT is attached.
> 
> Dropping write protections when a memslot is being deleted is a waste of cycles.
> The memslot and thus gfn_track is literally being deleted immediately after invoking
> the hook, updating gfn_track from KVMGT is completely unecessary.

> I.e. if we kill off the MOVE path, then KVMGT just needs to delete its hash table
> entry.
> 
> Oooh!  Looking at this code again made me realize that the larger page track cleanup
> that I want to do might actually work.  Long story short, I want to stop forcing
> KVMGT to poke into KVM internals, but I thought there was a lock inversion problem.
> 
> But AFAICT, there is no such problem.  And the cleanup I want to do will actually
> fix an existing KVMGT bug: kvmgt_page_track_write() invokes kvmgt_gfn_is_write_protected()
> without holding mmu_lock, and thus could consume garbage when walking the hash
> table.
> 
>   static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa,
> 		const u8 *val, int len,
> 		struct kvm_page_track_notifier_node *node)
>   {
> 	struct intel_vgpu *info =
> 		container_of(node, struct intel_vgpu, track_node);
> 
> 	if (kvmgt_gfn_is_write_protected(info, gpa_to_gfn(gpa)))
> 		intel_vgpu_page_track_handler(info, gpa,
> 						     (void *)val, len);
>   }
> 
> Acquiring mmu_lock isn't an option as intel_vgpu_page_track_handler() might sleep,
> e.g. when acquiring vgpu_lock.
>
I totally agree with you and actually had the same feeling as you when
examined the code yesterday. But I thought I'd better send this series
first and do the cleanup later :)
And I'm also not sure if a slots_arch_lock is required for
kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
(Though it doesn't matter for a removing slot and we actually needn't to
call kvm_slot_page_track_remove_page() in response to a slot removal,
the two interfaces are still required elsewhere.)


> Let me see if the clean up I have in mind will actually work.  If it does, I think
> the end result will be quite nice for both KVM and KVMGT.
Yes, it would be great.

Thanks
Yan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot
  2022-11-11 17:07   ` Sean Christopherson
@ 2022-11-12  0:05     ` Yan Zhao
  0 siblings, 0 replies; 16+ messages in thread
From: Yan Zhao @ 2022-11-12  0:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, intel-gfx, linux-kernel, zhenyuw, pbonzini, intel-gvt-dev

On Fri, Nov 11, 2022 at 05:07:35PM +0000, Sean Christopherson wrote:
> On Fri, Nov 11, 2022, Yan Zhao wrote:
> > KVMGT only cares about when a slot is indeed removed.
> > So switch to use track_remove_slot which is called when a slot is removed.
> 
> This should capture the original motivation, i.e. that the existing
> ->track_flush_slot() hook is theoretically flawed.  I think it also makes sense
> to call out that KVMGT undoubtedly does the wrong thing if a memslot is moved,
> but that (a) KVMGT has never supported moving memslots and (b) there's no sane
> use case for moving memslots that might be used by the guest for the GTT.
> 
> Bonus points if you can figure out a way to capture the restriction in the docs,
> e.g. somewhere in gpu/i915.rst?
> 
> Lastly, provide a link to the original discussion which provides even more context.
> 
> Link: https://lore.kernel.org/all/20221108084416.11447-1-yan.y.zhao@intel.com
>
Got it. I'll do it next time!

Thanks
Yan

> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-12  0:03     ` Yan Zhao
@ 2022-11-12  0:43       ` Sean Christopherson
  2022-11-14  1:05         ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-12  0:43 UTC (permalink / raw)
  To: Yan Zhao; +Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Sat, Nov 12, 2022, Yan Zhao wrote:
> And I'm also not sure if a slots_arch_lock is required for
> kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().

It's not required.  slots_arch_lock protects interaction between memslot updates
mmu_first_shadow_root_alloc().  When CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y, then
the mmu_first_shadow_root_alloc() doesn't touch the memslots because everything
is pre-allocated:

bool kvm_page_track_write_tracking_enabled(struct kvm *kvm)
{
	return IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) ||
	       !tdp_enabled || kvm_shadow_root_allocated(kvm);
}

int kvm_page_track_create_memslot(struct kvm *kvm,
				  struct kvm_memory_slot *slot,
				  unsigned long npages)
{
	if (!kvm_page_track_write_tracking_enabled(kvm)) <== always true
		return 0;

	return __kvm_page_track_write_tracking_alloc(slot, npages);
}

Though now that you point it out, it's tempting to #ifdef out some of those hooks
so that's basically impossible for mmu_first_shadow_root_alloc() to cause problems.
Not sure the extra #ideffery would be worth while though.

slots_arch_lock also protects shadow_root_allocated, but that's a KVM-internal
detail that isn't relevant to the page-tracking machinery when
CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-12  0:43       ` Sean Christopherson
@ 2022-11-14  1:05         ` Yan Zhao
  2022-11-14 16:32           ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-14  1:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> On Sat, Nov 12, 2022, Yan Zhao wrote:
> > And I'm also not sure if a slots_arch_lock is required for
> > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> 
> It's not required.  slots_arch_lock protects interaction between memslot updates
In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
do you know which lock is used to protect it?

Thanks
Yan

> mmu_first_shadow_root_alloc().  When CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y, then
> the mmu_first_shadow_root_alloc() doesn't touch the memslots because everything
> is pre-allocated:
> 
> bool kvm_page_track_write_tracking_enabled(struct kvm *kvm)
> {
> 	return IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) ||
> 	       !tdp_enabled || kvm_shadow_root_allocated(kvm);
> }
> 
> int kvm_page_track_create_memslot(struct kvm *kvm,
> 				  struct kvm_memory_slot *slot,
> 				  unsigned long npages)
> {
> 	if (!kvm_page_track_write_tracking_enabled(kvm)) <== always true
> 		return 0;
> 
> 	return __kvm_page_track_write_tracking_alloc(slot, npages);
> }
> 
> Though now that you point it out, it's tempting to #ifdef out some of those hooks
> so that's basically impossible for mmu_first_shadow_root_alloc() to cause problems.
> Not sure the extra #ideffery would be worth while though.
> 
> slots_arch_lock also protects shadow_root_allocated, but that's a KVM-internal
> detail that isn't relevant to the page-tracking machinery when
> CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-14  1:05         ` Yan Zhao
@ 2022-11-14 16:32           ` Sean Christopherson
  2022-11-14 22:42             ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-14 16:32 UTC (permalink / raw)
  To: Yan Zhao; +Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Mon, Nov 14, 2022, Yan Zhao wrote:
> On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > And I'm also not sure if a slots_arch_lock is required for
> > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > 
> > It's not required.  slots_arch_lock protects interaction between memslot updates
> In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> do you know which lock is used to protect it?

mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
allocates gfn_track for all memslots when shadow paging is activated.

The cleanup series I'm prepping adds lockdep assertions for the relevant paths, e.g. 

$ git grep -B 8 -E "update_gfn_write_track.*;"
arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_add_gfn(struct kvm *kvm, struct kvm_memory_slot *slot,
arch/x86/kvm/mmu/page_track.c-                         gfn_t gfn)
arch/x86/kvm/mmu/page_track.c-{
arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
arch/x86/kvm/mmu/page_track.c-
arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
arch/x86/kvm/mmu/page_track.c-          return;
arch/x86/kvm/mmu/page_track.c-
arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, 1);
--
arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_remove_gfn(struct kvm *kvm,
arch/x86/kvm/mmu/page_track.c-                            struct kvm_memory_slot *slot, gfn_t gfn)
arch/x86/kvm/mmu/page_track.c-{
arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
arch/x86/kvm/mmu/page_track.c-
arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
arch/x86/kvm/mmu/page_track.c-          return;
arch/x86/kvm/mmu/page_track.c-
arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, -1);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-14 16:32           ` Sean Christopherson
@ 2022-11-14 22:42             ` Yan Zhao
  2022-11-14 23:24               ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-14 22:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Mon, Nov 14, 2022 at 04:32:34PM +0000, Sean Christopherson wrote:
> On Mon, Nov 14, 2022, Yan Zhao wrote:
> > On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > > And I'm also not sure if a slots_arch_lock is required for
> > > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > > 
> > > It's not required.  slots_arch_lock protects interaction between memslot updates
> > In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> > slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> > do you know which lock is used to protect it?
> 
> mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
> protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
> allocates gfn_track for all memslots when shadow paging is activated.
Hmm, thanks for the reply.
but in direct_page_fault(),
if (page_fault_handle_page_track(vcpu, fault))
	return RET_PF_EMULATE;

slot->arch.gfn_track is read without any mmu_lock is held.
> 
> The cleanup series I'm prepping adds lockdep assertions for the relevant paths, e.g. 
> 
> $ git grep -B 8 -E "update_gfn_write_track.*;"
> arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_add_gfn(struct kvm *kvm, struct kvm_memory_slot *slot,
> arch/x86/kvm/mmu/page_track.c-                         gfn_t gfn)
> arch/x86/kvm/mmu/page_track.c-{
> arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
> arch/x86/kvm/mmu/page_track.c-
> arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
> arch/x86/kvm/mmu/page_track.c-          return;
> arch/x86/kvm/mmu/page_track.c-
> arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, 1);
> --
> arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_remove_gfn(struct kvm *kvm,
> arch/x86/kvm/mmu/page_track.c-                            struct kvm_memory_slot *slot, gfn_t gfn)
> arch/x86/kvm/mmu/page_track.c-{
> arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
> arch/x86/kvm/mmu/page_track.c-
> arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
> arch/x86/kvm/mmu/page_track.c-          return;
> arch/x86/kvm/mmu/page_track.c-
> arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, -1);
yes, it will be helpful.

Besides, will WRITE_ONCE or atomic_add in update_gfn_write_track() to
update slot->arch.gfn_track be better?

Thanks
Yan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-14 23:24               ` Sean Christopherson
@ 2022-11-14 23:22                 ` Yan Zhao
  2022-11-15  0:55                   ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Yan Zhao @ 2022-11-14 23:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Mon, Nov 14, 2022 at 11:24:16PM +0000, Sean Christopherson wrote:
> On Tue, Nov 15, 2022, Yan Zhao wrote:
> > On Mon, Nov 14, 2022 at 04:32:34PM +0000, Sean Christopherson wrote:
> > > On Mon, Nov 14, 2022, Yan Zhao wrote:
> > > > On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > > > > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > > > > And I'm also not sure if a slots_arch_lock is required for
> > > > > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > > > > 
> > > > > It's not required.  slots_arch_lock protects interaction between memslot updates
> > > > In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> > > > slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> > > > do you know which lock is used to protect it?
> > > 
> > > mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
> > > protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
> > > allocates gfn_track for all memslots when shadow paging is activated.
> > Hmm, thanks for the reply.
> > but in direct_page_fault(),
> > if (page_fault_handle_page_track(vcpu, fault))
> > 	return RET_PF_EMULATE;
> > 
> > slot->arch.gfn_track is read without any mmu_lock is held.
> 
> That's a fast path that deliberately reads out of mmu_lock.  A false positive
> only results in unnecessary emulation, and any false positive is inherently prone
> to races anyways, e.g. fault racing with zap.
what about false negative?
If the fast path read 0 count, no page track write callback will be called and write
protection will be removed in the slow path.


Thanks
Yan
> 
> > > arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_remove_gfn(struct kvm *kvm,
> > > arch/x86/kvm/mmu/page_track.c-                            struct kvm_memory_slot *slot, gfn_t gfn)
> > > arch/x86/kvm/mmu/page_track.c-{
> > > arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
> > > arch/x86/kvm/mmu/page_track.c-
> > > arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
> > > arch/x86/kvm/mmu/page_track.c-          return;
> > > arch/x86/kvm/mmu/page_track.c-
> > > arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, -1);
> > yes, it will be helpful.
> > 
> > Besides, will WRITE_ONCE or atomic_add in update_gfn_write_track() to
> > update slot->arch.gfn_track be better?
> 
> WRITE_ONCE() won't suffice, it needs to be atomic.  Switching to atomic_inc/dec
> isn't worth it so long as KVM's shadow MMU takes mmu_lock for write, i.e. while
> the accounting is mutually exclusive for other reasons in both KVM and KVMGT.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-14 22:42             ` Yan Zhao
@ 2022-11-14 23:24               ` Sean Christopherson
  2022-11-14 23:22                 ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-14 23:24 UTC (permalink / raw)
  To: Yan Zhao; +Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Tue, Nov 15, 2022, Yan Zhao wrote:
> On Mon, Nov 14, 2022 at 04:32:34PM +0000, Sean Christopherson wrote:
> > On Mon, Nov 14, 2022, Yan Zhao wrote:
> > > On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > > > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > > > And I'm also not sure if a slots_arch_lock is required for
> > > > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > > > 
> > > > It's not required.  slots_arch_lock protects interaction between memslot updates
> > > In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> > > slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> > > do you know which lock is used to protect it?
> > 
> > mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
> > protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
> > allocates gfn_track for all memslots when shadow paging is activated.
> Hmm, thanks for the reply.
> but in direct_page_fault(),
> if (page_fault_handle_page_track(vcpu, fault))
> 	return RET_PF_EMULATE;
> 
> slot->arch.gfn_track is read without any mmu_lock is held.

That's a fast path that deliberately reads out of mmu_lock.  A false positive
only results in unnecessary emulation, and any false positive is inherently prone
to races anyways, e.g. fault racing with zap.

> > arch/x86/kvm/mmu/page_track.c-void __kvm_write_track_remove_gfn(struct kvm *kvm,
> > arch/x86/kvm/mmu/page_track.c-                            struct kvm_memory_slot *slot, gfn_t gfn)
> > arch/x86/kvm/mmu/page_track.c-{
> > arch/x86/kvm/mmu/page_track.c-  lockdep_assert_held_write(&kvm->mmu_lock);
> > arch/x86/kvm/mmu/page_track.c-
> > arch/x86/kvm/mmu/page_track.c-  if (KVM_BUG_ON(!kvm_page_track_write_tracking_enabled(kvm), kvm))
> > arch/x86/kvm/mmu/page_track.c-          return;
> > arch/x86/kvm/mmu/page_track.c-
> > arch/x86/kvm/mmu/page_track.c:  update_gfn_write_track(slot, gfn, -1);
> yes, it will be helpful.
> 
> Besides, will WRITE_ONCE or atomic_add in update_gfn_write_track() to
> update slot->arch.gfn_track be better?

WRITE_ONCE() won't suffice, it needs to be atomic.  Switching to atomic_inc/dec
isn't worth it so long as KVM's shadow MMU takes mmu_lock for write, i.e. while
the accounting is mutually exclusive for other reasons in both KVM and KVMGT.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-14 23:22                 ` Yan Zhao
@ 2022-11-15  0:55                   ` Sean Christopherson
  2022-11-15  1:08                     ` Yan Zhao
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2022-11-15  0:55 UTC (permalink / raw)
  To: Yan Zhao; +Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Tue, Nov 15, 2022, Yan Zhao wrote:
> On Mon, Nov 14, 2022 at 11:24:16PM +0000, Sean Christopherson wrote:
> > On Tue, Nov 15, 2022, Yan Zhao wrote:
> > > On Mon, Nov 14, 2022 at 04:32:34PM +0000, Sean Christopherson wrote:
> > > > On Mon, Nov 14, 2022, Yan Zhao wrote:
> > > > > On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > > > > > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > > > > > And I'm also not sure if a slots_arch_lock is required for
> > > > > > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > > > > > 
> > > > > > It's not required.  slots_arch_lock protects interaction between memslot updates
> > > > > In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> > > > > slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> > > > > do you know which lock is used to protect it?
> > > > 
> > > > mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
> > > > protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
> > > > allocates gfn_track for all memslots when shadow paging is activated.
> > > Hmm, thanks for the reply.
> > > but in direct_page_fault(),
> > > if (page_fault_handle_page_track(vcpu, fault))
> > > 	return RET_PF_EMULATE;
> > > 
> > > slot->arch.gfn_track is read without any mmu_lock is held.
> > 
> > That's a fast path that deliberately reads out of mmu_lock.  A false positive
> > only results in unnecessary emulation, and any false positive is inherently prone
> > to races anyways, e.g. fault racing with zap.
> what about false negative?
> If the fast path read 0 count, no page track write callback will be called and write
> protection will be removed in the slow path.

No.  For a false negative to occur, a different task would have to create a SPTE
and write-protect the GFN _while holding mmu_lock_.  And then after acquiring
mmu_lock, the vCPU that got the false negative would call make_spte(), which would
detect that making the SPTE writable is disallowed due to the GFN being write-protected.

	if (pte_access & ACC_WRITE_MASK) {
		spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask;

		/*
		 * Optimization: for pte sync, if spte was writable the hash
		 * lookup is unnecessary (and expensive). Write protection
		 * is responsibility of kvm_mmu_get_page / kvm_mmu_sync_roots.
		 * Same reasoning can be applied to dirty page accounting.
		 */
		if (is_writable_pte(old_spte))
			goto out;

		/*
		 * Unsync shadow pages that are reachable by the new, writable
		 * SPTE.  Write-protect the SPTE if the page can't be unsync'd,
		 * e.g. it's write-tracked (upper-level SPs) or has one or more
		 * shadow pages and unsync'ing pages is not allowed.
		 */
		if (mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, can_unsync, prefetch)) {
			pgprintk("%s: found shadow page for %llx, marking ro\n",
				 __func__, gfn);
			wrprot = true;
			pte_access &= ~ACC_WRITE_MASK;
			spte &= ~(PT_WRITABLE_MASK | shadow_mmu_writable_mask);
		}
	}



int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
			    gfn_t gfn, bool can_unsync, bool prefetch)
{
	struct kvm_mmu_page *sp;
	bool locked = false;

	/*
	 * Force write-protection if the page is being tracked.  Note, the page
	 * track machinery is used to write-protect upper-level shadow pages,
	 * i.e. this guards the role.level == 4K assertion below!
	 */
	if (kvm_slot_page_track_is_active(kvm, slot, gfn, KVM_PAGE_TRACK_WRITE))
		return -EPERM;

	...
}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot
  2022-11-15  0:55                   ` Sean Christopherson
@ 2022-11-15  1:08                     ` Yan Zhao
  0 siblings, 0 replies; 16+ messages in thread
From: Yan Zhao @ 2022-11-15  1:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, pbonzini, intel-gfx, intel-gvt-dev, zhenyuw

On Tue, Nov 15, 2022 at 12:55:42AM +0000, Sean Christopherson wrote:
> On Tue, Nov 15, 2022, Yan Zhao wrote:
> > On Mon, Nov 14, 2022 at 11:24:16PM +0000, Sean Christopherson wrote:
> > > On Tue, Nov 15, 2022, Yan Zhao wrote:
> > > > On Mon, Nov 14, 2022 at 04:32:34PM +0000, Sean Christopherson wrote:
> > > > > On Mon, Nov 14, 2022, Yan Zhao wrote:
> > > > > > On Sat, Nov 12, 2022 at 12:43:07AM +0000, Sean Christopherson wrote:
> > > > > > > On Sat, Nov 12, 2022, Yan Zhao wrote:
> > > > > > > > And I'm also not sure if a slots_arch_lock is required for
> > > > > > > > kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
> > > > > > > 
> > > > > > > It's not required.  slots_arch_lock protects interaction between memslot updates
> > > > > > In kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page(),
> > > > > > slot->arch.gfn_track[mode][index] is updated in update_gfn_track(),
> > > > > > do you know which lock is used to protect it?
> > > > > 
> > > > > mmu_lock protects the count, kvm->srcu protects the slot, and shadow_root_allocated
> > > > > protects that validity of gfn_track, i.e. shadow_root_allocated ensures that KVM
> > > > > allocates gfn_track for all memslots when shadow paging is activated.
> > > > Hmm, thanks for the reply.
> > > > but in direct_page_fault(),
> > > > if (page_fault_handle_page_track(vcpu, fault))
> > > > 	return RET_PF_EMULATE;
> > > > 
> > > > slot->arch.gfn_track is read without any mmu_lock is held.
> > > 
> > > That's a fast path that deliberately reads out of mmu_lock.  A false positive
> > > only results in unnecessary emulation, and any false positive is inherently prone
> > > to races anyways, e.g. fault racing with zap.
> > what about false negative?
> > If the fast path read 0 count, no page track write callback will be called and write
> > protection will be removed in the slow path.
> 
> No.  For a false negative to occur, a different task would have to create a SPTE
> and write-protect the GFN _while holding mmu_lock_.  And then after acquiring
> mmu_lock, the vCPU that got the false negative would call make_spte(), which would
> detect that making the SPTE writable is disallowed due to the GFN being write-protected.
> 
> 	if (pte_access & ACC_WRITE_MASK) {
> 		spte |= PT_WRITABLE_MASK | shadow_mmu_writable_mask;
> 
> 		/*
> 		 * Optimization: for pte sync, if spte was writable the hash
> 		 * lookup is unnecessary (and expensive). Write protection
> 		 * is responsibility of kvm_mmu_get_page / kvm_mmu_sync_roots.
> 		 * Same reasoning can be applied to dirty page accounting.
> 		 */
> 		if (is_writable_pte(old_spte))
> 			goto out;
> 
> 		/*
> 		 * Unsync shadow pages that are reachable by the new, writable
> 		 * SPTE.  Write-protect the SPTE if the page can't be unsync'd,
> 		 * e.g. it's write-tracked (upper-level SPs) or has one or more
> 		 * shadow pages and unsync'ing pages is not allowed.
> 		 */
> 		if (mmu_try_to_unsync_pages(vcpu->kvm, slot, gfn, can_unsync, prefetch)) {
> 			pgprintk("%s: found shadow page for %llx, marking ro\n",
> 				 __func__, gfn);
> 			wrprot = true;
> 			pte_access &= ~ACC_WRITE_MASK;
> 			spte &= ~(PT_WRITABLE_MASK | shadow_mmu_writable_mask);
> 		}
> 	}
> 
> 
> 
> int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
> 			    gfn_t gfn, bool can_unsync, bool prefetch)
> {
> 	struct kvm_mmu_page *sp;
> 	bool locked = false;
> 
> 	/*
> 	 * Force write-protection if the page is being tracked.  Note, the page
> 	 * track machinery is used to write-protect upper-level shadow pages,
> 	 * i.e. this guards the role.level == 4K assertion below!
> 	 */
> 	if (kvm_slot_page_track_is_active(kvm, slot, gfn, KVM_PAGE_TRACK_WRITE))
> 		return -EPERM;
> 
> 	...
> }

Oh, you are right! I thought mmu_try_to_unsync_pages() is only for the
shadow mmu, and overlooked that TDP MMU will also go into it.

Thanks for the detailed explanation.

Thanks
Yan

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-15  1:32 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11 10:32 [PATCH v2 0/3] add track_remove_slot and remove track_flush_slot Yan Zhao
2022-11-11 10:33 ` [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot Yan Zhao
2022-11-11 18:19   ` Sean Christopherson
2022-11-12  0:03     ` Yan Zhao
2022-11-12  0:43       ` Sean Christopherson
2022-11-14  1:05         ` Yan Zhao
2022-11-14 16:32           ` Sean Christopherson
2022-11-14 22:42             ` Yan Zhao
2022-11-14 23:24               ` Sean Christopherson
2022-11-14 23:22                 ` Yan Zhao
2022-11-15  0:55                   ` Sean Christopherson
2022-11-15  1:08                     ` Yan Zhao
2022-11-11 10:34 ` [PATCH v2 2/3] drm/i915/gvt: switch from track_flush_slot to track_remove_slot Yan Zhao
2022-11-11 17:07   ` Sean Christopherson
2022-11-12  0:05     ` Yan Zhao
2022-11-11 10:35 ` [PATCH v2 3/3] KVM: x86: Remove the unused page track hook track_flush_slot Yan Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).