Re: [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd
       [not found] <d79aacb5-9069-4647-9332-86f7d74b747a@email.android.com>
@ 2022-02-14 17:14 ` Sean Christopherson
  2022-02-14 18:50   ` David Woodhouse
  0 siblings, 1 reply; 3+ messages in thread
From: Sean Christopherson @ 2022-02-14 17:14 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: Paolo Bonzini, Christian Borntraeger, Janosch Frank,
	David Hildenbrand, Claudio Imbrenda, kvm, linux-kernel

On Sat, Feb 12, 2022, Woodhouse, David wrote:
> 
> (Apologies if this is HTML but I'm half-way to Austria and the laptop is
> buried somewhere in the car, and access to work email with sane email apps is
> difficult.)
> 
> On 12 Feb 2022 03:05, Sean Christopherson <seanjc@google.com> wrote:
> 
> Don't actually set a request bit in vcpu->requests when making a request
> purely to force a vCPU to exit the guest.  Logging the request but not
> actually consuming it causes the vCPU to get stuck in an infinite loop
> during KVM_RUN because KVM sees a pending request and bails from VM-Enter
> to service the request.
> 
> 
> Right, but there is no extant code which does this. The guest_uses_pa flag is
> unused.

Grr.  A WARN or something would have been nice to have.  Oh well.

> The series came with a proof-of-concept that attempted using it for
> fixing nesting UAFs but it was just that — a proof of concept to demonstrate
> that the new design of GPC was sufficient to address that problem.
> 
> IIRC, said proof of concept did also actually consume the req in question,

It did.  I saw that, but obviously didn't connect the dots to guest_uses_pa.

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9826,6 +9826,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)

                if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
                        static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
+               if (kvm_check_request(KVM_REQ_GPC_INVALIDATE, vcpu))
+                       ; /* Nothing to do. It just wanted to wake us */

> and one of the existing test cases did exercise it with an additional mmap
> torture added? Of course until we have kernel code that *does* this, it's
> hard to exercise it from userspace :)

Indeed.  I'll send a new version with a different changelog, that way we're not
leaving a trap for developers and each architecture doesn't need to manually handle
the request.

Thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd
  2022-02-14 17:14 ` [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd Sean Christopherson
@ 2022-02-14 18:50   ` David Woodhouse
  0 siblings, 0 replies; 3+ messages in thread
From: David Woodhouse @ 2022-02-14 18:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Christian Borntraeger, Janosch Frank,
	David Hildenbrand, Claudio Imbrenda, kvm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2914 bytes --]

On Mon, 2022-02-14 at 17:14 +0000, Sean Christopherson wrote:
> On Sat, Feb 12, 2022, Woodhouse, David wrote:
> > (Apologies if this is HTML but I'm half-way to Austria and the laptop is
> > buried somewhere in the car, and access to work email with sane email apps is
> > difficult.)
> > 
> > On 12 Feb 2022 03:05, Sean Christopherson <seanjc@google.com> wrote:
> > 
> > Don't actually set a request bit in vcpu->requests when making a request
> > purely to force a vCPU to exit the guest.  Logging the request but not
> > actually consuming it causes the vCPU to get stuck in an infinite loop
> > during KVM_RUN because KVM sees a pending request and bails from VM-Enter
> > to service the request.
> > 
> > 
> > Right, but there is no extant code which does this. The guest_uses_pa flag is
> > unused.
> 
> Grr.  A WARN or something would have been nice to have.  Oh well.

I don't think it was clear yet what the 'wrong' behaviour would we that
we should warn about, since we really hadn't finished defining the
'correct' usage :)

> > The series came with a proof-of-concept that attempted using it for
> > fixing nesting UAFs but it was just that — a proof of concept to demonstrate
> > that the new design of GPC was sufficient to address that problem.
> > 
> > IIRC, said proof of concept did also actually consume the req in question,
> 
> It did.  I saw that, but obviously didn't connect the dots to guest_uses_pa.
> 
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9826,6 +9826,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> 
>                 if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
>                         static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
> +               if (kvm_check_request(KVM_REQ_GPC_INVALIDATE, vcpu))
> +                       ; /* Nothing to do. It just wanted to wake us */
> 

Right. That's a later iteration. I originally *did* actually use it to
trigger an action, IIRC, but Paolo suggested we do it differently.

I also pondered having each GPC able to raise a specific request, and
set the KVM_REQ_xxx bit in the GPC itself. That just made the loop in
invalidate_start() a bit more complex though because of the way that it
wakes the vCPUs after it's done its iteration and collected them in a
cpumask. So it seemed like premature deoptimisation; we can add that in
future if we really do need it, and looks like we're going in the
opposite direction.

> > and one of the existing test cases did exercise it with an additional mmap
> > torture added? Of course until we have kernel code that *does* this, it's
> > hard to exercise it from userspace :)
> 
> Indeed.  I'll send a new version with a different changelog, that way we're not
> leaving a trap for developers and each architecture doesn't need to manually handle
> the request.

Ack, thanks.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd
@ 2022-02-12  2:04 Sean Christopherson
  0 siblings, 0 replies; 3+ messages in thread
From: Sean Christopherson @ 2022-02-12  2:04 UTC (permalink / raw)
  To: Paolo Bonzini, Christian Borntraeger, Janosch Frank
  Cc: David Hildenbrand, Claudio Imbrenda, kvm, linux-kernel,
	David Woodhouse, Sean Christopherson

Don't actually set a request bit in vcpu->requests when making a request
purely to force a vCPU to exit the guest.  Logging the request but not
actually consuming it causes the vCPU to get stuck in an infinite loop
during KVM_RUN because KVM sees a pending request and bails from VM-Enter
to service the request.

Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus
to evict_vcpus, the purpose of the request is to get vCPUs out of guest
mode, it specifically is supposed to avoid waking vCPUs that are blocking.

Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to
what it wants to accomplish, and to genericize the name so that it can
used for similar but unrelated scenarios, should they arise in the future.
Add a comment and documentation to explain why the "no action" request
exists.

Add compile-time assertions to help detect improper usage.  Use the inner
assertless helper in the one s390 path that makes requests without a
hardcoded request.

Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

This code badly needs a selftest, any triggering of the related code would
hang the target vCPU, i.e. this is likely getting zero coverage.  Found by
inspection when resolving a conflict and trying to opportunistically type
of documentation for KVM_REQ_GPC_INVALIDATE.

Verified by hacking gfn_to_pfn_cache_invalidate_start() to unconditionally
send a request to all vCPUs.

 Documentation/virt/kvm/vcpu-requests.rst | 10 +++++++++
 arch/s390/kvm/kvm-s390.c                 |  2 +-
 include/linux/kvm_host.h                 | 27 ++++++++++++++++++++++--
 virt/kvm/kvm_main.c                      |  3 ++-
 virt/kvm/pfncache.c                      | 18 ++++++++++------
 5 files changed, 49 insertions(+), 11 deletions(-)

diff --git a/Documentation/virt/kvm/vcpu-requests.rst b/Documentation/virt/kvm/vcpu-requests.rst
index ad2915ef7020..f3d38e8a1fb3 100644
--- a/Documentation/virt/kvm/vcpu-requests.rst
+++ b/Documentation/virt/kvm/vcpu-requests.rst
@@ -136,6 +136,16 @@ KVM_REQ_UNHALT
   such as a pending signal, which does not indicate the VCPU's halt
   emulation should stop, and therefore does not make the request.
 
+KVM_REQ_OUTSIDE_GUEST_MODE
+
+  This "request" ensures the target vCPU has exited guest mode prior to the
+  sender of the request continuing on.  No action needs be taken by the target,
+  and so no request is actually logged for the target.  This request is similar
+  to a "kick", but unlike a kick it guarantees the vCPU has actually exited
+  guest mode.  A kick only guarantees the vCPU will exit at some point in the
+  future, e.g. a previous kick may have started the process, but there's no
+  guarantee the to-be-kicked vCPU has fully exited guest mode.
+
 KVM_REQUEST_MASK
 ----------------
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 577f1ead6a51..54b7e0017208 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3371,7 +3371,7 @@ void exit_sie(struct kvm_vcpu *vcpu)
 /* Kick a guest cpu out of SIE to process a request synchronously */
 void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu)
 {
-	kvm_make_request(req, vcpu);
+	__kvm_make_request(req, vcpu);
 	kvm_s390_vcpu_request(vcpu);
 }
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f11039944c08..a67670201888 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -148,6 +148,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQUEST_MASK           GENMASK(7,0)
 #define KVM_REQUEST_NO_WAKEUP      BIT(8)
 #define KVM_REQUEST_WAIT           BIT(9)
+#define KVM_REQUEST_NO_ACTION      BIT(10)
 /*
  * Architecture-independent vcpu->requests bit members
  * Bits 4-7 are reserved for more arch-independent bits.
@@ -157,9 +158,18 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_UNBLOCK           2
 #define KVM_REQ_UNHALT            3
 #define KVM_REQ_VM_DEAD           (4 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
-#define KVM_REQ_GPC_INVALIDATE    (5 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQUEST_ARCH_BASE     8
 
+/*
+ * KVM_REQ_OUTSIDE_GUEST_MODE exists is purely as way to force the vCPU to
+ * OUTSIDE_GUEST_MODE.  KVM_REQ_OUTSIDE_GUEST_MODE differs from a vCPU "kick"
+ * in that it ensures the vCPU has reached OUTSIDE_GUEST_MODE before continuing
+ * on.  A kick only guarantees that the vCPU is on its way out, e.g. a previous
+ * kick may have set vcpu->mode to EXITING_GUEST_MODE, and so there's no
+ * guarantee the vCPU received an IPI and has actually exited guest mode.
+ */
+#define KVM_REQ_OUTSIDE_GUEST_MODE	(KVM_REQUEST_NO_ACTION | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+
 #define KVM_ARCH_REQ_FLAGS(nr, flags) ({ \
 	BUILD_BUG_ON((unsigned)(nr) >= (sizeof_field(struct kvm_vcpu, requests) * 8) - KVM_REQUEST_ARCH_BASE); \
 	(unsigned)(((nr) + KVM_REQUEST_ARCH_BASE) | (flags)); \
@@ -1986,7 +1996,7 @@ static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 
 void kvm_arch_irq_routing_update(struct kvm *kvm);
 
-static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
+static inline void __kvm_make_request(int req, struct kvm_vcpu *vcpu)
 {
 	/*
 	 * Ensure the rest of the request is published to kvm_check_request's
@@ -1996,6 +2006,19 @@ static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
 	set_bit(req & KVM_REQUEST_MASK, (void *)&vcpu->requests);
 }
 
+static __always_inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Request that don't require vCPU action should never be logged in
+	 * vcpu->requests.  The vCPU won't clear the request, so it will stay
+	 * logged indefinitely and prevent the vCPU from entering the guest.
+	 */
+	BUILD_BUG_ON(!__builtin_constant_p(req) ||
+		     (req & KVM_REQUEST_NO_ACTION));
+
+	__kvm_make_request(req, vcpu);
+}
+
 static inline bool kvm_request_pending(struct kvm_vcpu *vcpu)
 {
 	return READ_ONCE(vcpu->requests);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 58d31da8a2f7..90ada92ff031 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -252,7 +252,8 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
 {
 	int cpu;
 
-	kvm_make_request(req, vcpu);
+	if (likely(!(req & KVM_REQUEST_NO_ACTION)))
+		__kvm_make_request(req, vcpu);
 
 	if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
 		return;
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index ce878f4be4da..5c21f81e5491 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -27,7 +27,7 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 {
 	DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
 	struct gfn_to_pfn_cache *gpc;
-	bool wake_vcpus = false;
+	bool evict_vcpus = false;
 
 	spin_lock(&kvm->gpc_lock);
 	list_for_each_entry(gpc, &kvm->gpc_list, list) {
@@ -40,11 +40,11 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 
 			/*
 			 * If a guest vCPU could be using the physical address,
-			 * it needs to be woken.
+			 * it needs to be forced out of guest mode.
 			 */
 			if (gpc->guest_uses_pa) {
-				if (!wake_vcpus) {
-					wake_vcpus = true;
+				if (!evict_vcpus) {
+					evict_vcpus = true;
 					bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS);
 				}
 				__set_bit(gpc->vcpu->vcpu_idx, vcpu_bitmap);
@@ -67,14 +67,18 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 	}
 	spin_unlock(&kvm->gpc_lock);
 
-	if (wake_vcpus) {
-		unsigned int req = KVM_REQ_GPC_INVALIDATE;
+	if (evict_vcpus) {
+		/*
+		 * KVM needs to ensure the vCPU is fully out of guest context
+		 * before allowing the invalidation to continue.
+		 */
+		unsigned int req = KVM_REQ_OUTSIDE_GUEST_MODE;
 		bool called;
 
 		/*
 		 * If the OOM reaper is active, then all vCPUs should have
 		 * been stopped already, so perform the request without
-		 * KVM_REQUEST_WAIT and be sad if any needed to be woken.
+		 * KVM_REQUEST_WAIT and be sad if any needed to be IPI'd.
 		 */
 		if (!may_block)
 			req &= ~KVM_REQUEST_WAIT;

base-commit: 66fa226c131fb89287f8f7d004a46e39a859fbf6
-- 
2.35.1.265.g69c8d7142f-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-14 20:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <d79aacb5-9069-4647-9332-86f7d74b747a@email.android.com>
2022-02-14 17:14 ` [PATCH] KVM: Don't actually set a request when evicting vCPUs for GFN cache invd Sean Christopherson
2022-02-14 18:50   ` David Woodhouse
2022-02-12  2:04 Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).