linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails
@ 2021-12-10 22:29 Sean Christopherson
  2021-12-10 22:41 ` Paolo Bonzini
  2021-12-13 16:53 ` Vitaly Kuznetsov
  0 siblings, 2 replies; 3+ messages in thread
From: Sean Christopherson @ 2021-12-10 22:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Hou Wenlong

Inject a #UD if patching in the correct hypercall fails, e.g. due to
emulator_write_emulated() failing because RIP is mapped not-writable by
the guest.  The guest is likely doomed in any case, but observing a #UD
in the guest is far friendlier to debug/triage than a !WRITABLE #PF with
CR2 pointing at the RIP of the faulting instruction.

Ideally, KVM wouldn't patch at all; it's the guest's responsibility to
identify and use the correct hypercall instruction (VMCALL vs. VMMCALL).
Sadly, older Linux kernels prior to commit c1118b3602c2 ("x86: kvm: use
alternatives for VMCALL vs. VMMCALL if kernel text is read-only") do the
wrong thing and blindly use VMCALL, i.e. removing the patching would
break running VMs with older kernels.

One could argue that KVM should be "fixed" to ignore guest paging
protections instead of injecting #UD, but patching in the first place was
a mistake as it was a hack-a-fix for a guest bug.  There are myriad fatal
issues with KVM's patching:

  1. Patches using an emulated guest write, which will fail if RIP is not
     mapped writable.  This is the issue being mitigated.

  2. Doesn't ensure the write is "atomic", e.g. a hypercall that splits a
     page boundary will be handled as two separate writes, which means
     that a partial, corrupted instruction can be observed by a vCPU.

  3. Doesn't serialize other CPU cores after updating the code stream.

  4. Completely fails to account for the case where KVM is emulating due
     to invalid guest state with unrestricted_guest=0.  Patching and
     retrying the instruction will result in vCPU getting stuck in an
     infinite loop.

But, the "support" _so_ awful, especially #1, that there's practically
zero chance that a modern guest kernel can rely on KVM to patch the guest.
So, rather than proliferate KVM's bad behavior any further than the
absolute minimum needed for backwards compatibility, just try to make it
suck a little less.

Cc: Hou Wenlong <houwenlong93@linux.alibaba.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/emulate.c |  2 +-
 arch/x86/kvm/x86.c     | 13 +++++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 28b1a4e57827..3ccf7b73687f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3734,7 +3734,7 @@ static int em_hypercall(struct x86_emulate_ctxt *ctxt)
 	int rc = ctxt->ops->fix_hypercall(ctxt);
 
 	if (rc != X86EMUL_CONTINUE)
-		return rc;
+		return emulate_ud(ctxt);
 
 	/* Let the processor re-execute the fixed hypercall */
 	ctxt->_eip = ctxt->eip;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26cb3a4cd0e9..1a844ad873ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9026,11 +9026,20 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	char instruction[3];
 	unsigned long rip = kvm_rip_read(vcpu);
+	struct x86_exception e;
+	int r;
 
 	static_call(kvm_x86_patch_hypercall)(vcpu, instruction);
 
-	return emulator_write_emulated(ctxt, rip, instruction, 3,
-		&ctxt->exception);
+	/*
+	 * Eat any exceptions, e.g. if RIP is not mapped writable, and simply
+	 * signal failure to the caller.  Faults on the write are (obviously)
+	 * not from the guest, though the guest is likely doomed in any case.
+	 */
+	r = emulator_write_emulated(ctxt, rip, instruction, 3, &e);
+	if (r != X86EMUL_CONTINUE)
+		return X86EMUL_UNHANDLEABLE;
+	return X86EMUL_CONTINUE;
 }
 
 static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu)
-- 
2.34.1.173.g76aa8bc2d0-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails
  2021-12-10 22:29 [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails Sean Christopherson
@ 2021-12-10 22:41 ` Paolo Bonzini
  2021-12-13 16:53 ` Vitaly Kuznetsov
  1 sibling, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2021-12-10 22:41 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Hou Wenlong

On 12/10/21 23:29, Sean Christopherson wrote:
> Inject a #UD if patching in the correct hypercall fails, e.g. due to
> emulator_write_emulated() failing because RIP is mapped not-writable by
> the guest.  The guest is likely doomed in any case, but observing a #UD
> in the guest is far friendlier to debug/triage than a !WRITABLE #PF with
> CR2 pointing at the RIP of the faulting instruction.
> 
> Ideally, KVM wouldn't patch at all; it's the guest's responsibility to
> identify and use the correct hypercall instruction (VMCALL vs. VMMCALL).
> Sadly, older Linux kernels prior to commit c1118b3602c2 ("x86: kvm: use
> alternatives for VMCALL vs. VMMCALL if kernel text is read-only") do the
> wrong thing and blindly use VMCALL, i.e. removing the patching would
> break running VMs with older kernels.
> 
> One could argue that KVM should be "fixed" to ignore guest paging
> protections instead of injecting #UD, but patching in the first place was
> a mistake as it was a hack-a-fix for a guest bug.

Sort of.  I agree that patching is awful, but I'm not sure about 
injecting #UD vs. just doing the hypercall; the original reason for the 
patching was to allow Intel<->AMD cross-vendor migration to work somewhat.

That in turn promoted Linux's ill-conceived sloppiness of just using 
vmcall, which lasted until commit c1118b3602c2.

> There are myriad fatal
> issues with KVM's patching:
> 
>    1. Patches using an emulated guest write, which will fail if RIP is not
>       mapped writable.  This is the issue being mitigated.
> 
>    2. Doesn't ensure the write is "atomic", e.g. a hypercall that splits a
>       page boundary will be handled as two separate writes, which means
>       that a partial, corrupted instruction can be observed by a vCPU.

Only the third bytes differs between VMCALL and VMMCALL so that's not 
really a problem.  (Apparently what happened is that Microsoft asked 
Intel to use 0xc1 like AMD, and VMware asked AMD to use 0xd9 like Intel, 
or something like that; and they ended up swapping opcodes.  But this 
may be an urban legend, no matter how plausible).

The big ones are 1 and 4.

Thanks,

Paolo

>    3. Doesn't serialize other CPU cores after updating the code stream.
> 
>    4. Completely fails to account for the case where KVM is emulating due
>       to invalid guest state with unrestricted_guest=0.  Patching and
>       retrying the instruction will result in vCPU getting stuck in an
>       infinite loop.
> 
> But, the "support" _so_ awful, especially #1, that there's practically
> zero chance that a modern guest kernel can rely on KVM to patch the guest.
> So, rather than proliferate KVM's bad behavior any further than the
> absolute minimum needed for backwards compatibility, just try to make it
> suck a little less.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails
  2021-12-10 22:29 [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails Sean Christopherson
  2021-12-10 22:41 ` Paolo Bonzini
@ 2021-12-13 16:53 ` Vitaly Kuznetsov
  1 sibling, 0 replies; 3+ messages in thread
From: Vitaly Kuznetsov @ 2021-12-13 16:53 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, Hou Wenlong

Sean Christopherson <seanjc@google.com> writes:

> Ideally, KVM wouldn't patch at all; it's the guest's responsibility to
> identify and use the correct hypercall instruction (VMCALL vs. VMMCALL).
> Sadly, older Linux kernels prior to commit c1118b3602c2 ("x86: kvm: use
> alternatives for VMCALL vs. VMMCALL if kernel text is read-only") do the
> wrong thing and blindly use VMCALL, i.e. removing the patching would
> break running VMs with older kernels.
>

FWIW, we also use hypercall patching for Hyper-V emulation (when
HV_X64_MSR_HYPERCALL is written) and this complies with TLFS, we can't
get rid of this. It's a different 'patching' though...

-- 
Vitaly


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-13 16:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 22:29 [PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails Sean Christopherson
2021-12-10 22:41 ` Paolo Bonzini
2021-12-13 16:53 ` Vitaly Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).