All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation
@ 2022-08-10 14:00 Vitaly Kuznetsov
  2022-08-10 14:00 ` [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset Vitaly Kuznetsov
  2022-08-10 14:00 ` [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset Vitaly Kuznetsov
  0 siblings, 2 replies; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-10 14:00 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
to reboot when QEMU's 'system_reset' command is issued. The problem appears
to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
VMX root operation (post-VMXON). kvm_arch_put_registers() does call 
kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
operation, however, it only does so after kvm_put_sregs2() and there's
a good reason for that: 'real' nested state requires e.g. EFER.SVME to
be set. While swapping kvm_put_sregs2()/kvm_put_nested_state() order
in kvm_arch_put_registers() can't be done in KVM_PUT_FULL_STATE case,
doing it in KVM_PUT_RESET_STATE seems like a reasonable band aid.

The root cause of the issue seems to be that QEMU is doing quite a lot
to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
rather, it's super-set) yet. While all the numerous existing APIs for
setting a vCPU state work fine for a newly created vCPU, using them for
vCPU reset is a mess caused by various dependencies between different
components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
been possible to allow to set 'inconsistent' state and only validate it
upon VCPU_RUN from the very beginning but that ship has long sailed for
KVM. A new, dedicated API for vCPU reset is likely the way to go.

RFC part: the immediate issue could've probably been solved in KVM too
by avoiding vmx_is_valid_cr4() check from __set_sregs2() and hoping that
someone will check for the resulting inconsistency later. I don't quite
like this option so I didn't explore it in depth.

Vitaly Kuznetsov (2):
  i386: reset KVM nested state upon CPU reset
  i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is
    reset

 target/i386/kvm/kvm.c | 57 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 45 insertions(+), 12 deletions(-)

-- 
2.37.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset
  2022-08-10 14:00 [PATCH RFC v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation Vitaly Kuznetsov
@ 2022-08-10 14:00 ` Vitaly Kuznetsov
  2022-08-10 14:50   ` Maxim Levitsky
  2022-08-10 14:00 ` [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset Vitaly Kuznetsov
  1 sibling, 1 reply; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-10 14:00 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

Make sure env->nested_state is cleaned up when a vCPU is reset, it may
be stale after an incoming migration, kvm_arch_put_registers() may
end up failing or putting vCPU in a weird state.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f148a6d52fa4..4f8dacc1d4b5 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1695,6 +1695,30 @@ static void kvm_init_xsave(CPUX86State *env)
            env->xsave_buf_len);
 }
 
+static void kvm_init_nested_state(CPUX86State *env)
+{
+    struct kvm_vmx_nested_state_hdr *vmx_hdr;
+    uint32_t size;
+
+    if (!env->nested_state) {
+        return;
+    }
+
+    size = env->nested_state->size;
+
+    memset(env->nested_state, 0, size);
+    env->nested_state->size = size;
+
+    if (cpu_has_vmx(env)) {
+        env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
+        vmx_hdr = &env->nested_state->hdr.vmx;
+        vmx_hdr->vmxon_pa = -1ull;
+        vmx_hdr->vmcs12_pa = -1ull;
+    } else if (cpu_has_svm(env)) {
+        env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
+    }
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
@@ -2122,19 +2146,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
         assert(max_nested_state_len >= offsetof(struct kvm_nested_state, data));
 
         if (cpu_has_vmx(env) || cpu_has_svm(env)) {
-            struct kvm_vmx_nested_state_hdr *vmx_hdr;
-
             env->nested_state = g_malloc0(max_nested_state_len);
             env->nested_state->size = max_nested_state_len;
 
-            if (cpu_has_vmx(env)) {
-                env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
-                vmx_hdr = &env->nested_state->hdr.vmx;
-                vmx_hdr->vmxon_pa = -1ull;
-                vmx_hdr->vmcs12_pa = -1ull;
-            } else {
-                env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
-            }
+            kvm_init_nested_state(env);
         }
     }
 
@@ -2199,6 +2214,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
     /* enabled by default */
     env->poll_control_msr = 1;
 
+    kvm_init_nested_state(env);
+
     sev_es_set_reset_vector(CPU(cpu));
 }
 
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset
  2022-08-10 14:00 [PATCH RFC v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation Vitaly Kuznetsov
  2022-08-10 14:00 ` [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset Vitaly Kuznetsov
@ 2022-08-10 14:00 ` Vitaly Kuznetsov
  2022-08-10 14:47   ` Maxim Levitsky
  1 sibling, 1 reply; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-10 14:00 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

Setting nested state upon migration needs to happen after kvm_put_sregs2()
to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
when vCPU is in VMX root operation, certain CR bits are locked and
kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
that vCPU is *not* in VMX root opertaion.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4f8dacc1d4b5..73e3880fa57b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
-    /* must be before kvm_put_nested_state so that EFER.SVME is set */
+    /*
+     * When resetting a vCPU, make sure to reset nested state first to
+     * e.g clear VMXON state and unlock certain CR4 bits.
+     */
+    if (level == KVM_PUT_RESET_STATE) {
+        ret = kvm_put_nested_state(x86_cpu);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
     ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
     if (ret < 0) {
         return ret;
     }
 
-    if (level >= KVM_PUT_RESET_STATE) {
+    /*
+     * When putting full CPU state, kvm_put_nested_state() must happen after
+     * kvm_put_sregs{,2} so that e.g. EFER.SVME is already set.
+     */
+    if (level == KVM_PUT_FULL_STATE) {
         ret = kvm_put_nested_state(x86_cpu);
         if (ret < 0) {
             return ret;
         }
+    }
 
+    if (level >= KVM_PUT_RESET_STATE) {
         ret = kvm_put_msr_feature_control(x86_cpu);
         if (ret < 0) {
             return ret;
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset
  2022-08-10 14:00 ` [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset Vitaly Kuznetsov
@ 2022-08-10 14:47   ` Maxim Levitsky
  2022-08-10 14:57     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2022-08-10 14:47 UTC (permalink / raw)
  To: Vitaly Kuznetsov, qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
> Setting nested state upon migration needs to happen after kvm_put_sregs2()
> to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
> when vCPU is in VMX root operation, certain CR bits are locked and
> kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
> vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
> kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
> that vCPU is *not* in VMX root opertaion.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 4f8dacc1d4b5..73e3880fa57b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>  
>      assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
>  
> -    /* must be before kvm_put_nested_state so that EFER.SVME is set */
> +    /*
> +     * When resetting a vCPU, make sure to reset nested state first to
> +     * e.g clear VMXON state and unlock certain CR4 bits.
> +     */
> +    if (level == KVM_PUT_RESET_STATE) {
> +        ret = kvm_put_nested_state(x86_cpu);
> +        if (ret < 0) {
> +            return ret;
> +        }

I should have mentioned this, I actually already debugged the same issue while
trying to reproduce the smm int window bug.
100% my fault.

I also share the same feeling that this might be yet another 'whack a mole' and
break somewhere else, but overall it does make sense.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


Best regards,
	Maxim Levitsky

> +    }
> +
>      ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
>      if (ret < 0) {
>          return ret;
>      }
>  
> -    if (level >= KVM_PUT_RESET_STATE) {
> +    /*
> +     * When putting full CPU state, kvm_put_nested_state() must happen after
> +     * kvm_put_sregs{,2} so that e.g. EFER.SVME is already set.
> +     */
> +    if (level == KVM_PUT_FULL_STATE) {
>          ret = kvm_put_nested_state(x86_cpu);
>          if (ret < 0) {
>              return ret;
>          }
> +    }
>  
> +    if (level >= KVM_PUT_RESET_STATE) {
>          ret = kvm_put_msr_feature_control(x86_cpu);
>          if (ret < 0) {
>              return ret;




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset
  2022-08-10 14:00 ` [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset Vitaly Kuznetsov
@ 2022-08-10 14:50   ` Maxim Levitsky
  0 siblings, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2022-08-10 14:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov, qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
> Make sure env->nested_state is cleaned up when a vCPU is reset, it may
> be stale after an incoming migration, kvm_arch_put_registers() may
> end up failing or putting vCPU in a weird state.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  target/i386/kvm/kvm.c | 37 +++++++++++++++++++++++++++----------
>  1 file changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index f148a6d52fa4..4f8dacc1d4b5 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1695,6 +1695,30 @@ static void kvm_init_xsave(CPUX86State *env)
>             env->xsave_buf_len);
>  }
>  
> +static void kvm_init_nested_state(CPUX86State *env)
> +{
> +    struct kvm_vmx_nested_state_hdr *vmx_hdr;
> +    uint32_t size;
> +
> +    if (!env->nested_state) {
> +        return;
> +    }
> +
> +    size = env->nested_state->size;
> +
> +    memset(env->nested_state, 0, size);
> +    env->nested_state->size = size;
> +
> +    if (cpu_has_vmx(env)) {
> +        env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
> +        vmx_hdr = &env->nested_state->hdr.vmx;
> +        vmx_hdr->vmxon_pa = -1ull;
> +        vmx_hdr->vmcs12_pa = -1ull;
> +    } else if (cpu_has_svm(env)) {
> +        env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
> +    }
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2122,19 +2146,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
>          assert(max_nested_state_len >= offsetof(struct kvm_nested_state, data));
>  
>          if (cpu_has_vmx(env) || cpu_has_svm(env)) {
> -            struct kvm_vmx_nested_state_hdr *vmx_hdr;
> -
>              env->nested_state = g_malloc0(max_nested_state_len);
>              env->nested_state->size = max_nested_state_len;
>  
> -            if (cpu_has_vmx(env)) {
> -                env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
> -                vmx_hdr = &env->nested_state->hdr.vmx;
> -                vmx_hdr->vmxon_pa = -1ull;
> -                vmx_hdr->vmcs12_pa = -1ull;
> -            } else {
> -                env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
> -            }
> +            kvm_init_nested_state(env);
>          }
>      }
>  
> @@ -2199,6 +2214,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
>      /* enabled by default */
>      env->poll_control_msr = 1;
>  
> +    kvm_init_nested_state(env);
> +
>      sev_es_set_reset_vector(CPU(cpu));
>  }
>  
Makes sense.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset
  2022-08-10 14:47   ` Maxim Levitsky
@ 2022-08-10 14:57     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-10 14:57 UTC (permalink / raw)
  To: Maxim Levitsky, qemu-devel, Paolo Bonzini; +Cc: Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
>> Setting nested state upon migration needs to happen after kvm_put_sregs2()
>> to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
>> when vCPU is in VMX root operation, certain CR bits are locked and
>> kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
>> vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
>> kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
>> that vCPU is *not* in VMX root opertaion.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>> 
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 4f8dacc1d4b5..73e3880fa57b 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>  
>>      assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
>>  
>> -    /* must be before kvm_put_nested_state so that EFER.SVME is set */
>> +    /*
>> +     * When resetting a vCPU, make sure to reset nested state first to
>> +     * e.g clear VMXON state and unlock certain CR4 bits.
>> +     */
>> +    if (level == KVM_PUT_RESET_STATE) {
>> +        ret = kvm_put_nested_state(x86_cpu);
>> +        if (ret < 0) {
>> +            return ret;
>> +        }
>
> I should have mentioned this, I actually already debugged the same issue while
> trying to reproduce the smm int window bug.
> 100% my fault.
>
> I also share the same feeling that this might be yet another 'whack a mole' and
> break somewhere else, but overall it does make sense.

This certainly *is* a 'whack a mole' and I'm sure there are other cases
when one of calls in kvm_arch_put_registers() fails. We need to work on
what's missing so we can expose kvm_vcpu_reset() to VMMs.

>
>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>

Thanks!

-- 
Vitaly



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-08-10 14:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 14:00 [PATCH RFC v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation Vitaly Kuznetsov
2022-08-10 14:00 ` [PATCH RFC v1 1/2] i386: reset KVM nested state upon CPU reset Vitaly Kuznetsov
2022-08-10 14:50   ` Maxim Levitsky
2022-08-10 14:00 ` [PATCH RFC v1 2/2] i386: reorder kvm_put_sregs2() and kvm_put_nested_state() when vCPU is reset Vitaly Kuznetsov
2022-08-10 14:47   ` Maxim Levitsky
2022-08-10 14:57     ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.