All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] target/i386: kvm: Various nested-state fixes
@ 2019-07-05 21:06 ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm

Hi,

This series is just a bunch of small fixes to recent QEMU nested-state
migration support.

1st and 2nd patch can be considered as trivial refactoring patches.

3rd patch fixes a bug of requiring to save VMX nested-state when it is
not needed.

4rd patch removes migration blocker when vCPU is exposed with VMX and
instead demand nested migration kernel capabilities only when vCPU may
have enabled VMX. To provide for better backwards-compatible migration
scenarios. For more info, refer to relevant commit message.

Thanks,
-Liran


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 0/4] target/i386: kvm: Various nested-state fixes
@ 2019-07-05 21:06 ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm

Hi,

This series is just a bunch of small fixes to recent QEMU nested-state
migration support.

1st and 2nd patch can be considered as trivial refactoring patches.

3rd patch fixes a bug of requiring to save VMX nested-state when it is
not needed.

4rd patch removes migration blocker when vCPU is exposed with VMX and
instead demand nested migration kernel capabilities only when vCPU may
have enabled VMX. To provide for better backwards-compatible migration
scenarios. For more info, refer to relevant commit message.

Thanks,
-Liran



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
  2019-07-05 21:06 ` [Qemu-devel] " Liran Alon
@ 2019-07-05 21:06   ` Liran Alon
  -1 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm, Liran Alon, Joao Martins

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index e4b4f5756a34..b57f873ec9e8 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1714,7 +1714,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
         env->nested_state->size = max_nested_state_len;
 
-        if (IS_INTEL_CPU(env)) {
+        if (cpu_has_vmx(env)) {
             struct kvm_vmx_nested_state_hdr *vmx_hdr =
                 &env->nested_state->hdr.vmx;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
@ 2019-07-05 21:06   ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Joao Martins, Liran Alon, ehabkost, kvm

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index e4b4f5756a34..b57f873ec9e8 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1714,7 +1714,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
         env->nested_state->size = max_nested_state_len;
 
-        if (IS_INTEL_CPU(env)) {
+        if (cpu_has_vmx(env)) {
             struct kvm_vmx_nested_state_hdr *vmx_hdr =
                 &env->nested_state->hdr.vmx;
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] target/i386: kvm: Init nested-state for vCPU exposed with SVM
  2019-07-05 21:06 ` [Qemu-devel] " Liran Alon
@ 2019-07-05 21:06   ` Liran Alon
  -1 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm, Liran Alon, Joao Martins

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/cpu.h | 5 +++++
 target/i386/kvm.c | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 93345792f4cb..cdb0e43676a9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1867,6 +1867,11 @@ static inline bool cpu_has_vmx(CPUX86State *env)
     return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
 }
 
+static inline bool cpu_has_svm(CPUX86State *env)
+{
+    return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
+}
+
 /* fpu_helper.c */
 void update_fp_status(CPUX86State *env);
 void update_mxcsr_status(CPUX86State *env);
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b57f873ec9e8..4e2c8652168f 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1721,6 +1721,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
             env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
             vmx_hdr->vmxon_pa = -1ull;
             vmx_hdr->vmcs12_pa = -1ull;
+        } else if (cpu_has_svm(env)) {
+            env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
         }
     }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 2/4] target/i386: kvm: Init nested-state for vCPU exposed with SVM
@ 2019-07-05 21:06   ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Joao Martins, Liran Alon, ehabkost, kvm

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/cpu.h | 5 +++++
 target/i386/kvm.c | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 93345792f4cb..cdb0e43676a9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1867,6 +1867,11 @@ static inline bool cpu_has_vmx(CPUX86State *env)
     return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
 }
 
+static inline bool cpu_has_svm(CPUX86State *env)
+{
+    return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
+}
+
 /* fpu_helper.c */
 void update_fp_status(CPUX86State *env);
 void update_mxcsr_status(CPUX86State *env);
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b57f873ec9e8..4e2c8652168f 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1721,6 +1721,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
             env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
             vmx_hdr->vmxon_pa = -1ull;
             vmx_hdr->vmcs12_pa = -1ull;
+        } else if (cpu_has_svm(env)) {
+            env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
         }
     }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] target/i386: kvm: Save nested-state only in case vCPU have set VMXON region
  2019-07-05 21:06 ` [Qemu-devel] " Liran Alon
@ 2019-07-05 21:06   ` Liran Alon
  -1 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm, Liran Alon, Joao Martins

Having (nested_state->hdr.vmx.vmxon_pa != -1ull) signals that vCPU have set
at some point in time a VMXON region. Note that even though when vCPU enters
SMM mode it temporarily exit VMX operation, KVM still reports (vmxon_pa != -1ull).
Therefore, this field can be used as a reliable indicator on when we require to
send VMX nested-state as part of migration stream.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/machine.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/i386/machine.c b/target/i386/machine.c
index 851b249d1a39..20bda9f80154 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -997,9 +997,8 @@ static bool vmx_nested_state_needed(void *opaque)
 {
     struct kvm_nested_state *nested_state = opaque;
 
-    return ((nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
-            ((nested_state->hdr.vmx.vmxon_pa != -1ull) ||
-             (nested_state->hdr.vmx.smm.flags & KVM_STATE_NESTED_SMM_VMXON)));
+    return (nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
+           (nested_state->hdr.vmx.vmxon_pa != -1ull);
 }
 
 static const VMStateDescription vmstate_vmx_nested_state = {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 3/4] target/i386: kvm: Save nested-state only in case vCPU have set VMXON region
@ 2019-07-05 21:06   ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Joao Martins, Liran Alon, ehabkost, kvm

Having (nested_state->hdr.vmx.vmxon_pa != -1ull) signals that vCPU have set
at some point in time a VMXON region. Note that even though when vCPU enters
SMM mode it temporarily exit VMX operation, KVM still reports (vmxon_pa != -1ull).
Therefore, this field can be used as a reliable indicator on when we require to
send VMX nested-state as part of migration stream.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/machine.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/i386/machine.c b/target/i386/machine.c
index 851b249d1a39..20bda9f80154 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -997,9 +997,8 @@ static bool vmx_nested_state_needed(void *opaque)
 {
     struct kvm_nested_state *nested_state = opaque;
 
-    return ((nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
-            ((nested_state->hdr.vmx.vmxon_pa != -1ull) ||
-             (nested_state->hdr.vmx.smm.flags & KVM_STATE_NESTED_SMM_VMXON)));
+    return (nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
+           (nested_state->hdr.vmx.vmxon_pa != -1ull);
 }
 
 static const VMStateDescription vmstate_vmx_nested_state = {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
  2019-07-05 21:06 ` [Qemu-devel] " Liran Alon
@ 2019-07-05 21:06   ` Liran Alon
  -1 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm, Liran Alon, Joao Martins

Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE
or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when code
was written it was thought there is no way to reliabely know if a vCPU is utilising VMX or not
at runtime. However, it turns out that this can be known to some extent:

In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
Since it was set, CR4.VMXE must remain set as long as vCPU is in
VMX operation. This is because CR4.VMXE is one of the bits set
in MSR_IA32_VMX_CR4_FIXED1.
There is one exception to above statement when vCPU enters SMM mode.
When a vCPU enters SMM mode, it temporarily exit VMX operation and
may also reset CR4.VMXE during execution in SMM mode.
When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
and CR4.VMXE is restored to it's original value of being set.
Therefore, when vCPU is not in SMM mode, we can infer whether
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
know for certain but assume the worse that vCPU may utilise VMX.

Summaring all the above, a vCPU may have enabled VMX in case
CR4.VMXE is set or vCPU is in SMM mode.

Therefore, remove migration blocker and check before migration (cpu_pre_save())
if vCPU may have enabled VMX. If true, only then require relevant kernel capabilities.

While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when vCPU is in guest-mode and
there is a pending/injected exception. Otherwise, this kernel capability is
not required for proper migration.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/cpu.h      | 22 ++++++++++++++++++++++
 target/i386/kvm.c      | 26 ++++++--------------------
 target/i386/kvm_i386.h |  1 +
 target/i386/machine.c  | 24 ++++++++++++++++++++----
 4 files changed, 49 insertions(+), 24 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cdb0e43676a9..c752c4d936ee 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1872,6 +1872,28 @@ static inline bool cpu_has_svm(CPUX86State *env)
     return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
 }
 
+/*
+ * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
+ * Since it was set, CR4.VMXE must remain set as long as vCPU is in
+ * VMX operation. This is because CR4.VMXE is one of the bits set
+ * in MSR_IA32_VMX_CR4_FIXED1.
+ *
+ * There is one exception to above statement when vCPU enters SMM mode.
+ * When a vCPU enters SMM mode, it temporarily exit VMX operation and
+ * may also reset CR4.VMXE during execution in SMM mode.
+ * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
+ * and CR4.VMXE is restored to it's original value of being set.
+ *
+ * Therefore, when vCPU is not in SMM mode, we can infer whether
+ * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
+ * know for certain.
+ */
+static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
+{
+    return cpu_has_vmx(env) &&
+           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
+}
+
 /* fpu_helper.c */
 void update_fp_status(CPUX86State *env);
 void update_mxcsr_status(CPUX86State *env);
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 4e2c8652168f..d3af445eeb5d 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void)
     return (ret == KVM_CLOCK_TSC_STABLE);
 }
 
+bool kvm_has_exception_payload(void)
+{
+    return has_exception_payload;
+}
+
 bool kvm_allows_irq0_override(void)
 {
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
@@ -1341,7 +1346,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 }
 
 static Error *invtsc_mig_blocker;
-static Error *nested_virt_mig_blocker;
 
 #define KVM_MAX_CPUID_ENTRIES  100
 
@@ -1640,22 +1644,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                   !!(c->ecx & CPUID_EXT_SMX);
     }
 
-    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
-        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
-        error_setg(&nested_virt_mig_blocker,
-                   "Kernel do not provide required capabilities for "
-                   "nested virtualization migration. "
-                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
-                   kvm_max_nested_state_length() > 0,
-                   has_exception_payload);
-        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
-        if (local_err) {
-            error_report_err(local_err);
-            error_free(nested_virt_mig_blocker);
-            return r;
-        }
-    }
-
     if (env->mcg_cap & MCG_LMCE_P) {
         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
     }
@@ -1670,7 +1658,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             if (local_err) {
                 error_report_err(local_err);
                 error_free(invtsc_mig_blocker);
-                goto fail2;
+                return r;
             }
         }
     }
@@ -1741,8 +1729,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
  fail:
     migrate_del_blocker(invtsc_mig_blocker);
- fail2:
-    migrate_del_blocker(nested_virt_mig_blocker);
 
     return r;
 }
diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
index 3057ba4f7d19..06fe06bdb3d6 100644
--- a/target/i386/kvm_i386.h
+++ b/target/i386/kvm_i386.h
@@ -35,6 +35,7 @@
 bool kvm_allows_irq0_override(void);
 bool kvm_has_smm(void);
 bool kvm_has_adjust_clock_stable(void);
+bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 20bda9f80154..c04021937722 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -7,6 +7,7 @@
 #include "hw/isa/isa.h"
 #include "migration/cpu.h"
 #include "hyperv.h"
+#include "kvm_i386.h"
 
 #include "sysemu/kvm.h"
 #include "sysemu/tcg.h"
@@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque)
     }
 
 #ifdef CONFIG_KVM
-    /* Verify we have nested virtualization state from kernel if required */
-    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
-        error_report("Guest enabled nested virtualization but kernel "
-                "does not support saving of nested state");
+    /*
+     * In case vCPU may have enabled VMX, we need to make sure kernel have
+     * required capabilities in order to perform migration correctly:
+     *
+     * 1) We must be able to extract vCPU nested-state from KVM.
+     *
+     * 2) In case vCPU is running in guest-mode and it has a pending exception,
+     * we must be able to determine if it's in a pending or injected state.
+     * Note that in case KVM don't have required capability to do so,
+     * a pending/injected exception will always appear as an
+     * injected exception.
+     */
+    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
+        (!env->nested_state ||
+         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
+          env->exception_injected))) {
+        error_report("Guest maybe enabled nested virtualization but kernel "
+                "does not support required capabilities to save vCPU "
+                "nested state");
         return -EINVAL;
     }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
@ 2019-07-05 21:06   ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-05 21:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Joao Martins, Liran Alon, ehabkost, kvm

Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE
or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when code
was written it was thought there is no way to reliabely know if a vCPU is utilising VMX or not
at runtime. However, it turns out that this can be known to some extent:

In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
Since it was set, CR4.VMXE must remain set as long as vCPU is in
VMX operation. This is because CR4.VMXE is one of the bits set
in MSR_IA32_VMX_CR4_FIXED1.
There is one exception to above statement when vCPU enters SMM mode.
When a vCPU enters SMM mode, it temporarily exit VMX operation and
may also reset CR4.VMXE during execution in SMM mode.
When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
and CR4.VMXE is restored to it's original value of being set.
Therefore, when vCPU is not in SMM mode, we can infer whether
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
know for certain but assume the worse that vCPU may utilise VMX.

Summaring all the above, a vCPU may have enabled VMX in case
CR4.VMXE is set or vCPU is in SMM mode.

Therefore, remove migration blocker and check before migration (cpu_pre_save())
if vCPU may have enabled VMX. If true, only then require relevant kernel capabilities.

While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when vCPU is in guest-mode and
there is a pending/injected exception. Otherwise, this kernel capability is
not required for proper migration.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 target/i386/cpu.h      | 22 ++++++++++++++++++++++
 target/i386/kvm.c      | 26 ++++++--------------------
 target/i386/kvm_i386.h |  1 +
 target/i386/machine.c  | 24 ++++++++++++++++++++----
 4 files changed, 49 insertions(+), 24 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cdb0e43676a9..c752c4d936ee 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1872,6 +1872,28 @@ static inline bool cpu_has_svm(CPUX86State *env)
     return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
 }
 
+/*
+ * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
+ * Since it was set, CR4.VMXE must remain set as long as vCPU is in
+ * VMX operation. This is because CR4.VMXE is one of the bits set
+ * in MSR_IA32_VMX_CR4_FIXED1.
+ *
+ * There is one exception to above statement when vCPU enters SMM mode.
+ * When a vCPU enters SMM mode, it temporarily exit VMX operation and
+ * may also reset CR4.VMXE during execution in SMM mode.
+ * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
+ * and CR4.VMXE is restored to it's original value of being set.
+ *
+ * Therefore, when vCPU is not in SMM mode, we can infer whether
+ * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
+ * know for certain.
+ */
+static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
+{
+    return cpu_has_vmx(env) &&
+           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
+}
+
 /* fpu_helper.c */
 void update_fp_status(CPUX86State *env);
 void update_mxcsr_status(CPUX86State *env);
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 4e2c8652168f..d3af445eeb5d 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void)
     return (ret == KVM_CLOCK_TSC_STABLE);
 }
 
+bool kvm_has_exception_payload(void)
+{
+    return has_exception_payload;
+}
+
 bool kvm_allows_irq0_override(void)
 {
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
@@ -1341,7 +1346,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 }
 
 static Error *invtsc_mig_blocker;
-static Error *nested_virt_mig_blocker;
 
 #define KVM_MAX_CPUID_ENTRIES  100
 
@@ -1640,22 +1644,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                   !!(c->ecx & CPUID_EXT_SMX);
     }
 
-    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
-        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
-        error_setg(&nested_virt_mig_blocker,
-                   "Kernel do not provide required capabilities for "
-                   "nested virtualization migration. "
-                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
-                   kvm_max_nested_state_length() > 0,
-                   has_exception_payload);
-        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
-        if (local_err) {
-            error_report_err(local_err);
-            error_free(nested_virt_mig_blocker);
-            return r;
-        }
-    }
-
     if (env->mcg_cap & MCG_LMCE_P) {
         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
     }
@@ -1670,7 +1658,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             if (local_err) {
                 error_report_err(local_err);
                 error_free(invtsc_mig_blocker);
-                goto fail2;
+                return r;
             }
         }
     }
@@ -1741,8 +1729,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
  fail:
     migrate_del_blocker(invtsc_mig_blocker);
- fail2:
-    migrate_del_blocker(nested_virt_mig_blocker);
 
     return r;
 }
diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
index 3057ba4f7d19..06fe06bdb3d6 100644
--- a/target/i386/kvm_i386.h
+++ b/target/i386/kvm_i386.h
@@ -35,6 +35,7 @@
 bool kvm_allows_irq0_override(void);
 bool kvm_has_smm(void);
 bool kvm_has_adjust_clock_stable(void);
+bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 20bda9f80154..c04021937722 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -7,6 +7,7 @@
 #include "hw/isa/isa.h"
 #include "migration/cpu.h"
 #include "hyperv.h"
+#include "kvm_i386.h"
 
 #include "sysemu/kvm.h"
 #include "sysemu/tcg.h"
@@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque)
     }
 
 #ifdef CONFIG_KVM
-    /* Verify we have nested virtualization state from kernel if required */
-    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
-        error_report("Guest enabled nested virtualization but kernel "
-                "does not support saving of nested state");
+    /*
+     * In case vCPU may have enabled VMX, we need to make sure kernel have
+     * required capabilities in order to perform migration correctly:
+     *
+     * 1) We must be able to extract vCPU nested-state from KVM.
+     *
+     * 2) In case vCPU is running in guest-mode and it has a pending exception,
+     * we must be able to determine if it's in a pending or injected state.
+     * Note that in case KVM don't have required capability to do so,
+     * a pending/injected exception will always appear as an
+     * injected exception.
+     */
+    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
+        (!env->nested_state ||
+         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
+          env->exception_injected))) {
+        error_report("Guest maybe enabled nested virtualization but kernel "
+                "does not support required capabilities to save vCPU "
+                "nested state");
         return -EINVAL;
     }
 #endif
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
@ 2019-07-09 17:21   ` Maran Wilson
  -1 siblings, 0 replies; 24+ messages in thread
From: Maran Wilson @ 2019-07-09 17:21 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: pbonzini, Joao Martins, ehabkost, kvm

On 7/5/2019 2:06 PM, Liran Alon wrote:
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>   target/i386/kvm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index e4b4f5756a34..b57f873ec9e8 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1714,7 +1714,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>   
>           env->nested_state->size = max_nested_state_len;
>   
> -        if (IS_INTEL_CPU(env)) {
> +        if (cpu_has_vmx(env)) {
>               struct kvm_vmx_nested_state_hdr *vmx_hdr =
>                   &env->nested_state->hdr.vmx;
>   

Reviewed-by: Maran Wilson <maran.wilson@oracle.com>

Thanks,
-Maran

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] target/i386: kvm: Init nested-state for vCPU exposed with SVM
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
@ 2019-07-09 17:21   ` Maran Wilson
  -1 siblings, 0 replies; 24+ messages in thread
From: Maran Wilson @ 2019-07-09 17:21 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: pbonzini, Joao Martins, ehabkost, kvm

On 7/5/2019 2:06 PM, Liran Alon wrote:
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>   target/i386/cpu.h | 5 +++++
>   target/i386/kvm.c | 2 ++
>   2 files changed, 7 insertions(+)
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 93345792f4cb..cdb0e43676a9 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1867,6 +1867,11 @@ static inline bool cpu_has_vmx(CPUX86State *env)
>       return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
>   }
>   
> +static inline bool cpu_has_svm(CPUX86State *env)
> +{
> +    return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
> +}
> +
>   /* fpu_helper.c */
>   void update_fp_status(CPUX86State *env);
>   void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index b57f873ec9e8..4e2c8652168f 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1721,6 +1721,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>               env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
>               vmx_hdr->vmxon_pa = -1ull;
>               vmx_hdr->vmcs12_pa = -1ull;
> +        } else if (cpu_has_svm(env)) {
> +            env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
>           }
>       }
>   

Reviewed-by: Maran Wilson <maran.wilson@oracle.com>

Thanks,
-Maran

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] target/i386: kvm: Save nested-state only in case vCPU have set VMXON region
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
@ 2019-07-09 17:21   ` Maran Wilson
  -1 siblings, 0 replies; 24+ messages in thread
From: Maran Wilson @ 2019-07-09 17:21 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: pbonzini, Joao Martins, ehabkost, kvm

On 7/5/2019 2:06 PM, Liran Alon wrote:
> Having (nested_state->hdr.vmx.vmxon_pa != -1ull) signals that vCPU have set
> at some point in time a VMXON region. Note that even though when vCPU enters
> SMM mode it temporarily exit VMX operation, KVM still reports (vmxon_pa != -1ull).
> Therefore, this field can be used as a reliable indicator on when we require to

s/on when we require/for when we are required/

Also, note that you have commit message lines are longer than 76 
characters (longer than 80, for that matter).

But aside from those nits:

Reviewed-by: Maran Wilson <maran.wilson@oracle.com>

Thanks,
-Maran

> send VMX nested-state as part of migration stream.
>
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>   target/i386/machine.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index 851b249d1a39..20bda9f80154 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -997,9 +997,8 @@ static bool vmx_nested_state_needed(void *opaque)
>   {
>       struct kvm_nested_state *nested_state = opaque;
>   
> -    return ((nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
> -            ((nested_state->hdr.vmx.vmxon_pa != -1ull) ||
> -             (nested_state->hdr.vmx.smm.flags & KVM_STATE_NESTED_SMM_VMXON)));
> +    return (nested_state->format == KVM_STATE_NESTED_FORMAT_VMX) &&
> +           (nested_state->hdr.vmx.vmxon_pa != -1ull);
>   }
>   
>   static const VMStateDescription vmstate_vmx_nested_state = {


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
@ 2019-07-09 17:21   ` Maran Wilson
  -1 siblings, 0 replies; 24+ messages in thread
From: Maran Wilson @ 2019-07-09 17:21 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: pbonzini, Joao Martins, ehabkost, kvm

Just a bunch of grammar and style nits below. As for the actual code 
changes:

Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>

Thanks,
-Maran

On 7/5/2019 2:06 PM, Liran Alon wrote:
> Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE
> or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when code

s/code/the code/

Also, note that you have commit message lines that are much longer than 
76 chars here.

> was written it was thought there is no way to reliabely know if a vCPU is utilising VMX or not
> at runtime. However, it turns out that this can be known to some extent:
>
> In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> Since it was set, CR4.VMXE must remain set as long as vCPU is in

s/vCPU/the vCPU/

> VMX operation. This is because CR4.VMXE is one of the bits set
> in MSR_IA32_VMX_CR4_FIXED1.
> There is one exception to above statement when vCPU enters SMM mode.

s/above/the above/

> When a vCPU enters SMM mode, it temporarily exit VMX operation and

s/exit/exits/

> may also reset CR4.VMXE during execution in SMM mode.
> When vCPU exits SMM mode, vCPU state is restored to be in VMX operation

s/vCPU exits SMM mode, vCPU/the vCPU exits SMM mode, its/

> and CR4.VMXE is restored to it's original value of being set.

s/it's/its/

> Therefore, when vCPU is not in SMM mode, we can infer whether

s/vCPU/the vCPU/

> VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> know for certain but assume the worse that vCPU may utilise VMX.
>
> Summaring all the above, a vCPU may have enabled VMX in case
> CR4.VMXE is set or vCPU is in SMM mode.
>
> Therefore, remove migration blocker and check before migration (cpu_pre_save())
> if vCPU may have enabled VMX. If true, only then require relevant kernel capabilities.
>
> While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when vCPU is in guest-mode and

s/vCPU/the vCPU/

> there is a pending/injected exception. Otherwise, this kernel capability is
> not required for proper migration.
>
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>   target/i386/cpu.h      | 22 ++++++++++++++++++++++
>   target/i386/kvm.c      | 26 ++++++--------------------
>   target/i386/kvm_i386.h |  1 +
>   target/i386/machine.c  | 24 ++++++++++++++++++++----
>   4 files changed, 49 insertions(+), 24 deletions(-)
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index cdb0e43676a9..c752c4d936ee 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1872,6 +1872,28 @@ static inline bool cpu_has_svm(CPUX86State *env)
>       return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
>   }
>   
> +/*
> + * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> + * Since it was set, CR4.VMXE must remain set as long as vCPU is in

s/vCPU/the vCPU/

> + * VMX operation. This is because CR4.VMXE is one of the bits set
> + * in MSR_IA32_VMX_CR4_FIXED1.
> + *
> + * There is one exception to above statement when vCPU enters SMM mode.
> + * When a vCPU enters SMM mode, it temporarily exit VMX operation and

instead of:
    statement when vCPU enters SMM mode. When a vCPU enters SMM mode, it 
temporarily exit
use:
    statement: When a vCPU enters SMM mode, it temporarily exits

> + * may also reset CR4.VMXE during execution in SMM mode.
> + * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation

s/vCPU exits SMM mode, vCPU/the vCPU exits SMM mode, its/


> + * and CR4.VMXE is restored to it's original value of being set.

s/it's/its/

> + *
> + * Therefore, when vCPU is not in SMM mode, we can infer whether

s/vCPU/the vCPU/

> + * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> + * know for certain.
> + */
> +static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
> +{
> +    return cpu_has_vmx(env) &&
> +           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
> +}
> +
>   /* fpu_helper.c */
>   void update_fp_status(CPUX86State *env);
>   void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 4e2c8652168f..d3af445eeb5d 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void)
>       return (ret == KVM_CLOCK_TSC_STABLE);
>   }
>   
> +bool kvm_has_exception_payload(void)
> +{
> +    return has_exception_payload;
> +}
> +
>   bool kvm_allows_irq0_override(void)
>   {
>       return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
> @@ -1341,7 +1346,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>   }
>   
>   static Error *invtsc_mig_blocker;
> -static Error *nested_virt_mig_blocker;
>   
>   #define KVM_MAX_CPUID_ENTRIES  100
>   
> @@ -1640,22 +1644,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>                                     !!(c->ecx & CPUID_EXT_SMX);
>       }
>   
> -    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
> -        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
> -        error_setg(&nested_virt_mig_blocker,
> -                   "Kernel do not provide required capabilities for "
> -                   "nested virtualization migration. "
> -                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
> -                   kvm_max_nested_state_length() > 0,
> -                   has_exception_payload);
> -        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
> -        if (local_err) {
> -            error_report_err(local_err);
> -            error_free(nested_virt_mig_blocker);
> -            return r;
> -        }
> -    }
> -
>       if (env->mcg_cap & MCG_LMCE_P) {
>           has_msr_mcg_ext_ctl = has_msr_feature_control = true;
>       }
> @@ -1670,7 +1658,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>               if (local_err) {
>                   error_report_err(local_err);
>                   error_free(invtsc_mig_blocker);
> -                goto fail2;
> +                return r;
>               }
>           }
>       }
> @@ -1741,8 +1729,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>   
>    fail:
>       migrate_del_blocker(invtsc_mig_blocker);
> - fail2:
> -    migrate_del_blocker(nested_virt_mig_blocker);
>   
>       return r;
>   }
> diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
> index 3057ba4f7d19..06fe06bdb3d6 100644
> --- a/target/i386/kvm_i386.h
> +++ b/target/i386/kvm_i386.h
> @@ -35,6 +35,7 @@
>   bool kvm_allows_irq0_override(void);
>   bool kvm_has_smm(void);
>   bool kvm_has_adjust_clock_stable(void);
> +bool kvm_has_exception_payload(void);
>   void kvm_synchronize_all_tsc(void);
>   void kvm_arch_reset_vcpu(X86CPU *cs);
>   void kvm_arch_do_init_vcpu(X86CPU *cs);
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index 20bda9f80154..c04021937722 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -7,6 +7,7 @@
>   #include "hw/isa/isa.h"
>   #include "migration/cpu.h"
>   #include "hyperv.h"
> +#include "kvm_i386.h"
>   
>   #include "sysemu/kvm.h"
>   #include "sysemu/tcg.h"
> @@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque)
>       }
>   
>   #ifdef CONFIG_KVM
> -    /* Verify we have nested virtualization state from kernel if required */
> -    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
> -        error_report("Guest enabled nested virtualization but kernel "
> -                "does not support saving of nested state");
> +    /*
> +     * In case vCPU may have enabled VMX, we need to make sure kernel have
> +     * required capabilities in order to perform migration correctly:
> +     *
> +     * 1) We must be able to extract vCPU nested-state from KVM.
> +     *
> +     * 2) In case vCPU is running in guest-mode and it has a pending exception,
> +     * we must be able to determine if it's in a pending or injected state.
> +     * Note that in case KVM don't have required capability to do so,
> +     * a pending/injected exception will always appear as an
> +     * injected exception.
> +     */
> +    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
> +        (!env->nested_state ||
> +         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
> +          env->exception_injected))) {
> +        error_report("Guest maybe enabled nested virtualization but kernel "
> +                "does not support required capabilities to save vCPU "
> +                "nested state");
>           return -EINVAL;
>       }
>   #endif


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] target/i386: kvm: Init nested-state for vCPU exposed with SVM
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
  (?)
@ 2019-07-11 13:35   ` Paolo Bonzini
  -1 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-11 13:35 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: Joao Martins, ehabkost, kvm

On 05/07/19 23:06, Liran Alon wrote:
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>  target/i386/cpu.h | 5 +++++
>  target/i386/kvm.c | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 93345792f4cb..cdb0e43676a9 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1867,6 +1867,11 @@ static inline bool cpu_has_vmx(CPUX86State *env)
>      return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
>  }
>  
> +static inline bool cpu_has_svm(CPUX86State *env)
> +{
> +    return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
> +}
> +
>  /* fpu_helper.c */
>  void update_fp_status(CPUX86State *env);
>  void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index b57f873ec9e8..4e2c8652168f 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1721,6 +1721,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>              env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
>              vmx_hdr->vmxon_pa = -1ull;
>              vmx_hdr->vmcs12_pa = -1ull;
> +        } else if (cpu_has_svm(env)) {
> +            env->nested_state->format = KVM_STATE_NESTED_FORMAT_SVM;
>          }
>      }
>  
> 

I'm not sure about it.  We have no idea what the format will be, so we
shouldn't set the format carelessly.

Paolo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
  (?)
  (?)
@ 2019-07-11 13:45   ` Paolo Bonzini
  2019-07-11 14:36       ` Liran Alon
  -1 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-11 13:45 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: Joao Martins, ehabkost, kvm

On 05/07/19 23:06, Liran Alon wrote:
> -        if (IS_INTEL_CPU(env)) {
> +        if (cpu_has_vmx(env)) {
>              struct kvm_vmx_nested_state_hdr *vmx_hdr =
>                  &env->nested_state->hdr.vmx;
>  

I am not sure this is enough, because kvm_get_nested_state and kvm_put_nested_state would run anyway later.  If we want to cull them completely for a non-VMX virtual machine, I'd do something like this:

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 5035092..73ab102 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1748,14 +1748,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
     max_nested_state_len = kvm_max_nested_state_length();
     if (max_nested_state_len > 0) {
         assert(max_nested_state_len >= offsetof(struct kvm_nested_state, data));
-        env->nested_state = g_malloc0(max_nested_state_len);
 
-        env->nested_state->size = max_nested_state_len;
-
-        if (IS_INTEL_CPU(env)) {
+        if (cpu_has_vmx(env)) {
             struct kvm_vmx_nested_state_hdr *vmx_hdr =
                 &env->nested_state->hdr.vmx;
 
+            env->nested_state = g_malloc0(max_nested_state_len);
+            env->nested_state->size = max_nested_state_len;
             env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
             vmx_hdr->vmxon_pa = -1ull;
             vmx_hdr->vmcs12_pa = -1ull;
@@ -3682,7 +3681,7 @@ static int kvm_put_nested_state(X86CPU *cpu)
     CPUX86State *env = &cpu->env;
     int max_nested_state_len = kvm_max_nested_state_length();
 
-    if (max_nested_state_len <= 0) {
+    if (!env->nested_state) {
         return 0;
     }
 
@@ -3696,7 +3695,7 @@ static int kvm_get_nested_state(X86CPU *cpu)
     int max_nested_state_len = kvm_max_nested_state_length();
     int ret;
 
-    if (max_nested_state_len <= 0) {
+    if (!env->nested_state) {
         return 0;
     }
 

What do you think?  (As a side effect, this completely disables
KVM_GET/SET_NESTED_STATE on SVM, which I think is safer since it
will have to save at least the NPT root and the paging mode.  So we
could remove vmstate_svm_nested_state as well).

Paolo

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
  2019-07-11 13:45   ` Paolo Bonzini
@ 2019-07-11 14:36       ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-11 14:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Joao Martins, ehabkost, kvm



> On 11 Jul 2019, at 16:45, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 05/07/19 23:06, Liran Alon wrote:
>> -        if (IS_INTEL_CPU(env)) {
>> +        if (cpu_has_vmx(env)) {
>>             struct kvm_vmx_nested_state_hdr *vmx_hdr =
>>                 &env->nested_state->hdr.vmx;
>> 
> 
> I am not sure this is enough, because kvm_get_nested_state and kvm_put_nested_state would run anyway later.  If we want to cull them completely for a non-VMX virtual machine, I'd do something like this:
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 5035092..73ab102 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1748,14 +1748,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
>     max_nested_state_len = kvm_max_nested_state_length();
>     if (max_nested_state_len > 0) {
>         assert(max_nested_state_len >= offsetof(struct kvm_nested_state, data));
> -        env->nested_state = g_malloc0(max_nested_state_len);
> 
> -        env->nested_state->size = max_nested_state_len;
> -
> -        if (IS_INTEL_CPU(env)) {
> +        if (cpu_has_vmx(env)) {
>             struct kvm_vmx_nested_state_hdr *vmx_hdr =
>                 &env->nested_state->hdr.vmx;
> 
> +            env->nested_state = g_malloc0(max_nested_state_len);
> +            env->nested_state->size = max_nested_state_len;
>             env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
>             vmx_hdr->vmxon_pa = -1ull;
>             vmx_hdr->vmcs12_pa = -1ull;
> @@ -3682,7 +3681,7 @@ static int kvm_put_nested_state(X86CPU *cpu)
>     CPUX86State *env = &cpu->env;
>     int max_nested_state_len = kvm_max_nested_state_length();
> 
> -    if (max_nested_state_len <= 0) {
> +    if (!env->nested_state) {
>         return 0;
>     }
> 
> @@ -3696,7 +3695,7 @@ static int kvm_get_nested_state(X86CPU *cpu)
>     int max_nested_state_len = kvm_max_nested_state_length();
>     int ret;
> 
> -    if (max_nested_state_len <= 0) {
> +    if (!env->nested_state) {
>         return 0;
>     }
> 
> 
> What do you think?  (As a side effect, this completely disables
> KVM_GET/SET_NESTED_STATE on SVM, which I think is safer since it
> will have to save at least the NPT root and the paging mode.  So we
> could remove vmstate_svm_nested_state as well).
> 
> Paolo

I like your suggestion better than my commit. It is indeed more elegant and correct. :)
The code change above looks good to me as nested_state_needed() will return false anyway if env->nested_state is false.
Will you submit a new patch or should I?

-Liran



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
@ 2019-07-11 14:36       ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-11 14:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Joao Martins, qemu-devel, kvm, ehabkost



> On 11 Jul 2019, at 16:45, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 05/07/19 23:06, Liran Alon wrote:
>> -        if (IS_INTEL_CPU(env)) {
>> +        if (cpu_has_vmx(env)) {
>>             struct kvm_vmx_nested_state_hdr *vmx_hdr =
>>                 &env->nested_state->hdr.vmx;
>> 
> 
> I am not sure this is enough, because kvm_get_nested_state and kvm_put_nested_state would run anyway later.  If we want to cull them completely for a non-VMX virtual machine, I'd do something like this:
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 5035092..73ab102 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1748,14 +1748,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
>     max_nested_state_len = kvm_max_nested_state_length();
>     if (max_nested_state_len > 0) {
>         assert(max_nested_state_len >= offsetof(struct kvm_nested_state, data));
> -        env->nested_state = g_malloc0(max_nested_state_len);
> 
> -        env->nested_state->size = max_nested_state_len;
> -
> -        if (IS_INTEL_CPU(env)) {
> +        if (cpu_has_vmx(env)) {
>             struct kvm_vmx_nested_state_hdr *vmx_hdr =
>                 &env->nested_state->hdr.vmx;
> 
> +            env->nested_state = g_malloc0(max_nested_state_len);
> +            env->nested_state->size = max_nested_state_len;
>             env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
>             vmx_hdr->vmxon_pa = -1ull;
>             vmx_hdr->vmcs12_pa = -1ull;
> @@ -3682,7 +3681,7 @@ static int kvm_put_nested_state(X86CPU *cpu)
>     CPUX86State *env = &cpu->env;
>     int max_nested_state_len = kvm_max_nested_state_length();
> 
> -    if (max_nested_state_len <= 0) {
> +    if (!env->nested_state) {
>         return 0;
>     }
> 
> @@ -3696,7 +3695,7 @@ static int kvm_get_nested_state(X86CPU *cpu)
>     int max_nested_state_len = kvm_max_nested_state_length();
>     int ret;
> 
> -    if (max_nested_state_len <= 0) {
> +    if (!env->nested_state) {
>         return 0;
>     }
> 
> 
> What do you think?  (As a side effect, this completely disables
> KVM_GET/SET_NESTED_STATE on SVM, which I think is safer since it
> will have to save at least the NPT root and the paging mode.  So we
> could remove vmstate_svm_nested_state as well).
> 
> Paolo

I like your suggestion better than my commit. It is indeed more elegant and correct. :)
The code change above looks good to me as nested_state_needed() will return false anyway if env->nested_state is false.
Will you submit a new patch or should I?

-Liran




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
  2019-07-11 14:36       ` Liran Alon
@ 2019-07-11 17:27         ` Paolo Bonzini
  -1 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-11 17:27 UTC (permalink / raw)
  To: Liran Alon; +Cc: qemu-devel, Joao Martins, ehabkost, kvm

On 11/07/19 16:36, Liran Alon wrote:
> Will you submit a new patch or should I?

I've just sent it, I was waiting for you to comment on the idea.  I
forgot to CC you though.

Paolo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX
@ 2019-07-11 17:27         ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-11 17:27 UTC (permalink / raw)
  To: Liran Alon; +Cc: Joao Martins, qemu-devel, kvm, ehabkost

On 11/07/19 16:36, Liran Alon wrote:
> Will you submit a new patch or should I?

I've just sent it, I was waiting for you to comment on the idea.  I
forgot to CC you though.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
  2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
@ 2019-07-15  9:20     ` Liran Alon
  -1 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-15  9:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, ehabkost, kvm, Joao Martins

Gentle ping.

Should this be considered to be merged into QEMU even though QEMU is now in hard freeze?
As it touches a mechanism which is already merged but too restricted.

Anyway, I would like this to be reviewed even if it’s merged is delayed for early feedback.

Thanks,
-Liran

> On 6 Jul 2019, at 0:06, Liran Alon <liran.alon@oracle.com> wrote:
> 
> Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE
> or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when code
> was written it was thought there is no way to reliabely know if a vCPU is utilising VMX or not
> at runtime. However, it turns out that this can be known to some extent:
> 
> In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> Since it was set, CR4.VMXE must remain set as long as vCPU is in
> VMX operation. This is because CR4.VMXE is one of the bits set
> in MSR_IA32_VMX_CR4_FIXED1.
> There is one exception to above statement when vCPU enters SMM mode.
> When a vCPU enters SMM mode, it temporarily exit VMX operation and
> may also reset CR4.VMXE during execution in SMM mode.
> When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
> and CR4.VMXE is restored to it's original value of being set.
> Therefore, when vCPU is not in SMM mode, we can infer whether
> VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> know for certain but assume the worse that vCPU may utilise VMX.
> 
> Summaring all the above, a vCPU may have enabled VMX in case
> CR4.VMXE is set or vCPU is in SMM mode.
> 
> Therefore, remove migration blocker and check before migration (cpu_pre_save())
> if vCPU may have enabled VMX. If true, only then require relevant kernel capabilities.
> 
> While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when vCPU is in guest-mode and
> there is a pending/injected exception. Otherwise, this kernel capability is
> not required for proper migration.
> 
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
> target/i386/cpu.h      | 22 ++++++++++++++++++++++
> target/i386/kvm.c      | 26 ++++++--------------------
> target/i386/kvm_i386.h |  1 +
> target/i386/machine.c  | 24 ++++++++++++++++++++----
> 4 files changed, 49 insertions(+), 24 deletions(-)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index cdb0e43676a9..c752c4d936ee 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1872,6 +1872,28 @@ static inline bool cpu_has_svm(CPUX86State *env)
>     return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
> }
> 
> +/*
> + * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> + * Since it was set, CR4.VMXE must remain set as long as vCPU is in
> + * VMX operation. This is because CR4.VMXE is one of the bits set
> + * in MSR_IA32_VMX_CR4_FIXED1.
> + *
> + * There is one exception to above statement when vCPU enters SMM mode.
> + * When a vCPU enters SMM mode, it temporarily exit VMX operation and
> + * may also reset CR4.VMXE during execution in SMM mode.
> + * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
> + * and CR4.VMXE is restored to it's original value of being set.
> + *
> + * Therefore, when vCPU is not in SMM mode, we can infer whether
> + * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> + * know for certain.
> + */
> +static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
> +{
> +    return cpu_has_vmx(env) &&
> +           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
> +}
> +
> /* fpu_helper.c */
> void update_fp_status(CPUX86State *env);
> void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 4e2c8652168f..d3af445eeb5d 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void)
>     return (ret == KVM_CLOCK_TSC_STABLE);
> }
> 
> +bool kvm_has_exception_payload(void)
> +{
> +    return has_exception_payload;
> +}
> +
> bool kvm_allows_irq0_override(void)
> {
>     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
> @@ -1341,7 +1346,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
> }
> 
> static Error *invtsc_mig_blocker;
> -static Error *nested_virt_mig_blocker;
> 
> #define KVM_MAX_CPUID_ENTRIES  100
> 
> @@ -1640,22 +1644,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>                                   !!(c->ecx & CPUID_EXT_SMX);
>     }
> 
> -    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
> -        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
> -        error_setg(&nested_virt_mig_blocker,
> -                   "Kernel do not provide required capabilities for "
> -                   "nested virtualization migration. "
> -                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
> -                   kvm_max_nested_state_length() > 0,
> -                   has_exception_payload);
> -        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
> -        if (local_err) {
> -            error_report_err(local_err);
> -            error_free(nested_virt_mig_blocker);
> -            return r;
> -        }
> -    }
> -
>     if (env->mcg_cap & MCG_LMCE_P) {
>         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
>     }
> @@ -1670,7 +1658,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>             if (local_err) {
>                 error_report_err(local_err);
>                 error_free(invtsc_mig_blocker);
> -                goto fail2;
> +                return r;
>             }
>         }
>     }
> @@ -1741,8 +1729,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
> 
>  fail:
>     migrate_del_blocker(invtsc_mig_blocker);
> - fail2:
> -    migrate_del_blocker(nested_virt_mig_blocker);
> 
>     return r;
> }
> diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
> index 3057ba4f7d19..06fe06bdb3d6 100644
> --- a/target/i386/kvm_i386.h
> +++ b/target/i386/kvm_i386.h
> @@ -35,6 +35,7 @@
> bool kvm_allows_irq0_override(void);
> bool kvm_has_smm(void);
> bool kvm_has_adjust_clock_stable(void);
> +bool kvm_has_exception_payload(void);
> void kvm_synchronize_all_tsc(void);
> void kvm_arch_reset_vcpu(X86CPU *cs);
> void kvm_arch_do_init_vcpu(X86CPU *cs);
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index 20bda9f80154..c04021937722 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -7,6 +7,7 @@
> #include "hw/isa/isa.h"
> #include "migration/cpu.h"
> #include "hyperv.h"
> +#include "kvm_i386.h"
> 
> #include "sysemu/kvm.h"
> #include "sysemu/tcg.h"
> @@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque)
>     }
> 
> #ifdef CONFIG_KVM
> -    /* Verify we have nested virtualization state from kernel if required */
> -    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
> -        error_report("Guest enabled nested virtualization but kernel "
> -                "does not support saving of nested state");
> +    /*
> +     * In case vCPU may have enabled VMX, we need to make sure kernel have
> +     * required capabilities in order to perform migration correctly:
> +     *
> +     * 1) We must be able to extract vCPU nested-state from KVM.
> +     *
> +     * 2) In case vCPU is running in guest-mode and it has a pending exception,
> +     * we must be able to determine if it's in a pending or injected state.
> +     * Note that in case KVM don't have required capability to do so,
> +     * a pending/injected exception will always appear as an
> +     * injected exception.
> +     */
> +    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
> +        (!env->nested_state ||
> +         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
> +          env->exception_injected))) {
> +        error_report("Guest maybe enabled nested virtualization but kernel "
> +                "does not support required capabilities to save vCPU "
> +                "nested state");
>         return -EINVAL;
>     }
> #endif
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
@ 2019-07-15  9:20     ` Liran Alon
  0 siblings, 0 replies; 24+ messages in thread
From: Liran Alon @ 2019-07-15  9:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, Joao Martins, ehabkost, kvm

Gentle ping.

Should this be considered to be merged into QEMU even though QEMU is now in hard freeze?
As it touches a mechanism which is already merged but too restricted.

Anyway, I would like this to be reviewed even if it’s merged is delayed for early feedback.

Thanks,
-Liran

> On 6 Jul 2019, at 0:06, Liran Alon <liran.alon@oracle.com> wrote:
> 
> Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE
> or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when code
> was written it was thought there is no way to reliabely know if a vCPU is utilising VMX or not
> at runtime. However, it turns out that this can be known to some extent:
> 
> In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> Since it was set, CR4.VMXE must remain set as long as vCPU is in
> VMX operation. This is because CR4.VMXE is one of the bits set
> in MSR_IA32_VMX_CR4_FIXED1.
> There is one exception to above statement when vCPU enters SMM mode.
> When a vCPU enters SMM mode, it temporarily exit VMX operation and
> may also reset CR4.VMXE during execution in SMM mode.
> When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
> and CR4.VMXE is restored to it's original value of being set.
> Therefore, when vCPU is not in SMM mode, we can infer whether
> VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> know for certain but assume the worse that vCPU may utilise VMX.
> 
> Summaring all the above, a vCPU may have enabled VMX in case
> CR4.VMXE is set or vCPU is in SMM mode.
> 
> Therefore, remove migration blocker and check before migration (cpu_pre_save())
> if vCPU may have enabled VMX. If true, only then require relevant kernel capabilities.
> 
> While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when vCPU is in guest-mode and
> there is a pending/injected exception. Otherwise, this kernel capability is
> not required for proper migration.
> 
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
> target/i386/cpu.h      | 22 ++++++++++++++++++++++
> target/i386/kvm.c      | 26 ++++++--------------------
> target/i386/kvm_i386.h |  1 +
> target/i386/machine.c  | 24 ++++++++++++++++++++----
> 4 files changed, 49 insertions(+), 24 deletions(-)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index cdb0e43676a9..c752c4d936ee 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1872,6 +1872,28 @@ static inline bool cpu_has_svm(CPUX86State *env)
>     return env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM;
> }
> 
> +/*
> + * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
> + * Since it was set, CR4.VMXE must remain set as long as vCPU is in
> + * VMX operation. This is because CR4.VMXE is one of the bits set
> + * in MSR_IA32_VMX_CR4_FIXED1.
> + *
> + * There is one exception to above statement when vCPU enters SMM mode.
> + * When a vCPU enters SMM mode, it temporarily exit VMX operation and
> + * may also reset CR4.VMXE during execution in SMM mode.
> + * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
> + * and CR4.VMXE is restored to it's original value of being set.
> + *
> + * Therefore, when vCPU is not in SMM mode, we can infer whether
> + * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
> + * know for certain.
> + */
> +static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
> +{
> +    return cpu_has_vmx(env) &&
> +           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
> +}
> +
> /* fpu_helper.c */
> void update_fp_status(CPUX86State *env);
> void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 4e2c8652168f..d3af445eeb5d 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void)
>     return (ret == KVM_CLOCK_TSC_STABLE);
> }
> 
> +bool kvm_has_exception_payload(void)
> +{
> +    return has_exception_payload;
> +}
> +
> bool kvm_allows_irq0_override(void)
> {
>     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
> @@ -1341,7 +1346,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
> }
> 
> static Error *invtsc_mig_blocker;
> -static Error *nested_virt_mig_blocker;
> 
> #define KVM_MAX_CPUID_ENTRIES  100
> 
> @@ -1640,22 +1644,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>                                   !!(c->ecx & CPUID_EXT_SMX);
>     }
> 
> -    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
> -        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
> -        error_setg(&nested_virt_mig_blocker,
> -                   "Kernel do not provide required capabilities for "
> -                   "nested virtualization migration. "
> -                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
> -                   kvm_max_nested_state_length() > 0,
> -                   has_exception_payload);
> -        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
> -        if (local_err) {
> -            error_report_err(local_err);
> -            error_free(nested_virt_mig_blocker);
> -            return r;
> -        }
> -    }
> -
>     if (env->mcg_cap & MCG_LMCE_P) {
>         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
>     }
> @@ -1670,7 +1658,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>             if (local_err) {
>                 error_report_err(local_err);
>                 error_free(invtsc_mig_blocker);
> -                goto fail2;
> +                return r;
>             }
>         }
>     }
> @@ -1741,8 +1729,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
> 
>  fail:
>     migrate_del_blocker(invtsc_mig_blocker);
> - fail2:
> -    migrate_del_blocker(nested_virt_mig_blocker);
> 
>     return r;
> }
> diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
> index 3057ba4f7d19..06fe06bdb3d6 100644
> --- a/target/i386/kvm_i386.h
> +++ b/target/i386/kvm_i386.h
> @@ -35,6 +35,7 @@
> bool kvm_allows_irq0_override(void);
> bool kvm_has_smm(void);
> bool kvm_has_adjust_clock_stable(void);
> +bool kvm_has_exception_payload(void);
> void kvm_synchronize_all_tsc(void);
> void kvm_arch_reset_vcpu(X86CPU *cs);
> void kvm_arch_do_init_vcpu(X86CPU *cs);
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index 20bda9f80154..c04021937722 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -7,6 +7,7 @@
> #include "hw/isa/isa.h"
> #include "migration/cpu.h"
> #include "hyperv.h"
> +#include "kvm_i386.h"
> 
> #include "sysemu/kvm.h"
> #include "sysemu/tcg.h"
> @@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque)
>     }
> 
> #ifdef CONFIG_KVM
> -    /* Verify we have nested virtualization state from kernel if required */
> -    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
> -        error_report("Guest enabled nested virtualization but kernel "
> -                "does not support saving of nested state");
> +    /*
> +     * In case vCPU may have enabled VMX, we need to make sure kernel have
> +     * required capabilities in order to perform migration correctly:
> +     *
> +     * 1) We must be able to extract vCPU nested-state from KVM.
> +     *
> +     * 2) In case vCPU is running in guest-mode and it has a pending exception,
> +     * we must be able to determine if it's in a pending or injected state.
> +     * Note that in case KVM don't have required capability to do so,
> +     * a pending/injected exception will always appear as an
> +     * injected exception.
> +     */
> +    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
> +        (!env->nested_state ||
> +         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
> +          env->exception_injected))) {
> +        error_report("Guest maybe enabled nested virtualization but kernel "
> +                "does not support required capabilities to save vCPU "
> +                "nested state");
>         return -EINVAL;
>     }
> #endif
> -- 
> 2.20.1
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
  2019-07-15  9:20     ` [Qemu-devel] " Liran Alon
@ 2019-07-15  9:25       ` Paolo Bonzini
  -1 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-15  9:25 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: ehabkost, kvm, Joao Martins

On 15/07/19 11:20, Liran Alon wrote:
> Gentle ping.
> 
> Should this be considered to be merged into QEMU even though QEMU is now in hard freeze?
> As it touches a mechanism which is already merged but too restricted.

Yes, I have it queued and will send the pull request later today.

Paolo

> Anyway, I would like this to be reviewed even if it’s merged is delayed for early feedback.
> 
> Thanks,
> -Liran


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
@ 2019-07-15  9:25       ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2019-07-15  9:25 UTC (permalink / raw)
  To: Liran Alon, qemu-devel; +Cc: Joao Martins, ehabkost, kvm

On 15/07/19 11:20, Liran Alon wrote:
> Gentle ping.
> 
> Should this be considered to be merged into QEMU even though QEMU is now in hard freeze?
> As it touches a mechanism which is already merged but too restricted.

Yes, I have it queued and will send the pull request later today.

Paolo

> Anyway, I would like this to be reviewed even if it’s merged is delayed for early feedback.
> 
> Thanks,
> -Liran



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-07-15  9:25 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-05 21:06 [PATCH 0/4] target/i386: kvm: Various nested-state fixes Liran Alon
2019-07-05 21:06 ` [Qemu-devel] " Liran Alon
2019-07-05 21:06 ` [PATCH 1/4] target/i386: kvm: Init nested-state for VMX when vCPU expose VMX Liran Alon
2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
2019-07-09 17:21   ` Maran Wilson
2019-07-11 13:45   ` Paolo Bonzini
2019-07-11 14:36     ` Liran Alon
2019-07-11 14:36       ` Liran Alon
2019-07-11 17:27       ` Paolo Bonzini
2019-07-11 17:27         ` Paolo Bonzini
2019-07-05 21:06 ` [PATCH 2/4] target/i386: kvm: Init nested-state for vCPU exposed with SVM Liran Alon
2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
2019-07-09 17:21   ` Maran Wilson
2019-07-11 13:35   ` Paolo Bonzini
2019-07-05 21:06 ` [PATCH 3/4] target/i386: kvm: Save nested-state only in case vCPU have set VMXON region Liran Alon
2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
2019-07-09 17:21   ` Maran Wilson
2019-07-05 21:06 ` [PATCH 4/4] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX Liran Alon
2019-07-05 21:06   ` [Qemu-devel] " Liran Alon
2019-07-09 17:21   ` Maran Wilson
2019-07-15  9:20   ` Liran Alon
2019-07-15  9:20     ` [Qemu-devel] " Liran Alon
2019-07-15  9:25     ` Paolo Bonzini
2019-07-15  9:25       ` [Qemu-devel] " Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.